linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v1 00/16] perf kwork: Implement perf kwork top
@ 2023-08-12  8:49 Yang Jihong
  2023-08-12  8:49 ` [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom() Yang Jihong
                   ` (16 more replies)
  0 siblings, 17 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Some common tools for collecting statistics on CPU usage, such as top,
obtain statistics from timer interrupt sampling, and then periodically
read statistics from /proc/stat.

This method has some deviations:
1. In the tick interrupt, the time between the last tick and the current
   tick is counted in the current task. However, the task may be running
   only part of the time.
2. For each task, the top tool periodically reads the /proc/{PID}/status
   information. For tasks with a short life cycle, it may be missed.

In conclusion, the top tool cannot accurately collect statistics on the
CPU usage and running time of tasks.

The statistical method based on sched_switch tracepoint can accurately
calculate the CPU usage of all tasks. This method is applicable to
scenarios where performance comparison data is of high precision.

In addition to parsing the perf.data file, the provides BPF-based CPU usage
statistics to reduce the load of the perf process.

Example usage:

  # perf kwork top -h

   Usage: perf kwork top [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): rate, runtime, tid
          --time <str>      Time span for analysis (start,stop)

  # ./perf kwork -k sched,irq,softirq record -- ./perf bench sched messaging -g 1 -l 10000
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 1 groups == 40 processes run

       Total time: 1.854 [sec]
  [ perf record: Woken up 13 times to write data ]
  [ perf record: Captured and wrote 27.436 MB perf.data (235986 samples) ]
  #
  # perf kwork top

  Total  :  22419.068 ms, 12 cpus
  %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
  %Cpu0   [||||||||||||||||||||||||        82.56%]
  %Cpu1   [||||||||||||||||||||||||        81.73%]
  %Cpu2   [||||||||||||||||||||||||        83.00%]
  %Cpu3   [||||||||||||||||||||||||        81.01%]
  %Cpu4   [||||||||||||||||||||||||        81.66%]
  %Cpu5   [||||||||||||||||||||||||        82.10%]
  %Cpu6   [||||||||||||||||||||||||        81.65%]
  %Cpu7   [||||||||||||||||||||||||        82.00%]
  %Cpu8   [||||||||||||||||||||||||        82.54%]
  %Cpu9   [||||||||||||||||||||||||        81.02%]
  %Cpu10  [||||||||||||||||||||||||        81.78%]
  %Cpu11  [||||||||||||||||||||||||        81.87%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
    2913025   27.62        516.976 ms  sched-messaging
    2913017   27.56        516.273 ms  sched-messaging
    2913018   27.56        516.152 ms  sched-messaging
    2913008   27.54        515.447 ms  sched-messaging
    2913010   27.50        514.890 ms  sched-messaging
    2913007   27.48        514.411 ms  sched-messaging
    2913015   27.36        512.563 ms  sched-messaging
    2913026   27.35        512.198 ms  sched-messaging
    2913011   27.32        511.632 ms  sched-messaging
    2913014   27.32        511.381 ms  sched-messaging
    2913012   27.30        511.138 ms  sched-messaging
    2913013   27.30        511.109 ms  sched-messaging
    2913009   27.30        510.997 ms  sched-messaging
    2913019   27.25        510.488 ms  sched-messaging
    2913021   27.23        509.661 ms  sched-messaging
    2913023   27.21        509.400 ms  sched-messaging
    2913020   27.20        509.677 ms  sched-messaging
    2913022   27.19        509.061 ms  sched-messaging
    2913016   27.18        509.231 ms  sched-messaging
    2913024   27.18        509.108 ms  sched-messaging
    2912999   21.79        407.929 ms  sched-messaging
    2913000   21.78        408.045 ms  sched-messaging
    2912998   21.75        407.579 ms  sched-messaging
    2912993   21.69        406.166 ms  sched-messaging
    2912991   21.59        404.378 ms  sched-messaging
    2912992   21.57        403.845 ms  sched-messaging
    2913005   21.56        403.715 ms  sched-messaging
    2913001   21.55        403.803 ms  sched-messaging
    2912990   21.54        403.601 ms  sched-messaging
    2912994   21.52        403.087 ms  sched-messaging
    2912995   21.49        402.421 ms  sched-messaging
    2912996   21.48        402.666 ms  sched-messaging
    2912989   21.46        402.303 ms  sched-messaging
    2913003   21.46        402.034 ms  sched-messaging
    2912988   21.44        401.820 ms  sched-messaging
    2913004   21.37        400.242 ms  sched-messaging
    2912987   21.28        398.512 ms  sched-messaging
    2913002   21.27        398.759 ms  sched-messaging
    2912997   21.27        398.518 ms  sched-messaging
    2913006   21.23        397.888 ms  sched-messaging
          0   18.89        353.061 ms  swapper/3
          0   18.89        353.026 ms  swapper/9
          0   18.26        341.190 ms  swapper/6
          0   18.25        341.048 ms  swapper/4
          0   18.13        338.844 ms  swapper/10
          0   18.12        338.686 ms  swapper/1
          0   18.05        337.333 ms  swapper/11
          0   17.92        334.903 ms  swapper/7
          0   17.80        332.691 ms  swapper/5
          0   17.38        324.762 ms  swapper/8
          0   17.29        323.102 ms  swapper/0
          0   16.89        315.630 ms  swapper/2
    2912984    2.10         39.452 ms  perf
    2912985    0.73         14.004 ms  sched-messaging
     660381    0.14          3.527 ms  containerd
    3006112    0.07          1.402 ms  redis-server
    2904655    0.05          1.071 ms  kworker/0:0
     659822    0.03          1.729 ms  containerd
     659824    0.03          1.346 ms  containerd
       1053    0.01          0.353 ms  zabbix_agentd
    2895571    0.01          0.349 ms  kworker/8:1
         10    0.00          0.535 ms  rcu_sched
    2912192    0.00          0.272 ms  kworker/u24:2
    2903540    0.00          0.182 ms  sshd
        801    0.00          0.120 ms  jbd2/sda4-8
    2929605    0.00          0.108 ms  grpc_global_tim
    2912986    0.00          0.075 ms  perf
    2897765    0.00          0.073 ms  kworker/7:1
    2912817    0.00          0.070 ms  kworker/2:1
     391756    0.00          0.067 ms  grpc_global_tim
    2901459    0.00          0.056 ms  kworker/3:0
    2908185    0.00          0.052 ms  kworker/10:1
    2901521    0.00          0.050 ms  kworker/6:1
    2902444    0.00          0.046 ms  kworker/5:1
    2898513    0.00          0.041 ms  kworker/1:1
    2912280    0.00          0.039 ms  kworker/11:3
       1057    0.00          0.037 ms  zabbix_agentd
    2908010    0.00          0.037 ms  kworker/9:3
    2912951    0.00          0.036 ms  kworker/4:3
        793    0.00          0.027 ms  multipathd
         42    0.00          0.023 ms  ksoftirqd/5
          9    0.00          0.022 ms  ksoftirqd/0
         24    0.00          0.019 ms  ksoftirqd/2
         59    0.00          0.019 ms  migration/8
    2929607    0.00          0.018 ms  wecode-db
         29    0.00          0.012 ms  migration/3
         11    0.00          0.012 ms  migration/0
         65    0.00          0.012 ms  migration/9
         17    0.00          0.011 ms  migration/1
         71    0.00          0.011 ms  migration/10
         53    0.00          0.011 ms  migration/7
         23    0.00          0.011 ms  migration/2
         47    0.00          0.011 ms  migration/6
        789    0.00          0.010 ms  multipathd
         35    0.00          0.010 ms  migration/4
        480    0.00          0.009 ms  kworker/4:1H
         41    0.00          0.008 ms  migration/5
         54    0.00          0.006 ms  ksoftirqd/7
         36    0.00          0.005 ms  ksoftirqd/4
         72    0.00          0.005 ms  ksoftirqd/10
         66    0.00          0.005 ms  ksoftirqd/9
         30    0.00          0.004 ms  ksoftirqd/3
    2898362    0.00          0.003 ms  kworker/u24:0

  # perf kwork top -C 2,4,5

  Total  :   5604.885 ms, 3 cpus
  %Cpu(s):  17.65% id,   0.00% hi,   0.09% si
  %Cpu2   [||||||||||||||||||||||||        83.00%]
  %Cpu4   [||||||||||||||||||||||||        81.66%]
  %Cpu5   [||||||||||||||||||||||||        82.10%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   18.25        341.048 ms  swapper/4
          0   17.80        332.691 ms  swapper/5
          0   16.89        315.630 ms  swapper/2
    2913012    8.99        168.201 ms  sched-messaging
    2913021    8.74        163.456 ms  sched-messaging
    2913026    8.34        156.120 ms  sched-messaging
    2913010    8.14        152.372 ms  sched-messaging
    2913007    7.74        144.754 ms  sched-messaging
    2913016    7.42        138.977 ms  sched-messaging
    2913022    7.28        136.279 ms  sched-messaging
    2913013    7.00        130.894 ms  sched-messaging
    2913019    6.96        130.455 ms  sched-messaging
    2913023    6.91        129.364 ms  sched-messaging
    2913008    6.47        121.331 ms  sched-messaging
    2913018    6.47        121.124 ms  sched-messaging
    2912994    6.42        120.213 ms  sched-messaging
    2913025    6.42        120.140 ms  sched-messaging
    2913020    6.37        119.386 ms  sched-messaging
    2913011    6.35        118.921 ms  sched-messaging
    2913014    6.24        116.779 ms  sched-messaging
    2913009    6.01        112.661 ms  sched-messaging
    2913005    5.96        111.587 ms  sched-messaging
    2913003    5.94        111.142 ms  sched-messaging
    2913024    5.87        109.876 ms  sched-messaging
    2912988    5.84        109.396 ms  sched-messaging
    2912987    5.80        108.550 ms  sched-messaging
    2912990    5.51        103.322 ms  sched-messaging
    2912993    5.45        102.045 ms  sched-messaging
    2912996    5.42        101.622 ms  sched-messaging
    2913006    5.42        101.592 ms  sched-messaging
    2913000    5.41        101.511 ms  sched-messaging
    2913002    5.41        101.420 ms  sched-messaging
    2912997    5.40        101.315 ms  sched-messaging
    2913004    5.33         99.872 ms  sched-messaging
    2913015    5.29         99.252 ms  sched-messaging
    2913017    5.22         97.822 ms  sched-messaging
    2912991    5.11         95.773 ms  sched-messaging
    2912992    5.11         95.579 ms  sched-messaging
    2913001    5.07         95.033 ms  sched-messaging
    2912989    4.93         92.425 ms  sched-messaging
    2912995    4.85         90.892 ms  sched-messaging
    2912998    4.78         89.617 ms  sched-messaging
    2912999    4.56         85.576 ms  sched-messaging
    3006112    0.07          1.402 ms  redis-server
     660381    0.02          0.566 ms  containerd
    2912984    0.02          0.452 ms  perf
     659824    0.01          0.369 ms  containerd
     659822    0.00          0.316 ms  containerd
         10    0.00          0.176 ms  rcu_sched
        801    0.00          0.120 ms  jbd2/sda4-8
    2912985    0.00          0.107 ms  sched-messaging
    2912986    0.00          0.075 ms  perf
    2912817    0.00          0.070 ms  kworker/2:1
    2912192    0.00          0.056 ms  kworker/u24:2
    2929605    0.00          0.046 ms  grpc_global_tim
    2902444    0.00          0.046 ms  kworker/5:1
    2912951    0.00          0.036 ms  kworker/4:3
         42    0.00          0.023 ms  ksoftirqd/5
         24    0.00          0.019 ms  ksoftirqd/2
         23    0.00          0.011 ms  migration/2
         35    0.00          0.010 ms  migration/4
        480    0.00          0.009 ms  kworker/4:1H
         41    0.00          0.008 ms  migration/5
         36    0.00          0.005 ms  ksoftirqd/4

  # perf kwork top -n perf

  Total  :  22419.068 ms, 12 cpus
  %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
  %Cpu0   [                                 0.00%]
  %Cpu1   [                                 0.00%]
  %Cpu2   [                                 0.02%]
  %Cpu3   [                                 0.00%]
  %Cpu4   [                                 0.00%]
  %Cpu5   [                                 0.00%]
  %Cpu6   [                                 0.00%]
  %Cpu7   [                                 0.00%]
  %Cpu8   [                                 0.03%]
  %Cpu9   [                                 0.09%]
  %Cpu10  [                                 0.00%]
  %Cpu11  [                                 1.96%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
    2912984    2.10         39.452 ms  perf
    2912986    0.00          0.075 ms  perf

  # perf kwork top -s tid -n sched-messaging

  Total  :  22419.068 ms, 12 cpus
  %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
  %Cpu0   [||||||||||||||||||||||||        81.80%]
  %Cpu1   [||||||||||||||||||||||||        81.65%]
  %Cpu2   [||||||||||||||||||||||||        82.88%]
  %Cpu3   [||||||||||||||||||||||||        80.96%]
  %Cpu4   [||||||||||||||||||||||||        81.64%]
  %Cpu5   [||||||||||||||||||||||||        82.04%]
  %Cpu6   [||||||||||||||||||||||||        81.63%]
  %Cpu7   [||||||||||||||||||||||||        81.97%]
  %Cpu8   [||||||||||||||||||||||||        82.43%]
  %Cpu9   [||||||||||||||||||||||||        80.86%]
  %Cpu10  [||||||||||||||||||||||||        81.72%]
  %Cpu11  [|||||||||||||||||||||||         79.89%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
    2912985    0.04          1.111 ms  sched-messaging
    2912987   21.28        398.512 ms  sched-messaging
    2912988   21.44        401.820 ms  sched-messaging
    2912989   21.46        402.303 ms  sched-messaging
    2912990   21.54        403.601 ms  sched-messaging
    2912991   21.59        404.378 ms  sched-messaging
    2912992   21.57        403.845 ms  sched-messaging
    2912993   21.69        406.166 ms  sched-messaging
    2912994   21.52        403.087 ms  sched-messaging
    2912995   21.49        402.421 ms  sched-messaging
    2912996   21.48        402.666 ms  sched-messaging
    2912997   21.27        398.518 ms  sched-messaging
    2912998   21.75        407.579 ms  sched-messaging
    2912999   21.79        407.929 ms  sched-messaging
    2913000   21.78        408.045 ms  sched-messaging
    2913001   21.55        403.803 ms  sched-messaging
    2913002   21.27        398.759 ms  sched-messaging
    2913003   21.46        402.034 ms  sched-messaging
    2913004   21.37        400.242 ms  sched-messaging
    2913005   21.56        403.715 ms  sched-messaging
    2913006   21.23        397.888 ms  sched-messaging
    2913007   27.48        514.411 ms  sched-messaging
    2913008   27.54        515.447 ms  sched-messaging
    2913009   27.30        510.997 ms  sched-messaging
    2913010   27.50        514.890 ms  sched-messaging
    2913011   27.32        511.632 ms  sched-messaging
    2913012   27.30        511.138 ms  sched-messaging
    2913013   27.30        511.109 ms  sched-messaging
    2913014   27.32        511.381 ms  sched-messaging
    2913015   27.36        512.563 ms  sched-messaging
    2913016   27.18        509.231 ms  sched-messaging
    2913017   27.56        516.273 ms  sched-messaging
    2913018   27.56        516.152 ms  sched-messaging
    2913019   27.25        510.488 ms  sched-messaging
    2913020   27.20        509.677 ms  sched-messaging
    2913021   27.23        509.661 ms  sched-messaging
    2913022   27.19        509.061 ms  sched-messaging
    2913023   27.21        509.400 ms  sched-messaging
    2913024   27.18        509.108 ms  sched-messaging
    2913025   27.62        516.976 ms  sched-messaging
    2913026   27.35        512.198 ms  sched-messaging

  # perf kwork top -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  :  89209.004 ms, 4 cpus
  %Cpu(s):  13.47% id,   0.01% hi,   0.24% si
  %Cpu0   [|||||||||||||||||||||||||       86.08%]
  %Cpu1   [||||||||||||||||||||||||||      87.08%]
  %Cpu2   [|||||||||||||||||||||||||       86.10%]
  %Cpu3   [|||||||||||||||||||||||||       85.86%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   13.86       3092.482 ms  [swapper/3]
          0        0   13.71       3057.756 ms  [swapper/0]
          0        0   13.65       3044.987 ms  [swapper/2]
          0        0   12.63       2818.079 ms  [swapper/1]
        320      320    1.32        295.427 ms  bash
        414      320    1.09        247.658 ms  sched-messaging
        678      320    1.09        246.881 ms  sched-messaging
        354      320    1.09        246.683 ms  sched-messaging
        409      320    1.08        246.839 ms  sched-messaging
        554      320    1.07        243.045 ms  sched-messaging
        405      320    1.04        237.340 ms  sched-messaging
        525      320    1.04        235.718 ms  sched-messaging
        486      320    1.03        235.369 ms  sched-messaging
        714      320    1.03        235.142 ms  sched-messaging
        435      320    1.03        233.290 ms  sched-messaging
        408      320    1.03        231.687 ms  sched-messaging
        716      320    1.02        233.513 ms  sched-messaging
        453      320    1.02        233.181 ms  sched-messaging
        522      320    1.02        230.101 ms  sched-messaging
        651      320    1.02        229.554 ms  sched-messaging
        418      320    1.01        231.359 ms  sched-messaging
        644      320    1.01        229.848 ms  sched-messaging
        535      320    1.01        229.655 ms  sched-messaging
        536      320    1.01        229.147 ms  sched-messaging
        410      320    1.01        227.700 ms  sched-messaging
        689      320    1.00        227.988 ms  sched-messaging
        412      320    1.00        227.052 ms  sched-messaging
        489      320    1.00        226.374 ms  sched-messaging
        521      320    1.00        225.499 ms  sched-messaging
        530      320    1.00        225.262 ms  sched-messaging
        681      320    1.00        225.187 ms  sched-messaging
        415      320    0.99        225.714 ms  sched-messaging
        643      320    0.99        225.090 ms  sched-messaging
        325      320    0.99        223.385 ms  sched-messaging
        498      320    0.99        222.936 ms  sched-messaging
        413      320    0.98        225.213 ms  sched-messaging
        645      320    0.98        223.211 ms  sched-messaging
        544      320    0.98        222.714 ms  sched-messaging
        441      320    0.98        222.590 ms  sched-messaging
        697      320    0.98        222.426 ms  sched-messaging
        523      320    0.98        221.841 ms  sched-messaging
        402      320    0.98        221.776 ms  sched-messaging
  <SNIP>

Yang Jihong (16):
  perf kwork: Fix incorrect and missing free atom in work_push_atom()
  perf kwork: Add the supported subcommands to the document
  perf kwork: Set ordered_events for perf_tool
  perf kwork: Add `kwork` and `src_type` to work_init() for struct
    kwork_class
  perf kwork: Overwrite original atom in the list when a new atom is
    pushed.
  perf kwork: Set default events list if not specified in
    setup_event_list()
  perf kwork: Add sched record support
  perf kwork: Add `root` parameter to work_sort()
  perf kwork: Implement perf kwork top
  perf evsel: Add evsel__intval_common() helper
  perf kwork top: Add statistics on hardirq event support
  perf kwork top: Add statistics on softirq event support
  perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time
    options
  perf kwork top: Implements BPF-based cpu usage statistics
  perf kwork top: Add BPF-based statistics on hardirq event support
  perf kwork top: Add BPF-based statistics on softirq event support

 tools/perf/Documentation/perf-kwork.txt  |  38 +-
 tools/perf/Makefile.perf                 |   2 +-
 tools/perf/builtin-kwork.c               | 758 +++++++++++++++++++++--
 tools/perf/util/Build                    |   1 +
 tools/perf/util/bpf_kwork_top.c          | 308 +++++++++
 tools/perf/util/bpf_skel/kwork_top.bpf.c | 338 ++++++++++
 tools/perf/util/evsel.c                  |  13 +
 tools/perf/util/evsel.h                  |   2 +
 tools/perf/util/kwork.h                  |  61 +-
 9 files changed, 1472 insertions(+), 49 deletions(-)
 create mode 100644 tools/perf/util/bpf_kwork_top.c
 create mode 100644 tools/perf/util/bpf_skel/kwork_top.bpf.c

-- 
2.30.GIT


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom()
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-09-04  4:05   ` Ian Rogers
  2023-08-12  8:49 ` [RFC v1 02/16] perf kwork: Add the supported subcommands to the document Yang Jihong
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

1. Atoms are managed in page mode and should be released using atom_free()
   instead of free().
2. When the event does not match, the atom needs to free.

Fixes: f98919ec4fcc ("perf kwork: Implement 'report' subcommand")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 14bf7a8429e7..73b5dc099a8a 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -406,12 +406,14 @@ static int work_push_atom(struct perf_kwork *kwork,
 
 	work = work_findnew(&class->work_root, &key, &kwork->cmp_id);
 	if (work == NULL) {
-		free(atom);
+		atom_free(atom);
 		return -1;
 	}
 
-	if (!profile_event_match(kwork, work, sample))
+	if (!profile_event_match(kwork, work, sample)) {
+		atom_free(atom);
 		return 0;
+	}
 
 	if (dst_type < KWORK_TRACE_MAX) {
 		dst_atom = list_last_entry_or_null(&work->atom_list[dst_type],
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 02/16] perf kwork: Add the supported subcommands to the document
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
  2023-08-12  8:49 ` [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom() Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 03/16] perf kwork: Set ordered_events for perf_tool Yang Jihong
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Add missing report, latency and timehist subcommands to the document.

Fixes: f98919ec4fcc ("perf kwork: Implement 'report' subcommand")
Fixes: ad3d9f7a929a ("perf kwork: Implement perf kwork latency")
Fixes: bcc8b3e88d6f ("perf kwork: Implement perf kwork timehist")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/Documentation/perf-kwork.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-kwork.txt b/tools/perf/Documentation/perf-kwork.txt
index 3c36324712b6..482d6c52e2ed 100644
--- a/tools/perf/Documentation/perf-kwork.txt
+++ b/tools/perf/Documentation/perf-kwork.txt
@@ -8,7 +8,7 @@ perf-kwork - Tool to trace/measure kernel work properties (latencies)
 SYNOPSIS
 --------
 [verse]
-'perf kwork' {record}
+'perf kwork' {record|report|latency|timehist}
 
 DESCRIPTION
 -----------
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 03/16] perf kwork: Set ordered_events for perf_tool
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
  2023-08-12  8:49 ` [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom() Yang Jihong
  2023-08-12  8:49 ` [RFC v1 02/16] perf kwork: Add the supported subcommands to the document Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 04/16] perf kwork: Add `kwork` and `src_type` to work_init() for struct kwork_class Yang Jihong
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

perf kwork processes data based on timestamps and needs to sort events.

Fixes: f98919ec4fcc ("perf kwork: Implement 'report' subcommand")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 73b5dc099a8a..de2fbb7c56c3 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -1694,9 +1694,10 @@ int cmd_kwork(int argc, const char **argv)
 	static struct perf_kwork kwork = {
 		.class_list          = LIST_HEAD_INIT(kwork.class_list),
 		.tool = {
-			.mmap    = perf_event__process_mmap,
-			.mmap2   = perf_event__process_mmap2,
-			.sample  = perf_kwork__process_tracepoint_sample,
+			.mmap		= perf_event__process_mmap,
+			.mmap2		= perf_event__process_mmap2,
+			.sample		= perf_kwork__process_tracepoint_sample,
+			.ordered_events = true,
 		},
 		.atom_page_list      = LIST_HEAD_INIT(kwork.atom_page_list),
 		.sort_list           = LIST_HEAD_INIT(kwork.sort_list),
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 04/16] perf kwork: Add `kwork` and `src_type` to work_init() for struct kwork_class
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (2 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 03/16] perf kwork: Set ordered_events for perf_tool Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed Yang Jihong
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

To support different types of reports, two parameters
`struct perf_kwork * kwork` and `enum kwork_trace_type src_type` are added
to work_init() of struct kwork_class for initialization in different
scenarios.

No functional change.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 16 +++++++++++-----
 tools/perf/util/kwork.h    |  6 ++++--
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index de2fbb7c56c3..42ea59a957ae 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -398,7 +398,7 @@ static int work_push_atom(struct perf_kwork *kwork,
 	struct kwork_work *work, key;
 
 	BUG_ON(class->work_init == NULL);
-	class->work_init(class, &key, evsel, sample, machine);
+	class->work_init(kwork, class, &key, src_type, evsel, sample, machine);
 
 	atom = atom_new(kwork, sample);
 	if (atom == NULL)
@@ -445,7 +445,7 @@ static struct kwork_atom *work_pop_atom(struct perf_kwork *kwork,
 	struct kwork_work *work, key;
 
 	BUG_ON(class->work_init == NULL);
-	class->work_init(class, &key, evsel, sample, machine);
+	class->work_init(kwork, class, &key, src_type, evsel, sample, machine);
 
 	work = work_findnew(&class->work_root, &key, &kwork->cmp_id);
 	if (ret_work != NULL)
@@ -821,8 +821,10 @@ static int irq_class_init(struct kwork_class *class,
 	return 0;
 }
 
-static void irq_work_init(struct kwork_class *class,
+static void irq_work_init(struct perf_kwork *kwork __maybe_unused,
+			  struct kwork_class *class,
 			  struct kwork_work *work,
+			  enum kwork_trace_type src_type __maybe_unused,
 			  struct evsel *evsel,
 			  struct perf_sample *sample,
 			  struct machine *machine __maybe_unused)
@@ -940,8 +942,10 @@ static char *evsel__softirq_name(struct evsel *evsel, u64 num)
 	return name;
 }
 
-static void softirq_work_init(struct kwork_class *class,
+static void softirq_work_init(struct perf_kwork *kwork __maybe_unused,
+			      struct kwork_class *class,
 			      struct kwork_work *work,
+			      enum kwork_trace_type src_type __maybe_unused,
 			      struct evsel *evsel,
 			      struct perf_sample *sample,
 			      struct machine *machine __maybe_unused)
@@ -1031,8 +1035,10 @@ static int workqueue_class_init(struct kwork_class *class,
 	return 0;
 }
 
-static void workqueue_work_init(struct kwork_class *class,
+static void workqueue_work_init(struct perf_kwork *kwork __maybe_unused,
+				struct kwork_class *class,
 				struct kwork_work *work,
+				enum kwork_trace_type src_type __maybe_unused,
 				struct evsel *evsel,
 				struct perf_sample *sample,
 				struct machine *machine)
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index 53b7327550b8..736c7a08fb19 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -91,6 +91,7 @@ struct kwork_atom_page {
 	DECLARE_BITMAP(bitmap, NR_ATOM_PER_PAGE);
 };
 
+struct perf_kwork;
 struct kwork_class;
 struct kwork_work {
 	/*
@@ -142,8 +143,10 @@ struct kwork_class {
 	int (*class_init)(struct kwork_class *class,
 			  struct perf_session *session);
 
-	void (*work_init)(struct kwork_class *class,
+	void (*work_init)(struct perf_kwork *kwork,
+			  struct kwork_class *class,
 			  struct kwork_work *work,
+			  enum kwork_trace_type src_type,
 			  struct evsel *evsel,
 			  struct perf_sample *sample,
 			  struct machine *machine);
@@ -152,7 +155,6 @@ struct kwork_class {
 			  char *buf, int len);
 };
 
-struct perf_kwork;
 struct trace_kwork_handler {
 	int (*raise_event)(struct perf_kwork *kwork,
 			   struct kwork_class *class, struct evsel *evsel,
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed.
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (3 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 04/16] perf kwork: Add `kwork` and `src_type` to work_init() for struct kwork_class Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-09-04  4:13   ` Ian Rogers
  2023-08-12  8:49 ` [RFC v1 06/16] perf kwork: Set default events list if not specified in setup_event_list() Yang Jihong
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

work_push_atom() supports nesting. Currently, all supported kworks are not
nested. A `overwrite` parameter is added to overwrite the original atom in
the list.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 42ea59a957ae..f620911a26a2 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -392,9 +392,10 @@ static int work_push_atom(struct perf_kwork *kwork,
 			  struct evsel *evsel,
 			  struct perf_sample *sample,
 			  struct machine *machine,
-			  struct kwork_work **ret_work)
+			  struct kwork_work **ret_work,
+			  bool overwrite)
 {
-	struct kwork_atom *atom, *dst_atom;
+	struct kwork_atom *atom, *dst_atom, *last_atom;
 	struct kwork_work *work, key;
 
 	BUG_ON(class->work_init == NULL);
@@ -427,6 +428,17 @@ static int work_push_atom(struct perf_kwork *kwork,
 	if (ret_work != NULL)
 		*ret_work = work;
 
+	if (overwrite) {
+		last_atom = list_last_entry_or_null(&work->atom_list[src_type],
+						    struct kwork_atom, list);
+		if (last_atom) {
+			atom_del(last_atom);
+
+			kwork->nr_skipped_events[src_type]++;
+			kwork->nr_skipped_events[KWORK_TRACE_MAX]++;
+		}
+	}
+
 	list_add_tail(&atom->list, &work->atom_list[src_type]);
 
 	return 0;
@@ -502,7 +514,7 @@ static int report_entry_event(struct perf_kwork *kwork,
 {
 	return work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
 			      KWORK_TRACE_MAX, evsel, sample,
-			      machine, NULL);
+			      machine, NULL, true);
 }
 
 static int report_exit_event(struct perf_kwork *kwork,
@@ -557,7 +569,7 @@ static int latency_raise_event(struct perf_kwork *kwork,
 {
 	return work_push_atom(kwork, class, KWORK_TRACE_RAISE,
 			      KWORK_TRACE_MAX, evsel, sample,
-			      machine, NULL);
+			      machine, NULL, true);
 }
 
 static int latency_entry_event(struct perf_kwork *kwork,
@@ -716,7 +728,7 @@ static int timehist_raise_event(struct perf_kwork *kwork,
 {
 	return work_push_atom(kwork, class, KWORK_TRACE_RAISE,
 			      KWORK_TRACE_MAX, evsel, sample,
-			      machine, NULL);
+			      machine, NULL, true);
 }
 
 static int timehist_entry_event(struct perf_kwork *kwork,
@@ -730,7 +742,7 @@ static int timehist_entry_event(struct perf_kwork *kwork,
 
 	ret = work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
 			     KWORK_TRACE_RAISE, evsel, sample,
-			     machine, &work);
+			     machine, &work, true);
 	if (ret)
 		return ret;
 
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 06/16] perf kwork: Set default events list if not specified in setup_event_list()
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (4 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 07/16] perf kwork: Add sched record support Yang Jihong
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Currently when no kwork event is specified, all events are configured by
default. Now set to default event list string, which is more flexible and
supports subsequent function extension.
Also put setup_event_list() into each subcommand for different settings.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index f620911a26a2..e859c34b23f3 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -1623,8 +1623,11 @@ static void setup_event_list(struct perf_kwork *kwork,
 	struct kwork_class *class;
 	char *tmp, *tok, *str;
 
+	/*
+	 * set default events list if not specified
+	 */
 	if (kwork->event_list_str == NULL)
-		goto null_event_list_str;
+		kwork->event_list_str = "irq, softirq, workqueue";
 
 	str = strdup(kwork->event_list_str);
 	for (tok = strtok_r(str, ", ", &tmp);
@@ -1643,17 +1646,6 @@ static void setup_event_list(struct perf_kwork *kwork,
 	}
 	free(str);
 
-null_event_list_str:
-	/*
-	 * config all kwork events if not specified
-	 */
-	if (list_empty(&kwork->class_list)) {
-		for (i = 0; i < KWORK_CLASS_MAX; i++) {
-			list_add_tail(&kwork_class_supported_list[i]->list,
-				      &kwork->class_list);
-		}
-	}
-
 	pr_debug("Config event list:");
 	list_for_each_entry(class, &kwork->class_list, list)
 		pr_debug(" %s", class->name);
@@ -1835,12 +1827,12 @@ int cmd_kwork(int argc, const char **argv)
 	if (!argc)
 		usage_with_options(kwork_usage, kwork_options);
 
-	setup_event_list(&kwork, kwork_options, kwork_usage);
 	sort_dimension__add(&kwork, "id", &kwork.cmp_id);
 
-	if (strlen(argv[0]) > 2 && strstarts("record", argv[0]))
+	if (strlen(argv[0]) > 2 && strstarts("record", argv[0])) {
+		setup_event_list(&kwork, kwork_options, kwork_usage);
 		return perf_kwork__record(&kwork, argc, argv);
-	else if (strlen(argv[0]) > 2 && strstarts("report", argv[0])) {
+	} else if (strlen(argv[0]) > 2 && strstarts("report", argv[0])) {
 		kwork.sort_order = default_report_sort_order;
 		if (argc > 1) {
 			argc = parse_options(argc, argv, report_options, report_usage, 0);
@@ -1849,6 +1841,7 @@ int cmd_kwork(int argc, const char **argv)
 		}
 		kwork.report = KWORK_REPORT_RUNTIME;
 		setup_sorting(&kwork, report_options, report_usage);
+		setup_event_list(&kwork, kwork_options, kwork_usage);
 		return perf_kwork__report(&kwork);
 	} else if (strlen(argv[0]) > 2 && strstarts("latency", argv[0])) {
 		kwork.sort_order = default_latency_sort_order;
@@ -1859,6 +1852,7 @@ int cmd_kwork(int argc, const char **argv)
 		}
 		kwork.report = KWORK_REPORT_LATENCY;
 		setup_sorting(&kwork, latency_options, latency_usage);
+		setup_event_list(&kwork, kwork_options, kwork_usage);
 		return perf_kwork__report(&kwork);
 	} else if (strlen(argv[0]) > 2 && strstarts("timehist", argv[0])) {
 		if (argc > 1) {
@@ -1867,6 +1861,7 @@ int cmd_kwork(int argc, const char **argv)
 				usage_with_options(timehist_usage, timehist_options);
 		}
 		kwork.report = KWORK_REPORT_TIMEHIST;
+		setup_event_list(&kwork, kwork_options, kwork_usage);
 		return perf_kwork__timehist(&kwork);
 	} else
 		usage_with_options(kwork_usage, kwork_options);
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 07/16] perf kwork: Add sched record support
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (5 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 06/16] perf kwork: Set default events list if not specified in setup_event_list() Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 08/16] perf kwork: Add `root` parameter to work_sort() Yang Jihong
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

The kwork_class type of sched is added to support recording and parsing of
sched_switch events.

As follows:

  # perf kwork -h

   Usage: perf kwork [<options>] {record|report|latency|timehist}

      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, sched, etc)
      -v, --verbose         be more verbose (show symbol address, etc)

  # perf kwork -k sched record true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.083 MB perf.data (47 samples) ]
  # perf evlist
  sched:sched_switch
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/Documentation/perf-kwork.txt |  2 +-
 tools/perf/builtin-kwork.c              | 69 ++++++++++++++++++++++++-
 tools/perf/util/kwork.h                 |  5 ++
 3 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-kwork.txt b/tools/perf/Documentation/perf-kwork.txt
index 482d6c52e2ed..2092ab916ea9 100644
--- a/tools/perf/Documentation/perf-kwork.txt
+++ b/tools/perf/Documentation/perf-kwork.txt
@@ -66,7 +66,7 @@ OPTIONS
 
 -k::
 --kwork::
-	List of kwork to profile (irq, softirq, workqueue, etc)
+	List of kwork to profile (irq, softirq, workqueue, sched, etc)
 
 -v::
 --verbose::
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index e859c34b23f3..4e2b9103b9e0 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -1084,10 +1084,77 @@ static struct kwork_class kwork_workqueue = {
 	.work_name      = workqueue_work_name,
 };
 
+static struct kwork_class kwork_sched;
+static int process_sched_switch_event(struct perf_tool *tool,
+				      struct evsel *evsel,
+				      struct perf_sample *sample,
+				      struct machine *machine)
+{
+	struct perf_kwork *kwork = container_of(tool, struct perf_kwork, tool);
+
+	if (kwork->tp_handler->sched_switch_event)
+		return kwork->tp_handler->sched_switch_event(kwork, &kwork_sched,
+							     evsel, sample, machine);
+	return 0;
+}
+
+const struct evsel_str_handler sched_tp_handlers[] = {
+	{ "sched:sched_switch",  process_sched_switch_event, },
+};
+
+static int sched_class_init(struct kwork_class *class,
+			    struct perf_session *session)
+{
+	if (perf_session__set_tracepoints_handlers(session,
+						   sched_tp_handlers)) {
+		pr_err("Failed to set sched tracepoints handlers\n");
+		return -1;
+	}
+
+	class->work_root = RB_ROOT_CACHED;
+	return 0;
+}
+
+static void sched_work_init(struct perf_kwork *kwork __maybe_unused,
+			    struct kwork_class *class,
+			    struct kwork_work *work,
+			    enum kwork_trace_type src_type,
+			    struct evsel *evsel,
+			    struct perf_sample *sample,
+			    struct machine *machine __maybe_unused)
+{
+	work->class = class;
+	work->cpu = sample->cpu;
+
+	if (src_type == KWORK_TRACE_EXIT) {
+		work->id = evsel__intval(evsel, sample, "prev_pid");
+		work->name = strdup(evsel__strval(evsel, sample, "prev_comm"));
+	} else if (src_type == KWORK_TRACE_ENTRY) {
+		work->id = evsel__intval(evsel, sample, "next_pid");
+		work->name = strdup(evsel__strval(evsel, sample, "next_comm"));
+	}
+}
+
+static void sched_work_name(struct kwork_work *work, char *buf, int len)
+{
+	snprintf(buf, len, "%s", work->name);
+}
+
+static struct kwork_class kwork_sched = {
+	.name		= "sched",
+	.type		= KWORK_CLASS_SCHED,
+	.nr_tracepoints	= ARRAY_SIZE(sched_tp_handlers),
+	.tp_handlers	= sched_tp_handlers,
+	.class_init	= sched_class_init,
+	.work_init	= sched_work_init,
+	.work_name	= sched_work_name,
+};
+
 static struct kwork_class *kwork_class_supported_list[KWORK_CLASS_MAX] = {
 	[KWORK_CLASS_IRQ]       = &kwork_irq,
 	[KWORK_CLASS_SOFTIRQ]   = &kwork_softirq,
 	[KWORK_CLASS_WORKQUEUE] = &kwork_workqueue,
+	[KWORK_CLASS_SCHED]     = &kwork_sched,
 };
 
 static void print_separator(int len)
@@ -1740,7 +1807,7 @@ int cmd_kwork(int argc, const char **argv)
 	OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
 		    "dump raw trace in ASCII"),
 	OPT_STRING('k', "kwork", &kwork.event_list_str, "kwork",
-		   "list of kwork to profile (irq, softirq, workqueue, etc)"),
+		   "list of kwork to profile (irq, softirq, workqueue, sched, etc)"),
 	OPT_BOOLEAN('f', "force", &kwork.force, "don't complain, do it"),
 	OPT_END()
 	};
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index 736c7a08fb19..f8e9cdd1371a 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -16,6 +16,7 @@ enum kwork_class_type {
 	KWORK_CLASS_IRQ,
 	KWORK_CLASS_SOFTIRQ,
 	KWORK_CLASS_WORKQUEUE,
+	KWORK_CLASS_SCHED,
 	KWORK_CLASS_MAX,
 };
 
@@ -167,6 +168,10 @@ struct trace_kwork_handler {
 	int (*exit_event)(struct perf_kwork *kwork,
 			  struct kwork_class *class, struct evsel *evsel,
 			  struct perf_sample *sample, struct machine *machine);
+
+	int (*sched_switch_event)(struct perf_kwork *kwork,
+				  struct kwork_class *class, struct evsel *evsel,
+				  struct perf_sample *sample, struct machine *machine);
 };
 
 struct perf_kwork {
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 08/16] perf kwork: Add `root` parameter to work_sort()
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (6 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 07/16] perf kwork: Add sched record support Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 09/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Add a `struct rb_root_cached *root` parameter to work_sort() to sort the
specified rb tree elements.

No functional change.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 4e2b9103b9e0..7d93951c2ed2 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -1378,11 +1378,11 @@ static void print_bad_events(struct perf_kwork *kwork)
 	}
 }
 
-static void work_sort(struct perf_kwork *kwork, struct kwork_class *class)
+static void work_sort(struct perf_kwork *kwork,
+		      struct kwork_class *class, struct rb_root_cached *root)
 {
 	struct rb_node *node;
 	struct kwork_work *data;
-	struct rb_root_cached *root = &class->work_root;
 
 	pr_debug("Sorting %s ...\n", class->name);
 	for (;;) {
@@ -1402,7 +1402,7 @@ static void perf_kwork__sort(struct perf_kwork *kwork)
 	struct kwork_class *class;
 
 	list_for_each_entry(class, &kwork->class_list, list)
-		work_sort(kwork, class);
+		work_sort(kwork, class, &class->work_root);
 }
 
 static int perf_kwork__check_config(struct perf_kwork *kwork,
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 09/16] perf kwork: Implement perf kwork top
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (7 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 08/16] perf kwork: Add `root` parameter to work_sort() Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 10/16] perf evsel: Add evsel__intval_common() helper Yang Jihong
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Some common tools for collecting statistics on CPU usage, such as top,
obtain statistics from timer interrupt sampling, and then periodically
read statistics from /proc/stat.

This method has some deviations:
1. In the tick interrupt, the time between the last tick and the current
   tick is counted in the current task. However, the task may be running
   only part of the time.
2. For each task, the top tool periodically reads the /proc/{PID}/status
   information. For tasks with a short life cycle, it may be missed.

In conclusion, the top tool cannot accurately collect statistics on the
CPU usage and running time of tasks.

The statistical method based on sched_switch tracepoint can accurately
calculate the CPU usage of all tasks. This method is applicable to
scenarios where performance comparison data is of high precision.

Example usage:

  # perf kwork

   Usage: perf kwork [<options>] {record|report|latency|timehist|top}

      -D, --dump-raw-trace  dump raw trace in ASCII
      -f, --force           don't complain, do it
      -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, sched, etc)
      -v, --verbose         be more verbose (show symbol address, etc)

  # perf kwork -k sched record -- perf bench sched messaging -g 1 -l 10000
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 1 groups == 40 processes run

       Total time: 14.074 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 15.886 MB perf.data (129472 samples) ]
  # perf kwork top

  Total  : 115708.178 ms, 8 cpus
  %Cpu(s):   9.78% id
  %Cpu0   [|||||||||||||||||||||||||||     90.55%]
  %Cpu1   [|||||||||||||||||||||||||||     90.51%]
  %Cpu2   [||||||||||||||||||||||||||      88.57%]
  %Cpu3   [|||||||||||||||||||||||||||     91.18%]
  %Cpu4   [|||||||||||||||||||||||||||     91.09%]
  %Cpu5   [|||||||||||||||||||||||||||     90.88%]
  %Cpu6   [||||||||||||||||||||||||||      88.64%]
  %Cpu7   [|||||||||||||||||||||||||||     90.28%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
       4113   22.23       3221.547 ms  sched-messaging
       4105   21.61       3131.495 ms  sched-messaging
       4119   21.53       3120.937 ms  sched-messaging
       4103   21.39       3101.614 ms  sched-messaging
       4106   21.37       3095.209 ms  sched-messaging
       4104   21.25       3077.269 ms  sched-messaging
       4115   21.21       3073.188 ms  sched-messaging
       4109   21.18       3069.022 ms  sched-messaging
       4111   20.78       3010.033 ms  sched-messaging
       4114   20.74       3007.073 ms  sched-messaging
       4108   20.73       3002.137 ms  sched-messaging
       4107   20.47       2967.292 ms  sched-messaging
       4117   20.39       2955.335 ms  sched-messaging
       4112   20.34       2947.080 ms  sched-messaging
       4118   20.32       2942.519 ms  sched-messaging
       4121   20.23       2929.865 ms  sched-messaging
       4110   20.22       2930.078 ms  sched-messaging
       4122   20.15       2919.542 ms  sched-messaging
       4120   19.77       2866.032 ms  sched-messaging
       4116   19.72       2857.660 ms  sched-messaging
       4127   16.19       2346.334 ms  sched-messaging
       4142   15.86       2297.600 ms  sched-messaging
       4141   15.62       2262.646 ms  sched-messaging
       4136   15.41       2231.408 ms  sched-messaging
       4130   15.38       2227.008 ms  sched-messaging
       4129   15.31       2217.692 ms  sched-messaging
       4126   15.21       2201.711 ms  sched-messaging
       4139   15.19       2200.722 ms  sched-messaging
       4137   15.10       2188.633 ms  sched-messaging
       4134   15.06       2182.082 ms  sched-messaging
       4132   15.02       2177.530 ms  sched-messaging
       4131   14.73       2131.973 ms  sched-messaging
       4125   14.68       2125.439 ms  sched-messaging
       4128   14.66       2122.255 ms  sched-messaging
       4123   14.65       2122.113 ms  sched-messaging
       4135   14.56       2107.144 ms  sched-messaging
       4133   14.51       2103.549 ms  sched-messaging
       4124   14.27       2066.671 ms  sched-messaging
       4140   14.17       2052.251 ms  sched-messaging
       4138   13.81       2000.361 ms  sched-messaging
          0   11.42       1652.009 ms  swapper/2
          0   11.35       1641.694 ms  swapper/6
          0    9.71       1405.108 ms  swapper/7
          0    9.48       1372.338 ms  swapper/1
          0    9.44       1366.013 ms  swapper/0
          0    9.11       1318.382 ms  swapper/5
          0    8.90       1287.582 ms  swapper/4
          0    8.81       1274.356 ms  swapper/3
       4100    2.61        379.328 ms  perf
       4101    1.16        169.487 ms  perf-exec
        151    0.65         94.741 ms  systemd-resolve
        249    0.36         53.030 ms  sd-resolve
        153    0.14         21.405 ms  systemd-timesyn
          1    0.10         16.200 ms  systemd
         16    0.09         15.785 ms  rcu_preempt
       4102    0.06          9.727 ms  perf
       4095    0.03          5.464 ms  kworker/7:1
         98    0.02          3.231 ms  jbd2/sda-8
        353    0.02          4.115 ms  sshd
         75    0.02          3.889 ms  kworker/2:1
         73    0.01          1.552 ms  kworker/5:1
         64    0.01          1.591 ms  kworker/4:1
         74    0.01          1.952 ms  kworker/3:1
         61    0.01          2.608 ms  kcompactd0
        397    0.01          1.602 ms  kworker/1:1
         69    0.01          1.817 ms  kworker/1:1H
         10    0.01          2.553 ms  kworker/u16:0
       2909    0.01          2.684 ms  kworker/0:2
       1211    0.00          0.426 ms  kworker/7:0
         97    0.00          0.153 ms  kworker/7:1H
         51    0.00          0.100 ms  ksoftirqd/7
        120    0.00          0.856 ms  systemd-journal
         76    0.00          1.414 ms  kworker/6:1
         46    0.00          0.246 ms  ksoftirqd/6
         45    0.00          0.164 ms  migration/6
         41    0.00          0.098 ms  ksoftirqd/5
         40    0.00          0.207 ms  migration/5
         86    0.00          1.339 ms  kworker/4:1H
         36    0.00          0.252 ms  ksoftirqd/4
         35    0.00          0.090 ms  migration/4
         31    0.00          0.156 ms  ksoftirqd/3
         30    0.00          0.073 ms  migration/3
         26    0.00          0.180 ms  ksoftirqd/2
         25    0.00          0.085 ms  migration/2
         21    0.00          0.106 ms  ksoftirqd/1
         20    0.00          0.118 ms  migration/1
        302    0.00          1.440 ms  systemd-logind
         17    0.00          0.132 ms  migration/0
         15    0.00          0.255 ms  ksoftirqd/0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/Documentation/perf-kwork.txt |   5 +-
 tools/perf/builtin-kwork.c              | 387 +++++++++++++++++++++++-
 tools/perf/util/kwork.h                 |  22 ++
 3 files changed, 412 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-kwork.txt b/tools/perf/Documentation/perf-kwork.txt
index 2092ab916ea9..0601fcb0feea 100644
--- a/tools/perf/Documentation/perf-kwork.txt
+++ b/tools/perf/Documentation/perf-kwork.txt
@@ -8,7 +8,7 @@ perf-kwork - Tool to trace/measure kernel work properties (latencies)
 SYNOPSIS
 --------
 [verse]
-'perf kwork' {record|report|latency|timehist}
+'perf kwork' {record|report|latency|timehist|top}
 
 DESCRIPTION
 -----------
@@ -23,6 +23,8 @@ There are several variants of 'perf kwork':
 
   'perf kwork timehist' provides an analysis of kernel work events.
 
+  'perf kwork top' to report the task cpu usage.
+
     Example usage:
         perf kwork record -- sleep 1
         perf kwork report
@@ -30,6 +32,7 @@ There are several variants of 'perf kwork':
         perf kwork latency
         perf kwork latency -b
         perf kwork timehist
+        perf kwork top
 
    By default it shows the individual work events such as irq, workqeueu,
    including the run time and delay (time between raise and actually entry):
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 7d93951c2ed2..1b213af59135 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -45,6 +45,11 @@
 #define PRINT_BRACKETPAIR_WIDTH 2
 #define PRINT_TIME_UNIT_SEC_WIDTH 2
 #define PRINT_TIME_UNIT_MESC_WIDTH 3
+#define PRINT_PID_WIDTH 7
+#define PRINT_TASK_NAME_WIDTH 16
+#define PRINT_CPU_USAGE_WIDTH 6
+#define PRINT_CPU_USAGE_DECIMAL_WIDTH 2
+#define PRINT_CPU_USAGE_HIST_WIDTH 30
 #define PRINT_RUNTIME_HEADER_WIDTH (PRINT_RUNTIME_WIDTH + PRINT_TIME_UNIT_MESC_WIDTH)
 #define PRINT_LATENCY_HEADER_WIDTH (PRINT_LATENCY_WIDTH + PRINT_TIME_UNIT_MESC_WIDTH)
 #define PRINT_TIMEHIST_CPU_WIDTH (PRINT_CPU_WIDTH + PRINT_BRACKETPAIR_WIDTH)
@@ -131,6 +136,16 @@ static int max_latency_cmp(struct kwork_work *l, struct kwork_work *r)
 	return 0;
 }
 
+static int cpu_usage_cmp(struct kwork_work *l, struct kwork_work *r)
+{
+	if (l->cpu_usage > r->cpu_usage)
+		return 1;
+	if (l->cpu_usage < r->cpu_usage)
+		return -1;
+
+	return 0;
+}
+
 static int sort_dimension__add(struct perf_kwork *kwork __maybe_unused,
 			       const char *tok, struct list_head *list)
 {
@@ -155,12 +170,17 @@ static int sort_dimension__add(struct perf_kwork *kwork __maybe_unused,
 		.name = "avg",
 		.cmp  = avg_latency_cmp,
 	};
+	static struct sort_dimension rate_sort_dimension = {
+		.name = "rate",
+		.cmp  = cpu_usage_cmp,
+	};
 	struct sort_dimension *available_sorts[] = {
 		&id_sort_dimension,
 		&max_sort_dimension,
 		&count_sort_dimension,
 		&runtime_sort_dimension,
 		&avg_sort_dimension,
+		&rate_sort_dimension,
 	};
 
 	if (kwork->report == KWORK_REPORT_LATENCY)
@@ -485,6 +505,38 @@ static struct kwork_atom *work_pop_atom(struct perf_kwork *kwork,
 	return NULL;
 }
 
+static struct kwork_work *find_work_by_id(struct rb_root_cached *root,
+					  u64 id, int cpu)
+{
+	struct rb_node *next;
+	struct kwork_work *work;
+
+	next = rb_first_cached(root);
+	while (next) {
+		work = rb_entry(next, struct kwork_work, node);
+		if ((cpu != -1 && work->id == id && work->cpu == cpu) ||
+		    (cpu == -1 && work->id == id))
+			return work;
+
+		next = rb_next(next);
+	}
+
+	return NULL;
+}
+
+static struct kwork_class *get_kwork_class(struct perf_kwork *kwork,
+					   enum kwork_class_type type)
+{
+	struct kwork_class *class;
+
+	list_for_each_entry(class, &kwork->class_list, list) {
+		if (class->type == type)
+			return class;
+	}
+
+	return NULL;
+}
+
 static void report_update_exit_event(struct kwork_work *work,
 				     struct kwork_atom *atom,
 				     struct perf_sample *sample)
@@ -789,6 +841,54 @@ static int timehist_exit_event(struct perf_kwork *kwork,
 	return ret;
 }
 
+static void top_update_runtime(struct kwork_work *work,
+			       struct kwork_atom *atom,
+			       struct perf_sample *sample)
+{
+	u64 delta;
+	u64 exit_time = sample->time;
+	u64 entry_time = atom->time;
+
+	if ((entry_time != 0) && (exit_time >= entry_time)) {
+		delta = exit_time - entry_time;
+		work->total_runtime += delta;
+	}
+}
+
+static int top_entry_event(struct perf_kwork *kwork,
+			   struct kwork_class *class,
+			   struct evsel *evsel,
+			   struct perf_sample *sample,
+			   struct machine *machine)
+{
+	return work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
+			      KWORK_TRACE_MAX, evsel, sample,
+			      machine, NULL, true);
+}
+
+static int top_sched_switch_event(struct perf_kwork *kwork,
+				  struct kwork_class *class,
+				  struct evsel *evsel,
+				  struct perf_sample *sample,
+				  struct machine *machine)
+{
+	struct kwork_atom *atom;
+	struct kwork_work *work;
+
+	atom = work_pop_atom(kwork, class, KWORK_TRACE_EXIT,
+			     KWORK_TRACE_ENTRY, evsel, sample,
+			     machine, &work);
+	if (!work)
+		return -1;
+
+	if (atom) {
+		top_update_runtime(work, atom, sample);
+		atom_del(atom);
+	}
+
+	return top_entry_event(kwork, class, evsel, sample, machine);
+}
+
 static struct kwork_class kwork_irq;
 static int process_irq_handler_entry_event(struct perf_tool *tool,
 					   struct evsel *evsel,
@@ -1378,6 +1478,101 @@ static void print_bad_events(struct perf_kwork *kwork)
 	}
 }
 
+const char *graph_load = "||||||||||||||||||||||||||||||||||||||||||||||||";
+const char *graph_idle = "                                                ";
+static void top_print_per_cpu_load(struct perf_kwork *kwork)
+{
+	int i, load_width;
+	u64 total, load, load_ratio;
+	struct kwork_top_stat *stat = &kwork->top_stat;
+
+	for (i = 0; i < MAX_NR_CPUS; i++) {
+		total = stat->cpus_runtime[i].total;
+		load = stat->cpus_runtime[i].load;
+		if (test_bit(i, stat->all_cpus_bitmap) && total) {
+			load_ratio = load * 10000 / total;
+			load_width = PRINT_CPU_USAGE_HIST_WIDTH *
+				load_ratio / 10000;
+
+			printf("%%Cpu%-*d[%.*s%.*s %*.*f%%]\n",
+			       PRINT_CPU_WIDTH, i,
+			       load_width, graph_load,
+			       PRINT_CPU_USAGE_HIST_WIDTH - load_width,
+			       graph_idle,
+			       PRINT_CPU_USAGE_WIDTH,
+			       PRINT_CPU_USAGE_DECIMAL_WIDTH,
+			       (double)load_ratio / 100);
+		}
+	}
+}
+
+static void top_print_cpu_usage(struct perf_kwork *kwork)
+{
+	struct kwork_top_stat *stat = &kwork->top_stat;
+	u64 idle_time = stat->cpus_runtime[MAX_NR_CPUS].idle;
+	int cpus_nr = bitmap_weight(stat->all_cpus_bitmap, MAX_NR_CPUS);
+	u64 cpus_total_time = stat->cpus_runtime[MAX_NR_CPUS].total;
+
+	printf("Total  : %*.*f ms, %d cpus\n",
+	       PRINT_RUNTIME_WIDTH, RPINT_DECIMAL_WIDTH,
+	       (double)cpus_total_time / NSEC_PER_MSEC,
+	       cpus_nr);
+
+	printf("%%Cpu(s): %*.*f%% id\n",
+	       PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
+	       cpus_total_time ? (double)idle_time * 100 / cpus_total_time : 0);
+
+	top_print_per_cpu_load(kwork);
+}
+
+static void top_print_header(struct perf_kwork *kwork __maybe_unused)
+{
+	int ret;
+
+	printf("\n ");
+	ret = printf(" %*s  %*s  %*s  %-*s",
+		     PRINT_PID_WIDTH, "PID",
+		     PRINT_CPU_USAGE_WIDTH, "%CPU",
+		     PRINT_RUNTIME_HEADER_WIDTH + RPINT_DECIMAL_WIDTH, "RUNTIME",
+		     PRINT_TASK_NAME_WIDTH, "COMMMAND");
+	printf("\n ");
+	print_separator(ret);
+}
+
+static int top_print_work(struct perf_kwork *kwork __maybe_unused, struct kwork_work *work)
+{
+	int ret = 0;
+
+	printf(" ");
+
+	/*
+	 * pid
+	 */
+	ret += printf(" %*ld ", PRINT_PID_WIDTH, work->id);
+
+	/*
+	 * cpu usage
+	 */
+	ret += printf(" %*.*f ",
+		      PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
+		      (double)work->cpu_usage / 100);
+
+	/*
+	 * total runtime
+	 */
+	ret += printf(" %*.*f ms ",
+		      PRINT_RUNTIME_WIDTH + RPINT_DECIMAL_WIDTH, RPINT_DECIMAL_WIDTH,
+		      (double)work->total_runtime / NSEC_PER_MSEC);
+
+	/*
+	 * command
+	 */
+	ret += printf(" %-*s", PRINT_TASK_NAME_WIDTH, work->name);
+
+	printf("\n");
+	return ret;
+}
+
 static void work_sort(struct perf_kwork *kwork,
 		      struct kwork_class *class, struct rb_root_cached *root)
 {
@@ -1425,6 +1620,9 @@ static int perf_kwork__check_config(struct perf_kwork *kwork,
 		.entry_event = timehist_entry_event,
 		.exit_event  = timehist_exit_event,
 	};
+	static struct trace_kwork_handler top_ops = {
+		.sched_switch_event = top_sched_switch_event,
+	};
 
 	switch (kwork->report) {
 	case KWORK_REPORT_RUNTIME:
@@ -1436,6 +1634,9 @@ static int perf_kwork__check_config(struct perf_kwork *kwork,
 	case KWORK_REPORT_TIMEHIST:
 		kwork->tp_handler = &timehist_ops;
 		break;
+	case KWORK_REPORT_TOP:
+		kwork->tp_handler = &top_ops;
+		break;
 	default:
 		pr_debug("Invalid report type %d\n", kwork->report);
 		return -1;
@@ -1682,6 +1883,169 @@ static int perf_kwork__timehist(struct perf_kwork *kwork)
 	return perf_kwork__read_events(kwork);
 }
 
+static void top_calc_total_runtime(struct perf_kwork *kwork)
+{
+	struct kwork_class *class;
+	struct kwork_work *work;
+	struct rb_node *next;
+	struct kwork_top_stat *stat = &kwork->top_stat;
+
+	class = get_kwork_class(kwork, KWORK_CLASS_SCHED);
+	if (!class)
+		return;
+
+	next = rb_first_cached(&class->work_root);
+	while (next) {
+		work = rb_entry(next, struct kwork_work, node);
+		BUG_ON(work->cpu >= MAX_NR_CPUS);
+		stat->cpus_runtime[work->cpu].total += work->total_runtime;
+		stat->cpus_runtime[MAX_NR_CPUS].total += work->total_runtime;
+		next = rb_next(next);
+	}
+}
+
+static void top_calc_idle_time(struct perf_kwork *kwork,
+				struct kwork_work *work)
+{
+	struct kwork_top_stat *stat = &kwork->top_stat;
+
+	if (work->id == 0) {
+		stat->cpus_runtime[work->cpu].idle += work->total_runtime;
+		stat->cpus_runtime[MAX_NR_CPUS].idle += work->total_runtime;
+	}
+}
+
+static void top_calc_cpu_usage(struct perf_kwork *kwork)
+{
+	struct kwork_class *class;
+	struct kwork_work *work;
+	struct rb_node *next;
+	struct kwork_top_stat *stat = &kwork->top_stat;
+
+	class = get_kwork_class(kwork, KWORK_CLASS_SCHED);
+	if (!class)
+		return;
+
+	next = rb_first_cached(&class->work_root);
+	while (next) {
+		work = rb_entry(next, struct kwork_work, node);
+
+		if (work->total_runtime == 0)
+			goto next;
+
+		__set_bit(work->cpu, stat->all_cpus_bitmap);
+
+		work->cpu_usage = work->total_runtime * 10000 /
+			stat->cpus_runtime[work->cpu].total;
+
+		top_calc_idle_time(kwork, work);
+next:
+		next = rb_next(next);
+	}
+}
+
+static void top_calc_load_runtime(struct perf_kwork *kwork,
+				  struct kwork_work *work)
+{
+	struct kwork_top_stat *stat = &kwork->top_stat;
+
+	if (work->id != 0) {
+		stat->cpus_runtime[work->cpu].load += work->total_runtime;
+		stat->cpus_runtime[MAX_NR_CPUS].load += work->total_runtime;
+	}
+}
+
+static void top_merge_tasks(struct perf_kwork *kwork)
+{
+	struct kwork_work *merged_work, *data;
+	struct kwork_class *class;
+	struct rb_node *node;
+	int cpu;
+	struct rb_root_cached merged_root = RB_ROOT_CACHED;
+
+	class = get_kwork_class(kwork, KWORK_CLASS_SCHED);
+	if (!class)
+		return;
+
+	for (;;) {
+		node = rb_first_cached(&class->work_root);
+		if (!node)
+			break;
+
+		rb_erase_cached(node, &class->work_root);
+		data = rb_entry(node, struct kwork_work, node);
+
+		cpu = data->cpu;
+		merged_work = find_work_by_id(&merged_root, data->id,
+					      data->id == 0 ? cpu : -1);
+		if (!merged_work) {
+			work_insert(&merged_root, data, &kwork->cmp_id);
+		} else {
+			merged_work->total_runtime += data->total_runtime;
+			merged_work->cpu_usage += data->cpu_usage;
+		}
+
+		top_calc_load_runtime(kwork, data);
+	}
+
+	work_sort(kwork, class, &merged_root);
+}
+
+static void perf_kwork__top_report(struct perf_kwork *kwork)
+{
+	struct kwork_work *work;
+	struct rb_node *next;
+
+	printf("\n");
+
+	top_print_cpu_usage(kwork);
+	top_print_header(kwork);
+	next = rb_first_cached(&kwork->sorted_work_root);
+	while (next) {
+		work = rb_entry(next, struct kwork_work, node);
+		process_skipped_events(kwork, work);
+
+		if (work->total_runtime == 0)
+			goto next;
+
+		top_print_work(kwork, work);
+
+next:
+		next = rb_next(next);
+	}
+
+	printf("\n");
+}
+
+static int perf_kwork__top(struct perf_kwork *kwork)
+{
+	struct __top_cpus_runtime *cpus_runtime;
+	int ret = 0;
+
+	cpus_runtime = zalloc(sizeof(struct __top_cpus_runtime) * (MAX_NR_CPUS + 1));
+	if (!cpus_runtime)
+		return -1;
+
+	kwork->top_stat.cpus_runtime = cpus_runtime;
+	bitmap_zero(kwork->top_stat.all_cpus_bitmap, MAX_NR_CPUS);
+
+	ret = perf_kwork__read_events(kwork);
+	if (ret)
+		goto out;
+
+	top_calc_total_runtime(kwork);
+	top_calc_cpu_usage(kwork);
+	top_merge_tasks(kwork);
+
+	setup_pager();
+
+	perf_kwork__top_report(kwork);
+
+out:
+	free(kwork->top_stat.cpus_runtime);
+	return ret;
+}
+
 static void setup_event_list(struct perf_kwork *kwork,
 			     const struct option *options,
 			     const char * const usage_msg[])
@@ -1801,6 +2165,7 @@ int cmd_kwork(int argc, const char **argv)
 	};
 	static const char default_report_sort_order[] = "runtime, max, count";
 	static const char default_latency_sort_order[] = "avg, max, count";
+	static const char default_top_sort_order[] = "rate, runtime";
 	const struct option kwork_options[] = {
 	OPT_INCR('v', "verbose", &verbose,
 		 "be more verbose (show symbol address, etc)"),
@@ -1868,6 +2233,9 @@ int cmd_kwork(int argc, const char **argv)
 		   "input file name"),
 	OPT_PARENT(kwork_options)
 	};
+	const struct option top_options[] = {
+	OPT_PARENT(kwork_options)
+	};
 	const char *kwork_usage[] = {
 		NULL,
 		NULL
@@ -1884,8 +2252,12 @@ int cmd_kwork(int argc, const char **argv)
 		"perf kwork timehist [<options>]",
 		NULL
 	};
+	const char * const top_usage[] = {
+		"perf kwork top [<options>]",
+		NULL
+	};
 	const char *const kwork_subcommands[] = {
-		"record", "report", "latency", "timehist", NULL
+		"record", "report", "latency", "timehist", "top", NULL
 	};
 
 	argc = parse_options_subcommand(argc, argv, kwork_options,
@@ -1930,6 +2302,19 @@ int cmd_kwork(int argc, const char **argv)
 		kwork.report = KWORK_REPORT_TIMEHIST;
 		setup_event_list(&kwork, kwork_options, kwork_usage);
 		return perf_kwork__timehist(&kwork);
+	} else if (strlen(argv[0]) > 2 && strstarts("top", argv[0])) {
+		kwork.sort_order = default_top_sort_order;
+		if (argc > 1) {
+			argc = parse_options(argc, argv, top_options, top_usage, 0);
+			if (argc)
+				usage_with_options(top_usage, top_options);
+		}
+		kwork.report = KWORK_REPORT_TOP;
+		if (!kwork.event_list_str)
+			kwork.event_list_str = "sched";
+		setup_event_list(&kwork, kwork_options, kwork_usage);
+		setup_sorting(&kwork, top_options, top_usage);
+		return perf_kwork__top(&kwork);
 	} else
 		usage_with_options(kwork_usage, kwork_options);
 
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index f8e9cdd1371a..41ed193d5d8b 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -24,6 +24,7 @@ enum kwork_report_type {
 	KWORK_REPORT_RUNTIME,
 	KWORK_REPORT_LATENCY,
 	KWORK_REPORT_TIMEHIST,
+	KWORK_REPORT_TOP,
 };
 
 enum kwork_trace_type {
@@ -129,6 +130,11 @@ struct kwork_work {
 	u64 max_latency_start;
 	u64 max_latency_end;
 	u64 total_latency;
+
+	/*
+	 * top report
+	 */
+	u32 cpu_usage;
 };
 
 struct kwork_class {
@@ -174,6 +180,17 @@ struct trace_kwork_handler {
 				  struct perf_sample *sample, struct machine *machine);
 };
 
+struct __top_cpus_runtime {
+	u64 load;
+	u64 idle;
+	u64 total;
+};
+
+struct kwork_top_stat {
+	DECLARE_BITMAP(all_cpus_bitmap, MAX_NR_CPUS);
+	struct __top_cpus_runtime *cpus_runtime;
+};
+
 struct perf_kwork {
 	/*
 	 * metadata
@@ -225,6 +242,11 @@ struct perf_kwork {
 	u64 all_runtime;
 	u64 all_count;
 	u64 nr_skipped_events[KWORK_TRACE_MAX + 1];
+
+	/*
+	 * perf kwork top data
+	 */
+	struct kwork_top_stat top_stat;
 };
 
 struct kwork_work *perf_kwork_add_work(struct perf_kwork *kwork,
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 10/16] perf evsel: Add evsel__intval_common() helper
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (8 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 09/16] perf kwork: Implement perf kwork top Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 11/16] perf kwork top: Add statistics on hardirq event support Yang Jihong
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Add evsel__intval_common() helper to search for common_field in
tracepoint format.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/util/evsel.c | 13 +++++++++++++
 tools/perf/util/evsel.h |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index e41bc4d9925f..f3ad2fa862bf 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2764,6 +2764,11 @@ struct tep_format_field *evsel__field(struct evsel *evsel, const char *name)
 	return tep_find_field(evsel->tp_format, name);
 }
 
+struct tep_format_field *evsel__common_field(struct evsel *evsel, const char *name)
+{
+	return tep_find_common_field(evsel->tp_format, name);
+}
+
 void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name)
 {
 	struct tep_format_field *field = evsel__field(evsel, name);
@@ -2832,6 +2837,14 @@ u64 evsel__intval(struct evsel *evsel, struct perf_sample *sample, const char *n
 
 	return field ? format_field__intval(field, sample, evsel->needs_swap) : 0;
 }
+
+u64 evsel__intval_common(struct evsel *evsel, struct perf_sample *sample, const char *name)
+{
+	struct tep_format_field *field = evsel__common_field(evsel, name);
+
+	return field ? format_field__intval(field, sample, evsel->needs_swap) : 0;
+}
+
 #endif
 
 bool evsel__fallback(struct evsel *evsel, int err, char *msg, size_t msgsize)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 848534ec74fa..815be2491938 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -340,6 +340,7 @@ struct perf_sample;
 #ifdef HAVE_LIBTRACEEVENT
 void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
 u64 evsel__intval(struct evsel *evsel, struct perf_sample *sample, const char *name);
+u64 evsel__intval_common(struct evsel *evsel, struct perf_sample *sample, const char *name);
 
 static inline char *evsel__strval(struct evsel *evsel, struct perf_sample *sample, const char *name)
 {
@@ -352,6 +353,7 @@ struct tep_format_field;
 u64 format_field__intval(struct tep_format_field *field, struct perf_sample *sample, bool needs_swap);
 
 struct tep_format_field *evsel__field(struct evsel *evsel, const char *name);
+struct tep_format_field *evsel__common_field(struct evsel *evsel, const char *name);
 
 static inline bool __evsel__match(const struct evsel *evsel, u32 type, u64 config)
 {
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 11/16] perf kwork top: Add statistics on hardirq event support
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (9 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 10/16] perf evsel: Add evsel__intval_common() helper Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 12/16] perf kwork top: Add statistics on softirq " Yang Jihong
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Calculate the runtime of the hardirq events and subtract it from
the corresponding task runtime to improve the precision.

Example usage:

  # perf kwork -k sched,irq record -- perf record -o perf_record.data -a sleep 10
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 1.054 MB perf_record.data (18019 samples) ]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.798 MB perf.data (16334 samples) ]
  #
  # perf kwork top

  Total  : 139240.869 ms, 8 cpus
  %Cpu(s):  94.91% id,   0.05% hi
  %Cpu0   [                                 0.05%]
  %Cpu1   [|                                5.00%]
  %Cpu2   [                                 0.43%]
  %Cpu3   [                                 0.57%]
  %Cpu4   [                                 1.19%]
  %Cpu5   [||||||                          20.46%]
  %Cpu6   [                                 0.48%]
  %Cpu7   [|||                             12.10%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   99.54      17325.622 ms  swapper/2
          0   99.54      17327.527 ms  swapper/0
          0   99.51      17319.909 ms  swapper/6
          0   99.42      17304.934 ms  swapper/3
          0   98.80      17197.385 ms  swapper/4
          0   94.99      16534.991 ms  swapper/1
          0   87.89      15295.264 ms  swapper/7
          0   79.53      13843.182 ms  swapper/5
       4252   36.50       6361.768 ms  perf
       4256    1.17        205.215 ms  bash
        151    0.53         93.298 ms  systemd-resolve
       4254    0.39         69.468 ms  perf
        423    0.34         59.368 ms  bash
        412    0.29         51.204 ms  sshd
        249    0.20         35.288 ms  sd-resolve
         16    0.17         30.287 ms  rcu_preempt
        153    0.09         17.266 ms  systemd-timesyn
          1    0.09         17.078 ms  systemd
       4253    0.07         12.457 ms  perf
       4255    0.06         11.559 ms  perf
       4234    0.03          6.105 ms  kworker/u16:1
         69    0.03          6.259 ms  kworker/1:1H
       4251    0.02          4.615 ms  perf
       4095    0.02          4.890 ms  kworker/7:1
         61    0.02          4.005 ms  kcompactd0
         75    0.02          3.546 ms  kworker/2:1
         97    0.01          3.106 ms  kworker/7:1H
         98    0.01          1.995 ms  jbd2/sda-8
       4088    0.01          1.779 ms  kworker/u16:3
       2909    0.01          1.795 ms  kworker/0:2
       4246    0.00          1.117 ms  kworker/7:2
         51    0.00          0.327 ms  ksoftirqd/7
         50    0.00          0.369 ms  migration/7
        102    0.00          0.160 ms  kworker/6:1H
         76    0.00          0.609 ms  kworker/6:1
         45    0.00          0.779 ms  migration/6
         87    0.00          0.504 ms  kworker/5:1H
         73    0.00          1.130 ms  kworker/5:1
         41    0.00          0.152 ms  ksoftirqd/5
         40    0.00          0.702 ms  migration/5
         64    0.00          0.316 ms  kworker/4:1
         35    0.00          0.791 ms  migration/4
        353    0.00          2.211 ms  sshd
         74    0.00          0.272 ms  kworker/3:1
         30    0.00          0.819 ms  migration/3
         25    0.00          0.784 ms  migration/2
        397    0.00          0.539 ms  kworker/1:1
         21    0.00          1.600 ms  ksoftirqd/1
         20    0.00          0.773 ms  migration/1
         17    0.00          1.682 ms  migration/0
         15    0.00          0.076 ms  ksoftirqd/0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 93 +++++++++++++++++++++++++++++++++++---
 tools/perf/util/kwork.h    |  1 +
 2 files changed, 88 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 1b213af59135..66ad8c2e8020 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -866,6 +866,36 @@ static int top_entry_event(struct perf_kwork *kwork,
 			      machine, NULL, true);
 }
 
+static int top_exit_event(struct perf_kwork *kwork,
+			  struct kwork_class *class,
+			  struct evsel *evsel,
+			  struct perf_sample *sample,
+			  struct machine *machine)
+{
+	struct kwork_work *work, *sched_work;
+	struct kwork_class *sched_class;
+	struct kwork_atom *atom;
+
+	atom = work_pop_atom(kwork, class, KWORK_TRACE_EXIT,
+			     KWORK_TRACE_ENTRY, evsel, sample,
+			     machine, &work);
+	if (!work)
+		return -1;
+
+	if (atom) {
+		sched_class = get_kwork_class(kwork, KWORK_CLASS_SCHED);
+		if (sched_class) {
+			sched_work = find_work_by_id(&sched_class->work_root,
+						     work->id, work->cpu);
+			if (sched_work)
+				top_update_runtime(work, atom, sample);
+		}
+		atom_del(atom);
+	}
+
+	return 0;
+}
+
 static int top_sched_switch_event(struct perf_kwork *kwork,
 				  struct kwork_class *class,
 				  struct evsel *evsel,
@@ -933,7 +963,7 @@ static int irq_class_init(struct kwork_class *class,
 	return 0;
 }
 
-static void irq_work_init(struct perf_kwork *kwork __maybe_unused,
+static void irq_work_init(struct perf_kwork *kwork,
 			  struct kwork_class *class,
 			  struct kwork_work *work,
 			  enum kwork_trace_type src_type __maybe_unused,
@@ -943,8 +973,14 @@ static void irq_work_init(struct perf_kwork *kwork __maybe_unused,
 {
 	work->class = class;
 	work->cpu = sample->cpu;
-	work->id = evsel__intval(evsel, sample, "irq");
-	work->name = evsel__strval(evsel, sample, "name");
+
+	if (kwork->report == KWORK_REPORT_TOP) {
+		work->id = evsel__intval_common(evsel, sample, "common_pid");
+		work->name = NULL;
+	} else {
+		work->id = evsel__intval(evsel, sample, "irq");
+		work->name = evsel__strval(evsel, sample, "name");
+	}
 }
 
 static void irq_work_name(struct kwork_work *work, char *buf, int len)
@@ -1510,6 +1546,7 @@ static void top_print_cpu_usage(struct perf_kwork *kwork)
 {
 	struct kwork_top_stat *stat = &kwork->top_stat;
 	u64 idle_time = stat->cpus_runtime[MAX_NR_CPUS].idle;
+	u64 hardirq_time = stat->cpus_runtime[MAX_NR_CPUS].irq;
 	int cpus_nr = bitmap_weight(stat->all_cpus_bitmap, MAX_NR_CPUS);
 	u64 cpus_total_time = stat->cpus_runtime[MAX_NR_CPUS].total;
 
@@ -1518,9 +1555,12 @@ static void top_print_cpu_usage(struct perf_kwork *kwork)
 	       (double)cpus_total_time / NSEC_PER_MSEC,
 	       cpus_nr);
 
-	printf("%%Cpu(s): %*.*f%% id\n",
+	printf("%%Cpu(s): %*.*f%% id, %*.*f%% hi\n",
+	       PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
+	       cpus_total_time ? (double)idle_time * 100 / cpus_total_time : 0,
+
 	       PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
-	       cpus_total_time ? (double)idle_time * 100 / cpus_total_time : 0);
+	       cpus_total_time ? (double)hardirq_time * 100 / cpus_total_time : 0);
 
 	top_print_per_cpu_load(kwork);
 }
@@ -1621,6 +1661,8 @@ static int perf_kwork__check_config(struct perf_kwork *kwork,
 		.exit_event  = timehist_exit_event,
 	};
 	static struct trace_kwork_handler top_ops = {
+		.entry_event        = timehist_entry_event,
+		.exit_event         = top_exit_event,
 		.sched_switch_event = top_sched_switch_event,
 	};
 
@@ -1915,6 +1957,43 @@ static void top_calc_idle_time(struct perf_kwork *kwork,
 	}
 }
 
+static void top_calc_irq_runtime(struct perf_kwork *kwork,
+				 enum kwork_class_type type,
+				 struct kwork_work *work)
+{
+	struct kwork_top_stat *stat = &kwork->top_stat;
+
+	if (type == KWORK_CLASS_IRQ) {
+		stat->cpus_runtime[work->cpu].irq += work->total_runtime;
+		stat->cpus_runtime[MAX_NR_CPUS].irq += work->total_runtime;
+	}
+}
+
+static void top_subtract_irq_runtime(struct perf_kwork *kwork,
+				     struct kwork_work *work)
+{
+	struct kwork_class *class;
+	struct kwork_work *data;
+	unsigned int i;
+	int irq_class_list[] = {KWORK_CLASS_IRQ};
+
+	for (i = 0; i < ARRAY_SIZE(irq_class_list); i++) {
+		class = get_kwork_class(kwork, irq_class_list[i]);
+		if (!class)
+			continue;
+
+		data = find_work_by_id(&class->work_root,
+				       work->id, work->cpu);
+		if (!data)
+			continue;
+
+		if (work->total_runtime > data->total_runtime) {
+			work->total_runtime -= data->total_runtime;
+			top_calc_irq_runtime(kwork, irq_class_list[i], data);
+		}
+	}
+}
+
 static void top_calc_cpu_usage(struct perf_kwork *kwork)
 {
 	struct kwork_class *class;
@@ -1935,6 +2014,8 @@ static void top_calc_cpu_usage(struct perf_kwork *kwork)
 
 		__set_bit(work->cpu, stat->all_cpus_bitmap);
 
+		top_subtract_irq_runtime(kwork, work);
+
 		work->cpu_usage = work->total_runtime * 10000 /
 			stat->cpus_runtime[work->cpu].total;
 
@@ -2311,7 +2392,7 @@ int cmd_kwork(int argc, const char **argv)
 		}
 		kwork.report = KWORK_REPORT_TOP;
 		if (!kwork.event_list_str)
-			kwork.event_list_str = "sched";
+			kwork.event_list_str = "sched, irq";
 		setup_event_list(&kwork, kwork_options, kwork_usage);
 		setup_sorting(&kwork, top_options, top_usage);
 		return perf_kwork__top(&kwork);
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index 41ed193d5d8b..b3e410d02043 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -183,6 +183,7 @@ struct trace_kwork_handler {
 struct __top_cpus_runtime {
 	u64 load;
 	u64 idle;
+	u64 irq;
 	u64 total;
 };
 
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 12/16] perf kwork top: Add statistics on softirq event support
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (10 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 11/16] perf kwork top: Add statistics on hardirq event support Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 13/16] perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time options Yang Jihong
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Calculate the runtime of the softirq events and subtract it from
the corresponding task runtime to improve the precision.

Example usage:

  # perf kwork -k sched,irq,softirq record -- perf record -e cpu-clock -o perf_record.data -a sleep 10
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.467 MB perf_record.data (7154 samples) ]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.152 MB perf.data (22846 samples) ]
  # perf kwork top

  Total  : 136601.588 ms, 8 cpus
  %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
  %Cpu0   [                                 0.02%]
  %Cpu1   [                                 0.01%]
  %Cpu2   [|                                4.61%]
  %Cpu3   [                                 0.04%]
  %Cpu4   [                                 0.01%]
  %Cpu5   [|||||                           17.31%]
  %Cpu6   [                                 0.51%]
  %Cpu7   [|||                             11.42%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   99.98      17073.515 ms  swapper/4
          0   99.98      17072.173 ms  swapper/1
          0   99.93      17064.229 ms  swapper/3
          0   99.62      17011.013 ms  swapper/0
          0   99.47      16985.180 ms  swapper/6
          0   95.17      16250.874 ms  swapper/2
          0   88.51      15111.684 ms  swapper/7
          0   82.62      14108.577 ms  swapper/5
       4342   33.00       5644.045 ms  perf
       4344    0.43         74.351 ms  perf
         16    0.13         22.296 ms  rcu_preempt
       4345    0.05         10.093 ms  perf
       4343    0.05          8.769 ms  perf
       4341    0.02          4.882 ms  perf
       4095    0.02          4.605 ms  kworker/7:1
         75    0.02          4.261 ms  kworker/2:1
        120    0.01          1.909 ms  systemd-journal
         98    0.01          2.540 ms  jbd2/sda-8
         61    0.01          3.404 ms  kcompactd0
        667    0.01          2.542 ms  kworker/u16:2
       4340    0.00          1.052 ms  kworker/7:2
         97    0.00          0.489 ms  kworker/7:1H
         51    0.00          0.209 ms  ksoftirqd/7
         50    0.00          0.646 ms  migration/7
         76    0.00          0.753 ms  kworker/6:1
         45    0.00          0.572 ms  migration/6
         87    0.00          0.145 ms  kworker/5:1H
         73    0.00          0.596 ms  kworker/5:1
         41    0.00          0.041 ms  ksoftirqd/5
         40    0.00          0.718 ms  migration/5
         64    0.00          0.115 ms  kworker/4:1
         35    0.00          0.556 ms  migration/4
        353    0.00          2.600 ms  sshd
         74    0.00          0.205 ms  kworker/3:1
         33    0.00          1.576 ms  kworker/3:0H
         30    0.00          0.996 ms  migration/3
         26    0.00          1.665 ms  ksoftirqd/2
         25    0.00          0.662 ms  migration/2
        397    0.00          0.057 ms  kworker/1:1
         20    0.00          1.005 ms  migration/1
       2909    0.00          1.053 ms  kworker/0:2
         17    0.00          0.720 ms  migration/0
         15    0.00          0.039 ms  ksoftirqd/0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/builtin-kwork.c | 30 ++++++++++++++++++++++--------
 tools/perf/util/kwork.h    |  1 +
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 66ad8c2e8020..c741cc1a543f 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -1090,7 +1090,7 @@ static char *evsel__softirq_name(struct evsel *evsel, u64 num)
 	return name;
 }
 
-static void softirq_work_init(struct perf_kwork *kwork __maybe_unused,
+static void softirq_work_init(struct perf_kwork *kwork,
 			      struct kwork_class *class,
 			      struct kwork_work *work,
 			      enum kwork_trace_type src_type __maybe_unused,
@@ -1098,12 +1098,19 @@ static void softirq_work_init(struct perf_kwork *kwork __maybe_unused,
 			      struct perf_sample *sample,
 			      struct machine *machine __maybe_unused)
 {
-	u64 num = evsel__intval(evsel, sample, "vec");
+	u64 num;
 
-	work->id = num;
 	work->class = class;
 	work->cpu = sample->cpu;
-	work->name = evsel__softirq_name(evsel, num);
+
+	if (kwork->report == KWORK_REPORT_TOP) {
+		work->id = evsel__intval_common(evsel, sample, "common_pid");
+		work->name = NULL;
+	} else {
+		num = evsel__intval(evsel, sample, "vec");
+		work->id = num;
+		work->name = evsel__softirq_name(evsel, num);
+	}
 }
 
 static void softirq_work_name(struct kwork_work *work, char *buf, int len)
@@ -1547,6 +1554,7 @@ static void top_print_cpu_usage(struct perf_kwork *kwork)
 	struct kwork_top_stat *stat = &kwork->top_stat;
 	u64 idle_time = stat->cpus_runtime[MAX_NR_CPUS].idle;
 	u64 hardirq_time = stat->cpus_runtime[MAX_NR_CPUS].irq;
+	u64 softirq_time = stat->cpus_runtime[MAX_NR_CPUS].softirq;
 	int cpus_nr = bitmap_weight(stat->all_cpus_bitmap, MAX_NR_CPUS);
 	u64 cpus_total_time = stat->cpus_runtime[MAX_NR_CPUS].total;
 
@@ -1555,12 +1563,15 @@ static void top_print_cpu_usage(struct perf_kwork *kwork)
 	       (double)cpus_total_time / NSEC_PER_MSEC,
 	       cpus_nr);
 
-	printf("%%Cpu(s): %*.*f%% id, %*.*f%% hi\n",
+	printf("%%Cpu(s): %*.*f%% id, %*.*f%% hi, %*.*f%% si\n",
 	       PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
 	       cpus_total_time ? (double)idle_time * 100 / cpus_total_time : 0,
 
 	       PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
-	       cpus_total_time ? (double)hardirq_time * 100 / cpus_total_time : 0);
+	       cpus_total_time ? (double)hardirq_time * 100 / cpus_total_time : 0,
+
+	       PRINT_CPU_USAGE_WIDTH, PRINT_CPU_USAGE_DECIMAL_WIDTH,
+	       cpus_total_time ? (double)softirq_time * 100 / cpus_total_time : 0);
 
 	top_print_per_cpu_load(kwork);
 }
@@ -1966,6 +1977,9 @@ static void top_calc_irq_runtime(struct perf_kwork *kwork,
 	if (type == KWORK_CLASS_IRQ) {
 		stat->cpus_runtime[work->cpu].irq += work->total_runtime;
 		stat->cpus_runtime[MAX_NR_CPUS].irq += work->total_runtime;
+	} else if (type == KWORK_CLASS_SOFTIRQ) {
+		stat->cpus_runtime[work->cpu].softirq += work->total_runtime;
+		stat->cpus_runtime[MAX_NR_CPUS].softirq += work->total_runtime;
 	}
 }
 
@@ -1975,7 +1989,7 @@ static void top_subtract_irq_runtime(struct perf_kwork *kwork,
 	struct kwork_class *class;
 	struct kwork_work *data;
 	unsigned int i;
-	int irq_class_list[] = {KWORK_CLASS_IRQ};
+	int irq_class_list[] = {KWORK_CLASS_IRQ, KWORK_CLASS_SOFTIRQ};
 
 	for (i = 0; i < ARRAY_SIZE(irq_class_list); i++) {
 		class = get_kwork_class(kwork, irq_class_list[i]);
@@ -2392,7 +2406,7 @@ int cmd_kwork(int argc, const char **argv)
 		}
 		kwork.report = KWORK_REPORT_TOP;
 		if (!kwork.event_list_str)
-			kwork.event_list_str = "sched, irq";
+			kwork.event_list_str = "sched, irq, softirq";
 		setup_event_list(&kwork, kwork_options, kwork_usage);
 		setup_sorting(&kwork, top_options, top_usage);
 		return perf_kwork__top(&kwork);
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index b3e410d02043..723b34385df6 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -184,6 +184,7 @@ struct __top_cpus_runtime {
 	u64 load;
 	u64 idle;
 	u64 irq;
+	u64 softirq;
 	u64 total;
 };
 
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 13/16] perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time options
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (11 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 12/16] perf kwork top: Add statistics on softirq " Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 14/16] perf kwork top: Implements BPF-based cpu usage statistics Yang Jihong
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Provide the following options for perf kwork top:

1. -C, --cpu <cpu>		list of cpus to profile
2. -i, --input <file>		input file name
3. -n, --name <name>		event name to profile
4. -s, --sort <key[,key2...]>	sort by key(s): rate, runtime, tid
5. --time <str>		Time span for analysis (start,stop)

Example usage:

  # perf kwork top -h

   Usage: perf kwork top [<options>]

      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): rate, runtime, tid
          --time <str>      Time span for analysis (start,stop)

  # perf kwork top -C 2,4,5

  Total  :  51226.940 ms, 3 cpus
  %Cpu(s):  92.59% id,   0.00% hi,   0.09% si
  %Cpu2   [|                                4.61%]
  %Cpu4   [                                 0.01%]
  %Cpu5   [|||||                           17.31%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   99.98      17073.515 ms  swapper/4
          0   95.17      16250.874 ms  swapper/2
          0   82.62      14108.577 ms  swapper/5
       4342   21.70       3708.358 ms  perf
         16    0.13         22.296 ms  rcu_preempt
         75    0.02          4.261 ms  kworker/2:1
         98    0.01          2.540 ms  jbd2/sda-8
         61    0.01          3.404 ms  kcompactd0
         87    0.00          0.145 ms  kworker/5:1H
         73    0.00          0.596 ms  kworker/5:1
         41    0.00          0.041 ms  ksoftirqd/5
         40    0.00          0.718 ms  migration/5
         64    0.00          0.115 ms  kworker/4:1
         35    0.00          0.556 ms  migration/4
        353    0.00          1.143 ms  sshd
         26    0.00          1.665 ms  ksoftirqd/2
         25    0.00          0.662 ms  migration/2

  # perf kwork top -i perf.data

  Total  : 136601.588 ms, 8 cpus
  %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
  %Cpu0   [                                 0.02%]
  %Cpu1   [                                 0.01%]
  %Cpu2   [|                                4.61%]
  %Cpu3   [                                 0.04%]
  %Cpu4   [                                 0.01%]
  %Cpu5   [|||||                           17.31%]
  %Cpu6   [                                 0.51%]
  %Cpu7   [|||                             11.42%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   99.98      17073.515 ms  swapper/4
          0   99.98      17072.173 ms  swapper/1
          0   99.93      17064.229 ms  swapper/3
          0   99.62      17011.013 ms  swapper/0
          0   99.47      16985.180 ms  swapper/6
          0   95.17      16250.874 ms  swapper/2
          0   88.51      15111.684 ms  swapper/7
          0   82.62      14108.577 ms  swapper/5
       4342   33.00       5644.045 ms  perf
       4344    0.43         74.351 ms  perf
         16    0.13         22.296 ms  rcu_preempt
       4345    0.05         10.093 ms  perf
       4343    0.05          8.769 ms  perf
       4341    0.02          4.882 ms  perf
       4095    0.02          4.605 ms  kworker/7:1
         75    0.02          4.261 ms  kworker/2:1
        120    0.01          1.909 ms  systemd-journal
         98    0.01          2.540 ms  jbd2/sda-8
         61    0.01          3.404 ms  kcompactd0
        667    0.01          2.542 ms  kworker/u16:2
       4340    0.00          1.052 ms  kworker/7:2
         97    0.00          0.489 ms  kworker/7:1H
         51    0.00          0.209 ms  ksoftirqd/7
         50    0.00          0.646 ms  migration/7
         76    0.00          0.753 ms  kworker/6:1
         45    0.00          0.572 ms  migration/6
         87    0.00          0.145 ms  kworker/5:1H
         73    0.00          0.596 ms  kworker/5:1
         41    0.00          0.041 ms  ksoftirqd/5
         40    0.00          0.718 ms  migration/5
         64    0.00          0.115 ms  kworker/4:1
         35    0.00          0.556 ms  migration/4
        353    0.00          2.600 ms  sshd
         74    0.00          0.205 ms  kworker/3:1
         33    0.00          1.576 ms  kworker/3:0H
         30    0.00          0.996 ms  migration/3
         26    0.00          1.665 ms  ksoftirqd/2
         25    0.00          0.662 ms  migration/2
        397    0.00          0.057 ms  kworker/1:1
         20    0.00          1.005 ms  migration/1
       2909    0.00          1.053 ms  kworker/0:2
         17    0.00          0.720 ms  migration/0
         15    0.00          0.039 ms  ksoftirqd/0

  # perf kwork top -n perf

  Total  : 136601.588 ms, 8 cpus
  %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
  %Cpu0   [                                 0.01%]
  %Cpu1   [                                 0.00%]
  %Cpu2   [|                                4.44%]
  %Cpu3   [                                 0.00%]
  %Cpu4   [                                 0.00%]
  %Cpu5   [                                 0.00%]
  %Cpu6   [                                 0.49%]
  %Cpu7   [|||                             11.38%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
       4342   15.74       2695.516 ms  perf
       4344    0.43         74.351 ms  perf
       4345    0.05         10.093 ms  perf
       4343    0.05          8.769 ms  perf
       4341    0.02          4.882 ms  perf

  # perf kwork top -s tid

  Total  : 136601.588 ms, 8 cpus
  %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
  %Cpu0   [                                 0.02%]
  %Cpu1   [                                 0.01%]
  %Cpu2   [|                                4.61%]
  %Cpu3   [                                 0.04%]
  %Cpu4   [                                 0.01%]
  %Cpu5   [|||||                           17.31%]
  %Cpu6   [                                 0.51%]
  %Cpu7   [|||                             11.42%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   99.62      17011.013 ms  swapper/0
          0   99.98      17072.173 ms  swapper/1
          0   95.17      16250.874 ms  swapper/2
          0   99.93      17064.229 ms  swapper/3
          0   99.98      17073.515 ms  swapper/4
          0   82.62      14108.577 ms  swapper/5
          0   99.47      16985.180 ms  swapper/6
          0   88.51      15111.684 ms  swapper/7
         15    0.00          0.039 ms  ksoftirqd/0
         16    0.13         22.296 ms  rcu_preempt
         17    0.00          0.720 ms  migration/0
         20    0.00          1.005 ms  migration/1
         25    0.00          0.662 ms  migration/2
         26    0.00          1.665 ms  ksoftirqd/2
         30    0.00          0.996 ms  migration/3
         33    0.00          1.576 ms  kworker/3:0H
         35    0.00          0.556 ms  migration/4
         40    0.00          0.718 ms  migration/5
         41    0.00          0.041 ms  ksoftirqd/5
         45    0.00          0.572 ms  migration/6
         50    0.00          0.646 ms  migration/7
         51    0.00          0.209 ms  ksoftirqd/7
         61    0.01          3.404 ms  kcompactd0
         64    0.00          0.115 ms  kworker/4:1
         73    0.00          0.596 ms  kworker/5:1
         74    0.00          0.205 ms  kworker/3:1
         75    0.02          4.261 ms  kworker/2:1
         76    0.00          0.753 ms  kworker/6:1
         87    0.00          0.145 ms  kworker/5:1H
         97    0.00          0.489 ms  kworker/7:1H
         98    0.01          2.540 ms  jbd2/sda-8
        120    0.01          1.909 ms  systemd-journal
        353    0.00          2.600 ms  sshd
        397    0.00          0.057 ms  kworker/1:1
        667    0.01          2.542 ms  kworker/u16:2
       2909    0.00          1.053 ms  kworker/0:2
       4095    0.02          4.605 ms  kworker/7:1
       4340    0.00          1.052 ms  kworker/7:2
       4341    0.02          4.882 ms  perf
       4342   33.00       5644.045 ms  perf
       4343    0.05          8.769 ms  perf
       4344    0.43         74.351 ms  perf
       4345    0.05         10.093 ms  perf

  # perf kwork top --time 128800,

  Total  :  53495.122 ms, 8 cpus
  %Cpu(s):  94.71% id,   0.09% hi,   0.09% si
  %Cpu0   [                                 0.07%]
  %Cpu1   [                                 0.04%]
  %Cpu2   [||                               8.49%]
  %Cpu3   [                                 0.09%]
  %Cpu4   [                                 0.02%]
  %Cpu5   [                                 0.06%]
  %Cpu6   [                                 0.12%]
  %Cpu7   [||||||                          21.24%]

        PID    %CPU           RUNTIME  COMMMAND
    ----------------------------------------------------
          0   99.96       3981.363 ms  swapper/4
          0   99.94       3978.955 ms  swapper/1
          0   99.91       9329.375 ms  swapper/5
          0   99.87       4906.829 ms  swapper/3
          0   99.86       9028.064 ms  swapper/6
          0   98.67       3928.161 ms  swapper/0
          0   91.17       8388.432 ms  swapper/2
          0   78.65       7125.602 ms  swapper/7
       4342   29.42       2675.198 ms  perf
         16    0.18         16.817 ms  rcu_preempt
       4345    0.09          8.183 ms  perf
       4344    0.04          4.290 ms  perf
       4343    0.03          2.844 ms  perf
        353    0.03          2.600 ms  sshd
       4095    0.02          2.702 ms  kworker/7:1
        120    0.02          1.909 ms  systemd-journal
         98    0.02          2.540 ms  jbd2/sda-8
         61    0.02          1.886 ms  kcompactd0
        667    0.02          1.011 ms  kworker/u16:2
         75    0.02          2.693 ms  kworker/2:1
       4341    0.01          1.838 ms  perf
         30    0.01          0.788 ms  migration/3
         26    0.01          1.665 ms  ksoftirqd/2
         20    0.01          0.752 ms  migration/1
       2909    0.01          0.604 ms  kworker/0:2
       4340    0.00          0.635 ms  kworker/7:2
         97    0.00          0.214 ms  kworker/7:1H
         51    0.00          0.209 ms  ksoftirqd/7
         50    0.00          0.646 ms  migration/7
         76    0.00          0.602 ms  kworker/6:1
         45    0.00          0.366 ms  migration/6
         87    0.00          0.145 ms  kworker/5:1H
         40    0.00          0.446 ms  migration/5
         35    0.00          0.318 ms  migration/4
         74    0.00          0.205 ms  kworker/3:1
         33    0.00          0.080 ms  kworker/3:0H
         25    0.00          0.448 ms  migration/2
        397    0.00          0.057 ms  kworker/1:1
         17    0.00          0.365 ms  migration/0

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/Documentation/perf-kwork.txt | 26 +++++++++++
 tools/perf/builtin-kwork.c              | 57 +++++++++++++++++++++++--
 2 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-kwork.txt b/tools/perf/Documentation/perf-kwork.txt
index 0601fcb0feea..34d6c285e527 100644
--- a/tools/perf/Documentation/perf-kwork.txt
+++ b/tools/perf/Documentation/perf-kwork.txt
@@ -178,6 +178,32 @@ OPTIONS for 'perf kwork timehist'
 	stop time is not given (i.e, time string is 'x.y,') then analysis goes
 	to end of file.
 
+OPTIONS for 'perf kwork top'
+---------------------------------
+
+-C::
+--cpu::
+	Only show events for the given CPU(s) (comma separated list).
+
+-i::
+--input::
+	Input file name. (default: perf.data unless stdin is a fifo)
+
+-n::
+--name::
+	Only show events for the given name.
+
+-s::
+--sort::
+	Sort by key(s): rate, runtime, tid
+
+--time::
+	Only analyze samples within given time window: <start>,<stop>. Times
+	have the format seconds.microseconds. If start is not given (i.e., time
+	string is ',x.y') then analysis starts at the beginning of the file. If
+	stop time is not given (i.e, time string is 'x.y,') then analysis goes
+	to end of file.
+
 SEE ALSO
 --------
 linkperf:perf-record[1]
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index c741cc1a543f..d5949ff4bd15 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -146,6 +146,24 @@ static int cpu_usage_cmp(struct kwork_work *l, struct kwork_work *r)
 	return 0;
 }
 
+static int id_or_cpu_r_cmp(struct kwork_work *l, struct kwork_work *r)
+{
+	if (l->id < r->id)
+		return 1;
+	if (l->id > r->id)
+		return -1;
+
+	if (l->id != 0)
+		return 0;
+
+	if (l->cpu < r->cpu)
+		return 1;
+	if (l->cpu > r->cpu)
+		return -1;
+
+	return 0;
+}
+
 static int sort_dimension__add(struct perf_kwork *kwork __maybe_unused,
 			       const char *tok, struct list_head *list)
 {
@@ -174,6 +192,10 @@ static int sort_dimension__add(struct perf_kwork *kwork __maybe_unused,
 		.name = "rate",
 		.cmp  = cpu_usage_cmp,
 	};
+	static struct sort_dimension tid_sort_dimension = {
+		.name = "tid",
+		.cmp  = id_or_cpu_r_cmp,
+	};
 	struct sort_dimension *available_sorts[] = {
 		&id_sort_dimension,
 		&max_sort_dimension,
@@ -181,6 +203,7 @@ static int sort_dimension__add(struct perf_kwork *kwork __maybe_unused,
 		&runtime_sort_dimension,
 		&avg_sort_dimension,
 		&rate_sort_dimension,
+		&tid_sort_dimension,
 	};
 
 	if (kwork->report == KWORK_REPORT_LATENCY)
@@ -381,6 +404,17 @@ static void profile_update_timespan(struct perf_kwork *kwork,
 		kwork->timeend = sample->time;
 }
 
+static bool profile_name_match(struct perf_kwork *kwork,
+			       struct kwork_work *work)
+{
+	if (kwork->profile_name && work->name &&
+	    (strcmp(work->name, kwork->profile_name) != 0)) {
+		return false;
+	}
+
+	return true;
+}
+
 static bool profile_event_match(struct perf_kwork *kwork,
 				struct kwork_work *work,
 				struct perf_sample *sample)
@@ -396,10 +430,14 @@ static bool profile_event_match(struct perf_kwork *kwork,
 	    ((ptime->end != 0) && (ptime->end < time)))
 		return false;
 
-	if ((kwork->profile_name != NULL) &&
-	    (work->name != NULL) &&
-	    (strcmp(work->name, kwork->profile_name) != 0))
+	/*
+	 * report top needs to collect the runtime of all tasks to
+	 * calculate the load of each core.
+	 */
+	if ((kwork->report != KWORK_REPORT_TOP) &&
+	    !profile_name_match(kwork, work)) {
 		return false;
+	}
 
 	profile_update_timespan(kwork, sample);
 	return true;
@@ -2070,6 +2108,9 @@ static void top_merge_tasks(struct perf_kwork *kwork)
 		rb_erase_cached(node, &class->work_root);
 		data = rb_entry(node, struct kwork_work, node);
 
+		if (!profile_name_match(kwork, data))
+			continue;
+
 		cpu = data->cpu;
 		merged_work = find_work_by_id(&merged_root, data->id,
 					      data->id == 0 ? cpu : -1);
@@ -2329,6 +2370,16 @@ int cmd_kwork(int argc, const char **argv)
 	OPT_PARENT(kwork_options)
 	};
 	const struct option top_options[] = {
+	OPT_STRING('s', "sort", &kwork.sort_order, "key[,key2...]",
+		   "sort by key(s): rate, runtime, tid"),
+	OPT_STRING('C', "cpu", &kwork.cpu_list, "cpu",
+		   "list of cpus to profile"),
+	OPT_STRING('n', "name", &kwork.profile_name, "name",
+		   "event name to profile"),
+	OPT_STRING(0, "time", &kwork.time_str, "str",
+		   "Time span for analysis (start,stop)"),
+	OPT_STRING('i', "input", &input_name, "file",
+		   "input file name"),
 	OPT_PARENT(kwork_options)
 	};
 	const char *kwork_usage[] = {
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 14/16] perf kwork top: Implements BPF-based cpu usage statistics
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (12 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 13/16] perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time options Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 15/16] perf kwork top: Add BPF-based statistics on hardirq event support Yang Jihong
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Use bpf to collect statistics on the CPU usage based on perf BPF skeletons.

Example usage:

  # perf kwork top -h

   Usage: perf kwork top [<options>]

      -b, --use-bpf         Use BPF to measure task cpu usage
      -C, --cpu <cpu>       list of cpus to profile
      -i, --input <file>    input file name
      -n, --name <name>     event name to profile
      -s, --sort <key[,key2...]>
                            sort by key(s): rate, runtime, tid
          --time <str>      Time span for analysis (start,stop)

  #
  # perf kwork -k sched top -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  : 160702.425 ms, 8 cpus
  %Cpu(s):  36.00% id,   0.00% hi,   0.00% si
  %Cpu0   [||||||||||||||||||              61.66%]
  %Cpu1   [||||||||||||||||||              61.27%]
  %Cpu2   [|||||||||||||||||||             66.40%]
  %Cpu3   [||||||||||||||||||              61.28%]
  %Cpu4   [||||||||||||||||||              61.82%]
  %Cpu5   [|||||||||||||||||||||||         77.41%]
  %Cpu6   [||||||||||||||||||              61.73%]
  %Cpu7   [||||||||||||||||||              63.25%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   38.72       8089.463 ms  [swapper/1]
          0        0   38.71       8084.547 ms  [swapper/3]
          0        0   38.33       8007.532 ms  [swapper/0]
          0        0   38.26       7992.985 ms  [swapper/6]
          0        0   38.17       7971.865 ms  [swapper/4]
          0        0   36.74       7447.765 ms  [swapper/7]
          0        0   33.59       6486.942 ms  [swapper/2]
          0        0   22.58       3771.268 ms  [swapper/5]
       9545     9351    2.48        447.136 ms  sched-messaging
       9574     9351    2.09        418.583 ms  sched-messaging
       9724     9351    2.05        372.407 ms  sched-messaging
       9531     9351    2.01        368.804 ms  sched-messaging
       9512     9351    2.00        362.250 ms  sched-messaging
       9514     9351    1.95        357.767 ms  sched-messaging
       9538     9351    1.86        384.476 ms  sched-messaging
       9712     9351    1.84        386.490 ms  sched-messaging
       9723     9351    1.83        380.021 ms  sched-messaging
       9722     9351    1.82        382.738 ms  sched-messaging
       9517     9351    1.81        354.794 ms  sched-messaging
       9559     9351    1.79        344.305 ms  sched-messaging
       9725     9351    1.77        365.315 ms  sched-messaging
  <SNIP>

  # perf kwork -k sched top -b -n perf
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  : 151563.332 ms, 8 cpus
  %Cpu(s):  26.49% id,   0.00% hi,   0.00% si
  %Cpu0   [                                 0.01%]
  %Cpu1   [                                 0.00%]
  %Cpu2   [                                 0.00%]
  %Cpu3   [                                 0.00%]
  %Cpu4   [                                 0.00%]
  %Cpu5   [                                 0.00%]
  %Cpu6   [                                 0.00%]
  %Cpu7   [                                 0.00%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
       9754     9754    0.01          2.303 ms  perf

  #
  # perf kwork -k sched top -b -C 2,3,4
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  :  48016.721 ms, 3 cpus
  %Cpu(s):  27.82% id,   0.00% hi,   0.00% si
  %Cpu2   [||||||||||||||||||||||          74.68%]
  %Cpu3   [|||||||||||||||||||||           71.06%]
  %Cpu4   [|||||||||||||||||||||           70.91%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   29.08       4734.998 ms  [swapper/4]
          0        0   28.93       4710.029 ms  [swapper/3]
          0        0   25.31       3912.363 ms  [swapper/2]
      10248    10158    1.62        264.931 ms  sched-messaging
      10253    10158    1.62        265.136 ms  sched-messaging
      10158    10158    1.60        263.013 ms  bash
      10360    10158    1.49        243.639 ms  sched-messaging
      10413    10158    1.48        238.604 ms  sched-messaging
      10531    10158    1.47        234.067 ms  sched-messaging
      10400    10158    1.47        240.631 ms  sched-messaging
      10355    10158    1.47        230.586 ms  sched-messaging
      10377    10158    1.43        234.835 ms  sched-messaging
      10526    10158    1.42        232.045 ms  sched-messaging
      10298    10158    1.41        222.396 ms  sched-messaging
      10410    10158    1.38        221.853 ms  sched-messaging
      10364    10158    1.38        226.042 ms  sched-messaging
      10480    10158    1.36        213.633 ms  sched-messaging
      10370    10158    1.36        223.620 ms  sched-messaging
      10553    10158    1.34        217.169 ms  sched-messaging
      10291    10158    1.34        211.516 ms  sched-messaging
      10251    10158    1.34        218.813 ms  sched-messaging
      10522    10158    1.33        218.498 ms  sched-messaging
      10288    10158    1.33        216.787 ms  sched-messaging
  <SNIP>

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/Documentation/perf-kwork.txt  |   5 +
 tools/perf/Makefile.perf                 |   2 +-
 tools/perf/builtin-kwork.c               |  62 ++++-
 tools/perf/util/Build                    |   1 +
 tools/perf/util/bpf_kwork_top.c          | 286 +++++++++++++++++++++++
 tools/perf/util/bpf_skel/kwork_top.bpf.c | 187 +++++++++++++++
 tools/perf/util/kwork.h                  |  26 +++
 7 files changed, 565 insertions(+), 4 deletions(-)
 create mode 100644 tools/perf/util/bpf_kwork_top.c
 create mode 100644 tools/perf/util/bpf_skel/kwork_top.bpf.c

diff --git a/tools/perf/Documentation/perf-kwork.txt b/tools/perf/Documentation/perf-kwork.txt
index 34d6c285e527..109ace1d5e90 100644
--- a/tools/perf/Documentation/perf-kwork.txt
+++ b/tools/perf/Documentation/perf-kwork.txt
@@ -33,6 +33,7 @@ There are several variants of 'perf kwork':
         perf kwork latency -b
         perf kwork timehist
         perf kwork top
+        perf kwork top -b
 
    By default it shows the individual work events such as irq, workqeueu,
    including the run time and delay (time between raise and actually entry):
@@ -181,6 +182,10 @@ OPTIONS for 'perf kwork timehist'
 OPTIONS for 'perf kwork top'
 ---------------------------------
 
+-b::
+--use-bpf::
+	Use BPF to measure task cpu usage.
+
 -C::
 --cpu::
 	Only show events for the given CPU(s) (comma separated list).
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 0ed7ee0c1665..bc61c0d0aae4 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1054,7 +1054,7 @@ SKELETONS += $(SKEL_OUT)/bperf_leader.skel.h $(SKEL_OUT)/bperf_follower.skel.h
 SKELETONS += $(SKEL_OUT)/bperf_cgroup.skel.h $(SKEL_OUT)/func_latency.skel.h
 SKELETONS += $(SKEL_OUT)/off_cpu.skel.h $(SKEL_OUT)/lock_contention.skel.h
 SKELETONS += $(SKEL_OUT)/kwork_trace.skel.h $(SKEL_OUT)/sample_filter.skel.h
-SKELETONS += $(SKEL_OUT)/bench_uprobe.skel.h
+SKELETONS += $(SKEL_OUT)/bench_uprobe.skel.h $(SKEL_OUT)/kwork_top.skel.h
 
 $(SKEL_TMP_OUT) $(LIBAPI_OUTPUT) $(LIBBPF_OUTPUT) $(LIBPERF_OUTPUT) $(LIBSUBCMD_OUTPUT) $(LIBSYMBOL_OUTPUT):
 	$(Q)$(MKDIR) -p $@
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index d5949ff4bd15..04b966801643 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -1619,8 +1619,14 @@ static void top_print_header(struct perf_kwork *kwork __maybe_unused)
 	int ret;
 
 	printf("\n ");
-	ret = printf(" %*s  %*s  %*s  %-*s",
+	ret = printf(" %*s %s%*s%s %*s  %*s  %-*s",
 		     PRINT_PID_WIDTH, "PID",
+
+		     kwork->use_bpf ? " " : "",
+		     kwork->use_bpf ? PRINT_PID_WIDTH : 0,
+		     kwork->use_bpf ? "SPID" : "",
+		     kwork->use_bpf ? " " : "",
+
 		     PRINT_CPU_USAGE_WIDTH, "%CPU",
 		     PRINT_RUNTIME_HEADER_WIDTH + RPINT_DECIMAL_WIDTH, "RUNTIME",
 		     PRINT_TASK_NAME_WIDTH, "COMMMAND");
@@ -1639,6 +1645,12 @@ static int top_print_work(struct perf_kwork *kwork __maybe_unused, struct kwork_
 	 */
 	ret += printf(" %*ld ", PRINT_PID_WIDTH, work->id);
 
+	/*
+	 * tgid
+	 */
+	if (kwork->use_bpf)
+		ret += printf(" %*d ", PRINT_PID_WIDTH, work->tgid);
+
 	/*
 	 * cpu usage
 	 */
@@ -1656,7 +1668,13 @@ static int top_print_work(struct perf_kwork *kwork __maybe_unused, struct kwork_
 	/*
 	 * command
 	 */
-	ret += printf(" %-*s", PRINT_TASK_NAME_WIDTH, work->name);
+	if (kwork->use_bpf)
+		ret += printf(" %s%s%s",
+			      work->is_kthread ? "[" : "",
+			      work->name,
+			      work->is_kthread ? "]" : "");
+	else
+		ret += printf(" %-*s", PRINT_TASK_NAME_WIDTH, work->name);
 
 	printf("\n");
 	return ret;
@@ -2153,6 +2171,36 @@ static void perf_kwork__top_report(struct perf_kwork *kwork)
 	printf("\n");
 }
 
+static int perf_kwork__top_bpf(struct perf_kwork *kwork)
+{
+	int ret;
+
+	signal(SIGINT, sig_handler);
+	signal(SIGTERM, sig_handler);
+
+	ret = perf_kwork__top_prepare_bpf(kwork);
+	if (ret)
+		return -1;
+
+	printf("Starting trace, Hit <Ctrl+C> to stop and report\n");
+
+	perf_kwork__top_start();
+
+	/*
+	 * a simple pause, wait here for stop signal
+	 */
+	pause();
+
+	perf_kwork__top_finish();
+
+	perf_kwork__top_read_bpf(kwork);
+
+	perf_kwork__top_cleanup_bpf();
+
+	return 0;
+
+}
+
 static int perf_kwork__top(struct perf_kwork *kwork)
 {
 	struct __top_cpus_runtime *cpus_runtime;
@@ -2165,7 +2213,11 @@ static int perf_kwork__top(struct perf_kwork *kwork)
 	kwork->top_stat.cpus_runtime = cpus_runtime;
 	bitmap_zero(kwork->top_stat.all_cpus_bitmap, MAX_NR_CPUS);
 
-	ret = perf_kwork__read_events(kwork);
+	if (kwork->use_bpf)
+		ret = perf_kwork__top_bpf(kwork);
+	else
+		ret = perf_kwork__read_events(kwork);
+
 	if (ret)
 		goto out;
 
@@ -2380,6 +2432,10 @@ int cmd_kwork(int argc, const char **argv)
 		   "Time span for analysis (start,stop)"),
 	OPT_STRING('i', "input", &input_name, "file",
 		   "input file name"),
+#ifdef HAVE_BPF_SKEL
+	OPT_BOOLEAN('b', "use-bpf", &kwork.use_bpf,
+		    "Use BPF to measure task cpu usage"),
+#endif
 	OPT_PARENT(kwork_options)
 	};
 	const char *kwork_usage[] = {
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index d487aec0b458..a46ac80de366 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -165,6 +165,7 @@ endif
 
 ifeq ($(CONFIG_LIBTRACEEVENT),y)
   perf-$(CONFIG_PERF_BPF_SKEL) += bpf_kwork.o
+  perf-$(CONFIG_PERF_BPF_SKEL) += bpf_kwork_top.o
 endif
 
 perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/bpf_kwork_top.c b/tools/perf/util/bpf_kwork_top.c
new file mode 100644
index 000000000000..42897ea22c61
--- /dev/null
+++ b/tools/perf/util/bpf_kwork_top.c
@@ -0,0 +1,286 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * bpf_kwork_top.c
+ *
+ * Copyright (c) 2022  Huawei Inc,  Yang Jihong <yangjihong1@huawei.com>
+ */
+
+#include <time.h>
+#include <fcntl.h>
+#include <signal.h>
+#include <stdio.h>
+#include <unistd.h>
+
+#include <linux/time64.h>
+
+#include "util/debug.h"
+#include "util/evsel.h"
+#include "util/kwork.h"
+
+#include <bpf/bpf.h>
+#include <perf/cpumap.h>
+
+#include "util/bpf_skel/kwork_top.skel.h"
+
+/*
+ * This should be in sync with "util/kwork_top.bpf.c"
+ */
+#define MAX_COMMAND_LEN 16
+
+struct time_data {
+	__u64 timestamp;
+};
+
+struct work_data {
+	__u64 runtime;
+};
+
+struct task_data {
+	__u32 tgid;
+	__u32 is_kthread;
+	char comm[MAX_COMMAND_LEN];
+};
+
+struct work_key {
+	__u32 type;
+	__u32 pid;
+	__u64 task_p;
+};
+
+struct task_key {
+	__u32 pid;
+	__u32 cpu;
+};
+
+struct kwork_class_bpf {
+	struct kwork_class *class;
+	void (*load_prepare)(void);
+};
+
+static struct kwork_top_bpf *skel;
+
+void perf_kwork__top_start(void)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+	skel->bss->from_timestamp = (u64)ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
+	skel->bss->enabled = 1;
+	pr_debug("perf kwork top start at: %lld\n", skel->bss->from_timestamp);
+}
+
+void perf_kwork__top_finish(void)
+{
+	struct timespec ts;
+
+	skel->bss->enabled = 0;
+	clock_gettime(CLOCK_MONOTONIC, &ts);
+	skel->bss->to_timestamp = (u64)ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
+	pr_debug("perf kwork top finish at: %lld\n", skel->bss->to_timestamp);
+}
+
+static void sched_load_prepare(void)
+{
+	bpf_program__set_autoload(skel->progs.on_switch, true);
+}
+
+static struct kwork_class_bpf kwork_sched_bpf = {
+	.load_prepare = sched_load_prepare,
+};
+
+static struct kwork_class_bpf *
+kwork_class_bpf_supported_list[KWORK_CLASS_MAX] = {
+	[KWORK_CLASS_SCHED]	= &kwork_sched_bpf,
+};
+
+static bool valid_kwork_class_type(enum kwork_class_type type)
+{
+	return type >= 0 && type < KWORK_CLASS_MAX ? true : false;
+}
+
+static int setup_filters(struct perf_kwork *kwork)
+{
+	u8 val = 1;
+	int i, nr_cpus, fd;
+	struct perf_cpu_map *map;
+
+	if (kwork->cpu_list) {
+		fd = bpf_map__fd(skel->maps.kwork_top_cpu_filter);
+		if (fd < 0) {
+			pr_debug("Invalid cpu filter fd\n");
+			return -1;
+		}
+
+		map = perf_cpu_map__new(kwork->cpu_list);
+		if (!map) {
+			pr_debug("Invalid cpu_list\n");
+			return -1;
+		}
+
+		nr_cpus = libbpf_num_possible_cpus();
+		for (i = 0; i < perf_cpu_map__nr(map); i++) {
+			struct perf_cpu cpu = perf_cpu_map__cpu(map, i);
+
+			if (cpu.cpu >= nr_cpus) {
+				perf_cpu_map__put(map);
+				pr_err("Requested cpu %d too large\n", cpu.cpu);
+				return -1;
+			}
+			bpf_map_update_elem(fd, &cpu.cpu, &val, BPF_ANY);
+		}
+		perf_cpu_map__put(map);
+
+		skel->bss->has_cpu_filter = 1;
+	}
+
+	return 0;
+}
+
+int perf_kwork__top_prepare_bpf(struct perf_kwork *kwork __maybe_unused)
+{
+	struct bpf_program *prog;
+	struct kwork_class *class;
+	struct kwork_class_bpf *class_bpf;
+	enum kwork_class_type type;
+
+	skel = kwork_top_bpf__open();
+	if (!skel) {
+		pr_debug("Failed to open kwork top skeleton\n");
+		return -1;
+	}
+
+	/*
+	 * set all progs to non-autoload,
+	 * then set corresponding progs according to config
+	 */
+	bpf_object__for_each_program(prog, skel->obj)
+		bpf_program__set_autoload(prog, false);
+
+	list_for_each_entry(class, &kwork->class_list, list) {
+		type = class->type;
+		if (!valid_kwork_class_type(type) ||
+		    !kwork_class_bpf_supported_list[type]) {
+			pr_err("Unsupported bpf trace class %s\n", class->name);
+			goto out;
+		}
+
+		class_bpf = kwork_class_bpf_supported_list[type];
+		class_bpf->class = class;
+
+		if (class_bpf->load_prepare)
+			class_bpf->load_prepare();
+	}
+
+	if (kwork_top_bpf__load(skel)) {
+		pr_debug("Failed to load kwork top skeleton\n");
+		goto out;
+	}
+
+	if (setup_filters(kwork))
+		goto out;
+
+	if (kwork_top_bpf__attach(skel)) {
+		pr_debug("Failed to attach kwork top skeleton\n");
+		goto out;
+	}
+
+	return 0;
+
+out:
+	kwork_top_bpf__destroy(skel);
+	return -1;
+}
+
+static void read_task_info(struct kwork_work *work)
+{
+	int fd;
+	struct task_data data;
+	struct task_key key = {
+		.pid = work->id,
+		.cpu = work->cpu,
+	};
+
+	fd = bpf_map__fd(skel->maps.kwork_top_tasks);
+	if (fd < 0) {
+		pr_debug("Invalid top tasks map fd\n");
+		return;
+	}
+
+	if (!bpf_map_lookup_elem(fd, &key, &data)) {
+		work->tgid = data.tgid;
+		work->is_kthread = data.is_kthread;
+		work->name = strdup(data.comm);
+	}
+}
+static int add_work(struct perf_kwork *kwork, struct work_key *key,
+		    struct work_data *data, int cpu)
+{
+	struct kwork_class_bpf *bpf_trace;
+	struct kwork_work *work;
+	struct kwork_work tmp = {
+		.id = key->pid,
+		.cpu = cpu,
+		.name = NULL,
+	};
+	enum kwork_class_type type = key->type;
+
+	if (!valid_kwork_class_type(type)) {
+		pr_debug("Invalid class type %d to add work\n", type);
+		return -1;
+	}
+
+	bpf_trace = kwork_class_bpf_supported_list[type];
+	tmp.class = bpf_trace->class;
+
+	work = perf_kwork_add_work(kwork, tmp.class, &tmp);
+	if (!work)
+		return -1;
+
+	work->total_runtime = data->runtime;
+	read_task_info(work);
+
+	return 0;
+}
+
+int perf_kwork__top_read_bpf(struct perf_kwork *kwork)
+{
+	int i, fd, nr_cpus;
+	struct work_data *data;
+	struct work_key key, prev;
+
+	fd = bpf_map__fd(skel->maps.kwork_top_works);
+	if (fd < 0) {
+		pr_debug("Invalid top runtime fd\n");
+		return -1;
+	}
+
+	nr_cpus = libbpf_num_possible_cpus();
+	data = calloc(nr_cpus, sizeof(struct work_data));
+	if (!data)
+		return -1;
+
+	memset(&prev, 0, sizeof(prev));
+	while (!bpf_map_get_next_key(fd, &prev, &key)) {
+		if ((bpf_map_lookup_elem(fd, &key, data)) != 0) {
+			pr_debug("Failed to lookup top elem\n");
+			return -1;
+		}
+
+		for (i = 0; i < nr_cpus; i++) {
+			if (data[i].runtime == 0)
+				continue;
+
+			if (add_work(kwork, &key, &data[i], i))
+				return -1;
+		}
+		prev = key;
+	}
+	free(data);
+
+	return 0;
+}
+
+void perf_kwork__top_cleanup_bpf(void)
+{
+	kwork_top_bpf__destroy(skel);
+}
diff --git a/tools/perf/util/bpf_skel/kwork_top.bpf.c b/tools/perf/util/bpf_skel/kwork_top.bpf.c
new file mode 100644
index 000000000000..47ad61608ec7
--- /dev/null
+++ b/tools/perf/util/bpf_skel/kwork_top.bpf.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+// Copyright (c) 2022, Huawei
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+
+/*
+ * This should be in sync with "util/kwork.h"
+ */
+enum kwork_class_type {
+	KWORK_CLASS_IRQ,
+	KWORK_CLASS_SOFTIRQ,
+	KWORK_CLASS_WORKQUEUE,
+	KWORK_CLASS_SCHED,
+	KWORK_CLASS_MAX,
+};
+
+#define MAX_ENTRIES     102400
+#define MAX_NR_CPUS     2048
+#define PF_KTHREAD      0x00200000
+#define MAX_COMMAND_LEN 16
+
+struct time_data {
+	__u64 timestamp;
+};
+
+struct work_data {
+	__u64 runtime;
+};
+
+struct task_data {
+	__u32 tgid;
+	__u32 is_kthread;
+	char comm[MAX_COMMAND_LEN];
+};
+
+struct work_key {
+	__u32 type;
+	__u32 pid;
+	__u64 task_p;
+};
+
+struct task_key {
+	__u32 pid;
+	__u32 cpu;
+};
+
+struct {
+	__uint(type, BPF_MAP_TYPE_TASK_STORAGE);
+	__uint(map_flags, BPF_F_NO_PREALLOC);
+	__type(key, int);
+	__type(value, struct time_data);
+} kwork_top_task_time SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(key_size, sizeof(struct task_key));
+	__uint(value_size, sizeof(struct task_data));
+	__uint(max_entries, MAX_ENTRIES);
+} kwork_top_tasks SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_HASH);
+	__uint(key_size, sizeof(struct work_key));
+	__uint(value_size, sizeof(struct work_data));
+	__uint(max_entries, MAX_ENTRIES);
+} kwork_top_works SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(key_size, sizeof(u32));
+	__uint(value_size, sizeof(u8));
+	__uint(max_entries, MAX_NR_CPUS);
+} kwork_top_cpu_filter SEC(".maps");
+
+int enabled = 0;
+
+int has_cpu_filter = 0;
+
+__u64 from_timestamp = 0;
+__u64 to_timestamp = 0;
+
+static __always_inline int cpu_is_filtered(__u32 cpu)
+{
+	__u8 *cpu_val;
+
+	if (has_cpu_filter) {
+		cpu_val = bpf_map_lookup_elem(&kwork_top_cpu_filter, &cpu);
+		if (!cpu_val)
+			return 1;
+	}
+
+	return 0;
+}
+
+static __always_inline void update_task_info(struct task_struct *task, __u32 cpu)
+{
+	struct task_key key = {
+		.pid = task->pid,
+		.cpu = cpu,
+	};
+
+	if (!bpf_map_lookup_elem(&kwork_top_tasks, &key)) {
+		struct task_data data = {
+			.tgid = task->tgid,
+			.is_kthread = task->flags & PF_KTHREAD ? 1 : 0,
+		};
+		BPF_CORE_READ_STR_INTO(&data.comm, task, comm);
+
+		bpf_map_update_elem(&kwork_top_tasks, &key, &data, BPF_ANY);
+	}
+}
+
+static __always_inline void update_work(struct work_key *key, __u64 delta)
+{
+	struct work_data *data;
+
+	data = bpf_map_lookup_elem(&kwork_top_works, key);
+	if (data) {
+		data->runtime += delta;
+	} else {
+		struct work_data new_data = {
+			.runtime = delta,
+		};
+
+		bpf_map_update_elem(&kwork_top_works, key, &new_data, BPF_ANY);
+	}
+}
+
+static void on_sched_out(struct task_struct *task, __u64 ts, __u32 cpu)
+{
+	__u64 delta;
+	struct time_data *pelem;
+
+	pelem = bpf_task_storage_get(&kwork_top_task_time, task, NULL, 0);
+	if (pelem)
+		delta = ts - pelem->timestamp;
+	else
+		delta = ts - from_timestamp;
+
+	struct work_key key = {
+		.type = KWORK_CLASS_SCHED,
+		.pid = task->pid,
+		.task_p = (__u64)task,
+	};
+
+	update_work(&key, delta);
+	update_task_info(task, cpu);
+}
+
+static void on_sched_in(struct task_struct *task, __u64 ts)
+{
+	struct time_data *pelem;
+
+	pelem = bpf_task_storage_get(&kwork_top_task_time, task, NULL,
+				     BPF_LOCAL_STORAGE_GET_F_CREATE);
+	if (pelem)
+		pelem->timestamp = ts;
+}
+
+SEC("tp_btf/sched_switch")
+int on_switch(u64 *ctx)
+{
+	struct task_struct *prev, *next;
+
+	prev = (struct task_struct *)ctx[1];
+	next = (struct task_struct *)ctx[2];
+
+	if (!enabled)
+		return 0;
+
+	__u32 cpu = bpf_get_smp_processor_id();
+
+	if (cpu_is_filtered(cpu))
+		return 0;
+
+	__u64 ts = bpf_ktime_get_ns();
+
+	on_sched_out(prev, ts, cpu);
+	on_sched_in(next, ts);
+
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "Dual BSD/GPL";
diff --git a/tools/perf/util/kwork.h b/tools/perf/util/kwork.h
index 723b34385df6..76fe2a821bcf 100644
--- a/tools/perf/util/kwork.h
+++ b/tools/perf/util/kwork.h
@@ -135,6 +135,8 @@ struct kwork_work {
 	 * top report
 	 */
 	u32 cpu_usage;
+	u32 tgid;
+	bool is_kthread;
 };
 
 struct kwork_class {
@@ -264,6 +266,13 @@ void perf_kwork__report_cleanup_bpf(void);
 void perf_kwork__trace_start(void);
 void perf_kwork__trace_finish(void);
 
+int perf_kwork__top_prepare_bpf(struct perf_kwork *kwork);
+int perf_kwork__top_read_bpf(struct perf_kwork *kwork);
+void perf_kwork__top_cleanup_bpf(void);
+
+void perf_kwork__top_start(void);
+void perf_kwork__top_finish(void);
+
 #else  /* !HAVE_BPF_SKEL */
 
 static inline int
@@ -283,6 +292,23 @@ static inline void perf_kwork__report_cleanup_bpf(void) {}
 static inline void perf_kwork__trace_start(void) {}
 static inline void perf_kwork__trace_finish(void) {}
 
+static inline int
+perf_kwork__top_prepare_bpf(struct perf_kwork *kwork __maybe_unused)
+{
+	return -1;
+}
+
+static inline int
+perf_kwork__top_read_bpf(struct perf_kwork *kwork __maybe_unused)
+{
+	return -1;
+}
+
+static inline void perf_kwork__top_cleanup_bpf(void) {}
+
+static inline void perf_kwork__top_start(void) {}
+static inline void perf_kwork__top_finish(void) {}
+
 #endif  /* HAVE_BPF_SKEL */
 
 #endif  /* PERF_UTIL_KWORK_H */
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 15/16] perf kwork top: Add BPF-based statistics on hardirq event support
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (13 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 14/16] perf kwork top: Implements BPF-based cpu usage statistics Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-08-12  8:49 ` [RFC v1 16/16] perf kwork top: Add BPF-based statistics on softirq " Yang Jihong
  2023-09-04  5:13 ` [RFC v1 00/16] perf kwork: Implement perf kwork top Ian Rogers
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Use bpf to collect statistics on hardirq events based on perf BPF skeletons.

Example usage:

  # perf kwork top -k sched,irq -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  : 136717.945 ms, 8 cpus
  %Cpu(s):  17.10% id,   0.01% hi,   0.00% si
  %Cpu0   [|||||||||||||||||||||||||       84.26%]
  %Cpu1   [|||||||||||||||||||||||||       84.77%]
  %Cpu2   [||||||||||||||||||||||||        83.22%]
  %Cpu3   [||||||||||||||||||||||||        80.37%]
  %Cpu4   [||||||||||||||||||||||||        81.49%]
  %Cpu5   [|||||||||||||||||||||||||       84.68%]
  %Cpu6   [|||||||||||||||||||||||||       84.48%]
  %Cpu7   [||||||||||||||||||||||||        80.21%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   19.78       3482.833 ms  [swapper/7]
          0        0   19.62       3454.219 ms  [swapper/3]
          0        0   18.50       3258.339 ms  [swapper/4]
          0        0   16.76       2842.749 ms  [swapper/2]
          0        0   15.71       2627.905 ms  [swapper/0]
          0        0   15.51       2598.206 ms  [swapper/6]
          0        0   15.31       2561.820 ms  [swapper/5]
          0        0   15.22       2548.708 ms  [swapper/1]
      13253    13018    2.95        513.108 ms  sched-messaging
      13092    13018    2.67        454.167 ms  sched-messaging
      13401    13018    2.66        454.790 ms  sched-messaging
      13240    13018    2.64        454.587 ms  sched-messaging
      13251    13018    2.61        442.273 ms  sched-messaging
      13075    13018    2.61        438.932 ms  sched-messaging
      13220    13018    2.60        443.245 ms  sched-messaging
      13235    13018    2.59        443.268 ms  sched-messaging
      13222    13018    2.50        426.344 ms  sched-messaging
      13410    13018    2.49        426.191 ms  sched-messaging
      13228    13018    2.46        425.121 ms  sched-messaging
      13379    13018    2.38        409.950 ms  sched-messaging
      13236    13018    2.37        413.159 ms  sched-messaging
      13095    13018    2.36        396.572 ms  sched-messaging
      13325    13018    2.35        408.089 ms  sched-messaging
      13242    13018    2.32        394.750 ms  sched-messaging
      13386    13018    2.31        396.997 ms  sched-messaging
      13046    13018    2.29        383.833 ms  sched-messaging
      13109    13018    2.28        388.482 ms  sched-messaging
      13388    13018    2.28        393.576 ms  sched-messaging
      13238    13018    2.26        388.487 ms  sched-messaging
  <SNIP>

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/util/bpf_kwork_top.c          | 11 ++++
 tools/perf/util/bpf_skel/kwork_top.bpf.c | 79 ++++++++++++++++++++++++
 2 files changed, 90 insertions(+)

diff --git a/tools/perf/util/bpf_kwork_top.c b/tools/perf/util/bpf_kwork_top.c
index 42897ea22c61..3998bd2a938f 100644
--- a/tools/perf/util/bpf_kwork_top.c
+++ b/tools/perf/util/bpf_kwork_top.c
@@ -79,6 +79,16 @@ void perf_kwork__top_finish(void)
 	pr_debug("perf kwork top finish at: %lld\n", skel->bss->to_timestamp);
 }
 
+static void irq_load_prepare(void)
+{
+	bpf_program__set_autoload(skel->progs.on_irq_handler_entry, true);
+	bpf_program__set_autoload(skel->progs.on_irq_handler_exit, true);
+}
+
+static struct kwork_class_bpf kwork_irq_bpf = {
+	.load_prepare = irq_load_prepare,
+};
+
 static void sched_load_prepare(void)
 {
 	bpf_program__set_autoload(skel->progs.on_switch, true);
@@ -90,6 +100,7 @@ static struct kwork_class_bpf kwork_sched_bpf = {
 
 static struct kwork_class_bpf *
 kwork_class_bpf_supported_list[KWORK_CLASS_MAX] = {
+	[KWORK_CLASS_IRQ]	= &kwork_irq_bpf,
 	[KWORK_CLASS_SCHED]	= &kwork_sched_bpf,
 };
 
diff --git a/tools/perf/util/bpf_skel/kwork_top.bpf.c b/tools/perf/util/bpf_skel/kwork_top.bpf.c
index 47ad61608ec7..9c7dc62386c7 100644
--- a/tools/perf/util/bpf_skel/kwork_top.bpf.c
+++ b/tools/perf/util/bpf_skel/kwork_top.bpf.c
@@ -54,6 +54,13 @@ struct {
 	__type(value, struct time_data);
 } kwork_top_task_time SEC(".maps");
 
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_HASH);
+	__uint(key_size, sizeof(struct work_key));
+	__uint(value_size, sizeof(struct time_data));
+	__uint(max_entries, MAX_ENTRIES);
+} kwork_top_irq_time SEC(".maps");
+
 struct {
 	__uint(type, BPF_MAP_TYPE_HASH);
 	__uint(key_size, sizeof(struct task_key));
@@ -184,4 +191,76 @@ int on_switch(u64 *ctx)
 	return 0;
 }
 
+SEC("tp_btf/irq_handler_entry")
+int on_irq_handler_entry(u64 *cxt)
+{
+	struct task_struct *task;
+
+	if (!enabled)
+		return 0;
+
+	__u32 cpu = bpf_get_smp_processor_id();
+
+	if (cpu_is_filtered(cpu))
+		return 0;
+
+	__u64 ts = bpf_ktime_get_ns();
+
+	task = (struct task_struct *)bpf_get_current_task();
+	if (!task)
+		return 0;
+
+	struct work_key key = {
+		.type = KWORK_CLASS_IRQ,
+		.pid = BPF_CORE_READ(task, pid),
+		.task_p = (__u64)task,
+	};
+
+	struct time_data data = {
+		.timestamp = ts,
+	};
+
+	bpf_map_update_elem(&kwork_top_irq_time, &key, &data, BPF_ANY);
+
+	return 0;
+}
+
+SEC("tp_btf/irq_handler_exit")
+int on_irq_handler_exit(u64 *cxt)
+{
+	__u64 delta;
+	struct task_struct *task;
+	struct time_data *pelem;
+
+	if (!enabled)
+		return 0;
+
+	__u32 cpu = bpf_get_smp_processor_id();
+
+	if (cpu_is_filtered(cpu))
+		return 0;
+
+	__u64 ts = bpf_ktime_get_ns();
+
+	task = (struct task_struct *)bpf_get_current_task();
+	if (!task)
+		return 0;
+
+	struct work_key key = {
+		.type = KWORK_CLASS_IRQ,
+		.pid = BPF_CORE_READ(task, pid),
+		.task_p = (__u64)task,
+	};
+
+	pelem = bpf_map_lookup_elem(&kwork_top_irq_time, &key);
+	if (pelem && pelem->timestamp != 0)
+		delta = ts - pelem->timestamp;
+	else
+		delta = ts - from_timestamp;
+
+	update_work(&key, delta);
+
+	return 0;
+}
+
 char LICENSE[] SEC("license") = "Dual BSD/GPL";
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC v1 16/16] perf kwork top: Add BPF-based statistics on softirq event support
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (14 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 15/16] perf kwork top: Add BPF-based statistics on hardirq event support Yang Jihong
@ 2023-08-12  8:49 ` Yang Jihong
  2023-09-04  5:13 ` [RFC v1 00/16] perf kwork: Implement perf kwork top Ian Rogers
  16 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-08-12  8:49 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users
  Cc: yangjihong1

Use bpf to collect statistics on softirq events based on perf BPF skeletons.

Example usage:

  # perf kwork top -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  : 135445.704 ms, 8 cpus
  %Cpu(s):  28.35% id,   0.00% hi,   0.25% si
  %Cpu0   [||||||||||||||||||||            69.85%]
  %Cpu1   [||||||||||||||||||||||          74.10%]
  %Cpu2   [|||||||||||||||||||||           71.18%]
  %Cpu3   [||||||||||||||||||||            69.61%]
  %Cpu4   [||||||||||||||||||||||          74.05%]
  %Cpu5   [||||||||||||||||||||            69.33%]
  %Cpu6   [||||||||||||||||||||            69.71%]
  %Cpu7   [||||||||||||||||||||||          73.77%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   30.43       5271.005 ms  [swapper/5]
          0        0   30.17       5226.644 ms  [swapper/3]
          0        0   30.08       5210.257 ms  [swapper/6]
          0        0   29.89       5177.177 ms  [swapper/0]
          0        0   28.51       4938.672 ms  [swapper/2]
          0        0   25.93       4223.464 ms  [swapper/7]
          0        0   25.69       4181.411 ms  [swapper/4]
          0        0   25.63       4173.804 ms  [swapper/1]
      16665    16265    2.16        360.600 ms  sched-messaging
      16537    16265    2.05        356.275 ms  sched-messaging
      16503    16265    2.01        343.063 ms  sched-messaging
      16424    16265    1.97        336.876 ms  sched-messaging
      16580    16265    1.94        323.658 ms  sched-messaging
      16515    16265    1.92        321.616 ms  sched-messaging
      16659    16265    1.91        325.538 ms  sched-messaging
      16634    16265    1.88        327.766 ms  sched-messaging
      16454    16265    1.87        326.843 ms  sched-messaging
      16382    16265    1.87        322.591 ms  sched-messaging
      16642    16265    1.86        320.506 ms  sched-messaging
      16582    16265    1.86        320.164 ms  sched-messaging
      16315    16265    1.86        326.872 ms  sched-messaging
      16637    16265    1.85        323.766 ms  sched-messaging
      16506    16265    1.82        311.688 ms  sched-messaging
      16512    16265    1.81        304.643 ms  sched-messaging
      16560    16265    1.80        314.751 ms  sched-messaging
      16320    16265    1.80        313.405 ms  sched-messaging
      16442    16265    1.80        314.403 ms  sched-messaging
      16626    16265    1.78        295.380 ms  sched-messaging
      16600    16265    1.77        309.444 ms  sched-messaging
      16550    16265    1.76        301.161 ms  sched-messaging
      16525    16265    1.75        296.560 ms  sched-messaging
      16314    16265    1.75        298.338 ms  sched-messaging
      16595    16265    1.74        304.390 ms  sched-messaging
      16555    16265    1.74        287.564 ms  sched-messaging
      16520    16265    1.74        295.734 ms  sched-messaging
      16507    16265    1.73        293.956 ms  sched-messaging
      16593    16265    1.72        296.443 ms  sched-messaging
      16531    16265    1.72        299.950 ms  sched-messaging
      16281    16265    1.72        301.339 ms  sched-messaging
  <SNIP>

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
 tools/perf/util/bpf_kwork_top.c          | 11 ++++
 tools/perf/util/bpf_skel/kwork_top.bpf.c | 72 ++++++++++++++++++++++++
 2 files changed, 83 insertions(+)

diff --git a/tools/perf/util/bpf_kwork_top.c b/tools/perf/util/bpf_kwork_top.c
index 3998bd2a938f..1a607b94f44d 100644
--- a/tools/perf/util/bpf_kwork_top.c
+++ b/tools/perf/util/bpf_kwork_top.c
@@ -89,6 +89,16 @@ static struct kwork_class_bpf kwork_irq_bpf = {
 	.load_prepare = irq_load_prepare,
 };
 
+static void softirq_load_prepare(void)
+{
+	bpf_program__set_autoload(skel->progs.on_softirq_entry, true);
+	bpf_program__set_autoload(skel->progs.on_softirq_exit, true);
+}
+
+static struct kwork_class_bpf kwork_softirq_bpf = {
+	.load_prepare = softirq_load_prepare,
+};
+
 static void sched_load_prepare(void)
 {
 	bpf_program__set_autoload(skel->progs.on_switch, true);
@@ -101,6 +111,7 @@ static struct kwork_class_bpf kwork_sched_bpf = {
 static struct kwork_class_bpf *
 kwork_class_bpf_supported_list[KWORK_CLASS_MAX] = {
 	[KWORK_CLASS_IRQ]	= &kwork_irq_bpf,
+	[KWORK_CLASS_SOFTIRQ]	= &kwork_softirq_bpf,
 	[KWORK_CLASS_SCHED]	= &kwork_sched_bpf,
 };
 
diff --git a/tools/perf/util/bpf_skel/kwork_top.bpf.c b/tools/perf/util/bpf_skel/kwork_top.bpf.c
index 9c7dc62386c7..84c15ccbab44 100644
--- a/tools/perf/util/bpf_skel/kwork_top.bpf.c
+++ b/tools/perf/util/bpf_skel/kwork_top.bpf.c
@@ -263,4 +263,76 @@ int on_irq_handler_exit(u64 *cxt)
 	return 0;
 }
 
+SEC("tp_btf/softirq_entry")
+int on_softirq_entry(u64 *cxt)
+{
+	struct task_struct *task;
+
+	if (!enabled)
+		return 0;
+
+	__u32 cpu = bpf_get_smp_processor_id();
+
+	if (cpu_is_filtered(cpu))
+		return 0;
+
+	__u64 ts = bpf_ktime_get_ns();
+
+	task = (struct task_struct *)bpf_get_current_task();
+	if (!task)
+		return 0;
+
+	struct work_key key = {
+		.type = KWORK_CLASS_SOFTIRQ,
+		.pid = BPF_CORE_READ(task, pid),
+		.task_p = (__u64)task,
+	};
+
+	struct time_data data = {
+		.timestamp = ts,
+	};
+
+	bpf_map_update_elem(&kwork_top_irq_time, &key, &data, BPF_ANY);
+
+	return 0;
+}
+
+SEC("tp_btf/softirq_exit")
+int on_softirq_exit(u64 *cxt)
+{
+	__u64 delta;
+	struct task_struct *task;
+	struct time_data *pelem;
+
+	if (!enabled)
+		return 0;
+
+	__u32 cpu = bpf_get_smp_processor_id();
+
+	if (cpu_is_filtered(cpu))
+		return 0;
+
+	__u64 ts = bpf_ktime_get_ns();
+
+	task = (struct task_struct *)bpf_get_current_task();
+	if (!task)
+		return 0;
+
+	struct work_key key = {
+		.type = KWORK_CLASS_SOFTIRQ,
+		.pid = BPF_CORE_READ(task, pid),
+		.task_p = (__u64)task,
+	};
+
+	pelem = bpf_map_lookup_elem(&kwork_top_irq_time, &key);
+	if (pelem)
+		delta = ts - pelem->timestamp;
+	else
+		delta = ts - from_timestamp;
+
+	update_work(&key, delta);
+
+	return 0;
+}
+
 char LICENSE[] SEC("license") = "Dual BSD/GPL";
-- 
2.30.GIT


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom()
  2023-08-12  8:49 ` [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom() Yang Jihong
@ 2023-09-04  4:05   ` Ian Rogers
  2023-09-04 11:42     ` Yang Jihong
  2023-09-06 16:43     ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 26+ messages in thread
From: Ian Rogers @ 2023-09-04  4:05 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> 1. Atoms are managed in page mode and should be released using atom_free()
>    instead of free().
> 2. When the event does not match, the atom needs to free.
>
> Fixes: f98919ec4fcc ("perf kwork: Implement 'report' subcommand")
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>

Reviewed-by: Ian Rogers <irogers@google.com>

> ---
>  tools/perf/builtin-kwork.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
> index 14bf7a8429e7..73b5dc099a8a 100644
> --- a/tools/perf/builtin-kwork.c
> +++ b/tools/perf/builtin-kwork.c
> @@ -406,12 +406,14 @@ static int work_push_atom(struct perf_kwork *kwork,
>
>         work = work_findnew(&class->work_root, &key, &kwork->cmp_id);
>         if (work == NULL) {
> -               free(atom);
> +               atom_free(atom);

Presumably this is fixing a memory-leak. It would be nice if kwork had
perf tests, then our builds with -fsanitize=address would highlight
this kind of issue. Here is a build command I use for this:
make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g
-fno-omit-frame-pointer -fsanitize=address" BUILD_BPF_SKEL=1
NO_LIBTRACEEVENT=1

Thanks,
Ian

>                 return -1;
>         }
>
> -       if (!profile_event_match(kwork, work, sample))
> +       if (!profile_event_match(kwork, work, sample)) {
> +               atom_free(atom);
>                 return 0;
> +       }
>
>         if (dst_type < KWORK_TRACE_MAX) {
>                 dst_atom = list_last_entry_or_null(&work->atom_list[dst_type],
> --
> 2.30.GIT
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed.
  2023-08-12  8:49 ` [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed Yang Jihong
@ 2023-09-04  4:13   ` Ian Rogers
  2023-09-04 11:46     ` Yang Jihong
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2023-09-04  4:13 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> work_push_atom() supports nesting. Currently, all supported kworks are not
> nested. A `overwrite` parameter is added to overwrite the original atom in
> the list.
>
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
>  tools/perf/builtin-kwork.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
> index 42ea59a957ae..f620911a26a2 100644
> --- a/tools/perf/builtin-kwork.c
> +++ b/tools/perf/builtin-kwork.c
> @@ -392,9 +392,10 @@ static int work_push_atom(struct perf_kwork *kwork,
>                           struct evsel *evsel,
>                           struct perf_sample *sample,
>                           struct machine *machine,
> -                         struct kwork_work **ret_work)
> +                         struct kwork_work **ret_work,
> +                         bool overwrite)

kerneldoc would be useful. Pushing something seems self-evident but
what does overwrite mean without reading the code?

>  {
> -       struct kwork_atom *atom, *dst_atom;
> +       struct kwork_atom *atom, *dst_atom, *last_atom;
>         struct kwork_work *work, key;
>
>         BUG_ON(class->work_init == NULL);
> @@ -427,6 +428,17 @@ static int work_push_atom(struct perf_kwork *kwork,
>         if (ret_work != NULL)
>                 *ret_work = work;
>
> +       if (overwrite) {
> +               last_atom = list_last_entry_or_null(&work->atom_list[src_type],
> +                                                   struct kwork_atom, list);
> +               if (last_atom) {
> +                       atom_del(last_atom);
> +
> +                       kwork->nr_skipped_events[src_type]++;
> +                       kwork->nr_skipped_events[KWORK_TRACE_MAX]++;
> +               }
> +       }
> +
>         list_add_tail(&atom->list, &work->atom_list[src_type]);
>
>         return 0;
> @@ -502,7 +514,7 @@ static int report_entry_event(struct perf_kwork *kwork,
>  {
>         return work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
>                               KWORK_TRACE_MAX, evsel, sample,
> -                             machine, NULL);
> +                             machine, NULL, true);

nit: for constant arguments it can be useful to add parameter names
which can enable checks like clang-tidy's bugprone argument:
https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html
This would make this:
        return work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
                              KWORK_TRACE_MAX, evsel, sample,
                              machine, /*ret_work=*/NULL, /*overwite=*/true);

Thanks,
Ian

>  }
>
>  static int report_exit_event(struct perf_kwork *kwork,
> @@ -557,7 +569,7 @@ static int latency_raise_event(struct perf_kwork *kwork,
>  {
>         return work_push_atom(kwork, class, KWORK_TRACE_RAISE,
>                               KWORK_TRACE_MAX, evsel, sample,
> -                             machine, NULL);
> +                             machine, NULL, true);
>  }
>
>  static int latency_entry_event(struct perf_kwork *kwork,
> @@ -716,7 +728,7 @@ static int timehist_raise_event(struct perf_kwork *kwork,
>  {
>         return work_push_atom(kwork, class, KWORK_TRACE_RAISE,
>                               KWORK_TRACE_MAX, evsel, sample,
> -                             machine, NULL);
> +                             machine, NULL, true);
>  }
>
>  static int timehist_entry_event(struct perf_kwork *kwork,
> @@ -730,7 +742,7 @@ static int timehist_entry_event(struct perf_kwork *kwork,
>
>         ret = work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
>                              KWORK_TRACE_RAISE, evsel, sample,
> -                            machine, &work);
> +                            machine, &work, true);
>         if (ret)
>                 return ret;
>
> --
> 2.30.GIT
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 00/16] perf kwork: Implement perf kwork top
  2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
                   ` (15 preceding siblings ...)
  2023-08-12  8:49 ` [RFC v1 16/16] perf kwork top: Add BPF-based statistics on softirq " Yang Jihong
@ 2023-09-04  5:13 ` Ian Rogers
  2023-09-04 11:59   ` Yang Jihong
  16 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2023-09-04  5:13 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> Some common tools for collecting statistics on CPU usage, such as top,
> obtain statistics from timer interrupt sampling, and then periodically
> read statistics from /proc/stat.
>
> This method has some deviations:
> 1. In the tick interrupt, the time between the last tick and the current
>    tick is counted in the current task. However, the task may be running
>    only part of the time.
> 2. For each task, the top tool periodically reads the /proc/{PID}/status
>    information. For tasks with a short life cycle, it may be missed.
>
> In conclusion, the top tool cannot accurately collect statistics on the
> CPU usage and running time of tasks.
>
> The statistical method based on sched_switch tracepoint can accurately
> calculate the CPU usage of all tasks. This method is applicable to
> scenarios where performance comparison data is of high precision.
>
> In addition to parsing the perf.data file, the provides BPF-based CPU usage
> statistics to reduce the load of the perf process.
>
> Example usage:
>
>   # perf kwork top -h
>
>    Usage: perf kwork top [<options>]
>
>       -C, --cpu <cpu>       list of cpus to profile
>       -i, --input <file>    input file name
>       -n, --name <name>     event name to profile
>       -s, --sort <key[,key2...]>
>                             sort by key(s): rate, runtime, tid
>           --time <str>      Time span for analysis (start,stop)
>
>   # ./perf kwork -k sched,irq,softirq record -- ./perf bench sched messaging -g 1 -l 10000
>   # Running 'sched/messaging' benchmark:
>   # 20 sender and receiver processes per group
>   # 1 groups == 40 processes run
>
>        Total time: 1.854 [sec]
>   [ perf record: Woken up 13 times to write data ]
>   [ perf record: Captured and wrote 27.436 MB perf.data (235986 samples) ]
>   #
>   # perf kwork top
>
>   Total  :  22419.068 ms, 12 cpus
>   %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>   %Cpu0   [||||||||||||||||||||||||        82.56%]
>   %Cpu1   [||||||||||||||||||||||||        81.73%]
>   %Cpu2   [||||||||||||||||||||||||        83.00%]
>   %Cpu3   [||||||||||||||||||||||||        81.01%]
>   %Cpu4   [||||||||||||||||||||||||        81.66%]
>   %Cpu5   [||||||||||||||||||||||||        82.10%]
>   %Cpu6   [||||||||||||||||||||||||        81.65%]
>   %Cpu7   [||||||||||||||||||||||||        82.00%]
>   %Cpu8   [||||||||||||||||||||||||        82.54%]
>   %Cpu9   [||||||||||||||||||||||||        81.02%]
>   %Cpu10  [||||||||||||||||||||||||        81.78%]
>   %Cpu11  [||||||||||||||||||||||||        81.87%]
>
>         PID    %CPU           RUNTIME  COMMMAND
>     ----------------------------------------------------
>     2913025   27.62        516.976 ms  sched-messaging
>     2913017   27.56        516.273 ms  sched-messaging
>     2913018   27.56        516.152 ms  sched-messaging
>     2913008   27.54        515.447 ms  sched-messaging
>     2913010   27.50        514.890 ms  sched-messaging
>     2913007   27.48        514.411 ms  sched-messaging
>     2913015   27.36        512.563 ms  sched-messaging
>     2913026   27.35        512.198 ms  sched-messaging
>     2913011   27.32        511.632 ms  sched-messaging
>     2913014   27.32        511.381 ms  sched-messaging
>     2913012   27.30        511.138 ms  sched-messaging
>     2913013   27.30        511.109 ms  sched-messaging
>     2913009   27.30        510.997 ms  sched-messaging
>     2913019   27.25        510.488 ms  sched-messaging
>     2913021   27.23        509.661 ms  sched-messaging
>     2913023   27.21        509.400 ms  sched-messaging
>     2913020   27.20        509.677 ms  sched-messaging
>     2913022   27.19        509.061 ms  sched-messaging
>     2913016   27.18        509.231 ms  sched-messaging
>     2913024   27.18        509.108 ms  sched-messaging
>     2912999   21.79        407.929 ms  sched-messaging
>     2913000   21.78        408.045 ms  sched-messaging
>     2912998   21.75        407.579 ms  sched-messaging
>     2912993   21.69        406.166 ms  sched-messaging
>     2912991   21.59        404.378 ms  sched-messaging
>     2912992   21.57        403.845 ms  sched-messaging
>     2913005   21.56        403.715 ms  sched-messaging
>     2913001   21.55        403.803 ms  sched-messaging
>     2912990   21.54        403.601 ms  sched-messaging
>     2912994   21.52        403.087 ms  sched-messaging
>     2912995   21.49        402.421 ms  sched-messaging
>     2912996   21.48        402.666 ms  sched-messaging
>     2912989   21.46        402.303 ms  sched-messaging
>     2913003   21.46        402.034 ms  sched-messaging
>     2912988   21.44        401.820 ms  sched-messaging
>     2913004   21.37        400.242 ms  sched-messaging
>     2912987   21.28        398.512 ms  sched-messaging
>     2913002   21.27        398.759 ms  sched-messaging
>     2912997   21.27        398.518 ms  sched-messaging
>     2913006   21.23        397.888 ms  sched-messaging
>           0   18.89        353.061 ms  swapper/3
>           0   18.89        353.026 ms  swapper/9
>           0   18.26        341.190 ms  swapper/6
>           0   18.25        341.048 ms  swapper/4
>           0   18.13        338.844 ms  swapper/10
>           0   18.12        338.686 ms  swapper/1
>           0   18.05        337.333 ms  swapper/11
>           0   17.92        334.903 ms  swapper/7
>           0   17.80        332.691 ms  swapper/5
>           0   17.38        324.762 ms  swapper/8
>           0   17.29        323.102 ms  swapper/0
>           0   16.89        315.630 ms  swapper/2
>     2912984    2.10         39.452 ms  perf
>     2912985    0.73         14.004 ms  sched-messaging
>      660381    0.14          3.527 ms  containerd
>     3006112    0.07          1.402 ms  redis-server
>     2904655    0.05          1.071 ms  kworker/0:0
>      659822    0.03          1.729 ms  containerd
>      659824    0.03          1.346 ms  containerd
>        1053    0.01          0.353 ms  zabbix_agentd
>     2895571    0.01          0.349 ms  kworker/8:1
>          10    0.00          0.535 ms  rcu_sched
>     2912192    0.00          0.272 ms  kworker/u24:2
>     2903540    0.00          0.182 ms  sshd
>         801    0.00          0.120 ms  jbd2/sda4-8
>     2929605    0.00          0.108 ms  grpc_global_tim
>     2912986    0.00          0.075 ms  perf
>     2897765    0.00          0.073 ms  kworker/7:1
>     2912817    0.00          0.070 ms  kworker/2:1
>      391756    0.00          0.067 ms  grpc_global_tim
>     2901459    0.00          0.056 ms  kworker/3:0
>     2908185    0.00          0.052 ms  kworker/10:1
>     2901521    0.00          0.050 ms  kworker/6:1
>     2902444    0.00          0.046 ms  kworker/5:1
>     2898513    0.00          0.041 ms  kworker/1:1
>     2912280    0.00          0.039 ms  kworker/11:3
>        1057    0.00          0.037 ms  zabbix_agentd
>     2908010    0.00          0.037 ms  kworker/9:3
>     2912951    0.00          0.036 ms  kworker/4:3
>         793    0.00          0.027 ms  multipathd
>          42    0.00          0.023 ms  ksoftirqd/5
>           9    0.00          0.022 ms  ksoftirqd/0
>          24    0.00          0.019 ms  ksoftirqd/2
>          59    0.00          0.019 ms  migration/8
>     2929607    0.00          0.018 ms  wecode-db
>          29    0.00          0.012 ms  migration/3
>          11    0.00          0.012 ms  migration/0
>          65    0.00          0.012 ms  migration/9
>          17    0.00          0.011 ms  migration/1
>          71    0.00          0.011 ms  migration/10
>          53    0.00          0.011 ms  migration/7
>          23    0.00          0.011 ms  migration/2
>          47    0.00          0.011 ms  migration/6
>         789    0.00          0.010 ms  multipathd
>          35    0.00          0.010 ms  migration/4
>         480    0.00          0.009 ms  kworker/4:1H
>          41    0.00          0.008 ms  migration/5
>          54    0.00          0.006 ms  ksoftirqd/7
>          36    0.00          0.005 ms  ksoftirqd/4
>          72    0.00          0.005 ms  ksoftirqd/10
>          66    0.00          0.005 ms  ksoftirqd/9
>          30    0.00          0.004 ms  ksoftirqd/3
>     2898362    0.00          0.003 ms  kworker/u24:0
>
>   # perf kwork top -C 2,4,5
>
>   Total  :   5604.885 ms, 3 cpus
>   %Cpu(s):  17.65% id,   0.00% hi,   0.09% si
>   %Cpu2   [||||||||||||||||||||||||        83.00%]
>   %Cpu4   [||||||||||||||||||||||||        81.66%]
>   %Cpu5   [||||||||||||||||||||||||        82.10%]
>
>         PID    %CPU           RUNTIME  COMMMAND
>     ----------------------------------------------------
>           0   18.25        341.048 ms  swapper/4
>           0   17.80        332.691 ms  swapper/5
>           0   16.89        315.630 ms  swapper/2
>     2913012    8.99        168.201 ms  sched-messaging
>     2913021    8.74        163.456 ms  sched-messaging
>     2913026    8.34        156.120 ms  sched-messaging
>     2913010    8.14        152.372 ms  sched-messaging
>     2913007    7.74        144.754 ms  sched-messaging
>     2913016    7.42        138.977 ms  sched-messaging
>     2913022    7.28        136.279 ms  sched-messaging
>     2913013    7.00        130.894 ms  sched-messaging
>     2913019    6.96        130.455 ms  sched-messaging
>     2913023    6.91        129.364 ms  sched-messaging
>     2913008    6.47        121.331 ms  sched-messaging
>     2913018    6.47        121.124 ms  sched-messaging
>     2912994    6.42        120.213 ms  sched-messaging
>     2913025    6.42        120.140 ms  sched-messaging
>     2913020    6.37        119.386 ms  sched-messaging
>     2913011    6.35        118.921 ms  sched-messaging
>     2913014    6.24        116.779 ms  sched-messaging
>     2913009    6.01        112.661 ms  sched-messaging
>     2913005    5.96        111.587 ms  sched-messaging
>     2913003    5.94        111.142 ms  sched-messaging
>     2913024    5.87        109.876 ms  sched-messaging
>     2912988    5.84        109.396 ms  sched-messaging
>     2912987    5.80        108.550 ms  sched-messaging
>     2912990    5.51        103.322 ms  sched-messaging
>     2912993    5.45        102.045 ms  sched-messaging
>     2912996    5.42        101.622 ms  sched-messaging
>     2913006    5.42        101.592 ms  sched-messaging
>     2913000    5.41        101.511 ms  sched-messaging
>     2913002    5.41        101.420 ms  sched-messaging
>     2912997    5.40        101.315 ms  sched-messaging
>     2913004    5.33         99.872 ms  sched-messaging
>     2913015    5.29         99.252 ms  sched-messaging
>     2913017    5.22         97.822 ms  sched-messaging
>     2912991    5.11         95.773 ms  sched-messaging
>     2912992    5.11         95.579 ms  sched-messaging
>     2913001    5.07         95.033 ms  sched-messaging
>     2912989    4.93         92.425 ms  sched-messaging
>     2912995    4.85         90.892 ms  sched-messaging
>     2912998    4.78         89.617 ms  sched-messaging
>     2912999    4.56         85.576 ms  sched-messaging
>     3006112    0.07          1.402 ms  redis-server
>      660381    0.02          0.566 ms  containerd
>     2912984    0.02          0.452 ms  perf
>      659824    0.01          0.369 ms  containerd
>      659822    0.00          0.316 ms  containerd
>          10    0.00          0.176 ms  rcu_sched
>         801    0.00          0.120 ms  jbd2/sda4-8
>     2912985    0.00          0.107 ms  sched-messaging
>     2912986    0.00          0.075 ms  perf
>     2912817    0.00          0.070 ms  kworker/2:1
>     2912192    0.00          0.056 ms  kworker/u24:2
>     2929605    0.00          0.046 ms  grpc_global_tim
>     2902444    0.00          0.046 ms  kworker/5:1
>     2912951    0.00          0.036 ms  kworker/4:3
>          42    0.00          0.023 ms  ksoftirqd/5
>          24    0.00          0.019 ms  ksoftirqd/2
>          23    0.00          0.011 ms  migration/2
>          35    0.00          0.010 ms  migration/4
>         480    0.00          0.009 ms  kworker/4:1H
>          41    0.00          0.008 ms  migration/5
>          36    0.00          0.005 ms  ksoftirqd/4
>
>   # perf kwork top -n perf
>
>   Total  :  22419.068 ms, 12 cpus
>   %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>   %Cpu0   [                                 0.00%]
>   %Cpu1   [                                 0.00%]
>   %Cpu2   [                                 0.02%]
>   %Cpu3   [                                 0.00%]
>   %Cpu4   [                                 0.00%]
>   %Cpu5   [                                 0.00%]
>   %Cpu6   [                                 0.00%]
>   %Cpu7   [                                 0.00%]
>   %Cpu8   [                                 0.03%]
>   %Cpu9   [                                 0.09%]
>   %Cpu10  [                                 0.00%]
>   %Cpu11  [                                 1.96%]
>
>         PID    %CPU           RUNTIME  COMMMAND
>     ----------------------------------------------------
>     2912984    2.10         39.452 ms  perf
>     2912986    0.00          0.075 ms  perf
>
>   # perf kwork top -s tid -n sched-messaging
>
>   Total  :  22419.068 ms, 12 cpus
>   %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>   %Cpu0   [||||||||||||||||||||||||        81.80%]
>   %Cpu1   [||||||||||||||||||||||||        81.65%]
>   %Cpu2   [||||||||||||||||||||||||        82.88%]
>   %Cpu3   [||||||||||||||||||||||||        80.96%]
>   %Cpu4   [||||||||||||||||||||||||        81.64%]
>   %Cpu5   [||||||||||||||||||||||||        82.04%]
>   %Cpu6   [||||||||||||||||||||||||        81.63%]
>   %Cpu7   [||||||||||||||||||||||||        81.97%]
>   %Cpu8   [||||||||||||||||||||||||        82.43%]
>   %Cpu9   [||||||||||||||||||||||||        80.86%]
>   %Cpu10  [||||||||||||||||||||||||        81.72%]
>   %Cpu11  [|||||||||||||||||||||||         79.89%]
>
>         PID    %CPU           RUNTIME  COMMMAND
>     ----------------------------------------------------
>     2912985    0.04          1.111 ms  sched-messaging
>     2912987   21.28        398.512 ms  sched-messaging
>     2912988   21.44        401.820 ms  sched-messaging
>     2912989   21.46        402.303 ms  sched-messaging
>     2912990   21.54        403.601 ms  sched-messaging
>     2912991   21.59        404.378 ms  sched-messaging
>     2912992   21.57        403.845 ms  sched-messaging
>     2912993   21.69        406.166 ms  sched-messaging
>     2912994   21.52        403.087 ms  sched-messaging
>     2912995   21.49        402.421 ms  sched-messaging
>     2912996   21.48        402.666 ms  sched-messaging
>     2912997   21.27        398.518 ms  sched-messaging
>     2912998   21.75        407.579 ms  sched-messaging
>     2912999   21.79        407.929 ms  sched-messaging
>     2913000   21.78        408.045 ms  sched-messaging
>     2913001   21.55        403.803 ms  sched-messaging
>     2913002   21.27        398.759 ms  sched-messaging
>     2913003   21.46        402.034 ms  sched-messaging
>     2913004   21.37        400.242 ms  sched-messaging
>     2913005   21.56        403.715 ms  sched-messaging
>     2913006   21.23        397.888 ms  sched-messaging
>     2913007   27.48        514.411 ms  sched-messaging
>     2913008   27.54        515.447 ms  sched-messaging
>     2913009   27.30        510.997 ms  sched-messaging
>     2913010   27.50        514.890 ms  sched-messaging
>     2913011   27.32        511.632 ms  sched-messaging
>     2913012   27.30        511.138 ms  sched-messaging
>     2913013   27.30        511.109 ms  sched-messaging
>     2913014   27.32        511.381 ms  sched-messaging
>     2913015   27.36        512.563 ms  sched-messaging
>     2913016   27.18        509.231 ms  sched-messaging
>     2913017   27.56        516.273 ms  sched-messaging
>     2913018   27.56        516.152 ms  sched-messaging
>     2913019   27.25        510.488 ms  sched-messaging
>     2913020   27.20        509.677 ms  sched-messaging
>     2913021   27.23        509.661 ms  sched-messaging
>     2913022   27.19        509.061 ms  sched-messaging
>     2913023   27.21        509.400 ms  sched-messaging
>     2913024   27.18        509.108 ms  sched-messaging
>     2913025   27.62        516.976 ms  sched-messaging
>     2913026   27.35        512.198 ms  sched-messaging
>
>   # perf kwork top -b
>   Starting trace, Hit <Ctrl+C> to stop and report
>   ^C
>   Total  :  89209.004 ms, 4 cpus
>   %Cpu(s):  13.47% id,   0.01% hi,   0.24% si
>   %Cpu0   [|||||||||||||||||||||||||       86.08%]
>   %Cpu1   [||||||||||||||||||||||||||      87.08%]
>   %Cpu2   [|||||||||||||||||||||||||       86.10%]
>   %Cpu3   [|||||||||||||||||||||||||       85.86%]
>
>         PID     SPID    %CPU           RUNTIME  COMMMAND
>     -------------------------------------------------------------
>           0        0   13.86       3092.482 ms  [swapper/3]
>           0        0   13.71       3057.756 ms  [swapper/0]
>           0        0   13.65       3044.987 ms  [swapper/2]
>           0        0   12.63       2818.079 ms  [swapper/1]
>         320      320    1.32        295.427 ms  bash
>         414      320    1.09        247.658 ms  sched-messaging
>         678      320    1.09        246.881 ms  sched-messaging
>         354      320    1.09        246.683 ms  sched-messaging
>         409      320    1.08        246.839 ms  sched-messaging
>         554      320    1.07        243.045 ms  sched-messaging
>         405      320    1.04        237.340 ms  sched-messaging
>         525      320    1.04        235.718 ms  sched-messaging
>         486      320    1.03        235.369 ms  sched-messaging
>         714      320    1.03        235.142 ms  sched-messaging
>         435      320    1.03        233.290 ms  sched-messaging
>         408      320    1.03        231.687 ms  sched-messaging
>         716      320    1.02        233.513 ms  sched-messaging
>         453      320    1.02        233.181 ms  sched-messaging
>         522      320    1.02        230.101 ms  sched-messaging
>         651      320    1.02        229.554 ms  sched-messaging
>         418      320    1.01        231.359 ms  sched-messaging
>         644      320    1.01        229.848 ms  sched-messaging
>         535      320    1.01        229.655 ms  sched-messaging
>         536      320    1.01        229.147 ms  sched-messaging
>         410      320    1.01        227.700 ms  sched-messaging
>         689      320    1.00        227.988 ms  sched-messaging
>         412      320    1.00        227.052 ms  sched-messaging
>         489      320    1.00        226.374 ms  sched-messaging
>         521      320    1.00        225.499 ms  sched-messaging
>         530      320    1.00        225.262 ms  sched-messaging
>         681      320    1.00        225.187 ms  sched-messaging
>         415      320    0.99        225.714 ms  sched-messaging
>         643      320    0.99        225.090 ms  sched-messaging
>         325      320    0.99        223.385 ms  sched-messaging
>         498      320    0.99        222.936 ms  sched-messaging
>         413      320    0.98        225.213 ms  sched-messaging
>         645      320    0.98        223.211 ms  sched-messaging
>         544      320    0.98        222.714 ms  sched-messaging
>         441      320    0.98        222.590 ms  sched-messaging
>         697      320    0.98        222.426 ms  sched-messaging
>         523      320    0.98        221.841 ms  sched-messaging
>         402      320    0.98        221.776 ms  sched-messaging
>   <SNIP>
>
> Yang Jihong (16):
>   perf kwork: Fix incorrect and missing free atom in work_push_atom()
>   perf kwork: Add the supported subcommands to the document
>   perf kwork: Set ordered_events for perf_tool
>   perf kwork: Add `kwork` and `src_type` to work_init() for struct
>     kwork_class
>   perf kwork: Overwrite original atom in the list when a new atom is
>     pushed.
>   perf kwork: Set default events list if not specified in
>     setup_event_list()
>   perf kwork: Add sched record support
>   perf kwork: Add `root` parameter to work_sort()
>   perf kwork: Implement perf kwork top
>   perf evsel: Add evsel__intval_common() helper
>   perf kwork top: Add statistics on hardirq event support
>   perf kwork top: Add statistics on softirq event support
>   perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time
>     options
>   perf kwork top: Implements BPF-based cpu usage statistics
>   perf kwork top: Add BPF-based statistics on hardirq event support
>   perf kwork top: Add BPF-based statistics on softirq event support

Besides documentation nits, series:
Reviewed-by: Ian Rogers <irogers@google.com>

Whilst looking at the series, could we clean up
tools/perf/util/kwork.h ? Generally none of the structs are commented.
Some like:

struct kwork_atom {
       struct list_head list;
       u64 time;
       struct kwork_atom *prev;

       void *page_addr;
       unsigned long bit_inpage;
};

Why is it an atom? Why is there a prev when the kwork_atom is also on
a list (which has a prev) ?

structs like trace_kwork_handler could be a forward reference in
kwork.h with the declaration in builtin-kwork.c.

Thanks,
Ian

>  tools/perf/Documentation/perf-kwork.txt  |  38 +-
>  tools/perf/Makefile.perf                 |   2 +-
>  tools/perf/builtin-kwork.c               | 758 +++++++++++++++++++++--
>  tools/perf/util/Build                    |   1 +
>  tools/perf/util/bpf_kwork_top.c          | 308 +++++++++
>  tools/perf/util/bpf_skel/kwork_top.bpf.c | 338 ++++++++++
>  tools/perf/util/evsel.c                  |  13 +
>  tools/perf/util/evsel.h                  |   2 +
>  tools/perf/util/kwork.h                  |  61 +-
>  9 files changed, 1472 insertions(+), 49 deletions(-)
>  create mode 100644 tools/perf/util/bpf_kwork_top.c
>  create mode 100644 tools/perf/util/bpf_skel/kwork_top.bpf.c
>
> --
> 2.30.GIT
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom()
  2023-09-04  4:05   ` Ian Rogers
@ 2023-09-04 11:42     ` Yang Jihong
  2023-09-06 16:43     ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-09-04 11:42 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

Hello,

On 2023/9/4 12:05, Ian Rogers wrote:
> On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>
>> 1. Atoms are managed in page mode and should be released using atom_free()
>>     instead of free().
>> 2. When the event does not match, the atom needs to free.
>>
>> Fixes: f98919ec4fcc ("perf kwork: Implement 'report' subcommand")
>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> 
> Reviewed-by: Ian Rogers <irogers@google.com>
> 
Thanks for review-by tag.
>> ---
>>   tools/perf/builtin-kwork.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
>> index 14bf7a8429e7..73b5dc099a8a 100644
>> --- a/tools/perf/builtin-kwork.c
>> +++ b/tools/perf/builtin-kwork.c
>> @@ -406,12 +406,14 @@ static int work_push_atom(struct perf_kwork *kwork,
>>
>>          work = work_findnew(&class->work_root, &key, &kwork->cmp_id);
>>          if (work == NULL) {
>> -               free(atom);
>> +               atom_free(atom);
> 
> Presumably this is fixing a memory-leak. It would be nice if kwork had
> perf tests, then our builds with -fsanitize=address would highlight
> this kind of issue. Here is a build command I use for this:
> make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g
> -fno-omit-frame-pointer -fsanitize=address" BUILD_BPF_SKEL=1
> NO_LIBTRACEEVENT=1
> 
Okay, I'm going to add the kwork test in perf tests.

Thanks,
Yang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed.
  2023-09-04  4:13   ` Ian Rogers
@ 2023-09-04 11:46     ` Yang Jihong
  0 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-09-04 11:46 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

Hello,

On 2023/9/4 12:13, Ian Rogers wrote:
> On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>
>> work_push_atom() supports nesting. Currently, all supported kworks are not
>> nested. A `overwrite` parameter is added to overwrite the original atom in
>> the list.
>>
>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
>> ---
>>   tools/perf/builtin-kwork.c | 24 ++++++++++++++++++------
>>   1 file changed, 18 insertions(+), 6 deletions(-)
>>
>> diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
>> index 42ea59a957ae..f620911a26a2 100644
>> --- a/tools/perf/builtin-kwork.c
>> +++ b/tools/perf/builtin-kwork.c
>> @@ -392,9 +392,10 @@ static int work_push_atom(struct perf_kwork *kwork,
>>                            struct evsel *evsel,
>>                            struct perf_sample *sample,
>>                            struct machine *machine,
>> -                         struct kwork_work **ret_work)
>> +                         struct kwork_work **ret_work,
>> +                         bool overwrite)
> 
> kerneldoc would be useful. Pushing something seems self-evident but
> what does overwrite mean without reading the code?

Okay, I'll add comments.

> 
>>   {
>> -       struct kwork_atom *atom, *dst_atom;
>> +       struct kwork_atom *atom, *dst_atom, *last_atom;
>>          struct kwork_work *work, key;
>>
>>          BUG_ON(class->work_init == NULL);
>> @@ -427,6 +428,17 @@ static int work_push_atom(struct perf_kwork *kwork,
>>          if (ret_work != NULL)
>>                  *ret_work = work;
>>
>> +       if (overwrite) {
>> +               last_atom = list_last_entry_or_null(&work->atom_list[src_type],
>> +                                                   struct kwork_atom, list);
>> +               if (last_atom) {
>> +                       atom_del(last_atom);
>> +
>> +                       kwork->nr_skipped_events[src_type]++;
>> +                       kwork->nr_skipped_events[KWORK_TRACE_MAX]++;
>> +               }
>> +       }
>> +
>>          list_add_tail(&atom->list, &work->atom_list[src_type]);
>>
>>          return 0;
>> @@ -502,7 +514,7 @@ static int report_entry_event(struct perf_kwork *kwork,
>>   {
>>          return work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
>>                                KWORK_TRACE_MAX, evsel, sample,
>> -                             machine, NULL);
>> +                             machine, NULL, true);
> 
> nit: for constant arguments it can be useful to add parameter names
> which can enable checks like clang-tidy's bugprone argument:
> https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html
> This would make this:
>          return work_push_atom(kwork, class, KWORK_TRACE_ENTRY,
>                                KWORK_TRACE_MAX, evsel, sample,
>                                machine, /*ret_work=*/NULL, /*overwite=*/true);
> 
Thanks for your advice, will add parameter names later.

Thanks,
Yang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 00/16] perf kwork: Implement perf kwork top
  2023-09-04  5:13 ` [RFC v1 00/16] perf kwork: Implement perf kwork top Ian Rogers
@ 2023-09-04 11:59   ` Yang Jihong
  2023-09-04 14:54     ` Ian Rogers
  0 siblings, 1 reply; 26+ messages in thread
From: Yang Jihong @ 2023-09-04 11:59 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

Hello,

On 2023/9/4 13:13, Ian Rogers wrote:
> On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>
>> Some common tools for collecting statistics on CPU usage, such as top,
>> obtain statistics from timer interrupt sampling, and then periodically
>> read statistics from /proc/stat.
>>
>> This method has some deviations:
>> 1. In the tick interrupt, the time between the last tick and the current
>>     tick is counted in the current task. However, the task may be running
>>     only part of the time.
>> 2. For each task, the top tool periodically reads the /proc/{PID}/status
>>     information. For tasks with a short life cycle, it may be missed.
>>
>> In conclusion, the top tool cannot accurately collect statistics on the
>> CPU usage and running time of tasks.
>>
>> The statistical method based on sched_switch tracepoint can accurately
>> calculate the CPU usage of all tasks. This method is applicable to
>> scenarios where performance comparison data is of high precision.
>>
>> In addition to parsing the perf.data file, the provides BPF-based CPU usage
>> statistics to reduce the load of the perf process.
>>
>> Example usage:
>>
>>    # perf kwork top -h
>>
>>     Usage: perf kwork top [<options>]
>>
>>        -C, --cpu <cpu>       list of cpus to profile
>>        -i, --input <file>    input file name
>>        -n, --name <name>     event name to profile
>>        -s, --sort <key[,key2...]>
>>                              sort by key(s): rate, runtime, tid
>>            --time <str>      Time span for analysis (start,stop)
>>
>>    # ./perf kwork -k sched,irq,softirq record -- ./perf bench sched messaging -g 1 -l 10000
>>    # Running 'sched/messaging' benchmark:
>>    # 20 sender and receiver processes per group
>>    # 1 groups == 40 processes run
>>
>>         Total time: 1.854 [sec]
>>    [ perf record: Woken up 13 times to write data ]
>>    [ perf record: Captured and wrote 27.436 MB perf.data (235986 samples) ]
>>    #
>>    # perf kwork top
>>
>>    Total  :  22419.068 ms, 12 cpus
>>    %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>>    %Cpu0   [||||||||||||||||||||||||        82.56%]
>>    %Cpu1   [||||||||||||||||||||||||        81.73%]
>>    %Cpu2   [||||||||||||||||||||||||        83.00%]
>>    %Cpu3   [||||||||||||||||||||||||        81.01%]
>>    %Cpu4   [||||||||||||||||||||||||        81.66%]
>>    %Cpu5   [||||||||||||||||||||||||        82.10%]
>>    %Cpu6   [||||||||||||||||||||||||        81.65%]
>>    %Cpu7   [||||||||||||||||||||||||        82.00%]
>>    %Cpu8   [||||||||||||||||||||||||        82.54%]
>>    %Cpu9   [||||||||||||||||||||||||        81.02%]
>>    %Cpu10  [||||||||||||||||||||||||        81.78%]
>>    %Cpu11  [||||||||||||||||||||||||        81.87%]
>>
>>          PID    %CPU           RUNTIME  COMMMAND
>>      ----------------------------------------------------
>>      2913025   27.62        516.976 ms  sched-messaging
>>      2913017   27.56        516.273 ms  sched-messaging
>>      2913018   27.56        516.152 ms  sched-messaging
>>      2913008   27.54        515.447 ms  sched-messaging
>>      2913010   27.50        514.890 ms  sched-messaging
>>      2913007   27.48        514.411 ms  sched-messaging
>>      2913015   27.36        512.563 ms  sched-messaging
>>      2913026   27.35        512.198 ms  sched-messaging
>>      2913011   27.32        511.632 ms  sched-messaging
>>      2913014   27.32        511.381 ms  sched-messaging
>>      2913012   27.30        511.138 ms  sched-messaging
>>      2913013   27.30        511.109 ms  sched-messaging
>>      2913009   27.30        510.997 ms  sched-messaging
>>      2913019   27.25        510.488 ms  sched-messaging
>>      2913021   27.23        509.661 ms  sched-messaging
>>      2913023   27.21        509.400 ms  sched-messaging
>>      2913020   27.20        509.677 ms  sched-messaging
>>      2913022   27.19        509.061 ms  sched-messaging
>>      2913016   27.18        509.231 ms  sched-messaging
>>      2913024   27.18        509.108 ms  sched-messaging
>>      2912999   21.79        407.929 ms  sched-messaging
>>      2913000   21.78        408.045 ms  sched-messaging
>>      2912998   21.75        407.579 ms  sched-messaging
>>      2912993   21.69        406.166 ms  sched-messaging
>>      2912991   21.59        404.378 ms  sched-messaging
>>      2912992   21.57        403.845 ms  sched-messaging
>>      2913005   21.56        403.715 ms  sched-messaging
>>      2913001   21.55        403.803 ms  sched-messaging
>>      2912990   21.54        403.601 ms  sched-messaging
>>      2912994   21.52        403.087 ms  sched-messaging
>>      2912995   21.49        402.421 ms  sched-messaging
>>      2912996   21.48        402.666 ms  sched-messaging
>>      2912989   21.46        402.303 ms  sched-messaging
>>      2913003   21.46        402.034 ms  sched-messaging
>>      2912988   21.44        401.820 ms  sched-messaging
>>      2913004   21.37        400.242 ms  sched-messaging
>>      2912987   21.28        398.512 ms  sched-messaging
>>      2913002   21.27        398.759 ms  sched-messaging
>>      2912997   21.27        398.518 ms  sched-messaging
>>      2913006   21.23        397.888 ms  sched-messaging
>>            0   18.89        353.061 ms  swapper/3
>>            0   18.89        353.026 ms  swapper/9
>>            0   18.26        341.190 ms  swapper/6
>>            0   18.25        341.048 ms  swapper/4
>>            0   18.13        338.844 ms  swapper/10
>>            0   18.12        338.686 ms  swapper/1
>>            0   18.05        337.333 ms  swapper/11
>>            0   17.92        334.903 ms  swapper/7
>>            0   17.80        332.691 ms  swapper/5
>>            0   17.38        324.762 ms  swapper/8
>>            0   17.29        323.102 ms  swapper/0
>>            0   16.89        315.630 ms  swapper/2
>>      2912984    2.10         39.452 ms  perf
>>      2912985    0.73         14.004 ms  sched-messaging
>>       660381    0.14          3.527 ms  containerd
>>      3006112    0.07          1.402 ms  redis-server
>>      2904655    0.05          1.071 ms  kworker/0:0
>>       659822    0.03          1.729 ms  containerd
>>       659824    0.03          1.346 ms  containerd
>>         1053    0.01          0.353 ms  zabbix_agentd
>>      2895571    0.01          0.349 ms  kworker/8:1
>>           10    0.00          0.535 ms  rcu_sched
>>      2912192    0.00          0.272 ms  kworker/u24:2
>>      2903540    0.00          0.182 ms  sshd
>>          801    0.00          0.120 ms  jbd2/sda4-8
>>      2929605    0.00          0.108 ms  grpc_global_tim
>>      2912986    0.00          0.075 ms  perf
>>      2897765    0.00          0.073 ms  kworker/7:1
>>      2912817    0.00          0.070 ms  kworker/2:1
>>       391756    0.00          0.067 ms  grpc_global_tim
>>      2901459    0.00          0.056 ms  kworker/3:0
>>      2908185    0.00          0.052 ms  kworker/10:1
>>      2901521    0.00          0.050 ms  kworker/6:1
>>      2902444    0.00          0.046 ms  kworker/5:1
>>      2898513    0.00          0.041 ms  kworker/1:1
>>      2912280    0.00          0.039 ms  kworker/11:3
>>         1057    0.00          0.037 ms  zabbix_agentd
>>      2908010    0.00          0.037 ms  kworker/9:3
>>      2912951    0.00          0.036 ms  kworker/4:3
>>          793    0.00          0.027 ms  multipathd
>>           42    0.00          0.023 ms  ksoftirqd/5
>>            9    0.00          0.022 ms  ksoftirqd/0
>>           24    0.00          0.019 ms  ksoftirqd/2
>>           59    0.00          0.019 ms  migration/8
>>      2929607    0.00          0.018 ms  wecode-db
>>           29    0.00          0.012 ms  migration/3
>>           11    0.00          0.012 ms  migration/0
>>           65    0.00          0.012 ms  migration/9
>>           17    0.00          0.011 ms  migration/1
>>           71    0.00          0.011 ms  migration/10
>>           53    0.00          0.011 ms  migration/7
>>           23    0.00          0.011 ms  migration/2
>>           47    0.00          0.011 ms  migration/6
>>          789    0.00          0.010 ms  multipathd
>>           35    0.00          0.010 ms  migration/4
>>          480    0.00          0.009 ms  kworker/4:1H
>>           41    0.00          0.008 ms  migration/5
>>           54    0.00          0.006 ms  ksoftirqd/7
>>           36    0.00          0.005 ms  ksoftirqd/4
>>           72    0.00          0.005 ms  ksoftirqd/10
>>           66    0.00          0.005 ms  ksoftirqd/9
>>           30    0.00          0.004 ms  ksoftirqd/3
>>      2898362    0.00          0.003 ms  kworker/u24:0
>>
>>    # perf kwork top -C 2,4,5
>>
>>    Total  :   5604.885 ms, 3 cpus
>>    %Cpu(s):  17.65% id,   0.00% hi,   0.09% si
>>    %Cpu2   [||||||||||||||||||||||||        83.00%]
>>    %Cpu4   [||||||||||||||||||||||||        81.66%]
>>    %Cpu5   [||||||||||||||||||||||||        82.10%]
>>
>>          PID    %CPU           RUNTIME  COMMMAND
>>      ----------------------------------------------------
>>            0   18.25        341.048 ms  swapper/4
>>            0   17.80        332.691 ms  swapper/5
>>            0   16.89        315.630 ms  swapper/2
>>      2913012    8.99        168.201 ms  sched-messaging
>>      2913021    8.74        163.456 ms  sched-messaging
>>      2913026    8.34        156.120 ms  sched-messaging
>>      2913010    8.14        152.372 ms  sched-messaging
>>      2913007    7.74        144.754 ms  sched-messaging
>>      2913016    7.42        138.977 ms  sched-messaging
>>      2913022    7.28        136.279 ms  sched-messaging
>>      2913013    7.00        130.894 ms  sched-messaging
>>      2913019    6.96        130.455 ms  sched-messaging
>>      2913023    6.91        129.364 ms  sched-messaging
>>      2913008    6.47        121.331 ms  sched-messaging
>>      2913018    6.47        121.124 ms  sched-messaging
>>      2912994    6.42        120.213 ms  sched-messaging
>>      2913025    6.42        120.140 ms  sched-messaging
>>      2913020    6.37        119.386 ms  sched-messaging
>>      2913011    6.35        118.921 ms  sched-messaging
>>      2913014    6.24        116.779 ms  sched-messaging
>>      2913009    6.01        112.661 ms  sched-messaging
>>      2913005    5.96        111.587 ms  sched-messaging
>>      2913003    5.94        111.142 ms  sched-messaging
>>      2913024    5.87        109.876 ms  sched-messaging
>>      2912988    5.84        109.396 ms  sched-messaging
>>      2912987    5.80        108.550 ms  sched-messaging
>>      2912990    5.51        103.322 ms  sched-messaging
>>      2912993    5.45        102.045 ms  sched-messaging
>>      2912996    5.42        101.622 ms  sched-messaging
>>      2913006    5.42        101.592 ms  sched-messaging
>>      2913000    5.41        101.511 ms  sched-messaging
>>      2913002    5.41        101.420 ms  sched-messaging
>>      2912997    5.40        101.315 ms  sched-messaging
>>      2913004    5.33         99.872 ms  sched-messaging
>>      2913015    5.29         99.252 ms  sched-messaging
>>      2913017    5.22         97.822 ms  sched-messaging
>>      2912991    5.11         95.773 ms  sched-messaging
>>      2912992    5.11         95.579 ms  sched-messaging
>>      2913001    5.07         95.033 ms  sched-messaging
>>      2912989    4.93         92.425 ms  sched-messaging
>>      2912995    4.85         90.892 ms  sched-messaging
>>      2912998    4.78         89.617 ms  sched-messaging
>>      2912999    4.56         85.576 ms  sched-messaging
>>      3006112    0.07          1.402 ms  redis-server
>>       660381    0.02          0.566 ms  containerd
>>      2912984    0.02          0.452 ms  perf
>>       659824    0.01          0.369 ms  containerd
>>       659822    0.00          0.316 ms  containerd
>>           10    0.00          0.176 ms  rcu_sched
>>          801    0.00          0.120 ms  jbd2/sda4-8
>>      2912985    0.00          0.107 ms  sched-messaging
>>      2912986    0.00          0.075 ms  perf
>>      2912817    0.00          0.070 ms  kworker/2:1
>>      2912192    0.00          0.056 ms  kworker/u24:2
>>      2929605    0.00          0.046 ms  grpc_global_tim
>>      2902444    0.00          0.046 ms  kworker/5:1
>>      2912951    0.00          0.036 ms  kworker/4:3
>>           42    0.00          0.023 ms  ksoftirqd/5
>>           24    0.00          0.019 ms  ksoftirqd/2
>>           23    0.00          0.011 ms  migration/2
>>           35    0.00          0.010 ms  migration/4
>>          480    0.00          0.009 ms  kworker/4:1H
>>           41    0.00          0.008 ms  migration/5
>>           36    0.00          0.005 ms  ksoftirqd/4
>>
>>    # perf kwork top -n perf
>>
>>    Total  :  22419.068 ms, 12 cpus
>>    %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>>    %Cpu0   [                                 0.00%]
>>    %Cpu1   [                                 0.00%]
>>    %Cpu2   [                                 0.02%]
>>    %Cpu3   [                                 0.00%]
>>    %Cpu4   [                                 0.00%]
>>    %Cpu5   [                                 0.00%]
>>    %Cpu6   [                                 0.00%]
>>    %Cpu7   [                                 0.00%]
>>    %Cpu8   [                                 0.03%]
>>    %Cpu9   [                                 0.09%]
>>    %Cpu10  [                                 0.00%]
>>    %Cpu11  [                                 1.96%]
>>
>>          PID    %CPU           RUNTIME  COMMMAND
>>      ----------------------------------------------------
>>      2912984    2.10         39.452 ms  perf
>>      2912986    0.00          0.075 ms  perf
>>
>>    # perf kwork top -s tid -n sched-messaging
>>
>>    Total  :  22419.068 ms, 12 cpus
>>    %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>>    %Cpu0   [||||||||||||||||||||||||        81.80%]
>>    %Cpu1   [||||||||||||||||||||||||        81.65%]
>>    %Cpu2   [||||||||||||||||||||||||        82.88%]
>>    %Cpu3   [||||||||||||||||||||||||        80.96%]
>>    %Cpu4   [||||||||||||||||||||||||        81.64%]
>>    %Cpu5   [||||||||||||||||||||||||        82.04%]
>>    %Cpu6   [||||||||||||||||||||||||        81.63%]
>>    %Cpu7   [||||||||||||||||||||||||        81.97%]
>>    %Cpu8   [||||||||||||||||||||||||        82.43%]
>>    %Cpu9   [||||||||||||||||||||||||        80.86%]
>>    %Cpu10  [||||||||||||||||||||||||        81.72%]
>>    %Cpu11  [|||||||||||||||||||||||         79.89%]
>>
>>          PID    %CPU           RUNTIME  COMMMAND
>>      ----------------------------------------------------
>>      2912985    0.04          1.111 ms  sched-messaging
>>      2912987   21.28        398.512 ms  sched-messaging
>>      2912988   21.44        401.820 ms  sched-messaging
>>      2912989   21.46        402.303 ms  sched-messaging
>>      2912990   21.54        403.601 ms  sched-messaging
>>      2912991   21.59        404.378 ms  sched-messaging
>>      2912992   21.57        403.845 ms  sched-messaging
>>      2912993   21.69        406.166 ms  sched-messaging
>>      2912994   21.52        403.087 ms  sched-messaging
>>      2912995   21.49        402.421 ms  sched-messaging
>>      2912996   21.48        402.666 ms  sched-messaging
>>      2912997   21.27        398.518 ms  sched-messaging
>>      2912998   21.75        407.579 ms  sched-messaging
>>      2912999   21.79        407.929 ms  sched-messaging
>>      2913000   21.78        408.045 ms  sched-messaging
>>      2913001   21.55        403.803 ms  sched-messaging
>>      2913002   21.27        398.759 ms  sched-messaging
>>      2913003   21.46        402.034 ms  sched-messaging
>>      2913004   21.37        400.242 ms  sched-messaging
>>      2913005   21.56        403.715 ms  sched-messaging
>>      2913006   21.23        397.888 ms  sched-messaging
>>      2913007   27.48        514.411 ms  sched-messaging
>>      2913008   27.54        515.447 ms  sched-messaging
>>      2913009   27.30        510.997 ms  sched-messaging
>>      2913010   27.50        514.890 ms  sched-messaging
>>      2913011   27.32        511.632 ms  sched-messaging
>>      2913012   27.30        511.138 ms  sched-messaging
>>      2913013   27.30        511.109 ms  sched-messaging
>>      2913014   27.32        511.381 ms  sched-messaging
>>      2913015   27.36        512.563 ms  sched-messaging
>>      2913016   27.18        509.231 ms  sched-messaging
>>      2913017   27.56        516.273 ms  sched-messaging
>>      2913018   27.56        516.152 ms  sched-messaging
>>      2913019   27.25        510.488 ms  sched-messaging
>>      2913020   27.20        509.677 ms  sched-messaging
>>      2913021   27.23        509.661 ms  sched-messaging
>>      2913022   27.19        509.061 ms  sched-messaging
>>      2913023   27.21        509.400 ms  sched-messaging
>>      2913024   27.18        509.108 ms  sched-messaging
>>      2913025   27.62        516.976 ms  sched-messaging
>>      2913026   27.35        512.198 ms  sched-messaging
>>
>>    # perf kwork top -b
>>    Starting trace, Hit <Ctrl+C> to stop and report
>>    ^C
>>    Total  :  89209.004 ms, 4 cpus
>>    %Cpu(s):  13.47% id,   0.01% hi,   0.24% si
>>    %Cpu0   [|||||||||||||||||||||||||       86.08%]
>>    %Cpu1   [||||||||||||||||||||||||||      87.08%]
>>    %Cpu2   [|||||||||||||||||||||||||       86.10%]
>>    %Cpu3   [|||||||||||||||||||||||||       85.86%]
>>
>>          PID     SPID    %CPU           RUNTIME  COMMMAND
>>      -------------------------------------------------------------
>>            0        0   13.86       3092.482 ms  [swapper/3]
>>            0        0   13.71       3057.756 ms  [swapper/0]
>>            0        0   13.65       3044.987 ms  [swapper/2]
>>            0        0   12.63       2818.079 ms  [swapper/1]
>>          320      320    1.32        295.427 ms  bash
>>          414      320    1.09        247.658 ms  sched-messaging
>>          678      320    1.09        246.881 ms  sched-messaging
>>          354      320    1.09        246.683 ms  sched-messaging
>>          409      320    1.08        246.839 ms  sched-messaging
>>          554      320    1.07        243.045 ms  sched-messaging
>>          405      320    1.04        237.340 ms  sched-messaging
>>          525      320    1.04        235.718 ms  sched-messaging
>>          486      320    1.03        235.369 ms  sched-messaging
>>          714      320    1.03        235.142 ms  sched-messaging
>>          435      320    1.03        233.290 ms  sched-messaging
>>          408      320    1.03        231.687 ms  sched-messaging
>>          716      320    1.02        233.513 ms  sched-messaging
>>          453      320    1.02        233.181 ms  sched-messaging
>>          522      320    1.02        230.101 ms  sched-messaging
>>          651      320    1.02        229.554 ms  sched-messaging
>>          418      320    1.01        231.359 ms  sched-messaging
>>          644      320    1.01        229.848 ms  sched-messaging
>>          535      320    1.01        229.655 ms  sched-messaging
>>          536      320    1.01        229.147 ms  sched-messaging
>>          410      320    1.01        227.700 ms  sched-messaging
>>          689      320    1.00        227.988 ms  sched-messaging
>>          412      320    1.00        227.052 ms  sched-messaging
>>          489      320    1.00        226.374 ms  sched-messaging
>>          521      320    1.00        225.499 ms  sched-messaging
>>          530      320    1.00        225.262 ms  sched-messaging
>>          681      320    1.00        225.187 ms  sched-messaging
>>          415      320    0.99        225.714 ms  sched-messaging
>>          643      320    0.99        225.090 ms  sched-messaging
>>          325      320    0.99        223.385 ms  sched-messaging
>>          498      320    0.99        222.936 ms  sched-messaging
>>          413      320    0.98        225.213 ms  sched-messaging
>>          645      320    0.98        223.211 ms  sched-messaging
>>          544      320    0.98        222.714 ms  sched-messaging
>>          441      320    0.98        222.590 ms  sched-messaging
>>          697      320    0.98        222.426 ms  sched-messaging
>>          523      320    0.98        221.841 ms  sched-messaging
>>          402      320    0.98        221.776 ms  sched-messaging
>>    <SNIP>
>>
>> Yang Jihong (16):
>>    perf kwork: Fix incorrect and missing free atom in work_push_atom()
>>    perf kwork: Add the supported subcommands to the document
>>    perf kwork: Set ordered_events for perf_tool
>>    perf kwork: Add `kwork` and `src_type` to work_init() for struct
>>      kwork_class
>>    perf kwork: Overwrite original atom in the list when a new atom is
>>      pushed.
>>    perf kwork: Set default events list if not specified in
>>      setup_event_list()
>>    perf kwork: Add sched record support
>>    perf kwork: Add `root` parameter to work_sort()
>>    perf kwork: Implement perf kwork top
>>    perf evsel: Add evsel__intval_common() helper
>>    perf kwork top: Add statistics on hardirq event support
>>    perf kwork top: Add statistics on softirq event support
>>    perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time
>>      options
>>    perf kwork top: Implements BPF-based cpu usage statistics
>>    perf kwork top: Add BPF-based statistics on hardirq event support
>>    perf kwork top: Add BPF-based statistics on softirq event support
> 
> Besides documentation nits, series:
> Reviewed-by: Ian Rogers <irogers@google.com>

Thanks for reivew.

> 
> Whilst looking at the series, could we clean up
> tools/perf/util/kwork.h ? Generally none of the structs are commented.

Okay, I'll put up a separate patch and add comments about it.
> Some like:
>  > struct kwork_atom {
>         struct list_head list;
>         u64 time;
>         struct kwork_atom *prev;
> 
>         void *page_addr;
>         unsigned long bit_inpage;
> };
> 
> Why is it an atom? Why is there a prev when the kwork_atom is also on
> a list (which has a prev) ?
> 
An atom indicates a minimum granularity event.
For example, an "irq_hander_entry" event of an IRQ whose irqnum==10 is 
an atom.

The prev member is used to store the sequence of events.
For example, the sequence of events is as follows:
raise -> entry -> exit

Then:
entry_atom->prev = raise_atom

This relationship needs to be saved because need to calculate latency 
time and running time at the same time in the "perf kwork timehist"

Thanks,
Yang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 00/16] perf kwork: Implement perf kwork top
  2023-09-04 11:59   ` Yang Jihong
@ 2023-09-04 14:54     ` Ian Rogers
  2023-09-05  1:01       ` Yang Jihong
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Rogers @ 2023-09-04 14:54 UTC (permalink / raw)
  To: Yang Jihong
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

On Mon, Sep 4, 2023 at 4:59 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> Hello,
>
> On 2023/9/4 13:13, Ian Rogers wrote:
> > On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> >>
> >> Some common tools for collecting statistics on CPU usage, such as top,
> >> obtain statistics from timer interrupt sampling, and then periodically
> >> read statistics from /proc/stat.
> >>
> >> This method has some deviations:
> >> 1. In the tick interrupt, the time between the last tick and the current
> >>     tick is counted in the current task. However, the task may be running
> >>     only part of the time.
> >> 2. For each task, the top tool periodically reads the /proc/{PID}/status
> >>     information. For tasks with a short life cycle, it may be missed.
> >>
> >> In conclusion, the top tool cannot accurately collect statistics on the
> >> CPU usage and running time of tasks.
> >>
> >> The statistical method based on sched_switch tracepoint can accurately
> >> calculate the CPU usage of all tasks. This method is applicable to
> >> scenarios where performance comparison data is of high precision.
> >>
> >> In addition to parsing the perf.data file, the provides BPF-based CPU usage
> >> statistics to reduce the load of the perf process.
> >>
> >> Example usage:
> >>
> >>    # perf kwork top -h
> >>
> >>     Usage: perf kwork top [<options>]
> >>
> >>        -C, --cpu <cpu>       list of cpus to profile
> >>        -i, --input <file>    input file name
> >>        -n, --name <name>     event name to profile
> >>        -s, --sort <key[,key2...]>
> >>                              sort by key(s): rate, runtime, tid
> >>            --time <str>      Time span for analysis (start,stop)
> >>
> >>    # ./perf kwork -k sched,irq,softirq record -- ./perf bench sched messaging -g 1 -l 10000
> >>    # Running 'sched/messaging' benchmark:
> >>    # 20 sender and receiver processes per group
> >>    # 1 groups == 40 processes run
> >>
> >>         Total time: 1.854 [sec]
> >>    [ perf record: Woken up 13 times to write data ]
> >>    [ perf record: Captured and wrote 27.436 MB perf.data (235986 samples) ]
> >>    #
> >>    # perf kwork top
> >>
> >>    Total  :  22419.068 ms, 12 cpus
> >>    %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
> >>    %Cpu0   [||||||||||||||||||||||||        82.56%]
> >>    %Cpu1   [||||||||||||||||||||||||        81.73%]
> >>    %Cpu2   [||||||||||||||||||||||||        83.00%]
> >>    %Cpu3   [||||||||||||||||||||||||        81.01%]
> >>    %Cpu4   [||||||||||||||||||||||||        81.66%]
> >>    %Cpu5   [||||||||||||||||||||||||        82.10%]
> >>    %Cpu6   [||||||||||||||||||||||||        81.65%]
> >>    %Cpu7   [||||||||||||||||||||||||        82.00%]
> >>    %Cpu8   [||||||||||||||||||||||||        82.54%]
> >>    %Cpu9   [||||||||||||||||||||||||        81.02%]
> >>    %Cpu10  [||||||||||||||||||||||||        81.78%]
> >>    %Cpu11  [||||||||||||||||||||||||        81.87%]
> >>
> >>          PID    %CPU           RUNTIME  COMMMAND
> >>      ----------------------------------------------------
> >>      2913025   27.62        516.976 ms  sched-messaging
> >>      2913017   27.56        516.273 ms  sched-messaging
> >>      2913018   27.56        516.152 ms  sched-messaging
> >>      2913008   27.54        515.447 ms  sched-messaging
> >>      2913010   27.50        514.890 ms  sched-messaging
> >>      2913007   27.48        514.411 ms  sched-messaging
> >>      2913015   27.36        512.563 ms  sched-messaging
> >>      2913026   27.35        512.198 ms  sched-messaging
> >>      2913011   27.32        511.632 ms  sched-messaging
> >>      2913014   27.32        511.381 ms  sched-messaging
> >>      2913012   27.30        511.138 ms  sched-messaging
> >>      2913013   27.30        511.109 ms  sched-messaging
> >>      2913009   27.30        510.997 ms  sched-messaging
> >>      2913019   27.25        510.488 ms  sched-messaging
> >>      2913021   27.23        509.661 ms  sched-messaging
> >>      2913023   27.21        509.400 ms  sched-messaging
> >>      2913020   27.20        509.677 ms  sched-messaging
> >>      2913022   27.19        509.061 ms  sched-messaging
> >>      2913016   27.18        509.231 ms  sched-messaging
> >>      2913024   27.18        509.108 ms  sched-messaging
> >>      2912999   21.79        407.929 ms  sched-messaging
> >>      2913000   21.78        408.045 ms  sched-messaging
> >>      2912998   21.75        407.579 ms  sched-messaging
> >>      2912993   21.69        406.166 ms  sched-messaging
> >>      2912991   21.59        404.378 ms  sched-messaging
> >>      2912992   21.57        403.845 ms  sched-messaging
> >>      2913005   21.56        403.715 ms  sched-messaging
> >>      2913001   21.55        403.803 ms  sched-messaging
> >>      2912990   21.54        403.601 ms  sched-messaging
> >>      2912994   21.52        403.087 ms  sched-messaging
> >>      2912995   21.49        402.421 ms  sched-messaging
> >>      2912996   21.48        402.666 ms  sched-messaging
> >>      2912989   21.46        402.303 ms  sched-messaging
> >>      2913003   21.46        402.034 ms  sched-messaging
> >>      2912988   21.44        401.820 ms  sched-messaging
> >>      2913004   21.37        400.242 ms  sched-messaging
> >>      2912987   21.28        398.512 ms  sched-messaging
> >>      2913002   21.27        398.759 ms  sched-messaging
> >>      2912997   21.27        398.518 ms  sched-messaging
> >>      2913006   21.23        397.888 ms  sched-messaging
> >>            0   18.89        353.061 ms  swapper/3
> >>            0   18.89        353.026 ms  swapper/9
> >>            0   18.26        341.190 ms  swapper/6
> >>            0   18.25        341.048 ms  swapper/4
> >>            0   18.13        338.844 ms  swapper/10
> >>            0   18.12        338.686 ms  swapper/1
> >>            0   18.05        337.333 ms  swapper/11
> >>            0   17.92        334.903 ms  swapper/7
> >>            0   17.80        332.691 ms  swapper/5
> >>            0   17.38        324.762 ms  swapper/8
> >>            0   17.29        323.102 ms  swapper/0
> >>            0   16.89        315.630 ms  swapper/2
> >>      2912984    2.10         39.452 ms  perf
> >>      2912985    0.73         14.004 ms  sched-messaging
> >>       660381    0.14          3.527 ms  containerd
> >>      3006112    0.07          1.402 ms  redis-server
> >>      2904655    0.05          1.071 ms  kworker/0:0
> >>       659822    0.03          1.729 ms  containerd
> >>       659824    0.03          1.346 ms  containerd
> >>         1053    0.01          0.353 ms  zabbix_agentd
> >>      2895571    0.01          0.349 ms  kworker/8:1
> >>           10    0.00          0.535 ms  rcu_sched
> >>      2912192    0.00          0.272 ms  kworker/u24:2
> >>      2903540    0.00          0.182 ms  sshd
> >>          801    0.00          0.120 ms  jbd2/sda4-8
> >>      2929605    0.00          0.108 ms  grpc_global_tim
> >>      2912986    0.00          0.075 ms  perf
> >>      2897765    0.00          0.073 ms  kworker/7:1
> >>      2912817    0.00          0.070 ms  kworker/2:1
> >>       391756    0.00          0.067 ms  grpc_global_tim
> >>      2901459    0.00          0.056 ms  kworker/3:0
> >>      2908185    0.00          0.052 ms  kworker/10:1
> >>      2901521    0.00          0.050 ms  kworker/6:1
> >>      2902444    0.00          0.046 ms  kworker/5:1
> >>      2898513    0.00          0.041 ms  kworker/1:1
> >>      2912280    0.00          0.039 ms  kworker/11:3
> >>         1057    0.00          0.037 ms  zabbix_agentd
> >>      2908010    0.00          0.037 ms  kworker/9:3
> >>      2912951    0.00          0.036 ms  kworker/4:3
> >>          793    0.00          0.027 ms  multipathd
> >>           42    0.00          0.023 ms  ksoftirqd/5
> >>            9    0.00          0.022 ms  ksoftirqd/0
> >>           24    0.00          0.019 ms  ksoftirqd/2
> >>           59    0.00          0.019 ms  migration/8
> >>      2929607    0.00          0.018 ms  wecode-db
> >>           29    0.00          0.012 ms  migration/3
> >>           11    0.00          0.012 ms  migration/0
> >>           65    0.00          0.012 ms  migration/9
> >>           17    0.00          0.011 ms  migration/1
> >>           71    0.00          0.011 ms  migration/10
> >>           53    0.00          0.011 ms  migration/7
> >>           23    0.00          0.011 ms  migration/2
> >>           47    0.00          0.011 ms  migration/6
> >>          789    0.00          0.010 ms  multipathd
> >>           35    0.00          0.010 ms  migration/4
> >>          480    0.00          0.009 ms  kworker/4:1H
> >>           41    0.00          0.008 ms  migration/5
> >>           54    0.00          0.006 ms  ksoftirqd/7
> >>           36    0.00          0.005 ms  ksoftirqd/4
> >>           72    0.00          0.005 ms  ksoftirqd/10
> >>           66    0.00          0.005 ms  ksoftirqd/9
> >>           30    0.00          0.004 ms  ksoftirqd/3
> >>      2898362    0.00          0.003 ms  kworker/u24:0
> >>
> >>    # perf kwork top -C 2,4,5
> >>
> >>    Total  :   5604.885 ms, 3 cpus
> >>    %Cpu(s):  17.65% id,   0.00% hi,   0.09% si
> >>    %Cpu2   [||||||||||||||||||||||||        83.00%]
> >>    %Cpu4   [||||||||||||||||||||||||        81.66%]
> >>    %Cpu5   [||||||||||||||||||||||||        82.10%]
> >>
> >>          PID    %CPU           RUNTIME  COMMMAND
> >>      ----------------------------------------------------
> >>            0   18.25        341.048 ms  swapper/4
> >>            0   17.80        332.691 ms  swapper/5
> >>            0   16.89        315.630 ms  swapper/2
> >>      2913012    8.99        168.201 ms  sched-messaging
> >>      2913021    8.74        163.456 ms  sched-messaging
> >>      2913026    8.34        156.120 ms  sched-messaging
> >>      2913010    8.14        152.372 ms  sched-messaging
> >>      2913007    7.74        144.754 ms  sched-messaging
> >>      2913016    7.42        138.977 ms  sched-messaging
> >>      2913022    7.28        136.279 ms  sched-messaging
> >>      2913013    7.00        130.894 ms  sched-messaging
> >>      2913019    6.96        130.455 ms  sched-messaging
> >>      2913023    6.91        129.364 ms  sched-messaging
> >>      2913008    6.47        121.331 ms  sched-messaging
> >>      2913018    6.47        121.124 ms  sched-messaging
> >>      2912994    6.42        120.213 ms  sched-messaging
> >>      2913025    6.42        120.140 ms  sched-messaging
> >>      2913020    6.37        119.386 ms  sched-messaging
> >>      2913011    6.35        118.921 ms  sched-messaging
> >>      2913014    6.24        116.779 ms  sched-messaging
> >>      2913009    6.01        112.661 ms  sched-messaging
> >>      2913005    5.96        111.587 ms  sched-messaging
> >>      2913003    5.94        111.142 ms  sched-messaging
> >>      2913024    5.87        109.876 ms  sched-messaging
> >>      2912988    5.84        109.396 ms  sched-messaging
> >>      2912987    5.80        108.550 ms  sched-messaging
> >>      2912990    5.51        103.322 ms  sched-messaging
> >>      2912993    5.45        102.045 ms  sched-messaging
> >>      2912996    5.42        101.622 ms  sched-messaging
> >>      2913006    5.42        101.592 ms  sched-messaging
> >>      2913000    5.41        101.511 ms  sched-messaging
> >>      2913002    5.41        101.420 ms  sched-messaging
> >>      2912997    5.40        101.315 ms  sched-messaging
> >>      2913004    5.33         99.872 ms  sched-messaging
> >>      2913015    5.29         99.252 ms  sched-messaging
> >>      2913017    5.22         97.822 ms  sched-messaging
> >>      2912991    5.11         95.773 ms  sched-messaging
> >>      2912992    5.11         95.579 ms  sched-messaging
> >>      2913001    5.07         95.033 ms  sched-messaging
> >>      2912989    4.93         92.425 ms  sched-messaging
> >>      2912995    4.85         90.892 ms  sched-messaging
> >>      2912998    4.78         89.617 ms  sched-messaging
> >>      2912999    4.56         85.576 ms  sched-messaging
> >>      3006112    0.07          1.402 ms  redis-server
> >>       660381    0.02          0.566 ms  containerd
> >>      2912984    0.02          0.452 ms  perf
> >>       659824    0.01          0.369 ms  containerd
> >>       659822    0.00          0.316 ms  containerd
> >>           10    0.00          0.176 ms  rcu_sched
> >>          801    0.00          0.120 ms  jbd2/sda4-8
> >>      2912985    0.00          0.107 ms  sched-messaging
> >>      2912986    0.00          0.075 ms  perf
> >>      2912817    0.00          0.070 ms  kworker/2:1
> >>      2912192    0.00          0.056 ms  kworker/u24:2
> >>      2929605    0.00          0.046 ms  grpc_global_tim
> >>      2902444    0.00          0.046 ms  kworker/5:1
> >>      2912951    0.00          0.036 ms  kworker/4:3
> >>           42    0.00          0.023 ms  ksoftirqd/5
> >>           24    0.00          0.019 ms  ksoftirqd/2
> >>           23    0.00          0.011 ms  migration/2
> >>           35    0.00          0.010 ms  migration/4
> >>          480    0.00          0.009 ms  kworker/4:1H
> >>           41    0.00          0.008 ms  migration/5
> >>           36    0.00          0.005 ms  ksoftirqd/4
> >>
> >>    # perf kwork top -n perf
> >>
> >>    Total  :  22419.068 ms, 12 cpus
> >>    %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
> >>    %Cpu0   [                                 0.00%]
> >>    %Cpu1   [                                 0.00%]
> >>    %Cpu2   [                                 0.02%]
> >>    %Cpu3   [                                 0.00%]
> >>    %Cpu4   [                                 0.00%]
> >>    %Cpu5   [                                 0.00%]
> >>    %Cpu6   [                                 0.00%]
> >>    %Cpu7   [                                 0.00%]
> >>    %Cpu8   [                                 0.03%]
> >>    %Cpu9   [                                 0.09%]
> >>    %Cpu10  [                                 0.00%]
> >>    %Cpu11  [                                 1.96%]
> >>
> >>          PID    %CPU           RUNTIME  COMMMAND
> >>      ----------------------------------------------------
> >>      2912984    2.10         39.452 ms  perf
> >>      2912986    0.00          0.075 ms  perf
> >>
> >>    # perf kwork top -s tid -n sched-messaging
> >>
> >>    Total  :  22419.068 ms, 12 cpus
> >>    %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
> >>    %Cpu0   [||||||||||||||||||||||||        81.80%]
> >>    %Cpu1   [||||||||||||||||||||||||        81.65%]
> >>    %Cpu2   [||||||||||||||||||||||||        82.88%]
> >>    %Cpu3   [||||||||||||||||||||||||        80.96%]
> >>    %Cpu4   [||||||||||||||||||||||||        81.64%]
> >>    %Cpu5   [||||||||||||||||||||||||        82.04%]
> >>    %Cpu6   [||||||||||||||||||||||||        81.63%]
> >>    %Cpu7   [||||||||||||||||||||||||        81.97%]
> >>    %Cpu8   [||||||||||||||||||||||||        82.43%]
> >>    %Cpu9   [||||||||||||||||||||||||        80.86%]
> >>    %Cpu10  [||||||||||||||||||||||||        81.72%]
> >>    %Cpu11  [|||||||||||||||||||||||         79.89%]
> >>
> >>          PID    %CPU           RUNTIME  COMMMAND
> >>      ----------------------------------------------------
> >>      2912985    0.04          1.111 ms  sched-messaging
> >>      2912987   21.28        398.512 ms  sched-messaging
> >>      2912988   21.44        401.820 ms  sched-messaging
> >>      2912989   21.46        402.303 ms  sched-messaging
> >>      2912990   21.54        403.601 ms  sched-messaging
> >>      2912991   21.59        404.378 ms  sched-messaging
> >>      2912992   21.57        403.845 ms  sched-messaging
> >>      2912993   21.69        406.166 ms  sched-messaging
> >>      2912994   21.52        403.087 ms  sched-messaging
> >>      2912995   21.49        402.421 ms  sched-messaging
> >>      2912996   21.48        402.666 ms  sched-messaging
> >>      2912997   21.27        398.518 ms  sched-messaging
> >>      2912998   21.75        407.579 ms  sched-messaging
> >>      2912999   21.79        407.929 ms  sched-messaging
> >>      2913000   21.78        408.045 ms  sched-messaging
> >>      2913001   21.55        403.803 ms  sched-messaging
> >>      2913002   21.27        398.759 ms  sched-messaging
> >>      2913003   21.46        402.034 ms  sched-messaging
> >>      2913004   21.37        400.242 ms  sched-messaging
> >>      2913005   21.56        403.715 ms  sched-messaging
> >>      2913006   21.23        397.888 ms  sched-messaging
> >>      2913007   27.48        514.411 ms  sched-messaging
> >>      2913008   27.54        515.447 ms  sched-messaging
> >>      2913009   27.30        510.997 ms  sched-messaging
> >>      2913010   27.50        514.890 ms  sched-messaging
> >>      2913011   27.32        511.632 ms  sched-messaging
> >>      2913012   27.30        511.138 ms  sched-messaging
> >>      2913013   27.30        511.109 ms  sched-messaging
> >>      2913014   27.32        511.381 ms  sched-messaging
> >>      2913015   27.36        512.563 ms  sched-messaging
> >>      2913016   27.18        509.231 ms  sched-messaging
> >>      2913017   27.56        516.273 ms  sched-messaging
> >>      2913018   27.56        516.152 ms  sched-messaging
> >>      2913019   27.25        510.488 ms  sched-messaging
> >>      2913020   27.20        509.677 ms  sched-messaging
> >>      2913021   27.23        509.661 ms  sched-messaging
> >>      2913022   27.19        509.061 ms  sched-messaging
> >>      2913023   27.21        509.400 ms  sched-messaging
> >>      2913024   27.18        509.108 ms  sched-messaging
> >>      2913025   27.62        516.976 ms  sched-messaging
> >>      2913026   27.35        512.198 ms  sched-messaging
> >>
> >>    # perf kwork top -b
> >>    Starting trace, Hit <Ctrl+C> to stop and report
> >>    ^C
> >>    Total  :  89209.004 ms, 4 cpus
> >>    %Cpu(s):  13.47% id,   0.01% hi,   0.24% si
> >>    %Cpu0   [|||||||||||||||||||||||||       86.08%]
> >>    %Cpu1   [||||||||||||||||||||||||||      87.08%]
> >>    %Cpu2   [|||||||||||||||||||||||||       86.10%]
> >>    %Cpu3   [|||||||||||||||||||||||||       85.86%]
> >>
> >>          PID     SPID    %CPU           RUNTIME  COMMMAND
> >>      -------------------------------------------------------------
> >>            0        0   13.86       3092.482 ms  [swapper/3]
> >>            0        0   13.71       3057.756 ms  [swapper/0]
> >>            0        0   13.65       3044.987 ms  [swapper/2]
> >>            0        0   12.63       2818.079 ms  [swapper/1]
> >>          320      320    1.32        295.427 ms  bash
> >>          414      320    1.09        247.658 ms  sched-messaging
> >>          678      320    1.09        246.881 ms  sched-messaging
> >>          354      320    1.09        246.683 ms  sched-messaging
> >>          409      320    1.08        246.839 ms  sched-messaging
> >>          554      320    1.07        243.045 ms  sched-messaging
> >>          405      320    1.04        237.340 ms  sched-messaging
> >>          525      320    1.04        235.718 ms  sched-messaging
> >>          486      320    1.03        235.369 ms  sched-messaging
> >>          714      320    1.03        235.142 ms  sched-messaging
> >>          435      320    1.03        233.290 ms  sched-messaging
> >>          408      320    1.03        231.687 ms  sched-messaging
> >>          716      320    1.02        233.513 ms  sched-messaging
> >>          453      320    1.02        233.181 ms  sched-messaging
> >>          522      320    1.02        230.101 ms  sched-messaging
> >>          651      320    1.02        229.554 ms  sched-messaging
> >>          418      320    1.01        231.359 ms  sched-messaging
> >>          644      320    1.01        229.848 ms  sched-messaging
> >>          535      320    1.01        229.655 ms  sched-messaging
> >>          536      320    1.01        229.147 ms  sched-messaging
> >>          410      320    1.01        227.700 ms  sched-messaging
> >>          689      320    1.00        227.988 ms  sched-messaging
> >>          412      320    1.00        227.052 ms  sched-messaging
> >>          489      320    1.00        226.374 ms  sched-messaging
> >>          521      320    1.00        225.499 ms  sched-messaging
> >>          530      320    1.00        225.262 ms  sched-messaging
> >>          681      320    1.00        225.187 ms  sched-messaging
> >>          415      320    0.99        225.714 ms  sched-messaging
> >>          643      320    0.99        225.090 ms  sched-messaging
> >>          325      320    0.99        223.385 ms  sched-messaging
> >>          498      320    0.99        222.936 ms  sched-messaging
> >>          413      320    0.98        225.213 ms  sched-messaging
> >>          645      320    0.98        223.211 ms  sched-messaging
> >>          544      320    0.98        222.714 ms  sched-messaging
> >>          441      320    0.98        222.590 ms  sched-messaging
> >>          697      320    0.98        222.426 ms  sched-messaging
> >>          523      320    0.98        221.841 ms  sched-messaging
> >>          402      320    0.98        221.776 ms  sched-messaging
> >>    <SNIP>
> >>
> >> Yang Jihong (16):
> >>    perf kwork: Fix incorrect and missing free atom in work_push_atom()
> >>    perf kwork: Add the supported subcommands to the document
> >>    perf kwork: Set ordered_events for perf_tool
> >>    perf kwork: Add `kwork` and `src_type` to work_init() for struct
> >>      kwork_class
> >>    perf kwork: Overwrite original atom in the list when a new atom is
> >>      pushed.
> >>    perf kwork: Set default events list if not specified in
> >>      setup_event_list()
> >>    perf kwork: Add sched record support
> >>    perf kwork: Add `root` parameter to work_sort()
> >>    perf kwork: Implement perf kwork top
> >>    perf evsel: Add evsel__intval_common() helper
> >>    perf kwork top: Add statistics on hardirq event support
> >>    perf kwork top: Add statistics on softirq event support
> >>    perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time
> >>      options
> >>    perf kwork top: Implements BPF-based cpu usage statistics
> >>    perf kwork top: Add BPF-based statistics on hardirq event support
> >>    perf kwork top: Add BPF-based statistics on softirq event support
> >
> > Besides documentation nits, series:
> > Reviewed-by: Ian Rogers <irogers@google.com>
>
> Thanks for reivew.
>
> >
> > Whilst looking at the series, could we clean up
> > tools/perf/util/kwork.h ? Generally none of the structs are commented.
>
> Okay, I'll put up a separate patch and add comments about it.
> > Some like:
> >  > struct kwork_atom {
> >         struct list_head list;
> >         u64 time;
> >         struct kwork_atom *prev;
> >
> >         void *page_addr;
> >         unsigned long bit_inpage;
> > };
> >
> > Why is it an atom? Why is there a prev when the kwork_atom is also on
> > a list (which has a prev) ?
> >
> An atom indicates a minimum granularity event.
> For example, an "irq_hander_entry" event of an IRQ whose irqnum==10 is
> an atom.
>
> The prev member is used to store the sequence of events.
> For example, the sequence of events is as follows:
> raise -> entry -> exit
>
> Then:
> entry_atom->prev = raise_atom
>
> This relationship needs to be saved because need to calculate latency
> time and running time at the same time in the "perf kwork timehist"

Thanks Yang, this is good context. Could it be added to the header
file? I'm also wondering at some point we should start generating
kerneldoc from comments in the code. I don't know if you want to look
into kerneldoc style comments.

Thanks,
Ian

> Thanks,
> Yang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 00/16] perf kwork: Implement perf kwork top
  2023-09-04 14:54     ` Ian Rogers
@ 2023-09-05  1:01       ` Yang Jihong
  0 siblings, 0 replies; 26+ messages in thread
From: Yang Jihong @ 2023-09-05  1:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, adrian.hunter, kan.liang, sandipan.das, ravi.bangoria,
	linux-kernel, linux-perf-users

Hello,

On 2023/9/4 22:54, Ian Rogers wrote:
> On Mon, Sep 4, 2023 at 4:59 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>
>> Hello,
>>
>> On 2023/9/4 13:13, Ian Rogers wrote:
>>> On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
>>>>
>>>> Some common tools for collecting statistics on CPU usage, such as top,
>>>> obtain statistics from timer interrupt sampling, and then periodically
>>>> read statistics from /proc/stat.
>>>>
>>>> This method has some deviations:
>>>> 1. In the tick interrupt, the time between the last tick and the current
>>>>      tick is counted in the current task. However, the task may be running
>>>>      only part of the time.
>>>> 2. For each task, the top tool periodically reads the /proc/{PID}/status
>>>>      information. For tasks with a short life cycle, it may be missed.
>>>>
>>>> In conclusion, the top tool cannot accurately collect statistics on the
>>>> CPU usage and running time of tasks.
>>>>
>>>> The statistical method based on sched_switch tracepoint can accurately
>>>> calculate the CPU usage of all tasks. This method is applicable to
>>>> scenarios where performance comparison data is of high precision.
>>>>
>>>> In addition to parsing the perf.data file, the provides BPF-based CPU usage
>>>> statistics to reduce the load of the perf process.
>>>>
>>>> Example usage:
>>>>
>>>>     # perf kwork top -h
>>>>
>>>>      Usage: perf kwork top [<options>]
>>>>
>>>>         -C, --cpu <cpu>       list of cpus to profile
>>>>         -i, --input <file>    input file name
>>>>         -n, --name <name>     event name to profile
>>>>         -s, --sort <key[,key2...]>
>>>>                               sort by key(s): rate, runtime, tid
>>>>             --time <str>      Time span for analysis (start,stop)
>>>>
>>>>     # ./perf kwork -k sched,irq,softirq record -- ./perf bench sched messaging -g 1 -l 10000
>>>>     # Running 'sched/messaging' benchmark:
>>>>     # 20 sender and receiver processes per group
>>>>     # 1 groups == 40 processes run
>>>>
>>>>          Total time: 1.854 [sec]
>>>>     [ perf record: Woken up 13 times to write data ]
>>>>     [ perf record: Captured and wrote 27.436 MB perf.data (235986 samples) ]
>>>>     #
>>>>     # perf kwork top
>>>>
>>>>     Total  :  22419.068 ms, 12 cpus
>>>>     %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>>>>     %Cpu0   [||||||||||||||||||||||||        82.56%]
>>>>     %Cpu1   [||||||||||||||||||||||||        81.73%]
>>>>     %Cpu2   [||||||||||||||||||||||||        83.00%]
>>>>     %Cpu3   [||||||||||||||||||||||||        81.01%]
>>>>     %Cpu4   [||||||||||||||||||||||||        81.66%]
>>>>     %Cpu5   [||||||||||||||||||||||||        82.10%]
>>>>     %Cpu6   [||||||||||||||||||||||||        81.65%]
>>>>     %Cpu7   [||||||||||||||||||||||||        82.00%]
>>>>     %Cpu8   [||||||||||||||||||||||||        82.54%]
>>>>     %Cpu9   [||||||||||||||||||||||||        81.02%]
>>>>     %Cpu10  [||||||||||||||||||||||||        81.78%]
>>>>     %Cpu11  [||||||||||||||||||||||||        81.87%]
>>>>
>>>>           PID    %CPU           RUNTIME  COMMMAND
>>>>       ----------------------------------------------------
>>>>       2913025   27.62        516.976 ms  sched-messaging
>>>>       2913017   27.56        516.273 ms  sched-messaging
>>>>       2913018   27.56        516.152 ms  sched-messaging
>>>>       2913008   27.54        515.447 ms  sched-messaging
>>>>       2913010   27.50        514.890 ms  sched-messaging
>>>>       2913007   27.48        514.411 ms  sched-messaging
>>>>       2913015   27.36        512.563 ms  sched-messaging
>>>>       2913026   27.35        512.198 ms  sched-messaging
>>>>       2913011   27.32        511.632 ms  sched-messaging
>>>>       2913014   27.32        511.381 ms  sched-messaging
>>>>       2913012   27.30        511.138 ms  sched-messaging
>>>>       2913013   27.30        511.109 ms  sched-messaging
>>>>       2913009   27.30        510.997 ms  sched-messaging
>>>>       2913019   27.25        510.488 ms  sched-messaging
>>>>       2913021   27.23        509.661 ms  sched-messaging
>>>>       2913023   27.21        509.400 ms  sched-messaging
>>>>       2913020   27.20        509.677 ms  sched-messaging
>>>>       2913022   27.19        509.061 ms  sched-messaging
>>>>       2913016   27.18        509.231 ms  sched-messaging
>>>>       2913024   27.18        509.108 ms  sched-messaging
>>>>       2912999   21.79        407.929 ms  sched-messaging
>>>>       2913000   21.78        408.045 ms  sched-messaging
>>>>       2912998   21.75        407.579 ms  sched-messaging
>>>>       2912993   21.69        406.166 ms  sched-messaging
>>>>       2912991   21.59        404.378 ms  sched-messaging
>>>>       2912992   21.57        403.845 ms  sched-messaging
>>>>       2913005   21.56        403.715 ms  sched-messaging
>>>>       2913001   21.55        403.803 ms  sched-messaging
>>>>       2912990   21.54        403.601 ms  sched-messaging
>>>>       2912994   21.52        403.087 ms  sched-messaging
>>>>       2912995   21.49        402.421 ms  sched-messaging
>>>>       2912996   21.48        402.666 ms  sched-messaging
>>>>       2912989   21.46        402.303 ms  sched-messaging
>>>>       2913003   21.46        402.034 ms  sched-messaging
>>>>       2912988   21.44        401.820 ms  sched-messaging
>>>>       2913004   21.37        400.242 ms  sched-messaging
>>>>       2912987   21.28        398.512 ms  sched-messaging
>>>>       2913002   21.27        398.759 ms  sched-messaging
>>>>       2912997   21.27        398.518 ms  sched-messaging
>>>>       2913006   21.23        397.888 ms  sched-messaging
>>>>             0   18.89        353.061 ms  swapper/3
>>>>             0   18.89        353.026 ms  swapper/9
>>>>             0   18.26        341.190 ms  swapper/6
>>>>             0   18.25        341.048 ms  swapper/4
>>>>             0   18.13        338.844 ms  swapper/10
>>>>             0   18.12        338.686 ms  swapper/1
>>>>             0   18.05        337.333 ms  swapper/11
>>>>             0   17.92        334.903 ms  swapper/7
>>>>             0   17.80        332.691 ms  swapper/5
>>>>             0   17.38        324.762 ms  swapper/8
>>>>             0   17.29        323.102 ms  swapper/0
>>>>             0   16.89        315.630 ms  swapper/2
>>>>       2912984    2.10         39.452 ms  perf
>>>>       2912985    0.73         14.004 ms  sched-messaging
>>>>        660381    0.14          3.527 ms  containerd
>>>>       3006112    0.07          1.402 ms  redis-server
>>>>       2904655    0.05          1.071 ms  kworker/0:0
>>>>        659822    0.03          1.729 ms  containerd
>>>>        659824    0.03          1.346 ms  containerd
>>>>          1053    0.01          0.353 ms  zabbix_agentd
>>>>       2895571    0.01          0.349 ms  kworker/8:1
>>>>            10    0.00          0.535 ms  rcu_sched
>>>>       2912192    0.00          0.272 ms  kworker/u24:2
>>>>       2903540    0.00          0.182 ms  sshd
>>>>           801    0.00          0.120 ms  jbd2/sda4-8
>>>>       2929605    0.00          0.108 ms  grpc_global_tim
>>>>       2912986    0.00          0.075 ms  perf
>>>>       2897765    0.00          0.073 ms  kworker/7:1
>>>>       2912817    0.00          0.070 ms  kworker/2:1
>>>>        391756    0.00          0.067 ms  grpc_global_tim
>>>>       2901459    0.00          0.056 ms  kworker/3:0
>>>>       2908185    0.00          0.052 ms  kworker/10:1
>>>>       2901521    0.00          0.050 ms  kworker/6:1
>>>>       2902444    0.00          0.046 ms  kworker/5:1
>>>>       2898513    0.00          0.041 ms  kworker/1:1
>>>>       2912280    0.00          0.039 ms  kworker/11:3
>>>>          1057    0.00          0.037 ms  zabbix_agentd
>>>>       2908010    0.00          0.037 ms  kworker/9:3
>>>>       2912951    0.00          0.036 ms  kworker/4:3
>>>>           793    0.00          0.027 ms  multipathd
>>>>            42    0.00          0.023 ms  ksoftirqd/5
>>>>             9    0.00          0.022 ms  ksoftirqd/0
>>>>            24    0.00          0.019 ms  ksoftirqd/2
>>>>            59    0.00          0.019 ms  migration/8
>>>>       2929607    0.00          0.018 ms  wecode-db
>>>>            29    0.00          0.012 ms  migration/3
>>>>            11    0.00          0.012 ms  migration/0
>>>>            65    0.00          0.012 ms  migration/9
>>>>            17    0.00          0.011 ms  migration/1
>>>>            71    0.00          0.011 ms  migration/10
>>>>            53    0.00          0.011 ms  migration/7
>>>>            23    0.00          0.011 ms  migration/2
>>>>            47    0.00          0.011 ms  migration/6
>>>>           789    0.00          0.010 ms  multipathd
>>>>            35    0.00          0.010 ms  migration/4
>>>>           480    0.00          0.009 ms  kworker/4:1H
>>>>            41    0.00          0.008 ms  migration/5
>>>>            54    0.00          0.006 ms  ksoftirqd/7
>>>>            36    0.00          0.005 ms  ksoftirqd/4
>>>>            72    0.00          0.005 ms  ksoftirqd/10
>>>>            66    0.00          0.005 ms  ksoftirqd/9
>>>>            30    0.00          0.004 ms  ksoftirqd/3
>>>>       2898362    0.00          0.003 ms  kworker/u24:0
>>>>
>>>>     # perf kwork top -C 2,4,5
>>>>
>>>>     Total  :   5604.885 ms, 3 cpus
>>>>     %Cpu(s):  17.65% id,   0.00% hi,   0.09% si
>>>>     %Cpu2   [||||||||||||||||||||||||        83.00%]
>>>>     %Cpu4   [||||||||||||||||||||||||        81.66%]
>>>>     %Cpu5   [||||||||||||||||||||||||        82.10%]
>>>>
>>>>           PID    %CPU           RUNTIME  COMMMAND
>>>>       ----------------------------------------------------
>>>>             0   18.25        341.048 ms  swapper/4
>>>>             0   17.80        332.691 ms  swapper/5
>>>>             0   16.89        315.630 ms  swapper/2
>>>>       2913012    8.99        168.201 ms  sched-messaging
>>>>       2913021    8.74        163.456 ms  sched-messaging
>>>>       2913026    8.34        156.120 ms  sched-messaging
>>>>       2913010    8.14        152.372 ms  sched-messaging
>>>>       2913007    7.74        144.754 ms  sched-messaging
>>>>       2913016    7.42        138.977 ms  sched-messaging
>>>>       2913022    7.28        136.279 ms  sched-messaging
>>>>       2913013    7.00        130.894 ms  sched-messaging
>>>>       2913019    6.96        130.455 ms  sched-messaging
>>>>       2913023    6.91        129.364 ms  sched-messaging
>>>>       2913008    6.47        121.331 ms  sched-messaging
>>>>       2913018    6.47        121.124 ms  sched-messaging
>>>>       2912994    6.42        120.213 ms  sched-messaging
>>>>       2913025    6.42        120.140 ms  sched-messaging
>>>>       2913020    6.37        119.386 ms  sched-messaging
>>>>       2913011    6.35        118.921 ms  sched-messaging
>>>>       2913014    6.24        116.779 ms  sched-messaging
>>>>       2913009    6.01        112.661 ms  sched-messaging
>>>>       2913005    5.96        111.587 ms  sched-messaging
>>>>       2913003    5.94        111.142 ms  sched-messaging
>>>>       2913024    5.87        109.876 ms  sched-messaging
>>>>       2912988    5.84        109.396 ms  sched-messaging
>>>>       2912987    5.80        108.550 ms  sched-messaging
>>>>       2912990    5.51        103.322 ms  sched-messaging
>>>>       2912993    5.45        102.045 ms  sched-messaging
>>>>       2912996    5.42        101.622 ms  sched-messaging
>>>>       2913006    5.42        101.592 ms  sched-messaging
>>>>       2913000    5.41        101.511 ms  sched-messaging
>>>>       2913002    5.41        101.420 ms  sched-messaging
>>>>       2912997    5.40        101.315 ms  sched-messaging
>>>>       2913004    5.33         99.872 ms  sched-messaging
>>>>       2913015    5.29         99.252 ms  sched-messaging
>>>>       2913017    5.22         97.822 ms  sched-messaging
>>>>       2912991    5.11         95.773 ms  sched-messaging
>>>>       2912992    5.11         95.579 ms  sched-messaging
>>>>       2913001    5.07         95.033 ms  sched-messaging
>>>>       2912989    4.93         92.425 ms  sched-messaging
>>>>       2912995    4.85         90.892 ms  sched-messaging
>>>>       2912998    4.78         89.617 ms  sched-messaging
>>>>       2912999    4.56         85.576 ms  sched-messaging
>>>>       3006112    0.07          1.402 ms  redis-server
>>>>        660381    0.02          0.566 ms  containerd
>>>>       2912984    0.02          0.452 ms  perf
>>>>        659824    0.01          0.369 ms  containerd
>>>>        659822    0.00          0.316 ms  containerd
>>>>            10    0.00          0.176 ms  rcu_sched
>>>>           801    0.00          0.120 ms  jbd2/sda4-8
>>>>       2912985    0.00          0.107 ms  sched-messaging
>>>>       2912986    0.00          0.075 ms  perf
>>>>       2912817    0.00          0.070 ms  kworker/2:1
>>>>       2912192    0.00          0.056 ms  kworker/u24:2
>>>>       2929605    0.00          0.046 ms  grpc_global_tim
>>>>       2902444    0.00          0.046 ms  kworker/5:1
>>>>       2912951    0.00          0.036 ms  kworker/4:3
>>>>            42    0.00          0.023 ms  ksoftirqd/5
>>>>            24    0.00          0.019 ms  ksoftirqd/2
>>>>            23    0.00          0.011 ms  migration/2
>>>>            35    0.00          0.010 ms  migration/4
>>>>           480    0.00          0.009 ms  kworker/4:1H
>>>>            41    0.00          0.008 ms  migration/5
>>>>            36    0.00          0.005 ms  ksoftirqd/4
>>>>
>>>>     # perf kwork top -n perf
>>>>
>>>>     Total  :  22419.068 ms, 12 cpus
>>>>     %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>>>>     %Cpu0   [                                 0.00%]
>>>>     %Cpu1   [                                 0.00%]
>>>>     %Cpu2   [                                 0.02%]
>>>>     %Cpu3   [                                 0.00%]
>>>>     %Cpu4   [                                 0.00%]
>>>>     %Cpu5   [                                 0.00%]
>>>>     %Cpu6   [                                 0.00%]
>>>>     %Cpu7   [                                 0.00%]
>>>>     %Cpu8   [                                 0.03%]
>>>>     %Cpu9   [                                 0.09%]
>>>>     %Cpu10  [                                 0.00%]
>>>>     %Cpu11  [                                 1.96%]
>>>>
>>>>           PID    %CPU           RUNTIME  COMMMAND
>>>>       ----------------------------------------------------
>>>>       2912984    2.10         39.452 ms  perf
>>>>       2912986    0.00          0.075 ms  perf
>>>>
>>>>     # perf kwork top -s tid -n sched-messaging
>>>>
>>>>     Total  :  22419.068 ms, 12 cpus
>>>>     %Cpu(s):  17.99% id,   0.00% hi,   0.09% si
>>>>     %Cpu0   [||||||||||||||||||||||||        81.80%]
>>>>     %Cpu1   [||||||||||||||||||||||||        81.65%]
>>>>     %Cpu2   [||||||||||||||||||||||||        82.88%]
>>>>     %Cpu3   [||||||||||||||||||||||||        80.96%]
>>>>     %Cpu4   [||||||||||||||||||||||||        81.64%]
>>>>     %Cpu5   [||||||||||||||||||||||||        82.04%]
>>>>     %Cpu6   [||||||||||||||||||||||||        81.63%]
>>>>     %Cpu7   [||||||||||||||||||||||||        81.97%]
>>>>     %Cpu8   [||||||||||||||||||||||||        82.43%]
>>>>     %Cpu9   [||||||||||||||||||||||||        80.86%]
>>>>     %Cpu10  [||||||||||||||||||||||||        81.72%]
>>>>     %Cpu11  [|||||||||||||||||||||||         79.89%]
>>>>
>>>>           PID    %CPU           RUNTIME  COMMMAND
>>>>       ----------------------------------------------------
>>>>       2912985    0.04          1.111 ms  sched-messaging
>>>>       2912987   21.28        398.512 ms  sched-messaging
>>>>       2912988   21.44        401.820 ms  sched-messaging
>>>>       2912989   21.46        402.303 ms  sched-messaging
>>>>       2912990   21.54        403.601 ms  sched-messaging
>>>>       2912991   21.59        404.378 ms  sched-messaging
>>>>       2912992   21.57        403.845 ms  sched-messaging
>>>>       2912993   21.69        406.166 ms  sched-messaging
>>>>       2912994   21.52        403.087 ms  sched-messaging
>>>>       2912995   21.49        402.421 ms  sched-messaging
>>>>       2912996   21.48        402.666 ms  sched-messaging
>>>>       2912997   21.27        398.518 ms  sched-messaging
>>>>       2912998   21.75        407.579 ms  sched-messaging
>>>>       2912999   21.79        407.929 ms  sched-messaging
>>>>       2913000   21.78        408.045 ms  sched-messaging
>>>>       2913001   21.55        403.803 ms  sched-messaging
>>>>       2913002   21.27        398.759 ms  sched-messaging
>>>>       2913003   21.46        402.034 ms  sched-messaging
>>>>       2913004   21.37        400.242 ms  sched-messaging
>>>>       2913005   21.56        403.715 ms  sched-messaging
>>>>       2913006   21.23        397.888 ms  sched-messaging
>>>>       2913007   27.48        514.411 ms  sched-messaging
>>>>       2913008   27.54        515.447 ms  sched-messaging
>>>>       2913009   27.30        510.997 ms  sched-messaging
>>>>       2913010   27.50        514.890 ms  sched-messaging
>>>>       2913011   27.32        511.632 ms  sched-messaging
>>>>       2913012   27.30        511.138 ms  sched-messaging
>>>>       2913013   27.30        511.109 ms  sched-messaging
>>>>       2913014   27.32        511.381 ms  sched-messaging
>>>>       2913015   27.36        512.563 ms  sched-messaging
>>>>       2913016   27.18        509.231 ms  sched-messaging
>>>>       2913017   27.56        516.273 ms  sched-messaging
>>>>       2913018   27.56        516.152 ms  sched-messaging
>>>>       2913019   27.25        510.488 ms  sched-messaging
>>>>       2913020   27.20        509.677 ms  sched-messaging
>>>>       2913021   27.23        509.661 ms  sched-messaging
>>>>       2913022   27.19        509.061 ms  sched-messaging
>>>>       2913023   27.21        509.400 ms  sched-messaging
>>>>       2913024   27.18        509.108 ms  sched-messaging
>>>>       2913025   27.62        516.976 ms  sched-messaging
>>>>       2913026   27.35        512.198 ms  sched-messaging
>>>>
>>>>     # perf kwork top -b
>>>>     Starting trace, Hit <Ctrl+C> to stop and report
>>>>     ^C
>>>>     Total  :  89209.004 ms, 4 cpus
>>>>     %Cpu(s):  13.47% id,   0.01% hi,   0.24% si
>>>>     %Cpu0   [|||||||||||||||||||||||||       86.08%]
>>>>     %Cpu1   [||||||||||||||||||||||||||      87.08%]
>>>>     %Cpu2   [|||||||||||||||||||||||||       86.10%]
>>>>     %Cpu3   [|||||||||||||||||||||||||       85.86%]
>>>>
>>>>           PID     SPID    %CPU           RUNTIME  COMMMAND
>>>>       -------------------------------------------------------------
>>>>             0        0   13.86       3092.482 ms  [swapper/3]
>>>>             0        0   13.71       3057.756 ms  [swapper/0]
>>>>             0        0   13.65       3044.987 ms  [swapper/2]
>>>>             0        0   12.63       2818.079 ms  [swapper/1]
>>>>           320      320    1.32        295.427 ms  bash
>>>>           414      320    1.09        247.658 ms  sched-messaging
>>>>           678      320    1.09        246.881 ms  sched-messaging
>>>>           354      320    1.09        246.683 ms  sched-messaging
>>>>           409      320    1.08        246.839 ms  sched-messaging
>>>>           554      320    1.07        243.045 ms  sched-messaging
>>>>           405      320    1.04        237.340 ms  sched-messaging
>>>>           525      320    1.04        235.718 ms  sched-messaging
>>>>           486      320    1.03        235.369 ms  sched-messaging
>>>>           714      320    1.03        235.142 ms  sched-messaging
>>>>           435      320    1.03        233.290 ms  sched-messaging
>>>>           408      320    1.03        231.687 ms  sched-messaging
>>>>           716      320    1.02        233.513 ms  sched-messaging
>>>>           453      320    1.02        233.181 ms  sched-messaging
>>>>           522      320    1.02        230.101 ms  sched-messaging
>>>>           651      320    1.02        229.554 ms  sched-messaging
>>>>           418      320    1.01        231.359 ms  sched-messaging
>>>>           644      320    1.01        229.848 ms  sched-messaging
>>>>           535      320    1.01        229.655 ms  sched-messaging
>>>>           536      320    1.01        229.147 ms  sched-messaging
>>>>           410      320    1.01        227.700 ms  sched-messaging
>>>>           689      320    1.00        227.988 ms  sched-messaging
>>>>           412      320    1.00        227.052 ms  sched-messaging
>>>>           489      320    1.00        226.374 ms  sched-messaging
>>>>           521      320    1.00        225.499 ms  sched-messaging
>>>>           530      320    1.00        225.262 ms  sched-messaging
>>>>           681      320    1.00        225.187 ms  sched-messaging
>>>>           415      320    0.99        225.714 ms  sched-messaging
>>>>           643      320    0.99        225.090 ms  sched-messaging
>>>>           325      320    0.99        223.385 ms  sched-messaging
>>>>           498      320    0.99        222.936 ms  sched-messaging
>>>>           413      320    0.98        225.213 ms  sched-messaging
>>>>           645      320    0.98        223.211 ms  sched-messaging
>>>>           544      320    0.98        222.714 ms  sched-messaging
>>>>           441      320    0.98        222.590 ms  sched-messaging
>>>>           697      320    0.98        222.426 ms  sched-messaging
>>>>           523      320    0.98        221.841 ms  sched-messaging
>>>>           402      320    0.98        221.776 ms  sched-messaging
>>>>     <SNIP>
>>>>
>>>> Yang Jihong (16):
>>>>     perf kwork: Fix incorrect and missing free atom in work_push_atom()
>>>>     perf kwork: Add the supported subcommands to the document
>>>>     perf kwork: Set ordered_events for perf_tool
>>>>     perf kwork: Add `kwork` and `src_type` to work_init() for struct
>>>>       kwork_class
>>>>     perf kwork: Overwrite original atom in the list when a new atom is
>>>>       pushed.
>>>>     perf kwork: Set default events list if not specified in
>>>>       setup_event_list()
>>>>     perf kwork: Add sched record support
>>>>     perf kwork: Add `root` parameter to work_sort()
>>>>     perf kwork: Implement perf kwork top
>>>>     perf evsel: Add evsel__intval_common() helper
>>>>     perf kwork top: Add statistics on hardirq event support
>>>>     perf kwork top: Add statistics on softirq event support
>>>>     perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time
>>>>       options
>>>>     perf kwork top: Implements BPF-based cpu usage statistics
>>>>     perf kwork top: Add BPF-based statistics on hardirq event support
>>>>     perf kwork top: Add BPF-based statistics on softirq event support
>>>
>>> Besides documentation nits, series:
>>> Reviewed-by: Ian Rogers <irogers@google.com>
>>
>> Thanks for reivew.
>>
>>>
>>> Whilst looking at the series, could we clean up
>>> tools/perf/util/kwork.h ? Generally none of the structs are commented.
>>
>> Okay, I'll put up a separate patch and add comments about it.
>>> Some like:
>>>   > struct kwork_atom {
>>>          struct list_head list;
>>>          u64 time;
>>>          struct kwork_atom *prev;
>>>
>>>          void *page_addr;
>>>          unsigned long bit_inpage;
>>> };
>>>
>>> Why is it an atom? Why is there a prev when the kwork_atom is also on
>>> a list (which has a prev) ?
>>>
>> An atom indicates a minimum granularity event.
>> For example, an "irq_hander_entry" event of an IRQ whose irqnum==10 is
>> an atom.
>>
>> The prev member is used to store the sequence of events.
>> For example, the sequence of events is as follows:
>> raise -> entry -> exit
>>
>> Then:
>> entry_atom->prev = raise_atom
>>
>> This relationship needs to be saved because need to calculate latency
>> time and running time at the same time in the "perf kwork timehist"
> 
> Thanks Yang, this is good context. Could it be added to the header

Okay, I can submit a separate patch with comments.

> file? I'm also wondering at some point we should start generating
> kerneldoc from comments in the code. I don't know if you want to look
> into kerneldoc style comments.

Of course, would you like to tell me how to do it?
Not just perf kwork, but the whole perf tool code, I can do some work on 
this.

> 
> Thanks,
> Ian
> 
>> Thanks,
>> Yang
> 
> .
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom()
  2023-09-04  4:05   ` Ian Rogers
  2023-09-04 11:42     ` Yang Jihong
@ 2023-09-06 16:43     ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 26+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-09-06 16:43 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Yang Jihong, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, namhyung, adrian.hunter, kan.liang, sandipan.das,
	ravi.bangoria, linux-kernel, linux-perf-users

Em Sun, Sep 03, 2023 at 09:05:06PM -0700, Ian Rogers escreveu:
> On Sat, Aug 12, 2023 at 1:52 AM Yang Jihong <yangjihong1@huawei.com> wrote:
> >
> > 1. Atoms are managed in page mode and should be released using atom_free()
> >    instead of free().
> > 2. When the event does not match, the atom needs to free.
> >
> > Fixes: f98919ec4fcc ("perf kwork: Implement 'report' subcommand")
> > Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> 
> Reviewed-by: Ian Rogers <irogers@google.com>

Thanks, applied locally, will perform some tests and later today
probably push publicly, great work!

- Arnaldo

 
> > ---
> >  tools/perf/builtin-kwork.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
> > index 14bf7a8429e7..73b5dc099a8a 100644
> > --- a/tools/perf/builtin-kwork.c
> > +++ b/tools/perf/builtin-kwork.c
> > @@ -406,12 +406,14 @@ static int work_push_atom(struct perf_kwork *kwork,
> >
> >         work = work_findnew(&class->work_root, &key, &kwork->cmp_id);
> >         if (work == NULL) {
> > -               free(atom);
> > +               atom_free(atom);
> 
> Presumably this is fixing a memory-leak. It would be nice if kwork had
> perf tests, then our builds with -fsanitize=address would highlight
> this kind of issue. Here is a build command I use for this:
> make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g
> -fno-omit-frame-pointer -fsanitize=address" BUILD_BPF_SKEL=1
> NO_LIBTRACEEVENT=1
> 
> Thanks,
> Ian
> 
> >                 return -1;
> >         }
> >
> > -       if (!profile_event_match(kwork, work, sample))
> > +       if (!profile_event_match(kwork, work, sample)) {
> > +               atom_free(atom);
> >                 return 0;
> > +       }
> >
> >         if (dst_type < KWORK_TRACE_MAX) {
> >                 dst_atom = list_last_entry_or_null(&work->atom_list[dst_type],
> > --
> > 2.30.GIT
> >

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-09-06 16:43 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-12  8:49 [RFC v1 00/16] perf kwork: Implement perf kwork top Yang Jihong
2023-08-12  8:49 ` [RFC v1 01/16] perf kwork: Fix incorrect and missing free atom in work_push_atom() Yang Jihong
2023-09-04  4:05   ` Ian Rogers
2023-09-04 11:42     ` Yang Jihong
2023-09-06 16:43     ` Arnaldo Carvalho de Melo
2023-08-12  8:49 ` [RFC v1 02/16] perf kwork: Add the supported subcommands to the document Yang Jihong
2023-08-12  8:49 ` [RFC v1 03/16] perf kwork: Set ordered_events for perf_tool Yang Jihong
2023-08-12  8:49 ` [RFC v1 04/16] perf kwork: Add `kwork` and `src_type` to work_init() for struct kwork_class Yang Jihong
2023-08-12  8:49 ` [RFC v1 05/16] perf kwork: Overwrite original atom in the list when a new atom is pushed Yang Jihong
2023-09-04  4:13   ` Ian Rogers
2023-09-04 11:46     ` Yang Jihong
2023-08-12  8:49 ` [RFC v1 06/16] perf kwork: Set default events list if not specified in setup_event_list() Yang Jihong
2023-08-12  8:49 ` [RFC v1 07/16] perf kwork: Add sched record support Yang Jihong
2023-08-12  8:49 ` [RFC v1 08/16] perf kwork: Add `root` parameter to work_sort() Yang Jihong
2023-08-12  8:49 ` [RFC v1 09/16] perf kwork: Implement perf kwork top Yang Jihong
2023-08-12  8:49 ` [RFC v1 10/16] perf evsel: Add evsel__intval_common() helper Yang Jihong
2023-08-12  8:49 ` [RFC v1 11/16] perf kwork top: Add statistics on hardirq event support Yang Jihong
2023-08-12  8:49 ` [RFC v1 12/16] perf kwork top: Add statistics on softirq " Yang Jihong
2023-08-12  8:49 ` [RFC v1 13/16] perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time options Yang Jihong
2023-08-12  8:49 ` [RFC v1 14/16] perf kwork top: Implements BPF-based cpu usage statistics Yang Jihong
2023-08-12  8:49 ` [RFC v1 15/16] perf kwork top: Add BPF-based statistics on hardirq event support Yang Jihong
2023-08-12  8:49 ` [RFC v1 16/16] perf kwork top: Add BPF-based statistics on softirq " Yang Jihong
2023-09-04  5:13 ` [RFC v1 00/16] perf kwork: Implement perf kwork top Ian Rogers
2023-09-04 11:59   ` Yang Jihong
2023-09-04 14:54     ` Ian Rogers
2023-09-05  1:01       ` Yang Jihong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).