All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space
@ 2019-02-26  5:31 Alexey Budankov
  2019-02-26  5:55 ` [PATCH v3 2/9] perf record: implement -f,--mmap-flush=<threshold> option Alexey Budankov
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  5:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


The patch set implements runtime trace compression (-z option) in 
record mode and trace auto decompression in report and inject modes. 
Streaming Zstandard (Zstd) API (zstd) is used for compression and 
decompression of data that come from kernel mmaped data buffers.

Usage of implemented -z,--compression_level=n option provides ~3-5x 
avg. trace file size reduction on variety of tested workloads what 
saves storage space on larger server systems where trace file size 
can easily reach several tens or even hundreds of GiBs, especially 
when profiling with dwarf-based stacks and tracing of context switches.
Implemented -f,--mmap-flush option can be used to avoid compressing 
every single byte of data and increase compression ratio at the same 
time lowering tool runtime overhead. Default option value is 1 what 
is equal to the current perf record implementation. The option is 
independent from -z setting and doesn't vary with compression level:

  $ tools/perf/perf record -z 1 -e cycles -- matrix.gcc
  $ tools/perf/perf record --aio=1 -z 1 -e cycles -- matrix.gcc
  $ tools/perf/perf record -z 1 -f 1024 -e cycles -- matrix.gcc
  $ tools/perf/perf record --aio=1 -z 1 -f 1024 -e cycles -- matrix.gcc

Runtime compression overhead has been measured for serial and AIO 
trace writing modes when profiling matrix multiplication workload 
with the following results:

    -------------------------------------------------------------
    | SERIAL			  | AIO-1                       |
-----------------------------------------------------------------
|-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
|----------------------------------------------------------------
| 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
| 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
| 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
| 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
| 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
| 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
-----------------------------------------------------------------

OVH = (Execution time with -z N) / (Execution time with -z 0)

ratio - compression ratio
size  - number of bytes that was compressed

size ~= trace file x ratio

See complete description of measurement conditions and details below.

Introduced compression functionality can be disabled or configured from 
the command line using NO_LIBZSTD and LIBZSTD_DIR defines:

  $ make -C tools/perf NO_LIBZSTD=1 clean all
  $ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all

If your system has some version of the zstd package preinstalled then 
the build system finds and uses it during the build. Auto detection 
feature status is reported just before compilation starts, as usual.
If you still prefer to compile with some version of zstd that is not 
preinstalled you have capability to refer the compilation to that 
version using LIBZSTD_DIR define.

See 'tools/perf/perf test' run results below.

---
Alexey Budankov (9):
  feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
  perf record: implement -f,--mmap-flush=<threshold> option
  perf session: define bytes_transferred and bytes_compressed metrics
  perf record: implement COMPRESSED event record and its attributes
  perf mmap: implement dedicated memory buffer for data compression
  perf util: introduce Zstd based streaming compression API
  perf record: implement -z,--compression_level=n option and compression
  perf report: implement record trace decompression
  perf inject: enable COMPRESSED records decompression

 tools/build/Makefile.feature             |   6 +-
 tools/build/feature/Makefile             |   6 +-
 tools/build/feature/test-all.c           |   5 +
 tools/build/feature/test-libzstd.c       |  12 ++
 tools/perf/Documentation/perf-record.txt |  10 ++
 tools/perf/Makefile.config               |  20 +++
 tools/perf/Makefile.perf                 |   3 +
 tools/perf/builtin-inject.c              |   4 +
 tools/perf/builtin-record.c              | 161 ++++++++++++++++++++---
 tools/perf/builtin-report.c              |   5 +-
 tools/perf/perf.h                        |   2 +
 tools/perf/util/Build                    |   2 +
 tools/perf/util/compress.h               |  22 ++++
 tools/perf/util/env.h                    |  11 ++
 tools/perf/util/event.c                  |   1 +
 tools/perf/util/event.h                  |   7 +
 tools/perf/util/evlist.c                 |   8 +-
 tools/perf/util/evlist.h                 |   3 +-
 tools/perf/util/header.c                 |  55 +++++++-
 tools/perf/util/header.h                 |   1 +
 tools/perf/util/mmap.c                   |  60 +++++++--
 tools/perf/util/mmap.h                   |  18 ++-
 tools/perf/util/session.c                | 124 ++++++++++++++++-
 tools/perf/util/session.h                |  14 ++
 tools/perf/util/tool.h                   |   2 +
 tools/perf/util/zstd.c                   | 143 ++++++++++++++++++++
 26 files changed, 659 insertions(+), 46 deletions(-)
 create mode 100644 tools/build/feature/test-libzstd.c
 create mode 100644 tools/perf/util/zstd.c

---
Changes in v3:
- moved -f,--mmap-flush option implementation into a separate patch
- moved definition and printing of bytes_transferred and bytes_compressed into a separate patch
- moved COMPRESSED feature into a separate patch
- added versioning and stored COMPRESSED feature attributes as u32
- implemented dedicated memory buffer for compression in case of serial streaming
- moved low level Zstd based compression functions into util/{compress.h,zstd.c}
- made compress function to be a param of __push(), __aio_push() functions
- enabled perf inject to decompress COMPRESSED records
- measured compression overhead for serial and AIO streaming using 
  basic matrix multiplication workload on 8 core skylake

Changes in v2:
- moved compression/decompression code to session layer
- enabled allocation aio data buffers for compression
- enabled trace compression for serial trace streaming

---
[1] https://github.com/facebook/zstd

---
OVERHEAD MEASUREMENTS:

uname -a
Linux localhost 4.20.7-200.fc29.x86_64 #1 SMP Wed Feb 6 19:16:42 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

cat /proc/cpuinfo
processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 94
model name      : Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
stepping        : 3
microcode       : 0xc6
cpu MHz         : 4021.884
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips        : 8016.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

-----------------------------------------------------------------
#!/bin/bash -xv

echo 0 > /proc/sys/kernel/perf_event_paranoid
+ echo 0
cat /proc/sys/kernel/perf_event_paranoid
+ cat /proc/sys/kernel/perf_event_paranoid
0

echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
+ echo performance
+ tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
performance

for i in 0 1 2 3 5 8
do
    /usr/bin/time tools/perf/perf record -z $i -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
done
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record -z 0 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7fe36de5c010
Offs of buf1 = 0x7fe36de5c180
Addr of buf2 = 0x7fe36be5b010
Offs of buf2 = 0x7fe36be5b1c0
Addr of buf3 = 0x7fe369e5a010
Offs of buf3 = 0x7fe369e5a100
Addr of buf4 = 0x7fe367e59010
Offs of buf4 = 0x7fe367e59140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 16.949 seconds
[ perf record: Woken up 309 times to write data ]
[ perf record: Captured and wrote 179.424 MB perf.data ]
133.67user 0.35system 0:17.08elapsed 784%CPU (0avgtext+0avgdata 100580maxresident)k
0inputs+367480outputs (0major+34737minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record -z 1 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7fcaec334010
Offs of buf1 = 0x7fcaec334180
Addr of buf2 = 0x7fcaea333010
Offs of buf2 = 0x7fcaea3331c0
Addr of buf3 = 0x7fcae8332010
Offs of buf3 = 0x7fcae8332100
Addr of buf4 = 0x7fcae6331010
Offs of buf4 = 0x7fcae6331140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 17.608 seconds
[ perf record: Woken up 595 times to write data ]
[ perf record: Compressed 181.148 MB to 21.497 MB, ratio is 8.427 ]
[ perf record: Captured and wrote 21.527 MB perf.data ]
135.69user 0.24system 0:17.73elapsed 766%CPU (0avgtext+0avgdata 100500maxresident)k
0inputs+44112outputs (0major+35033minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record -z 2 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f1336f8d010
Offs of buf1 = 0x7f1336f8d180
Addr of buf2 = 0x7f1334f8c010
Offs of buf2 = 0x7f1334f8c1c0
Addr of buf3 = 0x7f1332f8b010
Offs of buf3 = 0x7f1332f8b100
Addr of buf4 = 0x7f1330f8a010
Offs of buf4 = 0x7f1330f8a140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 18.175 seconds
[ perf record: Woken up 521 times to write data ]
[ perf record: Compressed 186.953 MB to 23.210 MB, ratio is 8.055 ]
[ perf record: Captured and wrote 23.239 MB perf.data ]
140.21user 0.25system 0:18.32elapsed 766%CPU (0avgtext+0avgdata 100560maxresident)k
0inputs+47608outputs (0major+35263minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record -z 3 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f97060e3010
Offs of buf1 = 0x7f97060e3180
Addr of buf2 = 0x7f97040e2010
Offs of buf2 = 0x7f97040e21c0
Addr of buf3 = 0x7f97020e1010
Offs of buf3 = 0x7f97020e1100
Addr of buf4 = 0x7f97000e0010
Offs of buf4 = 0x7f97000e0140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 17.688 seconds
[ perf record: Woken up 485 times to write data ]
[ perf record: Compressed 181.908 MB to 21.962 MB, ratio is 8.283 ]
[ perf record: Captured and wrote 21.991 MB perf.data ]
136.87user 0.23system 0:17.81elapsed 769%CPU (0avgtext+0avgdata 100616maxresident)k
0inputs+45064outputs (0major+35773minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record -z 5 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f477b444010
Offs of buf1 = 0x7f477b444180
Addr of buf2 = 0x7f4779443010
Offs of buf2 = 0x7f47794431c0
Addr of buf3 = 0x7f4777442010
Offs of buf3 = 0x7f4777442100
Addr of buf4 = 0x7f4775441010
Offs of buf4 = 0x7f4775441140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 18.406 seconds
[ perf record: Woken up 416 times to write data ]
[ perf record: Compressed 187.705 MB to 23.170 MB, ratio is 8.101 ]
[ perf record: Captured and wrote 23.200 MB perf.data ]
142.72user 0.26system 0:18.53elapsed 771%CPU (0avgtext+0avgdata 100520maxresident)k
0inputs+47528outputs (0major+36928minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record -z 8 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7fb5bf032010
Offs of buf1 = 0x7fb5bf032180
Addr of buf2 = 0x7fb5bd031010
Offs of buf2 = 0x7fb5bd0311c0
Addr of buf3 = 0x7fb5bb030010
Offs of buf3 = 0x7fb5bb030100
Addr of buf4 = 0x7fb5b902f010
Offs of buf4 = 0x7fb5b902f140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 17.751 seconds
[ perf record: Woken up 391 times to write data ]
[ perf record: Compressed 179.191 MB to 19.441 MB, ratio is 9.217 ]
[ perf record: Captured and wrote 19.502 MB perf.data ]
138.90user 0.29system 0:17.88elapsed 778%CPU (0avgtext+0avgdata 100612maxresident)k
0inputs+39968outputs (0major+37436minor)pagefaults 0swaps

for i in 0 1 2 3 5 8
do
    /usr/bin/time tools/perf/perf record --aio=1 -z $i -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
done
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record --aio=1 -z 0 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7feee4519010
Offs of buf1 = 0x7feee4519180
Addr of buf2 = 0x7feee2518010
Offs of buf2 = 0x7feee25181c0
Addr of buf3 = 0x7feee0517010
Offs of buf3 = 0x7feee0517100
Addr of buf4 = 0x7feede516010
Offs of buf4 = 0x7feede516140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 17.912 seconds
[ perf record: Woken up 390 times to write data ]
[ perf record: Captured and wrote 187.527 MB perf.data ]
139.70user 0.39system 0:18.04elapsed 776%CPU (0avgtext+0avgdata 100624maxresident)k
0inputs+384072outputs (0major+35257minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record --aio=1 -z 1 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f72b93ac010
Offs of buf1 = 0x7f72b93ac180
Addr of buf2 = 0x7f72b73ab010
Offs of buf2 = 0x7f72b73ab1c0
Addr of buf3 = 0x7f72b53aa010
Offs of buf3 = 0x7f72b53aa100
Addr of buf4 = 0x7f72b33a9010
Offs of buf4 = 0x7f72b33a9140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 18.198 seconds
[ perf record: Woken up 416 times to write data ]
[ perf record: Compressed 188.562 MB to 22.252 MB, ratio is 8.474 ]
[ perf record: Captured and wrote 22.284 MB perf.data ]
141.12user 0.32system 0:18.32elapsed 771%CPU (0avgtext+0avgdata 100576maxresident)k
0inputs+45664outputs (0major+35040minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record --aio=1 -z 2 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7ffb9caf3010
Offs of buf1 = 0x7ffb9caf3180
Addr of buf2 = 0x7ffb9aaf2010
Offs of buf2 = 0x7ffb9aaf21c0
Addr of buf3 = 0x7ffb98af1010
Offs of buf3 = 0x7ffb98af1100
Addr of buf4 = 0x7ffb96af0010
Offs of buf4 = 0x7ffb96af0140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 18.360 seconds
[ perf record: Woken up 442 times to write data ]
[ perf record: Compressed 191.773 MB to 24.238 MB, ratio is 7.912 ]
[ perf record: Captured and wrote 24.290 MB perf.data ]
143.76user 0.49system 0:18.50elapsed 779%CPU (0avgtext+0avgdata 100596maxresident)k
0inputs+49760outputs (0major+35276minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record --aio=1 -z 3 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f13f2df2010
Offs of buf1 = 0x7f13f2df2180
Addr of buf2 = 0x7f13f0df1010
Offs of buf2 = 0x7f13f0df11c0
Addr of buf3 = 0x7f13eedf0010
Offs of buf3 = 0x7f13eedf0100
Addr of buf4 = 0x7f13ecdef010
Offs of buf4 = 0x7f13ecdef140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 18.383 seconds
[ perf record: Woken up 499 times to write data ]
[ perf record: Compressed 191.078 MB to 23.246 MB, ratio is 8.220 ]
[ perf record: Captured and wrote 23.282 MB perf.data ]
143.72user 0.34system 0:18.51elapsed 778%CPU (0avgtext+0avgdata 100616maxresident)k
0inputs+47704outputs (0major+35783minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record --aio=1 -z 5 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7fca0d091010
Offs of buf1 = 0x7fca0d091180
Addr of buf2 = 0x7fca0b090010
Offs of buf2 = 0x7fca0b0901c0
Addr of buf3 = 0x7fca0908f010
Offs of buf3 = 0x7fca0908f100
Addr of buf4 = 0x7fca0708e010
Offs of buf4 = 0x7fca0708e140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 18.758 seconds
[ perf record: Woken up 535 times to write data ]
[ perf record: Compressed 190.065 MB to 24.430 MB, ratio is 7.780 ]
[ perf record: Captured and wrote 24.519 MB perf.data ]
144.62user 0.66system 0:18.88elapsed 769%CPU (0avgtext+0avgdata 100528maxresident)k
0inputs+50232outputs (0major+36942minor)pagefaults 0swaps
+ for i in 0 1 2 3 5 8
+ /usr/bin/time tools/perf/perf record --aio=1 -z 8 -F 25000 -N -B -T -R -e cycles -- ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f7e1f449010
Offs of buf1 = 0x7f7e1f449180
Addr of buf2 = 0x7f7e1d448010
Offs of buf2 = 0x7f7e1d4481c0
Addr of buf3 = 0x7f7e1b447010
Offs of buf3 = 0x7f7e1b447100
Addr of buf4 = 0x7f7e19446010
Offs of buf4 = 0x7f7e19446140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 20.103 seconds
[ perf record: Woken up 260 times to write data ]
[ perf record: Compressed 193.024 MB to 31.588 MB, ratio is 6.111 ]
[ perf record: Captured and wrote 32.139 MB perf.data ]
151.73user 4.21system 0:20.23elapsed 770%CPU (0avgtext+0avgdata 100616maxresident)k
0inputs+65848outputs (0major+37431minor)pagefaults 0swaps

---
TESTING:

tools/perf/perf test
 1: vmlinux symtab matches kallsyms                       : Skip
 2: Detect openat syscall event                           : Ok
 3: Detect openat syscall event on all cpus               : Ok
 4: Read samples using the mmap interface                 : Ok
 5: Test data source output                               : Ok
 6: Parse event definition strings                        : Ok
 7: Simple expression parser                              : Ok
 8: PERF_RECORD_* events & perf_sample fields             : Ok
 9: Parse perf pmu format                                 : Ok
10: DSO data read                                         : Ok
11: DSO data cache                                        : Ok
12: DSO data reopen                                       : Ok
13: Roundtrip evsel->name                                 : Ok
14: Parse sched tracepoints fields                        : Ok
15: syscalls:sys_enter_openat event fields                : Ok
16: Setup struct perf_event_attr                          : Skip
17: Match and link multiple hists                         : Ok
18: 'import perf' in python                               : FAILED!
19: Breakpoint overflow signal handler                    : Ok
20: Breakpoint overflow sampling                          : Ok
21: Breakpoint accounting                                 : Ok
22: Watchpoint                                            :
22.1: Read Only Watchpoint                                : Skip
22.2: Write Only Watchpoint                               : Ok
22.3: Read / Write Watchpoint                             : Ok
22.4: Modify Watchpoint                                   : Ok
23: Number of exit events of a simple workload            : Ok
24: Software clock events period values                   : Ok
25: Object code reading                                   : Ok
26: Sample parsing                                        : Ok
27: Use a dummy software event to keep tracking           : Ok
28: Parse with no sample_id_all bit set                   : Ok
29: Filter hist entries                                   : Ok
30: Lookup mmap thread                                    : Ok
31: Share thread mg                                       : Ok
32: Sort output of hist entries                           : Ok
33: Cumulate child hist entries                           : Ok
34: Track with sched_switch                               : Ok
35: Filter fds with revents mask in a fdarray             : Ok
36: Add fd to a fdarray, making it autogrow               : Ok
37: kmod_path__parse                                      : Ok
38: Thread map                                            : Ok
39: LLVM search and compile                               :
39.1: Basic BPF llvm compile                              : Skip
39.2: kbuild searching                                    : Skip
39.3: Compile source for BPF prologue generation          : Skip
39.4: Compile source for BPF relocation                   : Skip
40: Session topology                                      : Ok
41: BPF filter                                            :
41.1: Basic BPF filtering                                 : Skip
41.2: BPF pinning                                         : Skip
41.3: BPF prologue generation                             : Skip
41.4: BPF relocation checker                              : Skip
42: Synthesize thread map                                 : Ok
43: Remove thread map                                     : Ok
44: Synthesize cpu map                                    : Ok
45: Synthesize stat config                                : Ok
46: Synthesize stat                                       : Ok
47: Synthesize stat round                                 : Ok
48: Synthesize attr update                                : Ok
49: Event times                                           : Ok
50: Read backward ring buffer                             : Ok
51: Print cpu map                                         : Ok
52: Probe SDT events                                      : Ok
53: is_printable_array                                    : Ok
54: Print bitmap                                          : Ok
55: perf hooks                                            : Ok
56: builtin clang support                                 : Skip (not compiled in)
57: unit_number__scnprintf                                : Ok
58: mem2node                                              : Ok
59: x86 rdpmc                                             : Ok
60: Convert perf time to TSC                              : Ok
61: DWARF unwind                                          : Ok
62: x86 instruction decoder - new instructions            : Ok
63: x86 bp modify                                         : Ok
64: Check open filename arg using perf trace + vfs_getname: Skip
65: Add vfs_getname probe to get syscall args filenames   : Skip
66: probe libc's inet_pton & backtrace it with ping       : Ok
67: Use vfs_getname probe to get syscall args filenames   : Skip

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 2/9] perf record: implement -f,--mmap-flush=<threshold> option
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
@ 2019-02-26  5:55 ` Alexey Budankov
  2019-02-26  5:57 ` [PATCH v3 3/9] perf session: define bytes_transferred and bytes_compressed metrics Alexey Budankov
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  5:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Implemented -f,--mmap-flush option that specifies threshold to postpone
and/or trigger the move of data from mmaped kernel buffers to a storage.

The option can be used to avoid capturing every single byte of data into
the stored trace. The default option value is 1.

  $ tools/perf/perf record -f 1024 -e cycles -- matrix.gcc
  $ tools/perf/perf record --aio -f 1024 -e cycles -- matrix.gcc

Implemented sync param is the mean to force data move independently from 
the threshold value. Despite a user provides flush value from the command 
line, the tool needs capability to drain memory buffers, at least in the 
end of the collection.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/Documentation/perf-record.txt |  5 +++
 tools/perf/builtin-record.c              | 53 +++++++++++++++++++++---
 tools/perf/perf.h                        |  1 +
 tools/perf/util/evlist.c                 |  6 +--
 tools/perf/util/evlist.h                 |  3 +-
 tools/perf/util/mmap.c                   |  4 +-
 tools/perf/util/mmap.h                   |  3 +-
 7 files changed, 63 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 8f0c2be34848..8276d6517812 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -459,6 +459,11 @@ Set affinity mask of trace reading thread according to the policy defined by 'mo
   node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer
   cpu  - thread affinity mask is set to cpu of the processed mmap buffer
 
+-f::
+--mmap-flush=n::
+Minimal number of bytes accumulated in mmaped kernel buffer that is flushed to a storage (default: 1).
+Maximal allowed value is a quarter of mmaped kernel buffer size.
+
 --all-kernel::
 Configure all used events to run in kernel space.
 
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6c3719ac901d..6235cc6b59e9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -334,6 +334,29 @@ static int record__aio_enabled(struct record *rec)
 	return rec->opts.nr_cblocks > 0;
 }
 
+#define MMAP_FLUSH_DEFAULT 1
+static int record__mmap_flush_parse(const struct option *opt,
+				    const char *str,
+				    int unset)
+{
+	int mmap_len;
+	struct record_opts *opts = (struct record_opts *)opt->value;
+
+	if (unset)
+		return 0;
+
+	if (str)
+		opts->mmap_flush = strtol(str, NULL, 0);
+	if (!opts->mmap_flush)
+		opts->mmap_flush = MMAP_FLUSH_DEFAULT;
+
+	mmap_len = perf_evlist__mmap_size(opts->mmap_pages);
+	if (opts->mmap_flush > mmap_len / 4)
+		opts->mmap_flush = mmap_len / 4;
+
+	return 0;
+}
+
 static int process_synthesized_event(struct perf_tool *tool,
 				     union perf_event *event,
 				     struct perf_sample *sample __maybe_unused,
@@ -543,7 +566,8 @@ static int record__mmap_evlist(struct record *rec,
 	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode,
-				 opts->nr_cblocks, opts->affinity) < 0) {
+				 opts->nr_cblocks, opts->affinity,
+				 opts->mmap_flush) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
@@ -734,7 +758,7 @@ static void record__adjust_affinity(struct record *rec, struct perf_mmap *map)
 }
 
 static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
-				    bool overwrite)
+				    bool overwrite, bool sync)
 {
 	u64 bytes_written = rec->bytes_written;
 	int i;
@@ -757,12 +781,19 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 		off = record__aio_get_pos(trace_fd);
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
+		u64 flush = MMAP_FLUSH_DEFAULT;
 		struct perf_mmap *map = &maps[i];
 
 		if (map->base) {
 			record__adjust_affinity(rec, map);
+			if (sync) {
+				flush = map->flush;
+				map->flush = MMAP_FLUSH_DEFAULT;
+			}
 			if (!record__aio_enabled(rec)) {
 				if (perf_mmap__push(map, rec, record__pushfn) != 0) {
+					if (sync)
+						map->flush = flush;
 					rc = -1;
 					goto out;
 				}
@@ -775,10 +806,14 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 				idx = record__aio_sync(map, false);
 				if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off) != 0) {
 					record__aio_set_pos(trace_fd, off);
+					if (sync)
+						map->flush = flush;
 					rc = -1;
 					goto out;
 				}
 			}
+			if (sync)
+				map->flush = flush;
 		}
 
 		if (map->auxtrace_mmap.base && !rec->opts.auxtrace_snapshot_mode &&
@@ -804,15 +839,15 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	return rc;
 }
 
-static int record__mmap_read_all(struct record *rec)
+static int record__mmap_read_all(struct record *rec, bool sync)
 {
 	int err;
 
-	err = record__mmap_read_evlist(rec, rec->evlist, false);
+	err = record__mmap_read_evlist(rec, rec->evlist, false, sync);
 	if (err)
 		return err;
 
-	return record__mmap_read_evlist(rec, rec->evlist, true);
+	return record__mmap_read_evlist(rec, rec->evlist, true, sync);
 }
 
 static void record__init_features(struct record *rec)
@@ -1311,7 +1346,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		if (trigger_is_hit(&switch_output_trigger) || done || draining)
 			perf_evlist__toggle_bkw_mmap(rec->evlist, BKW_MMAP_DATA_PENDING);
 
-		if (record__mmap_read_all(rec) < 0) {
+		if (record__mmap_read_all(rec, false) < 0) {
 			trigger_error(&auxtrace_snapshot_trigger);
 			trigger_error(&switch_output_trigger);
 			err = -1;
@@ -1412,6 +1447,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		record__synthesize_workload(rec, true);
 
 out_child:
+	record__mmap_read_all(rec, true);
 	record__aio_mmap_read_sync(rec);
 
 	if (forks) {
@@ -1814,6 +1850,7 @@ static struct record record = {
 			.uses_mmap   = true,
 			.default_per_cpu = true,
 		},
+		.mmap_flush          = MMAP_FLUSH_DEFAULT,
 	},
 	.tool = {
 		.sample		= process_sample_event,
@@ -1880,6 +1917,9 @@ static struct option __record_options[] = {
 	OPT_CALLBACK('m', "mmap-pages", &record.opts, "pages[,pages]",
 		     "number of mmap data pages and AUX area tracing mmap pages",
 		     record__parse_mmap_pages),
+	OPT_CALLBACK('f', "mmap-flush", &record.opts, "bytes",
+		     "Minimal number of bytes in mmap data pages that is written to a storage (default: 1)",
+		     record__mmap_flush_parse),
 	OPT_BOOLEAN(0, "group", &record.opts.group,
 		    "put the counters into a counter group"),
 	OPT_CALLBACK_NOOPT('g', NULL, &callchain_param,
@@ -2183,6 +2223,7 @@ int cmd_record(int argc, const char **argv)
 		pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks);
 
 	pr_debug("affinity: %s\n", affinity_tags[rec->opts.affinity]);
+	pr_debug("mmap flush: %d\n", rec->opts.mmap_flush);
 
 	err = __cmd_record(&record, argc, argv);
 out:
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index b120e547ddc7..7886cc9771cf 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -85,6 +85,7 @@ struct record_opts {
 	u64          clockid_res_ns;
 	int	     nr_cblocks;
 	int	     affinity;
+	int	     mmap_flush;
 };
 
 enum perf_affinity {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 08cedb643ea6..937039faac59 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1022,7 +1022,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite, int nr_cblocks, int affinity)
+			 bool auxtrace_overwrite, int nr_cblocks, int affinity, int flush)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
@@ -1032,7 +1032,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	 * Its value is decided by evsel's write_backward.
 	 * So &mp should not be passed through const pointer.
 	 */
-	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity };
+	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity, .flush = flush };
 
 	if (!evlist->mmap)
 		evlist->mmap = perf_evlist__alloc_mmap(evlist, false);
@@ -1064,7 +1064,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
 {
-	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS);
+	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS, 1);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 744906dd4887..edf18811e39f 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -165,7 +165,8 @@ unsigned long perf_event_mlock_kb_in_pages(void);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite, int nr_cblocks, int affinity);
+			 bool auxtrace_overwrite, int nr_cblocks,
+			 int affinity, int flush);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
 void perf_evlist__munmap(struct perf_evlist *evlist);
 
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index cdc7740fc181..ef3d79b2c90b 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -440,6 +440,8 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd, int c
 
 	perf_mmap__setup_affinity_mask(map, mp);
 
+	map->flush = mp->flush;
+
 	if (auxtrace_mmap__mmap(&map->auxtrace_mmap,
 				&mp->auxtrace_mp, map->base, fd))
 		return -1;
@@ -492,7 +494,7 @@ static int __perf_mmap__read_init(struct perf_mmap *md)
 	md->start = md->overwrite ? head : old;
 	md->end = md->overwrite ? old : head;
 
-	if (md->start == md->end)
+	if ((md->end - md->start) < md->flush)
 		return -EAGAIN;
 
 	size = md->end - md->start;
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index e566c19b242b..b82f8c2d55c4 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -39,6 +39,7 @@ struct perf_mmap {
 	} aio;
 #endif
 	cpu_set_t	affinity_mask;
+	u64		flush;
 };
 
 /*
@@ -70,7 +71,7 @@ enum bkw_mmap_state {
 };
 
 struct mmap_params {
-	int			    prot, mask, nr_cblocks, affinity;
+	int			    prot, mask, nr_cblocks, affinity, flush;
 	struct auxtrace_mmap_params auxtrace_mp;
 };

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/9] perf session: define bytes_transferred and bytes_compressed metrics
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
  2019-02-26  5:55 ` [PATCH v3 2/9] perf record: implement -f,--mmap-flush=<threshold> option Alexey Budankov
@ 2019-02-26  5:57 ` Alexey Budankov
  2019-02-26  5:59 ` [PATCH v3 4/9] perf record: implement COMPRESSED event record and its attributes Alexey Budankov
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  5:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Define bytes_transferred and bytes_compressed metrics to calculate
comp_ratio=transferred/compressed in the end of the data collection.

bytes_transferred accumulates the amount of bytes that was captured from
the mmaped kernel buffers for compression. bytes_compressed accumulates
the amount of bytes that was received after applying compression to
move to a storage.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-record.c | 8 ++++++++
 tools/perf/util/env.h       | 1 +
 tools/perf/util/session.h   | 2 ++
 3 files changed, 11 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6235cc6b59e9..299f484f3e30 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1450,6 +1450,14 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	record__mmap_read_all(rec, true);
 	record__aio_mmap_read_sync(rec);
 
+	if (!quiet && rec->session->bytes_transferred && rec->session->bytes_compressed) {
+		float ratio = (float)rec->session->bytes_transferred/(float)rec->session->bytes_compressed;
+
+		session->header.env.comp_ratio = ratio + 0.5;
+		fprintf(stderr,	"[ perf record: Compressed %.3f MB to %.3f MB, ratio is %.3f ]\n",
+			rec->session->bytes_transferred / 1024.0 / 1024.0, rec->session->bytes_compressed / 1024.0 / 1024.0, ratio);
+	}
+
 	if (forks) {
 		int exit_status;
 
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index d01b8355f4ca..fb39e9af128f 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -64,6 +64,7 @@ struct perf_env {
 	struct memory_node	*memory_nodes;
 	unsigned long long	 memory_bsize;
 	u64                     clockid_res_ns;
+	u32                     comp_ratio;
 };
 
 extern struct perf_env perf_env;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index d96eccd7d27f..0e14884f28b2 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -35,6 +35,8 @@ struct perf_session {
 	struct ordered_events	ordered_events;
 	struct perf_data	*data;
 	struct perf_tool	*tool;
+	u64			bytes_transferred;
+	u64			bytes_compressed;
 };
 
 struct perf_tool;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 4/9] perf record: implement COMPRESSED event record and its attributes
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
  2019-02-26  5:55 ` [PATCH v3 2/9] perf record: implement -f,--mmap-flush=<threshold> option Alexey Budankov
  2019-02-26  5:57 ` [PATCH v3 3/9] perf session: define bytes_transferred and bytes_compressed metrics Alexey Budankov
@ 2019-02-26  5:59 ` Alexey Budankov
  2019-02-26  6:03 ` [PATCH v3 5/9] perf mmap: implement dedicated memory buffer for data compression Alexey Budankov
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  5:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Implemented PERF_RECORD_COMPRESSED event, related data types, header
feature and functions to write, read and print feature attributes
from the trace header section.

comp_mmap_len preserves the size of mmaped kernel buffer that was used
during collection. comp_mmap_len size is used on loading stage as the
size of decomp buffer for decompression of COMPRESSED events content.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-record.c |  9 ++++++
 tools/perf/perf.h           |  1 +
 tools/perf/util/env.h       | 10 +++++++
 tools/perf/util/event.c     |  1 +
 tools/perf/util/event.h     |  7 +++++
 tools/perf/util/header.c    | 55 ++++++++++++++++++++++++++++++++++++-
 tools/perf/util/header.h    |  1 +
 7 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 299f484f3e30..61017fa0ee1c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -357,6 +357,11 @@ static int record__mmap_flush_parse(const struct option *opt,
 	return 0;
 }
 
+static int record__comp_enabled(struct record *rec)
+{
+	return rec->opts.comp_level > 0;
+}
+
 static int process_synthesized_event(struct perf_tool *tool,
 				     union perf_event *event,
 				     struct perf_sample *sample __maybe_unused,
@@ -873,6 +878,9 @@ static void record__init_features(struct record *rec)
 	if (!(rec->opts.use_clockid && rec->opts.clockid_res_ns))
 		perf_header__clear_feat(&session->header, HEADER_CLOCKID);
 
+	if (!record__comp_enabled(rec))
+		perf_header__clear_feat(&session->header, HEADER_COMPRESSED);
+
 	perf_header__clear_feat(&session->header, HEADER_STAT);
 }
 
@@ -1211,6 +1219,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 		err = -1;
 		goto out_child;
 	}
+	session->header.env.comp_mmap_len = session->evlist->mmap_len;
 
 	err = bpf__apply_obj_config();
 	if (err) {
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 7886cc9771cf..2c6caad45b10 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -86,6 +86,7 @@ struct record_opts {
 	int	     nr_cblocks;
 	int	     affinity;
 	int	     mmap_flush;
+	unsigned int comp_level;
 };
 
 enum perf_affinity {
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index fb39e9af128f..7990d63ab764 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -65,6 +65,16 @@ struct perf_env {
 	unsigned long long	 memory_bsize;
 	u64                     clockid_res_ns;
 	u32                     comp_ratio;
+	u32			comp_ver;
+	u32			comp_type;
+	u32			comp_level;
+	u32			comp_mmap_len;
+};
+
+enum perf_compress_type {
+	PERF_COMP_NONE = 0,
+	PERF_COMP_ZSTD,
+	PERF_COMP_MAX
 };
 
 extern struct perf_env perf_env;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index ba7be74fad6e..d1ad6c419724 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -68,6 +68,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_EVENT_UPDATE]		= "EVENT_UPDATE",
 	[PERF_RECORD_TIME_CONV]			= "TIME_CONV",
 	[PERF_RECORD_HEADER_FEATURE]		= "FEATURE",
+	[PERF_RECORD_COMPRESSED]		= "COMPRESSED",
 };
 
 static const char *perf_ns__names[] = {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 36ae7e92dab1..8a13aefe734e 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -254,6 +254,7 @@ enum perf_user_event_type { /* above any possible kernel type */
 	PERF_RECORD_EVENT_UPDATE		= 78,
 	PERF_RECORD_TIME_CONV			= 79,
 	PERF_RECORD_HEADER_FEATURE		= 80,
+	PERF_RECORD_COMPRESSED			= 81,
 	PERF_RECORD_HEADER_MAX
 };
 
@@ -626,6 +627,11 @@ struct feature_event {
 	char				data[];
 };
 
+struct compressed_event {
+	struct perf_event_header	header;
+	char				data[];
+};
+
 union perf_event {
 	struct perf_event_header	header;
 	struct mmap_event		mmap;
@@ -659,6 +665,7 @@ union perf_event {
 	struct feature_event		feat;
 	struct ksymbol_event		ksymbol_event;
 	struct bpf_event		bpf_event;
+	struct compressed_event		pack;
 };
 
 void perf_event__print_totals(void);
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index a2323d777dae..011b464fdad7 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1250,6 +1250,30 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
 	return ret;
 }
 
+static int write_compressed(struct feat_fd *ff __maybe_unused,
+			    struct perf_evlist *evlist __maybe_unused)
+{
+	int ret;
+
+	ret = do_write(ff, &(ff->ph->env.comp_ver), sizeof(ff->ph->env.comp_ver));
+	if (ret)
+		return ret;
+
+	ret = do_write(ff, &(ff->ph->env.comp_type), sizeof(ff->ph->env.comp_type));
+	if (ret)
+		return ret;
+
+	ret = do_write(ff, &(ff->ph->env.comp_level), sizeof(ff->ph->env.comp_level));
+	if (ret)
+		return ret;
+
+	ret = do_write(ff, &(ff->ph->env.comp_ratio), sizeof(ff->ph->env.comp_ratio));
+	if (ret)
+		return ret;
+
+	return do_write(ff, &(ff->ph->env.comp_mmap_len), sizeof(ff->ph->env.comp_mmap_len));
+}
+
 static void print_hostname(struct feat_fd *ff, FILE *fp)
 {
 	fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
@@ -1537,6 +1561,13 @@ static void print_cache(struct feat_fd *ff, FILE *fp __maybe_unused)
 	}
 }
 
+static void print_compressed(struct feat_fd *ff, FILE *fp)
+{
+	fprintf(fp, "# compressed : %s, level = %d, ratio = %d\n",
+		ff->ph->env.comp_type == PERF_COMP_ZSTD ? "Zstd" : "Unknown",
+		ff->ph->env.comp_level, ff->ph->env.comp_ratio);
+}
+
 static void print_pmu_mappings(struct feat_fd *ff, FILE *fp)
 {
 	const char *delimiter = "# pmu mappings: ";
@@ -2379,6 +2410,27 @@ static int process_clockid(struct feat_fd *ff,
 	return 0;
 }
 
+static int process_compressed(struct feat_fd *ff,
+			      void *data __maybe_unused)
+{
+	if (do_read_u32(ff, &(ff->ph->env.comp_ver)))
+		return -1;
+
+	if (do_read_u32(ff, &(ff->ph->env.comp_type)))
+		return -1;
+
+	if (do_read_u32(ff, &(ff->ph->env.comp_level)))
+		return -1;
+
+	if (do_read_u32(ff, &(ff->ph->env.comp_ratio)))
+		return -1;
+
+	if (do_read_u32(ff, &(ff->ph->env.comp_mmap_len)))
+		return -1;
+
+	return 0;
+}
+
 struct feature_ops {
 	int (*write)(struct feat_fd *ff, struct perf_evlist *evlist);
 	void (*print)(struct feat_fd *ff, FILE *fp);
@@ -2438,7 +2490,8 @@ static const struct feature_ops feat_ops[HEADER_LAST_FEATURE] = {
 	FEAT_OPN(CACHE,		cache,		true),
 	FEAT_OPR(SAMPLE_TIME,	sample_time,	false),
 	FEAT_OPR(MEM_TOPOLOGY,	mem_topology,	true),
-	FEAT_OPR(CLOCKID,       clockid,        false)
+	FEAT_OPR(CLOCKID,       clockid,        false),
+	FEAT_OPR(COMPRESSED,	compressed,	false)
 };
 
 struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 0d553ddca0a3..ee867075dc64 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -39,6 +39,7 @@ enum {
 	HEADER_SAMPLE_TIME,
 	HEADER_MEM_TOPOLOGY,
 	HEADER_CLOCKID,
+	HEADER_COMPRESSED,
 	HEADER_LAST_FEATURE,
 	HEADER_FEAT_BITS	= 256,
 };

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 5/9] perf mmap: implement dedicated memory buffer for data compression
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (2 preceding siblings ...)
  2019-02-26  5:59 ` [PATCH v3 4/9] perf record: implement COMPRESSED event record and its attributes Alexey Budankov
@ 2019-02-26  6:03 ` Alexey Budankov
  2019-02-26  6:10 ` [PATCH v3 6/9] perf util: introduce Zstd based streaming compression API Alexey Budankov
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  6:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Implemented mmap data buffer that is used as the memory to operate
on when compressing sampling data in case of serial trace streaming.

In case of AIO trace streaming AIO buffers are used to implement
sampling data compression.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-record.c |  6 +++++-
 tools/perf/util/evlist.c    |  8 +++++---
 tools/perf/util/evlist.h    |  2 +-
 tools/perf/util/mmap.c      | 25 +++++++++++++++++++++++++
 tools/perf/util/mmap.h      |  4 +++-
 5 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 61017fa0ee1c..71c67a87c713 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -572,7 +572,7 @@ static int record__mmap_evlist(struct record *rec,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode,
 				 opts->nr_cblocks, opts->affinity,
-				 opts->mmap_flush) < 0) {
+				 opts->mmap_flush, opts->comp_level) < 0) {
 		if (errno == EPERM) {
 			pr_err("Permission error mapping pages.\n"
 			       "Consider increasing "
@@ -2242,6 +2242,10 @@ int cmd_record(int argc, const char **argv)
 	pr_debug("affinity: %s\n", affinity_tags[rec->opts.affinity]);
 	pr_debug("mmap flush: %d\n", rec->opts.mmap_flush);
 
+	if (rec->opts.comp_level > 22)
+		rec->opts.comp_level = 0;
+	pr_debug("comp level: %d\n", rec->opts.comp_level);
+
 	err = __cmd_record(&record, argc, argv);
 out:
 	perf_evlist__delete(rec->evlist);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 937039faac59..a13458b43dc1 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1022,7 +1022,8 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 unsigned int auxtrace_pages,
-			 bool auxtrace_overwrite, int nr_cblocks, int affinity, int flush)
+			 bool auxtrace_overwrite, int nr_cblocks, int affinity, int flush,
+			 int comp_level)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
@@ -1032,7 +1033,8 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	 * Its value is decided by evsel's write_backward.
 	 * So &mp should not be passed through const pointer.
 	 */
-	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity, .flush = flush };
+	struct mmap_params mp = { .nr_cblocks = nr_cblocks, .affinity = affinity, .flush = flush,
+				  .comp_level = comp_level };
 
 	if (!evlist->mmap)
 		evlist->mmap = perf_evlist__alloc_mmap(evlist, false);
@@ -1064,7 +1066,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
 {
-	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS, 1);
+	return perf_evlist__mmap_ex(evlist, pages, 0, false, 0, PERF_AFFINITY_SYS, 1, 0);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index edf18811e39f..77c11dac4a63 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -166,7 +166,7 @@ unsigned long perf_event_mlock_kb_in_pages(void);
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite, int nr_cblocks,
-			 int affinity, int flush);
+			 int affinity, int flush, int comp_level);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
 void perf_evlist__munmap(struct perf_evlist *evlist);
 
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index ef3d79b2c90b..08fd846df604 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -159,6 +159,10 @@ void __weak auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp __mayb
 #ifdef HAVE_AIO_SUPPORT
 
 #ifdef HAVE_LIBNUMA_SUPPORT
+static int perf_mmap__aio_enabled(struct perf_mmap *map)
+{
+	return map->aio.nr_cblocks > 0;
+}
 static int perf_mmap__aio_alloc(struct perf_mmap *map, int idx)
 {
 	map->aio.data[idx] = mmap(NULL, perf_mmap__mmap_len(map), PROT_READ|PROT_WRITE,
@@ -199,6 +203,10 @@ static int perf_mmap__aio_bind(struct perf_mmap *map, int idx, int cpu, int affi
 	return 0;
 }
 #else
+static int perf_mmap__aio_enabled(struct perf_mmap *map __maybe_unused)
+{
+	return 0;
+}
 static int perf_mmap__aio_alloc(struct perf_mmap *map, int idx)
 {
 	map->aio.data[idx] = malloc(perf_mmap__mmap_len(map));
@@ -374,6 +382,10 @@ static void perf_mmap__aio_munmap(struct perf_mmap *map __maybe_unused)
 void perf_mmap__munmap(struct perf_mmap *map)
 {
 	perf_mmap__aio_munmap(map);
+	if (map->data != NULL) {
+		munmap(map->data, perf_mmap__mmap_len(map));
+		map->data = NULL;
+	}
 	if (map->base != NULL) {
 		munmap(map->base, perf_mmap__mmap_len(map));
 		map->base = NULL;
@@ -442,6 +454,19 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd, int c
 
 	map->flush = mp->flush;
 
+	map->comp_level = mp->comp_level;
+
+	if (map->comp_level && !perf_mmap__aio_enabled(map)) {
+		map->data = mmap(NULL, perf_mmap__mmap_len(map), PROT_READ|PROT_WRITE,
+				 MAP_PRIVATE|MAP_ANONYMOUS, 0, 0);
+		if (map->data == MAP_FAILED) {
+			pr_debug2("failed to mmap data buffer, error %d\n",
+					errno);
+			map->data = NULL;
+			return -1;
+		}
+	}
+
 	if (auxtrace_mmap__mmap(&map->auxtrace_mmap,
 				&mp->auxtrace_mp, map->base, fd))
 		return -1;
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index b82f8c2d55c4..a02427d609c0 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -40,6 +40,8 @@ struct perf_mmap {
 #endif
 	cpu_set_t	affinity_mask;
 	u64		flush;
+	void 		*data;
+	int		comp_level;
 };
 
 /*
@@ -71,7 +73,7 @@ enum bkw_mmap_state {
 };
 
 struct mmap_params {
-	int			    prot, mask, nr_cblocks, affinity, flush;
+	int prot, mask, nr_cblocks, affinity, flush, comp_level;
 	struct auxtrace_mmap_params auxtrace_mp;
 };

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 6/9] perf util: introduce Zstd based streaming compression API
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (3 preceding siblings ...)
  2019-02-26  6:03 ` [PATCH v3 5/9] perf mmap: implement dedicated memory buffer for data compression Alexey Budankov
@ 2019-02-26  6:10 ` Alexey Budankov
  2019-02-26  6:20 ` [PATCH v3 7/9] perf record: implement -z,--compression_level=n option and compression Alexey Budankov
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  6:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Implemented functions are based on Zstd streaming compression
API. The functions are used in runtime to compress data that
come from mmaped kernel buffer data and then stored into a trace.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/util/Build      |  2 +
 tools/perf/util/compress.h | 18 ++++++++
 tools/perf/util/zstd.c     | 95 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 115 insertions(+)
 create mode 100644 tools/perf/util/zstd.c

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 8dd3102301ea..920ee8bebd83 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -145,6 +145,8 @@ perf-y += scripting-engines/
 
 perf-$(CONFIG_ZLIB) += zlib.o
 perf-$(CONFIG_LZMA) += lzma.o
+perf-y += zstd.o
+
 perf-y += demangle-java.o
 perf-y += demangle-rust.o
 
diff --git a/tools/perf/util/compress.h b/tools/perf/util/compress.h
index 892e92e7e7fc..e0987616db94 100644
--- a/tools/perf/util/compress.h
+++ b/tools/perf/util/compress.h
@@ -2,6 +2,11 @@
 #ifndef PERF_COMPRESS_H
 #define PERF_COMPRESS_H
 
+#include <stdbool.h>
+#ifdef HAVE_ZSTD_SUPPORT
+#include <zstd.h>
+#endif
+
 #ifdef HAVE_ZLIB_SUPPORT
 int gzip_decompress_to_file(const char *input, int output_fd);
 bool gzip_is_compressed(const char *input);
@@ -12,4 +17,17 @@ int lzma_decompress_to_file(const char *input, int output_fd);
 bool lzma_is_compressed(const char *input);
 #endif
 
+struct zstd_data {
+#ifdef HAVE_ZSTD_SUPPORT
+	ZSTD_CStream	*cstream;
+#endif
+};
+
+int zstd_init(struct zstd_data *data, int level);
+int zstd_fini(struct zstd_data *data);
+
+size_t zstd_compress_stream_to_records(struct zstd_data *data,
+	void *dst, size_t dst_size, void *src, size_t src_size,	size_t max_record_size,
+	size_t process_header(void *record, size_t increment));
+
 #endif /* PERF_COMPRESS_H */
diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
new file mode 100644
index 000000000000..686c3a347dcc
--- /dev/null
+++ b/tools/perf/util/zstd.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <string.h>
+
+#include "util/compress.h"
+#include "util/debug.h"
+
+#ifdef HAVE_ZSTD_SUPPORT
+
+int zstd_init(struct zstd_data *data, int level)
+{
+	size_t ret;
+
+	data->cstream = ZSTD_createCStream();
+	if (data->cstream == NULL) {
+		pr_err("Couldn't create compression stream.\n");
+		return -1;
+	}
+
+	ret = ZSTD_initCStream(data->cstream, level);
+	if (ZSTD_isError(ret)) {
+		pr_err("Failed to initialize compression stream: %s\n", ZSTD_getErrorName(ret));
+		return -1;
+	}
+
+	return 0;
+}
+
+int zstd_fini(struct zstd_data *data)
+{
+	if (data->cstream) {
+		ZSTD_freeCStream(data->cstream);
+		data->cstream = NULL;
+	}
+
+	return 0;
+}
+
+size_t zstd_compress_stream_to_records(struct zstd_data *data,
+	void *dst, size_t dst_size, void *src, size_t src_size,	size_t max_record_size,
+	size_t process_header(void *record, size_t increment))
+{
+	size_t ret, size, compressed = 0;
+	ZSTD_inBuffer input = { src, src_size, 0 };
+	ZSTD_outBuffer output;
+	void *record;
+
+	while (input.pos < input.size) {
+		record = dst;
+		size = process_header(record, 0);
+		compressed += size;
+		dst += size;
+		dst_size -= size;
+		output = (ZSTD_outBuffer){ dst, (dst_size > max_record_size) ?
+						max_record_size : dst_size, 0 };
+		ret = ZSTD_compressStream(data->cstream, &output, &input);
+		ZSTD_flushStream(data->cstream, &output);
+		if (ZSTD_isError(ret)) {
+			pr_err("failed to compress %ld bytes: %s\n",
+				(long)src_size, ZSTD_getErrorName(ret));
+			memcpy(dst, src, src_size);
+			return src_size;
+		}
+		size = output.pos;
+		size = process_header(record, size);
+		compressed += size;
+		dst += size;
+		dst_size -= size;
+	}
+
+	return compressed;
+}
+
+#else /* !HAVE_ZSTD_SUPPORT */
+
+int zstd_init(struct zstd_data *data __maybe_unused, int level __maybe_unused)
+{
+	return 0;
+}
+
+int zstd_fini(struct zstd_data *data __maybe_unused)
+{
+	return 0;
+}
+
+size_t zstd_compress_stream_to_records(struct zstd_data *data __maybe_unused,
+		void *dst, size_t dst_size __maybe_unused,
+		void *src, size_t src_size, size_t max_record_size __maybe_unused,
+		size_t process_header(void *record, size_t increment) __maybe_unused)
+{
+	memcpy(dst, src, src_size);
+	return 0;
+}
+
+#endif


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 7/9] perf record: implement -z,--compression_level=n option and compression
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (4 preceding siblings ...)
  2019-02-26  6:10 ` [PATCH v3 6/9] perf util: introduce Zstd based streaming compression API Alexey Budankov
@ 2019-02-26  6:20 ` Alexey Budankov
  2019-02-26  6:26 ` [PATCH v3 8/9] perf report: implement record trace decompression Alexey Budankov
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  6:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Implemented -z,--compression_level=n option that enables compression
of mmaped kernel data buffers content in runtime during perf record
sampling collection.
    
Compression is implemented using the functions from zstd.c. As the
memory to operate on the compression employs mmap->data buffer in case
of serial trace writing and mmap AIO buffers in case of AIO trace
writing. If Zstd streaming compression API fails for some reason the
data to be compressed are just copied into the memory buffers using
memcpy().
    
Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
records. Each element of the array is not longer that 64KiB because of
u16 size limitation and comprised of perf_event_header followed by the
compressed chunk that is decompressed on the loading stage. --mmap-flush
option value can be used to avoid compression of every single byte of
data and possibly increase compression ratio.
    
Compression overhead has been measured for serial and AIO trace writing
when profiling matrix multiplication workload:
    
        -------------------------------------------------------------
        | SERIAL                      | AIO-1                       |
    -----------------------------------------------------------------
    |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
    |----------------------------------------------------------------
    | 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
    | 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
    | 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
    | 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
    | 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
    | 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
    -----------------------------------------------------------------
    
    OVH = (Execution time with -z N) / (Execution time with -z 0)

    ratio - compression ratio
    size  - number of bytes that was compressed
    
            size ~= trace size x ratio

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/Documentation/perf-record.txt |  5 ++
 tools/perf/builtin-record.c              | 87 ++++++++++++++++++++----
 tools/perf/util/mmap.c                   | 31 ++++++---
 tools/perf/util/mmap.h                   | 13 ++--
 tools/perf/util/session.h                |  2 +
 5 files changed, 110 insertions(+), 28 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 8276d6517812..28c62a914c75 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -464,6 +464,11 @@ Set affinity mask of trace reading thread according to the policy defined by 'mo
 Minimal number of bytes accumulated in mmaped kernel buffer that is flushed to a storage (default: 1).
 Maximal allowed value is a quater of mmaped kernel buffer size.
 
+-z::
+--compression-level=n::
+Produce compressed trace using specified level n to save storage space (no compression: 0 - default,
+fastest compression: 1, smallest trace: 22)
+
 --all-kernel::
 Configure all used events to run in kernel space.
 
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 71c67a87c713..fa50387334f2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -237,7 +237,7 @@ static int record__aio_sync(struct perf_mmap *md, bool sync_all)
 	} while (1);
 }
 
-static int record__aio_pushfn(void *to, struct aiocb *cblock, void *bf, size_t size, off_t off)
+static int record__aio_pushfn(void *to, void *bf, size_t size, off_t off, struct aiocb *cblock)
 {
 	struct record *rec = to;
 	int ret, trace_fd = rec->session->data->file.fd;
@@ -264,13 +264,15 @@ static void record__aio_set_pos(int trace_fd, off_t pos)
 	lseek(trace_fd, pos, SEEK_SET);
 }
 
+static int record__aio_enabled(struct record *rec);
+
 static void record__aio_mmap_read_sync(struct record *rec)
 {
 	int i;
 	struct perf_evlist *evlist = rec->evlist;
 	struct perf_mmap *maps = evlist->mmap;
 
-	if (!rec->opts.nr_cblocks)
+	if (!record__aio_enabled(rec))
 		return;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
@@ -292,25 +294,28 @@ static int record__aio_parse(const struct option *opt,
 
 	if (unset) {
 		opts->nr_cblocks = 0;
-	} else {
-		if (str)
-			opts->nr_cblocks = strtol(str, NULL, 0);
-		if (!opts->nr_cblocks)
-			opts->nr_cblocks = nr_cblocks_default;
+		return 0;
 	}
 
+	if (str)
+		opts->nr_cblocks = strtol(str, NULL, 0);
+	if (!opts->nr_cblocks)
+		opts->nr_cblocks = nr_cblocks_default;
+
+	if (opts->nr_cblocks > nr_cblocks_max)
+		opts->nr_cblocks = nr_cblocks_max;
+
 	return 0;
 }
 #else /* HAVE_AIO_SUPPORT */
-static int nr_cblocks_max = 0;
-
 static int record__aio_sync(struct perf_mmap *md __maybe_unused, bool sync_all __maybe_unused)
 {
 	return -1;
 }
 
-static int record__aio_pushfn(void *to __maybe_unused, struct aiocb *cblock __maybe_unused,
-		void *bf __maybe_unused, size_t size __maybe_unused, off_t off __maybe_unused)
+static int record__aio_pushfn(void *to __maybe_unused, void *bf __maybe_unused,
+		size_t size __maybe_unused, off_t off __maybe_unused,
+		struct aiocb *cblock __maybe_unused)
 {
 	return -1;
 }
@@ -762,6 +767,40 @@ static void record__adjust_affinity(struct record *rec, struct perf_mmap *map)
 	}
 }
 
+static size_t record__process_comp_header(void *record, size_t increment)
+{
+	struct compressed_event *event = record;
+	size_t size = sizeof(struct compressed_event);
+
+	if (increment) {
+		event->header.size += increment;
+		return increment;
+	} else {
+		event->header.type = PERF_RECORD_COMPRESSED;
+		event->header.size = size;
+		return size;
+	}
+}
+
+static size_t record__zstd_compress(void *data, void *dst, size_t dst_size,
+		void *src, size_t src_size)
+{
+	size_t compressed;
+	struct perf_session *session = data;
+	/* maximum size of record data size (2^16 - 1 - header) */
+	const size_t max_record_size = (1 << 8 * sizeof(u16)) -
+					1 - sizeof(struct compressed_event);
+
+	compressed = zstd_compress_stream_to_records(&(session->zstd_data),
+				dst, dst_size, src, src_size, max_record_size,
+				record__process_comp_header);
+
+	session->bytes_transferred += src_size;
+	session->bytes_compressed  += compressed;
+
+	return compressed;
+}
+
 static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
 				    bool overwrite, bool sync)
 {
@@ -771,6 +810,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	struct perf_mmap *maps;
 	int trace_fd = rec->data.file.fd;
 	off_t off;
+	struct perf_session *session = rec->session;
+	perf_mmap__compress_fn_t compress_fn;
 
 	if (!evlist)
 		return 0;
@@ -782,6 +823,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (overwrite && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
 		return 0;
 
+	compress_fn = record__comp_enabled(rec) ? record__zstd_compress : NULL;
+
 	if (record__aio_enabled(rec))
 		off = record__aio_get_pos(trace_fd);
 
@@ -796,7 +839,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 				map->flush = MMAP_FLUSH_DEFAULT;
 			}
 			if (!record__aio_enabled(rec)) {
-				if (perf_mmap__push(map, rec, record__pushfn) != 0) {
+				if (perf_mmap__push(map, rec, record__pushfn,
+						    compress_fn, session) != 0) {
 					if (sync)
 						map->flush = flush;
 					rc = -1;
@@ -809,7 +853,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 				 * becomes available after previous aio write request.
 				 */
 				idx = record__aio_sync(map, false);
-				if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off) != 0) {
+				if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off,
+							compress_fn, session) != 0) {
 					record__aio_set_pos(trace_fd, off);
 					if (sync)
 						map->flush = flush;
@@ -1190,6 +1235,14 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	fd = perf_data__fd(data);
 	rec->session = session;
 
+	if (zstd_init(&(session->zstd_data), rec->opts.comp_level) < 0) {
+		pr_err("Compression initialization failed.\n");
+		return -1;
+	}
+
+	session->header.env.comp_type  = PERF_COMP_ZSTD;
+	session->header.env.comp_level = rec->opts.comp_level;
+
 	record__init_features(rec);
 
 	if (rec->opts.use_clockid && rec->opts.clockid_res_ns)
@@ -1519,6 +1572,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
 	}
 
 out_delete_session:
+	zstd_fini(&(session->zstd_data));
 	perf_session__delete(session);
 	return status;
 }
@@ -2039,6 +2093,10 @@ static struct option __record_options[] = {
 	OPT_CALLBACK(0, "affinity", &record.opts, "node|cpu",
 		     "Set affinity mask of trace reading thread to NUMA node cpu mask or cpu of processed mmap buffer",
 		     record__parse_affinity),
+#ifdef HAVE_ZSTD_SUPPORT
+	OPT_UINTEGER('z', "compression-level", &record.opts.comp_level,
+		     "Produce compressed trace (default: 0, fastest: 1, smallest: 22)"),
+#endif
 	OPT_END()
 };
 
@@ -2236,8 +2294,7 @@ int cmd_record(int argc, const char **argv)
 
 	if (rec->opts.nr_cblocks > nr_cblocks_max)
 		rec->opts.nr_cblocks = nr_cblocks_max;
-	if (verbose > 0)
-		pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks);
+	pr_debug("nr_cblocks: %d\n", rec->opts.nr_cblocks);
 
 	pr_debug("affinity: %s\n", affinity_tags[rec->opts.affinity]);
 	pr_debug("mmap flush: %d\n", rec->opts.mmap_flush);
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 08fd846df604..ea0db6281e44 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -295,14 +295,15 @@ static void perf_mmap__aio_munmap(struct perf_mmap *map)
 }
 
 int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx,
-			int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off),
-			off_t *off)
+			int push(void *to, void *buf, size_t size, off_t off, struct aiocb *cblock),
+			off_t *off, perf_mmap__compress_fn_t compress, void *comp_data)
 {
 	u64 head = perf_mmap__read_head(md);
 	unsigned char *data = md->base + page_size;
 	unsigned long size, size0 = 0;
 	void *buf;
 	int rc = 0;
+	size_t mmap_len = perf_mmap__mmap_len(md);
 
 	rc = perf_mmap__read_init(md);
 	if (rc < 0)
@@ -331,14 +332,20 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx,
 		buf = &data[md->start & md->mask];
 		size = md->mask + 1 - (md->start & md->mask);
 		md->start += size;
-		memcpy(md->aio.data[idx], buf, size);
 		size0 = size;
+		if (!compress)
+			memcpy(md->aio.data[idx], buf, size);
+		else
+			size0 = compress(comp_data, md->aio.data[idx], mmap_len, buf, size);
 	}
 
 	buf = &data[md->start & md->mask];
 	size = md->end - md->start;
 	md->start += size;
-	memcpy(md->aio.data[idx] + size0, buf, size);
+	if (!compress)
+		memcpy(md->aio.data[idx] + size0, buf, size);
+	else
+		size = compress(comp_data, md->aio.data[idx] + size0, mmap_len - size0, buf, size);
 
 	/*
 	 * Increment md->refcount to guard md->data[idx] buffer
@@ -354,7 +361,7 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx,
 	md->prev = head;
 	perf_mmap__consume(md);
 
-	rc = push(to, &md->aio.cblocks[idx], md->aio.data[idx], size0 + size, *off);
+	rc = push(to, md->aio.data[idx], size0 + size, *off, &md->aio.cblocks[idx]);
 	if (!rc) {
 		*off += size0 + size;
 	} else {
@@ -555,13 +562,15 @@ int perf_mmap__read_init(struct perf_mmap *map)
 }
 
 int perf_mmap__push(struct perf_mmap *md, void *to,
-		    int push(struct perf_mmap *map, void *to, void *buf, size_t size))
+		    int push(struct perf_mmap *map, void *to, void *buf, size_t size),
+		    perf_mmap__compress_fn_t compress, void *comp_data)
 {
 	u64 head = perf_mmap__read_head(md);
 	unsigned char *data = md->base + page_size;
 	unsigned long size;
 	void *buf;
 	int rc = 0;
+	size_t mmap_len = perf_mmap__mmap_len(md);
 
 	rc = perf_mmap__read_init(md);
 	if (rc < 0)
@@ -573,7 +582,10 @@ int perf_mmap__push(struct perf_mmap *md, void *to,
 		buf = &data[md->start & md->mask];
 		size = md->mask + 1 - (md->start & md->mask);
 		md->start += size;
-
+		if (compress) {
+			size = compress(comp_data, md->data, mmap_len, buf, size);
+			buf = md->data;
+		}
 		if (push(md, to, buf, size) < 0) {
 			rc = -1;
 			goto out;
@@ -583,7 +595,10 @@ int perf_mmap__push(struct perf_mmap *md, void *to,
 	buf = &data[md->start & md->mask];
 	size = md->end - md->start;
 	md->start += size;
-
+	if (compress) {
+		size = compress(comp_data, md->data, mmap_len, buf, size);
+		buf = md->data;
+	}
 	if (push(md, to, buf, size) < 0) {
 		rc = -1;
 		goto out;
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index a02427d609c0..2df3882c4b83 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -99,16 +99,19 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *map);
 
 union perf_event *perf_mmap__read_event(struct perf_mmap *map);
 
+typedef size_t (*perf_mmap__compress_fn_t)(void *data, void *dst, size_t dst_size,
+		void *src, size_t src_size);
 int perf_mmap__push(struct perf_mmap *md, void *to,
-		    int push(struct perf_mmap *map, void *to, void *buf, size_t size));
+		    int push(struct perf_mmap *map, void *to, void *buf, size_t size),
+		    perf_mmap__compress_fn_t compress, void *compress_data);
 #ifdef HAVE_AIO_SUPPORT
 int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx,
-			int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off),
-			off_t *off);
+			int push(void *to, void *buf, size_t size, off_t off, struct aiocb *cblock),
+			off_t *off, perf_mmap__compress_fn_t compress, void *compress_data);
 #else
 static inline int perf_mmap__aio_push(struct perf_mmap *md __maybe_unused, void *to __maybe_unused, int idx __maybe_unused,
-	int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off) __maybe_unused,
-	off_t *off __maybe_unused)
+	int push(void *to, void *buf, size_t size, off_t off, struct aiocb *cblock) __maybe_unused,
+	off_t *off __maybe_unused, perf_mmap__compress_fn_t compress __maybe_unused, void *compress_data __maybe_unused)
 {
 	return 0;
 }
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 0e14884f28b2..6c984c895924 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -8,6 +8,7 @@
 #include "machine.h"
 #include "data.h"
 #include "ordered-events.h"
+#include "util/compress.h"
 #include <linux/kernel.h>
 #include <linux/rbtree.h>
 #include <linux/perf_event.h>
@@ -37,6 +38,7 @@ struct perf_session {
 	struct perf_tool	*tool;
 	u64			bytes_transferred;
 	u64			bytes_compressed;
+	struct zstd_data	zstd_data;
 };
 
 struct perf_tool;


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 8/9] perf report: implement record trace decompression
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (5 preceding siblings ...)
  2019-02-26  6:20 ` [PATCH v3 7/9] perf record: implement -z,--compression_level=n option and compression Alexey Budankov
@ 2019-02-26  6:26 ` Alexey Budankov
  2019-02-26  6:28 ` [PATCH v3 9/9] perf inject: enable COMPRESSED records decompression Alexey Budankov
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Trace frames containing PERF_RECORD_COMPRESSED records are
decompressed using functions from zstd.c into a linked list
of mmaped memory regions of mmap_comp_len size (struct decomp).

After decompression of one COMPRESSED record its content is 
iterated and fetched for usual processing. The mmaped memory regions 
with decompressed events are kept till the tool process termination.

When dumping raw trace (e.g., perf report -D --header) file
offsets of events from compressed records are printed as zero.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-report.c |   5 +-
 tools/perf/util/compress.h  |   4 ++
 tools/perf/util/session.c   | 124 +++++++++++++++++++++++++++++++++++-
 tools/perf/util/session.h   |  10 +++
 tools/perf/util/tool.h      |   2 +
 tools/perf/util/zstd.c      |  48 ++++++++++++++
 6 files changed, 191 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2e8c74d6430c..5f4483b525ed 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1215,6 +1215,9 @@ int cmd_report(int argc, const char **argv)
 	if (session == NULL)
 		return -1;
 
+	if (zstd_init(&(session->zstd_data), 0) < 0)
+		pr_warning("Decompression initialization failed. Reported data may be incomplete.\n");
+
 	if (report.queue_size) {
 		ordered_events__set_alloc_size(&session->ordered_events,
 					       report.queue_size);
@@ -1427,7 +1430,7 @@ int cmd_report(int argc, const char **argv)
 
 error:
 	zfree(&report.ptime_range);
-
+	zstd_fini(&(session->zstd_data));
 	perf_session__delete(session);
 	return ret;
 }
diff --git a/tools/perf/util/compress.h b/tools/perf/util/compress.h
index e0987616db94..4f6672770ebb 100644
--- a/tools/perf/util/compress.h
+++ b/tools/perf/util/compress.h
@@ -20,6 +20,7 @@ bool lzma_is_compressed(const char *input);
 struct zstd_data {
 #ifdef HAVE_ZSTD_SUPPORT
 	ZSTD_CStream	*cstream;
+	ZSTD_DStream	*dstream;
 #endif
 };
 
@@ -30,4 +31,7 @@ size_t zstd_compress_stream_to_records(struct zstd_data *data,
 	void *dst, size_t dst_size, void *src, size_t src_size,	size_t max_record_size,
 	size_t process_header(void *record, size_t increment));
 
+size_t zstd_decompress_stream(struct zstd_data *data,
+	void *src, size_t src_size, void *dst, size_t dst_size);
+
 #endif /* PERF_COMPRESS_H */
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c764bbc91009..b1bf37c30461 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -29,6 +29,67 @@
 #include "stat.h"
 #include "arch/common.h"
 
+#ifdef HAVE_ZSTD_SUPPORT
+static int perf_session__process_compressed_event(struct perf_session *session,
+					union perf_event *event, u64 file_offset)
+{
+	void *src;
+	size_t decomp_size, src_size;
+	u64 decomp_last_rem = 0;
+	size_t decomp_len = session->header.env.comp_mmap_len;
+	struct decomp *decomp, *decomp_last = session->decomp_last;
+
+	decomp = mmap(NULL, sizeof(struct decomp) + decomp_len, PROT_READ|PROT_WRITE,
+		      MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+	if (decomp == MAP_FAILED) {
+		pr_err("Couldn't allocate memory for decompression\n");
+		return -1;
+	}
+
+	decomp->file_pos = file_offset;
+	decomp->head = 0;
+
+	if (decomp_last) {
+		decomp_last_rem = decomp_last->size - decomp_last->head;
+		memcpy(decomp->data, &(decomp_last->data[decomp_last->head]), decomp_last_rem);
+		decomp->size = decomp_last_rem;
+	}
+
+	src = (void *)event + sizeof(struct compressed_event);
+	src_size = event->pack.header.size - sizeof(struct compressed_event);
+
+	decomp_size = zstd_decompress_stream(&(session->zstd_data), src, src_size,
+				&(decomp->data[decomp_last_rem]), decomp_len - decomp_last_rem);
+	if (!decomp_size) {
+		munmap(decomp, sizeof(struct decomp) + decomp_len);
+		pr_err("Couldn't decompress data\n");
+		return -1;
+	}
+
+	decomp->size += decomp_size;
+
+	if (session->decomp == NULL) {
+		session->decomp = decomp;
+		session->decomp_last = decomp;
+	} else {
+		session->decomp_last->next = decomp;
+		session->decomp_last = decomp;
+	}
+
+	pr_debug("decomp (B): %ld to %ld\n", src_size, decomp_size);
+
+	return 0;
+}
+#else /* !HAVE_ZSTD_SUPPORT */
+static int perf_session__process_compressed_event(struct perf_session *session __maybe_unused,
+				union perf_event *event __maybe_unused,
+				u64 file_offset __maybe_unused)
+{
+	dump_printf(": unhandled!\n");
+	return 0;
+}
+#endif
+
 static int perf_session__deliver_event(struct perf_session *session,
 				       union perf_event *event,
 				       struct perf_tool *tool,
@@ -196,12 +257,23 @@ static void perf_session__delete_threads(struct perf_session *session)
 
 void perf_session__delete(struct perf_session *session)
 {
+	struct decomp *next, *decomp;
+	size_t decomp_len;
 	if (session == NULL)
 		return;
 	auxtrace__free(session);
 	auxtrace_index__free(&session->auxtrace_index);
 	perf_session__destroy_kernel_maps(session);
 	perf_session__delete_threads(session);
+	next = session->decomp;
+	decomp_len = session->header.env.comp_mmap_len;
+	do {
+		decomp = next;
+		if (decomp == NULL)
+			break;
+		next = decomp->next;
+		munmap(decomp, decomp_len + sizeof(struct decomp));
+	} while (1);
 	perf_env__exit(&session->header.env);
 	machines__exit(&session->machines);
 	if (session->data)
@@ -427,6 +499,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
 		tool->time_conv = process_event_op2_stub;
 	if (tool->feature == NULL)
 		tool->feature = process_event_op2_stub;
+	if (tool->compressed == NULL)
+		tool->compressed = perf_session__process_compressed_event;
 }
 
 static void swap_sample_id_all(union perf_event *event, void *data)
@@ -1370,7 +1444,8 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 	int fd = perf_data__fd(session->data);
 	int err;
 
-	dump_event(session->evlist, event, file_offset, &sample);
+	if (event->header.type != PERF_RECORD_COMPRESSED)
+		dump_event(session->evlist, event, file_offset, &sample);
 
 	/* These events are processed right away */
 	switch (event->header.type) {
@@ -1423,6 +1498,11 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		return tool->time_conv(session, event);
 	case PERF_RECORD_HEADER_FEATURE:
 		return tool->feature(session, event);
+	case PERF_RECORD_COMPRESSED:
+		err = tool->compressed(session, event, file_offset);
+		if (err)
+			dump_event(session->evlist, event, file_offset, &sample);
+		return 0;
 	default:
 		return -EINVAL;
 	}
@@ -1705,6 +1785,8 @@ static int perf_session__flush_thread_stacks(struct perf_session *session)
 
 volatile int session_done;
 
+static int __perf_session__process_decomp_events(struct perf_session *session);
+
 static int __perf_session__process_pipe_events(struct perf_session *session)
 {
 	struct ordered_events *oe = &session->ordered_events;
@@ -1785,6 +1867,10 @@ static int __perf_session__process_pipe_events(struct perf_session *session)
 	if (skip > 0)
 		head += skip;
 
+	err = __perf_session__process_decomp_events(session);
+	if (err)
+		goto out_err;
+
 	if (!session_done())
 		goto more;
 done:
@@ -1833,6 +1919,38 @@ fetch_mmaped_event(struct perf_session *session,
 	return event;
 }
 
+static int __perf_session__process_decomp_events(struct perf_session *session)
+{
+	s64 skip;
+	u64 size, file_pos = 0;
+	union perf_event *event;
+	struct decomp *decomp = session->decomp_last;
+
+	if (!decomp)
+		return 0;
+
+	while (decomp->head < decomp->size && !session_done()) {
+		event = fetch_mmaped_event(session, decomp->head, decomp->size, decomp->data);
+		if (!event)
+			break;
+
+		size = event->header.size;
+		if (size < sizeof(struct perf_event_header) ||
+		    (skip = perf_session__process_event(session, event, file_pos)) < 0) {
+			pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
+				decomp->file_pos + decomp->head, event->header.size, event->header.type);
+			return -EINVAL;
+		}
+
+		if (skip)
+			size += skip;
+
+		decomp->head += size;
+	}
+
+	return 0;
+}
+
 /*
  * On 64bit we can mmap the data file in one go. No need for tiny mmap
  * slices. On 32bit we use 32MB.
@@ -1933,6 +2051,10 @@ reader__process_events(struct reader *rd, struct perf_session *session,
 	head += size;
 	file_pos += size;
 
+	err = __perf_session__process_decomp_events(session);
+	if (err)
+		goto out;
+
 	ui_progress__update(prog, size);
 
 	if (session_done())
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 6c984c895924..dd8920b745bc 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -39,6 +39,16 @@ struct perf_session {
 	u64			bytes_transferred;
 	u64			bytes_compressed;
 	struct zstd_data	zstd_data;
+	struct decomp		*decomp;
+	struct decomp		*decomp_last;
+};
+
+struct decomp {
+	struct decomp *next;
+	u64 file_pos;
+	u64 head;
+	size_t size;
+	char data[];
 };
 
 struct perf_tool;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 250391672f9f..9096a6e3de59 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -28,6 +28,7 @@ typedef int (*event_attr_op)(struct perf_tool *tool,
 
 typedef int (*event_op2)(struct perf_session *session, union perf_event *event);
 typedef s64 (*event_op3)(struct perf_session *session, union perf_event *event);
+typedef int (*event_op4)(struct perf_session *session, union perf_event *event, u64 data);
 
 typedef int (*event_oe)(struct perf_tool *tool, union perf_event *event,
 			struct ordered_events *oe);
@@ -72,6 +73,7 @@ struct perf_tool {
 			stat,
 			stat_round,
 			feature;
+	event_op4	compressed;
 	event_op3	auxtrace;
 	bool		ordered_events;
 	bool		ordering_requires_timestamps;
diff --git a/tools/perf/util/zstd.c b/tools/perf/util/zstd.c
index 686c3a347dcc..f80736266df9 100644
--- a/tools/perf/util/zstd.c
+++ b/tools/perf/util/zstd.c
@@ -11,6 +11,21 @@ int zstd_init(struct zstd_data *data, int level)
 {
 	size_t ret;
 
+	data->dstream = ZSTD_createDStream();
+	if (data->dstream == NULL) {
+		pr_err("Couldn't create decompression stream.\n");
+		return -1;
+	}
+
+	ret = ZSTD_initDStream(data->dstream);
+	if (ZSTD_isError(ret)) {
+		pr_err("Failed to initialize decompression stream: %s\n", ZSTD_getErrorName(ret));
+		return -1;
+	}
+
+	if (!level)
+		return 0;
+
 	data->cstream = ZSTD_createCStream();
 	if (data->cstream == NULL) {
 		pr_err("Couldn't create compression stream.\n");
@@ -28,6 +43,11 @@ int zstd_init(struct zstd_data *data, int level)
 
 int zstd_fini(struct zstd_data *data)
 {
+	if (data->dstream) {
+		ZSTD_freeDStream(data->dstream);
+		data->dstream = NULL;
+	}
+
 	if (data->cstream) {
 		ZSTD_freeCStream(data->cstream);
 		data->cstream = NULL;
@@ -71,6 +91,27 @@ size_t zstd_compress_stream_to_records(struct zstd_data *data,
 	return compressed;
 }
 
+size_t zstd_decompress_stream(struct zstd_data *data,
+		void *src, size_t src_size, void *dst, size_t dst_size)
+{
+	size_t ret;
+	ZSTD_inBuffer input = { src, src_size, 0 };
+	ZSTD_outBuffer output = { dst, dst_size, 0 };
+
+	while (input.pos < input.size) {
+		ret = ZSTD_decompressStream(data->dstream, &output, &input);
+		if (ZSTD_isError(ret)) {
+			pr_err("failed to decompress (B): %ld -> %ld : %s\n",
+				src_size, output.size, ZSTD_getErrorName(ret));
+			break;
+		}
+		output.dst  = dst + output.pos;
+		output.size = dst_size - output.pos;
+	}
+
+	return output.pos;
+}
+
 #else /* !HAVE_ZSTD_SUPPORT */
 
 int zstd_init(struct zstd_data *data __maybe_unused, int level __maybe_unused)
@@ -92,4 +133,11 @@ size_t zstd_compress_stream_to_records(struct zstd_data *data __maybe_unused,
 	return 0;
 }
 
+size_t zstd_decompress_stream(struct zstd_data *data __maybe_unused, void *src __maybe_unused,
+		size_t src_size __maybe_unused, void *dst __maybe_unused,
+		size_t dst_size __maybe_unused)
+{
+	return 0;
+}
+
 #endif

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 9/9] perf inject: enable COMPRESSED records decompression
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (6 preceding siblings ...)
  2019-02-26  6:26 ` [PATCH v3 8/9] perf report: implement record trace decompression Alexey Budankov
@ 2019-02-26  6:28 ` Alexey Budankov
  2019-02-26  7:01 ` [PATCH v3 1/9] feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines Alexey Budankov
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  6:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Initialized decompression API so COMPRESSED records would be
decompressed into the resulting output data file.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/perf/builtin-inject.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 9bb1f35d5cb7..5a5bc4207766 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -839,6 +839,9 @@ int cmd_inject(int argc, const char **argv)
 	if (inject.session == NULL)
 		return -1;
 
+	if (zstd_init(&(inject.session->zstd_data), 0) < 0)
+		pr_warning("Decompression initialization failed.\n");
+
 	if (inject.build_ids) {
 		/*
 		 * to make sure the mmap records are ordered correctly
@@ -869,6 +872,7 @@ int cmd_inject(int argc, const char **argv)
 	ret = __cmd_inject(&inject);
 
 out_delete:
+	zstd_fini(&(inject.session->zstd_data));
 	perf_session__delete(inject.session);
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 1/9] feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (7 preceding siblings ...)
  2019-02-26  6:28 ` [PATCH v3 9/9] perf inject: enable COMPRESSED records decompression Alexey Budankov
@ 2019-02-26  7:01 ` Alexey Budankov
  2019-02-27 14:27 ` [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Jiri Olsa
  2019-02-27 14:28 ` Jiri Olsa
  10 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-26  7:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra
  Cc: Jiri Olsa, Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


Implement libzstd feature check, NO_LIBZSTD and LIBZSTD_DIR defines
to override Zstd library sources or disable the feature from the
command line:

  $ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all
  $ make -C tools/perf NO_LIBZSTD=1 clean all

Auto detection feature status is reported just before compilation starts.
If your system has some version of the zstd library preinstalled then
the build system finds and uses it during the build.

If you still prefer to compile with some other version of zstd library
that is not preinstalled you have capability to refer the compilation 
to that version using LIBZSTD_DIR define.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
 tools/build/Makefile.feature       |  6 ++++--
 tools/build/feature/Makefile       |  6 +++++-
 tools/build/feature/test-all.c     |  5 +++++
 tools/build/feature/test-libzstd.c | 12 ++++++++++++
 tools/perf/Makefile.config         | 20 ++++++++++++++++++++
 tools/perf/Makefile.perf           |  3 +++
 6 files changed, 49 insertions(+), 3 deletions(-)
 create mode 100644 tools/build/feature/test-libzstd.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 61e46d54a67c..adf791cbd726 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -66,7 +66,8 @@ FEATURE_TESTS_BASIC :=                  \
         sched_getcpu			\
         sdt				\
         setns				\
-        libaio
+        libaio				\
+        libzstd
 
 # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
 # of all feature tests
@@ -118,7 +119,8 @@ FEATURE_DISPLAY ?=              \
          lzma                   \
          get_cpuid              \
          bpf			\
-         libaio
+         libaio			\
+         libzstd
 
 # Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features.
 # If in the future we need per-feature checks/flags for features not
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 7ceb4441b627..4b8244ee65ce 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -62,7 +62,8 @@ FILES=                                          \
          test-clang.bin				\
          test-llvm.bin				\
          test-llvm-version.bin			\
-         test-libaio.bin
+         test-libaio.bin			\
+         test-libzstd.bin
 
 FILES := $(addprefix $(OUTPUT),$(FILES))
 
@@ -301,6 +302,9 @@ $(OUTPUT)test-clang.bin:
 $(OUTPUT)test-libaio.bin:
 	$(BUILD) -lrt
 
+$(OUTPUT)test-libzstd.bin:
+	$(BUILD) -lzstd
+
 ###############################
 
 clean:
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index e903b86b742f..b0dda7db2a17 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -178,6 +178,10 @@
 # include "test-reallocarray.c"
 #undef main
 
+#define main main_test_zstd
+# include "test-libzstd.c"
+#undef main
+
 int main(int argc, char *argv[])
 {
 	main_test_libpython();
@@ -219,6 +223,7 @@ int main(int argc, char *argv[])
 	main_test_setns();
 	main_test_libaio();
 	main_test_reallocarray();
+	main_test_libzstd();
 
 	return 0;
 }
diff --git a/tools/build/feature/test-libzstd.c b/tools/build/feature/test-libzstd.c
new file mode 100644
index 000000000000..55268c01b84d
--- /dev/null
+++ b/tools/build/feature/test-libzstd.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <zstd.h>
+
+int main(void)
+{
+	ZSTD_CStream	*cstream;
+
+	cstream = ZSTD_createCStream();
+	ZSTD_freeCStream(cstream);
+
+	return 0;
+}
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 0f11d5891301..4949bdb16a66 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -152,6 +152,13 @@ endif
 FEATURE_CHECK_CFLAGS-libbabeltrace := $(LIBBABELTRACE_CFLAGS)
 FEATURE_CHECK_LDFLAGS-libbabeltrace := $(LIBBABELTRACE_LDFLAGS) -lbabeltrace-ctf
 
+ifdef LIBZSTD_DIR
+  LIBZSTD_CFLAGS  := -I$(LIBZSTD_DIR)/lib
+  LIBZSTD_LDFLAGS := -L$(LIBZSTD_DIR)/lib
+endif
+FEATURE_CHECK_CFLAGS-libzstd := $(LIBZSTD_CFLAGS)
+FEATURE_CHECK_LDFLAGS-libzstd := $(LIBZSTD_LDFLAGS)
+
 FEATURE_CHECK_CFLAGS-bpf = -I. -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(SRCARCH)/include/uapi -I$(srctree)/tools/include/uapi
 # include ARCH specific config
 -include $(src-perf)/arch/$(SRCARCH)/Makefile
@@ -782,6 +789,19 @@ ifndef NO_LZMA
   endif
 endif
 
+ifndef NO_LIBZSTD
+  ifeq ($(feature-libzstd), 1)
+    CFLAGS += -DHAVE_ZSTD_SUPPORT
+    CFLAGS += $(LIBZSTD_CFLAGS)
+    LDFLAGS += $(LIBZSTD_LDFLAGS)
+    EXTLIBS += -lzstd
+    $(call detected,CONFIG_ZSTD)
+  else
+    msg := $(warning No libzstd found, disables trace compression, please install libzstd-dev[el] and/or set LIBZSTD_DIR);
+    NO_LIBZSTD := 1
+  endif
+endif
+
 ifndef NO_BACKTRACE
   ifeq ($(feature-backtrace), 1)
     CFLAGS += -DHAVE_BACKTRACE_SUPPORT
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 01f7555fd933..06b927ee6ee3 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -108,6 +108,9 @@ include ../scripts/utilities.mak
 # streaming for record mode. Currently Posix AIO trace streaming is
 # supported only when linking with glibc.
 #
+# Define NO_LIBZSTD if you do not want support of Zstandard based runtime
+# trace compression in record mode.
+#
 
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (8 preceding siblings ...)
  2019-02-26  7:01 ` [PATCH v3 1/9] feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines Alexey Budankov
@ 2019-02-27 14:27 ` Jiri Olsa
  2019-02-27 14:56   ` Alexey Budankov
  2019-02-27 14:28 ` Jiri Olsa
  10 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2019-02-27 14:27 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel

On Tue, Feb 26, 2019 at 08:31:38AM +0300, Alexey Budankov wrote:
> 
> The patch set implements runtime trace compression (-z option) in 
> record mode and trace auto decompression in report and inject modes. 
> Streaming Zstandard (Zstd) API (zstd) is used for compression and 
> decompression of data that come from kernel mmaped data buffers.
> 
> Usage of implemented -z,--compression_level=n option provides ~3-5x 
> avg. trace file size reduction on variety of tested workloads what 
> saves storage space on larger server systems where trace file size 
> can easily reach several tens or even hundreds of GiBs, especially 
> when profiling with dwarf-based stacks and tracing of context switches.
> Implemented -f,--mmap-flush option can be used to avoid compressing 
> every single byte of data and increase compression ratio at the same 
> time lowering tool runtime overhead. Default option value is 1 what 
> is equal to the current perf record implementation. The option is 
> independent from -z setting and doesn't vary with compression level:
> 
>   $ tools/perf/perf record -z 1 -e cycles -- matrix.gcc
>   $ tools/perf/perf record --aio=1 -z 1 -e cycles -- matrix.gcc
>   $ tools/perf/perf record -z 1 -f 1024 -e cycles -- matrix.gcc
>   $ tools/perf/perf record --aio=1 -z 1 -f 1024 -e cycles -- matrix.gcc
> 
> Runtime compression overhead has been measured for serial and AIO 
> trace writing modes when profiling matrix multiplication workload 
> with the following results:
> 
>     -------------------------------------------------------------
>     | SERIAL			  | AIO-1                       |
> -----------------------------------------------------------------
> |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
> |----------------------------------------------------------------
> | 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
> | 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
> | 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
> | 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
> | 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
> | 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
> -----------------------------------------------------------------
> 
> OVH = (Execution time with -z N) / (Execution time with -z 0)
> 
> ratio - compression ratio
> size  - number of bytes that was compressed
> 
> size ~= trace file x ratio
> 
> See complete description of measurement conditions and details below.
> 
> Introduced compression functionality can be disabled or configured from 
> the command line using NO_LIBZSTD and LIBZSTD_DIR defines:
> 
>   $ make -C tools/perf NO_LIBZSTD=1 clean all
>   $ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all
> 
> If your system has some version of the zstd package preinstalled then 
> the build system finds and uses it during the build. Auto detection 
> feature status is reported just before compilation starts, as usual.
> If you still prefer to compile with some version of zstd that is not 
> preinstalled you have capability to refer the compilation to that 
> version using LIBZSTD_DIR define.
> 
> See 'tools/perf/perf test' run results below.
> 
> ---
> Alexey Budankov (9):
>   feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
>   perf record: implement -f,--mmap-flush=<threshold> option
>   perf session: define bytes_transferred and bytes_compressed metrics
>   perf record: implement COMPRESSED event record and its attributes
>   perf mmap: implement dedicated memory buffer for data compression
>   perf util: introduce Zstd based streaming compression API
>   perf record: implement -z,--compression_level=n option and compression
>   perf report: implement record trace decompression
>   perf inject: enable COMPRESSED records decompression

what commit id is this post based on? I can't get it applied:

Applying: feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
Applying: perf record: implement -f,--mmap-flush=<threshold> option
error: corrupt patch at line 276
Patch failed at 0002 perf record: implement -f,--mmap-flush=<threshold> option
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space
  2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
                   ` (9 preceding siblings ...)
  2019-02-27 14:27 ` [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Jiri Olsa
@ 2019-02-27 14:28 ` Jiri Olsa
  2019-02-27 15:40   ` Alexey Budankov
  10 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2019-02-27 14:28 UTC (permalink / raw)
  To: Alexey Budankov
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel

On Tue, Feb 26, 2019 at 08:31:38AM +0300, Alexey Budankov wrote:

SNIP

> TESTING:
> 
> tools/perf/perf test
>  1: vmlinux symtab matches kallsyms                       : Skip
>  2: Detect openat syscall event                           : Ok
>  3: Detect openat syscall event on all cpus               : Ok
>  4: Read samples using the mmap interface                 : Ok
>  5: Test data source output                               : Ok
>  6: Parse event definition strings                        : Ok
>  7: Simple expression parser                              : Ok
>  8: PERF_RECORD_* events & perf_sample fields             : Ok
>  9: Parse perf pmu format                                 : Ok
> 10: DSO data read                                         : Ok
> 11: DSO data cache                                        : Ok
> 12: DSO data reopen                                       : Ok
> 13: Roundtrip evsel->name                                 : Ok
> 14: Parse sched tracepoints fields                        : Ok
> 15: syscalls:sys_enter_openat event fields                : Ok
> 16: Setup struct perf_event_attr                          : Skip
> 17: Match and link multiple hists                         : Ok
> 18: 'import perf' in python                               : FAILED!

Arnaldo's perf/core is passing for me in here:

[jolsa@krava perf]$ ./perf test 16 48
16: Setup struct perf_event_attr                          : Ok
48: Synthesize attr update                                : Ok

jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space
  2019-02-27 14:27 ` [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Jiri Olsa
@ 2019-02-27 14:56   ` Alexey Budankov
  0 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-27 14:56 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


On 27.02.2019 17:27, Jiri Olsa wrote:
> On Tue, Feb 26, 2019 at 08:31:38AM +0300, Alexey Budankov wrote:
>>
>> The patch set implements runtime trace compression (-z option) in 
>> record mode and trace auto decompression in report and inject modes. 
>> Streaming Zstandard (Zstd) API (zstd) is used for compression and 
>> decompression of data that come from kernel mmaped data buffers.
>>
>> Usage of implemented -z,--compression_level=n option provides ~3-5x 
>> avg. trace file size reduction on variety of tested workloads what 
>> saves storage space on larger server systems where trace file size 
>> can easily reach several tens or even hundreds of GiBs, especially 
>> when profiling with dwarf-based stacks and tracing of context switches.
>> Implemented -f,--mmap-flush option can be used to avoid compressing 
>> every single byte of data and increase compression ratio at the same 
>> time lowering tool runtime overhead. Default option value is 1 what 
>> is equal to the current perf record implementation. The option is 
>> independent from -z setting and doesn't vary with compression level:
>>
>>   $ tools/perf/perf record -z 1 -e cycles -- matrix.gcc
>>   $ tools/perf/perf record --aio=1 -z 1 -e cycles -- matrix.gcc
>>   $ tools/perf/perf record -z 1 -f 1024 -e cycles -- matrix.gcc
>>   $ tools/perf/perf record --aio=1 -z 1 -f 1024 -e cycles -- matrix.gcc
>>
>> Runtime compression overhead has been measured for serial and AIO 
>> trace writing modes when profiling matrix multiplication workload 
>> with the following results:
>>
>>     -------------------------------------------------------------
>>     | SERIAL			  | AIO-1                       |
>> -----------------------------------------------------------------
>> |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
>> |----------------------------------------------------------------
>> | 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
>> | 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
>> | 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
>> | 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
>> | 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
>> | 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
>> -----------------------------------------------------------------
>>
>> OVH = (Execution time with -z N) / (Execution time with -z 0)
>>
>> ratio - compression ratio
>> size  - number of bytes that was compressed
>>
>> size ~= trace file x ratio
>>
>> See complete description of measurement conditions and details below.
>>
>> Introduced compression functionality can be disabled or configured from 
>> the command line using NO_LIBZSTD and LIBZSTD_DIR defines:
>>
>>   $ make -C tools/perf NO_LIBZSTD=1 clean all
>>   $ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all
>>
>> If your system has some version of the zstd package preinstalled then 
>> the build system finds and uses it during the build. Auto detection 
>> feature status is reported just before compilation starts, as usual.
>> If you still prefer to compile with some version of zstd that is not 
>> preinstalled you have capability to refer the compilation to that 
>> version using LIBZSTD_DIR define.
>>
>> See 'tools/perf/perf test' run results below.
>>
>> ---
>> Alexey Budankov (9):
>>   feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
>>   perf record: implement -f,--mmap-flush=<threshold> option
>>   perf session: define bytes_transferred and bytes_compressed metrics
>>   perf record: implement COMPRESSED event record and its attributes
>>   perf mmap: implement dedicated memory buffer for data compression
>>   perf util: introduce Zstd based streaming compression API
>>   perf record: implement -z,--compression_level=n option and compression
>>   perf report: implement record trace decompression
>>   perf inject: enable COMPRESSED records decompression
> 
> what commit id is this post based on? I can't get it applied:

I think it is that one below: 

ca9d246ede355946273271c37bf2e9e9fe533212 (HEAD -> zstd.V3) perf inject: enable COMPRESSED records decompression
4a71d4b79f5c1793e4d28851ff0ab11e1b0b2fd9 perf report: implement record trace decompression
5c2ba4e117cfacc415fbeac8e26484cfb80070b7 perf record: implement -z,--compression_level=n option and compression
9634ec18b3090daf5b0137a9aa7dd87552245a83 perf util: introduce Zstd based streaming compression API
bc692fb6a1a00e005592233a4ced1198e9d0a78d perf mmap: implement dedicated memory buffer for data compression
9ef63f64522f51bdba9e311303dd8bd73ce88f95 perf record: implement COMPRESSED event record and its attributes
8b470b5b6668532a679cecceeca2e56c1c71b8e2 perf session: define bytes_transferred and bytes_compressed metrics
1666675103895646f78b5f30c9bab77dfbfb91a1 perf record: implement -f,--mmap-flush=<threshold> option
3347a6774f85bcb06a04c95c188eb7f8e99811e1 feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
306b805b097d42e66b0cdae19f97e1d1fa60909e (perf/core) Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
=> c25b9ecf016e7824096611d4b8b0c6c271205316 (origin/perf/core) perf script python: add Python3 support to syscall-counts-by-pid.py
c6e562de9f34fd4df2833a5ef4cf59948385e816 perf script python: add Python3 support to syscall-counts.py
b3b4ccaa756984bc81ed7436e55c6fca5d02f00d perf script python: add Python3 support to stat-cpi.py

Thanks,
Alexey

> 
> Applying: feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines
> Applying: perf record: implement -f,--mmap-flush=<threshold> option
> error: corrupt patch at line 276
> Patch failed at 0002 perf record: implement -f,--mmap-flush=<threshold> option
> Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> 
> jirka
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space
  2019-02-27 14:28 ` Jiri Olsa
@ 2019-02-27 15:40   ` Alexey Budankov
  0 siblings, 0 replies; 14+ messages in thread
From: Alexey Budankov @ 2019-02-27 15:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Namhyung Kim, Alexander Shishkin, Andi Kleen, linux-kernel


On 27.02.2019 17:28, Jiri Olsa wrote:
> On Tue, Feb 26, 2019 at 08:31:38AM +0300, Alexey Budankov wrote:
> 
> SNIP
> 
>> TESTING:
>>
>> tools/perf/perf test
>>  1: vmlinux symtab matches kallsyms                       : Skip
>>  2: Detect openat syscall event                           : Ok
>>  3: Detect openat syscall event on all cpus               : Ok
>>  4: Read samples using the mmap interface                 : Ok
>>  5: Test data source output                               : Ok
>>  6: Parse event definition strings                        : Ok
>>  7: Simple expression parser                              : Ok
>>  8: PERF_RECORD_* events & perf_sample fields             : Ok
>>  9: Parse perf pmu format                                 : Ok
>> 10: DSO data read                                         : Ok
>> 11: DSO data cache                                        : Ok
>> 12: DSO data reopen                                       : Ok
>> 13: Roundtrip evsel->name                                 : Ok
>> 14: Parse sched tracepoints fields                        : Ok
>> 15: syscalls:sys_enter_openat event fields                : Ok
>> 16: Setup struct perf_event_attr                          : Skip
>> 17: Match and link multiple hists                         : Ok
>> 18: 'import perf' in python                               : FAILED!
> 
> Arnaldo's perf/core is passing for me in here:
> 
> [jolsa@krava perf]$ ./perf test 16 48
> 16: Setup struct perf_event_attr                          : Ok
> 48: Synthesize attr update                                : Ok

Passes for me too when I run in tools/perf/ dir:

[root@nntvtune39 acme]# pwd && tools/perf/perf test 16 48
/root/abudanko/kernel/acme
16: Setup struct perf_event_attr                          : Skip
48: Synthesize attr update                                : Ok
[root@nntvtune39 acme]# cd tools/perf/
[root@nntvtune39 perf]# pwd && tools/perf/perf test 16 48
[root@nntvtune39 perf]# pwd && perf test 16 48
/root/abudanko/kernel/acme/tools/perf
16: Setup struct perf_event_attr               : Ok
48: Synthesize attr update                     : Ok

Thanks,
Alexey

> 
> jirka
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-02-27 15:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-26  5:31 [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Alexey Budankov
2019-02-26  5:55 ` [PATCH v3 2/9] perf record: implement -f,--mmap-flush=<threshold> option Alexey Budankov
2019-02-26  5:57 ` [PATCH v3 3/9] perf session: define bytes_transferred and bytes_compressed metrics Alexey Budankov
2019-02-26  5:59 ` [PATCH v3 4/9] perf record: implement COMPRESSED event record and its attributes Alexey Budankov
2019-02-26  6:03 ` [PATCH v3 5/9] perf mmap: implement dedicated memory buffer for data compression Alexey Budankov
2019-02-26  6:10 ` [PATCH v3 6/9] perf util: introduce Zstd based streaming compression API Alexey Budankov
2019-02-26  6:20 ` [PATCH v3 7/9] perf record: implement -z,--compression_level=n option and compression Alexey Budankov
2019-02-26  6:26 ` [PATCH v3 8/9] perf report: implement record trace decompression Alexey Budankov
2019-02-26  6:28 ` [PATCH v3 9/9] perf inject: enable COMPRESSED records decompression Alexey Budankov
2019-02-26  7:01 ` [PATCH v3 1/9] feature: implement libzstd check, LIBZSTD_DIR and NO_LIBZSTD defines Alexey Budankov
2019-02-27 14:27 ` [PATCH v3 0/9] perf: enable compression of record mode trace to save storage space Jiri Olsa
2019-02-27 14:56   ` Alexey Budankov
2019-02-27 14:28 ` Jiri Olsa
2019-02-27 15:40   ` Alexey Budankov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.