From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753322AbbK3JCx (ORCPT ); Mon, 30 Nov 2015 04:02:53 -0500 Received: from mail-pa0-f48.google.com ([209.85.220.48]:33362 "EHLO mail-pa0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751816AbbK3JCr (ORCPT ); Mon, 30 Nov 2015 04:02:47 -0500 From: Stephane Eranian To: linux-kernel@vger.kernel.org Cc: acme@redhat.com, peterz@infradead.org, mingo@elte.hu, ak@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, cel@us.ibm.com, sukadev@linux.vnet.ibm.com, sonnyrao@chromium.org, johnmccutchan@google.com, dsahern@gmail.com, adrian.hunter@intel.com, pawel.moll@arm.com Subject: [PATCH v8 0/4] perf: add support for profiling jitted code Date: Mon, 30 Nov 2015 10:02:19 +0100 Message-Id: <1448874143-7269-1-git-send-email-eranian@google.com> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch series extends perf record/report/annotate to enable profiling of jitted (just-in-time compiled) code. The current perf tool provides very limited support for profiling jitted code for some runtime environments. But the support is experimental and cannot be used in complex environments. It relies on files in /tmp, for instance. It does not support annotate mode or rejitted code. This patch series adds a better way of profiling jitted code with the following advantages: - support any jitted code environment (some with modifications) - support Java runtime with JVMTI interface with no modifications - provides a portable JVMTI agent library - known to support V8 runtime - known to support DART runtime - supports code rejitting and code movements - no files in /tmp - meta-data file is unique to each run - no changes to perf report/annotate - support per-thread and system-wide profiling - support monitoring of multiple simultaneous Jit runtimes - source level view in perf annotate - works on x86_64, i386, arm32, arm64 The support is based on cooperation with the runtime. For Java runtimes, supporting the JVMTI interface, there is no change necessary. For other runtimes, modifications are necessary to emit the meta-data to support symbolization, annotation, source lines correlation of the samples. Those modifications are relatively straighforward, some have been implemented in V8 and DART. The jit environment emits a binary dump file which contains the jitted code (in raw format) and meta-data describing the mapping of functions. The binary format is documented in the jitdump.h header file. It is adapted from the OProfile jitdump format. To enable synchronization of the runtime MMAPs with those recorded by the kernel on behalf of the perf tool, the runtime needs to timestamp any record in the dump file using the same time source. This is possible since Linux 4.1 where the kernel supports per event timestamp clock source. In the case of the JVMTI agent, the clock used is CLOCK_MONOTONIC, thus perf record is invoked with -k mono such that it matches the agent. The current support only works when the runtime is monitored from start to finish: perf record java --agentpath:libpfmjvmti.so my_class. Once the run is completed, the jitdump file needs to be injected into the perf.data file. This is accomplished by using the perf inject command. This will also generate an ELF image for each jitted function. The injected MMAP records will point to these ELF images. The reasoning behind using ELF images is that it makes processing for perf report and annotate automatic and transparent. It also makes it easier to package and analyze on a remote machine. Binutils tools can decode the ELF images easily. The reporting is unchanged, simply invoke perf report or perf annotate on the modified perf.data file. The jitted code will appear symbolized and the assembly view will display the instruction level profile and source level profile. As an added bonus, the series includes support for demangling function signature from OpenJDK. Furthermore, we believe there is a way to skip the perf inject phase and have perf report/annotate directly inject the MMAP records on the fly during processing of the perf.data file. Perf report would also generate the ELF files if necessary. Such optimization, would make using this extension seamless in system-wide mode and larger environments. This will be added in a later update as well. In V2, we have switched to Pawell Moll and David Ahern posix clock kernel module instead. We have dropped the patch which modified the arguments to map_init() because the change was not used. We are not printing the return type of Java methods anymore and have made the Java demangler a separate module. We also rebased to 3.19.0+ from tip.git. In V3, we switched to Pawel Moll's CLOCK_MONOTONIC perf clock patches. This patch switch perf_events from sched_clock to CLOCK_MONOTONIC, a clock source which is available to users. In V4, we rebased to 4.0-rc5. We also simplified the process by getting rid of the requirement to pass the jitdump file name to perf inject. Now, perf injects automtically detects if jitdumps were generated and it merges the relevant meta-data. This is accomplished by having the jit runtime mmap the jitdump file for the purpose of creating a MMAP record in the perf.data file. That MMAP contains all the info to locate the jitdump file and generate the ELF images for jitted functions. In V5, we rebase to acme's perf/core branch (instead of tip.git). We fixed some bswap issues, switched to using scnprintf() and fixed formatting issues. Also made sure all the files were included in the patches. We also fix one error message in the JVMTI agent. In V6, we switched back to using tip.git to leverage PeterZ's clockid patch for perf_events in 4.0.0-rc6. Clock source can now be specified per event and they are connected with the MONOTONIC Posix clock. We leverage this extension to timestamp samples in the jit runtime and correlate them with perf samples. Notice the -k mono option in perf record example below. In V7, we rebased to 4.3.0-rc3 using tip.git (at commit 0dc7757). We fixed several issues in the agent. We also added source line information in the jitdump file from the JVMTI agent. This is still experimental and probably has some issues. The source line info is encoded in DWARF2 format in each ELF image. The code to do this is leveraged from Oprofile with some fixes and cleanups. In V8, we rebased to 4.4.0-rc2 using tip.git (at commit 9962da9). We received great contributions from Andrian Hunter (adrian.hunter@intel.com). He has fixed several issues in the jitdump injection code, see changelog of the patch. The jitdump header has a new flags field to be used for Intel PT. The series has been verified to work on x86_64, i386, arm32 and arm64 running 4.1 or later. To use the new feature: - need to run with 4.1 or later - compile perf - cd tools/perf/jvmti; make; install wherever is appropriate Example using openJDK: $ perf record -k mono java -agentpath:libjvmti.so my_class $ perf inject -i perf.data -jit -o perf.data.jitted $ perf report -i perf.data.jitted Thanks to all the contributors and testers. Special thanks to PeterZ for adding the clock source to perf_events and solving the problem of common timesource for user and kernel level samples. Thanks to the Oprofile authors for the DWARF2 source line code generation. Special thanks to Adrian Hunter for his many bug fixes and improvements for the V8 version of this series. Enjoy, Stephane Eranian (4): perf tools: add Java demangling support perf inject: add jitdump mmap injection support perf tools: add JVMTI agent library perf/jit: add source line info support tools/build/Makefile.feature | 2 + tools/build/feature/Makefile | 4 + tools/build/feature/test-all.c | 5 + tools/build/feature/test-libcrypto.c | 17 + tools/perf/Documentation/perf-inject.txt | 7 + tools/perf/builtin-inject.c | 93 +++++ tools/perf/config/Makefile | 11 + tools/perf/jvmti/Makefile | 73 ++++ tools/perf/jvmti/jvmti_agent.c | 465 +++++++++++++++++++++ tools/perf/jvmti/jvmti_agent.h | 36 ++ tools/perf/jvmti/libjvmti.c | 304 ++++++++++++++ tools/perf/util/Build | 7 + tools/perf/util/demangle-java.c | 199 +++++++++ tools/perf/util/demangle-java.h | 10 + tools/perf/util/genelf.c | 449 +++++++++++++++++++++ tools/perf/util/genelf.h | 67 +++ tools/perf/util/genelf_debug.c | 610 ++++++++++++++++++++++++++++ tools/perf/util/jit.h | 15 + tools/perf/util/jitdump.c | 672 +++++++++++++++++++++++++++++++ tools/perf/util/jitdump.h | 124 ++++++ tools/perf/util/symbol-elf.c | 3 + 21 files changed, 3173 insertions(+) create mode 100644 tools/build/feature/test-libcrypto.c create mode 100644 tools/perf/jvmti/Makefile create mode 100644 tools/perf/jvmti/jvmti_agent.c create mode 100644 tools/perf/jvmti/jvmti_agent.h create mode 100644 tools/perf/jvmti/libjvmti.c create mode 100644 tools/perf/util/demangle-java.c create mode 100644 tools/perf/util/demangle-java.h create mode 100644 tools/perf/util/genelf.c create mode 100644 tools/perf/util/genelf.h create mode 100644 tools/perf/util/genelf_debug.c create mode 100644 tools/perf/util/jit.h create mode 100644 tools/perf/util/jitdump.c create mode 100644 tools/perf/util/jitdump.h -- 1.9.1