All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, acme@redhat.com,
	kirill.shutemov@linux.intel.com, Borislav Petkov <bp@alien8.de>,
	rric@kernel.org,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>
Subject: [RFC PATCH 00/17] perf: Detached events
Date: Tue,  5 Sep 2017 16:30:09 +0300	[thread overview]
Message-ID: <20170905133026.13689-1-alexander.shishkin@linux.intel.com> (raw)

Hi,

I'm going to keep this short.

Objective: include perf data (specifically, AUX/Intel PT) in process core
dumps.

Obstacles and how this patchset deals with them:
(1) Need to be able to have perf events running without consumer (perf
record) running in the background.
Detached events: a new flag to the perf syscall makes a 'detached' event,
which exists after its file descriptor is released. Not all detached events
are per-thread AUX events: this tries to take into account the need for
system-wide persistent events too.

(2) Need to be able to kill those events, so they need to be accessible
after they are created.
Event files: detached events exist as files in tracefs (at the moment), can
be opened/mmaped/read/removed.

(3) Ring buffer contents from these events needs to end up in the core dump
file.
Injecting perf ring buffer into the target task's address space.

(4) Inheritance will have to allocate ring buffers for such events for this
feature to be useful.
A parentless detached event is created (with a ring buffer) upon
inheritance, no output redirection, each event has its own ring buffer.

(5) Sideeffect of (4) is that we can't use GFP_KERNEL pages for such ring
buffers or else we'll have to fail inherit_event() (and, therefore, user's
fork()) when they exhaust their mlock limit.
Using shmemfs-backed pages for such a ring buffer and only pinning them
while the corresponding target task is running. Other times these pages can
be swapped out.

(6) Ring buffer memory accounting needs to take this new arrangement into
account: one user can use up at most NR_CPUS * buffer_size memory at any
given point in time.
Only account the first such event and undo the accounting when the last
event is gone.

(7) We'll also need to supply all the things that the [PT] decoder normally
finds out via sysfs attributes, like clock ratios, capabilities, etc so that
it also finds its way into the core dump file.
"PMU info" structure is appended to the user page.

I've also hack the perf tool to support all this, all these things can be
found at [1]. I'm not posting the tooling patches though, them being
thoroughly ugly and proof-of-concept. In short, perf record will create
detached events with '--detached' and afterwards will open detached events
via their path in tracefs.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/ash/linux.git/log/?h=perf-detached-shmem-wip

Alexander Shishkin (17):
  perf: Allow mmapping only user page
  perf: Factor out mlock accounting
  tracefs: De-globalize instances' callbacks
  tracefs: Add ->unlink callback to tracefs_dir_ops
  perf: Introduce detached events
  perf: Add buffers to the detached events
  perf: Add pmu_info to user page
  perf: Allow inheritance for detached events
  perf: Use shmemfs pages for userspace-only per-thread detached events
  perf: Implement pinning and scheduling for SHMEM events
  perf: Implement mlock accounting for shmem ring buffers
  perf: Track pinned events per user
  perf: Re-inject shmem buffers after exec
  perf: Add ioctl(REATTACH) for detached events
  perf: Allow controlled non-root access to detached events
  perf/x86/intel/pt: Add PMU info
  perf/x86/intel/bts: Add PMU info

 arch/x86/events/intel/bts.c     |  20 +-
 arch/x86/events/intel/pt.c      |  23 +-
 arch/x86/events/intel/pt.h      |  11 +
 fs/tracefs/inode.c              |  71 +++-
 include/linux/perf_event.h      |  33 ++
 include/linux/sched/user.h      |   6 +
 include/linux/tracefs.h         |   3 +-
 include/uapi/linux/perf_event.h |  15 +
 kernel/events/core.c            | 526 +++++++++++++++++++++++------
 kernel/events/internal.h        |  27 +-
 kernel/events/ring_buffer.c     | 730 ++++++++++++++++++++++++++++++++++++--
 kernel/trace/trace.c            |   8 +-
 kernel/user.c                   |   1 +
 13 files changed, 1315 insertions(+), 159 deletions(-)

-- 
2.14.1

             reply	other threads:[~2017-09-05 13:30 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-05 13:30 Alexander Shishkin [this message]
2017-09-05 13:30 ` [RFC PATCH 01/17] perf: Allow mmapping only user page Alexander Shishkin
2017-09-06 16:28   ` Borislav Petkov
2017-09-13 11:35     ` Alexander Shishkin
2017-09-13 12:58       ` Borislav Petkov
2017-09-05 13:30 ` [RFC PATCH 02/17] perf: Factor out mlock accounting Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 03/17] tracefs: De-globalize instances' callbacks Alexander Shishkin
2018-01-24 18:54   ` Steven Rostedt
2017-09-05 13:30 ` [RFC PATCH 04/17] tracefs: Add ->unlink callback to tracefs_dir_ops Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 05/17] perf: Introduce detached events Alexander Shishkin
2017-10-03 14:34   ` Peter Zijlstra
2017-10-06 11:23     ` Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 06/17] perf: Add buffers to the " Alexander Shishkin
2017-10-03 14:36   ` Peter Zijlstra
2017-09-05 13:30 ` [RFC PATCH 07/17] perf: Add pmu_info to user page Alexander Shishkin
2017-10-03 14:40   ` Peter Zijlstra
2017-09-05 13:30 ` [RFC PATCH 08/17] perf: Allow inheritance for detached events Alexander Shishkin
2017-10-03 14:42   ` Peter Zijlstra
2017-10-06 11:40     ` Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 09/17] perf: Use shmemfs pages for userspace-only per-thread " Alexander Shishkin
2017-10-03 14:43   ` Peter Zijlstra
2017-10-06 11:52     ` Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 10/17] perf: Implement pinning and scheduling for SHMEM events Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 11/17] perf: Implement mlock accounting for shmem ring buffers Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 12/17] perf: Track pinned events per user Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 13/17] perf: Re-inject shmem buffers after exec Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 14/17] perf: Add ioctl(REATTACH) for detached events Alexander Shishkin
2017-10-03 14:50   ` Peter Zijlstra
2017-09-05 13:30 ` [RFC PATCH 15/17] perf: Allow controlled non-root access to " Alexander Shishkin
2017-10-03 14:53   ` Peter Zijlstra
2017-09-05 13:30 ` [RFC PATCH 16/17] perf/x86/intel/pt: Add PMU info Alexander Shishkin
2017-09-05 13:30 ` [RFC PATCH 17/17] perf/x86/intel/bts: " Alexander Shishkin
2017-09-06 16:24 ` [RFC PATCH 00/17] perf: Detached events Borislav Petkov
2017-09-13 11:54   ` Alexander Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170905133026.13689-1-alexander.shishkin@linux.intel.com \
    --to=alexander.shishkin@linux.intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=bp@alien8.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rric@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.