All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Alexander Shishkin <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: torvalds@linux-foundation.org, eranian@google.com,
	linux-kernel@vger.kernel.org, efault@gmx.de, tglx@linutronix.de,
	kaixu.xia@linaro.org, mingo@kernel.org, bp@alien8.de,
	hpa@zytor.com, fweisbec@gmail.com, rric@kernel.org,
	paulus@samba.org, alexander.shishkin@linux.intel.com,
	peterz@infradead.org
Subject: [tip:perf/core] perf: Add API for PMUs to write to the AUX area
Date: Thu, 2 Apr 2015 11:39:07 -0700	[thread overview]
Message-ID: <tip-fdc2670666f40ab3e03143f04d1ebf4a05e2c24a@git.kernel.org> (raw)
In-Reply-To: <1421237903-181015-8-git-send-email-alexander.shishkin@linux.intel.com>

Commit-ID:  fdc2670666f40ab3e03143f04d1ebf4a05e2c24a
Gitweb:     http://git.kernel.org/tip/fdc2670666f40ab3e03143f04d1ebf4a05e2c24a
Author:     Alexander Shishkin <alexander.shishkin@linux.intel.com>
AuthorDate: Wed, 14 Jan 2015 14:18:16 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 2 Apr 2015 17:14:13 +0200

perf: Add API for PMUs to write to the AUX area

For pmus that wish to write data to ring buffer's AUX area, provide
perf_aux_output_{begin,end}() calls to initiate/commit data writes,
similarly to perf_output_{begin,end}. These also use the same output
handle structure. Also, similarly to software counterparts, these
will direct inherited events' output to parents' ring buffers.

After the perf_aux_output_begin() returns successfully, handle->size
is set to the maximum amount of data that can be written wrt aux_tail
pointer, so that no data that the user hasn't seen will be overwritten,
therefore this should always be called before hardware writing is
enabled. On success, this will return the pointer to pmu driver's
private structure allocated for this aux area by pmu::setup_aux. Same
pointer can also be retrieved using perf_get_aux() while hardware
writing is enabled.

PMU driver should pass the actual amount of data written as a parameter
to perf_aux_output_end(). All hardware writes should be completed and
visible before this one is called.

Additionally, perf_aux_output_skip() will adjust output handle and
aux_head in case some part of the buffer has to be skipped over to
maintain hardware's alignment constraints.

Nested writers are forbidden and guards are in place to catch such
attempts.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: markus.t.metzger@intel.com
Cc: mathieu.poirier@linaro.org
Link: http://lkml.kernel.org/r/1421237903-181015-8-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/perf_event.h  |  24 +++++++-
 kernel/events/core.c        |   5 +-
 kernel/events/internal.h    |   4 ++
 kernel/events/ring_buffer.c | 139 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 168 insertions(+), 4 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f936a1e..45c5873 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -573,7 +573,10 @@ struct perf_output_handle {
 	struct ring_buffer		*rb;
 	unsigned long			wakeup;
 	unsigned long			size;
-	void				*addr;
+	union {
+		void			*addr;
+		unsigned long		head;
+	};
 	int				page;
 };
 
@@ -608,6 +611,14 @@ perf_cgroup_from_task(struct task_struct *task)
 
 #ifdef CONFIG_PERF_EVENTS
 
+extern void *perf_aux_output_begin(struct perf_output_handle *handle,
+				   struct perf_event *event);
+extern void perf_aux_output_end(struct perf_output_handle *handle,
+				unsigned long size, bool truncated);
+extern int perf_aux_output_skip(struct perf_output_handle *handle,
+				unsigned long size);
+extern void *perf_get_aux(struct perf_output_handle *handle);
+
 extern int perf_pmu_register(struct pmu *pmu, const char *name, int type);
 extern void perf_pmu_unregister(struct pmu *pmu);
 
@@ -898,6 +909,17 @@ extern void perf_event_disable(struct perf_event *event);
 extern int __perf_event_disable(void *info);
 extern void perf_event_task_tick(void);
 #else /* !CONFIG_PERF_EVENTS: */
+static inline void *
+perf_aux_output_begin(struct perf_output_handle *handle,
+		      struct perf_event *event)				{ return NULL; }
+static inline void
+perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,
+		    bool truncated)					{ }
+static inline int
+perf_aux_output_skip(struct perf_output_handle *handle,
+		     unsigned long size)				{ return -EINVAL; }
+static inline void *
+perf_get_aux(struct perf_output_handle *handle)				{ return NULL; }
 static inline void
 perf_event_task_sched_in(struct task_struct *prev,
 			 struct task_struct *task)			{ }
diff --git a/kernel/events/core.c b/kernel/events/core.c
index dbc2eff..81e8d14 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3423,7 +3423,6 @@ static void free_event_rcu(struct rcu_head *head)
 	kfree(event);
 }
 
-static void ring_buffer_put(struct ring_buffer *rb);
 static void ring_buffer_attach(struct perf_event *event,
 			       struct ring_buffer *rb);
 
@@ -4361,7 +4360,7 @@ static void rb_free_rcu(struct rcu_head *rcu_head)
 	rb_free(rb);
 }
 
-static struct ring_buffer *ring_buffer_get(struct perf_event *event)
+struct ring_buffer *ring_buffer_get(struct perf_event *event)
 {
 	struct ring_buffer *rb;
 
@@ -4376,7 +4375,7 @@ static struct ring_buffer *ring_buffer_get(struct perf_event *event)
 	return rb;
 }
 
-static void ring_buffer_put(struct ring_buffer *rb)
+void ring_buffer_put(struct ring_buffer *rb)
 {
 	if (!atomic_dec_and_test(&rb->refcount))
 		return;
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 4d117a9..b701ebc 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -36,6 +36,8 @@ struct ring_buffer {
 	struct user_struct		*mmap_user;
 
 	/* AUX area */
+	local_t				aux_head;
+	local_t				aux_nest;
 	unsigned long			aux_pgoff;
 	int				aux_nr_pages;
 	atomic_t			aux_mmap_count;
@@ -56,6 +58,8 @@ extern void perf_event_wakeup(struct perf_event *event);
 extern int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
 			pgoff_t pgoff, int nr_pages, int flags);
 extern void rb_free_aux(struct ring_buffer *rb);
+extern struct ring_buffer *ring_buffer_get(struct perf_event *event);
+extern void ring_buffer_put(struct ring_buffer *rb);
 
 static inline bool rb_has_aux(struct ring_buffer *rb)
 {
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 6e3be7a..0cc7b0f 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -243,6 +243,145 @@ ring_buffer_init(struct ring_buffer *rb, long watermark, int flags)
 	spin_lock_init(&rb->event_lock);
 }
 
+/*
+ * This is called before hardware starts writing to the AUX area to
+ * obtain an output handle and make sure there's room in the buffer.
+ * When the capture completes, call perf_aux_output_end() to commit
+ * the recorded data to the buffer.
+ *
+ * The ordering is similar to that of perf_output_{begin,end}, with
+ * the exception of (B), which should be taken care of by the pmu
+ * driver, since ordering rules will differ depending on hardware.
+ */
+void *perf_aux_output_begin(struct perf_output_handle *handle,
+			    struct perf_event *event)
+{
+	struct perf_event *output_event = event;
+	unsigned long aux_head, aux_tail;
+	struct ring_buffer *rb;
+
+	if (output_event->parent)
+		output_event = output_event->parent;
+
+	/*
+	 * Since this will typically be open across pmu::add/pmu::del, we
+	 * grab ring_buffer's refcount instead of holding rcu read lock
+	 * to make sure it doesn't disappear under us.
+	 */
+	rb = ring_buffer_get(output_event);
+	if (!rb)
+		return NULL;
+
+	if (!rb_has_aux(rb) || !atomic_inc_not_zero(&rb->aux_refcount))
+		goto err;
+
+	/*
+	 * Nesting is not supported for AUX area, make sure nested
+	 * writers are caught early
+	 */
+	if (WARN_ON_ONCE(local_xchg(&rb->aux_nest, 1)))
+		goto err_put;
+
+	aux_head = local_read(&rb->aux_head);
+	aux_tail = ACCESS_ONCE(rb->user_page->aux_tail);
+
+	handle->rb = rb;
+	handle->event = event;
+	handle->head = aux_head;
+	if (aux_head - aux_tail < perf_aux_size(rb))
+		handle->size = CIRC_SPACE(aux_head, aux_tail, perf_aux_size(rb));
+	else
+		handle->size = 0;
+
+	/*
+	 * handle->size computation depends on aux_tail load; this forms a
+	 * control dependency barrier separating aux_tail load from aux data
+	 * store that will be enabled on successful return
+	 */
+	if (!handle->size) { /* A, matches D */
+		event->pending_disable = 1;
+		perf_output_wakeup(handle);
+		local_set(&rb->aux_nest, 0);
+		goto err_put;
+	}
+
+	return handle->rb->aux_priv;
+
+err_put:
+	rb_free_aux(rb);
+
+err:
+	ring_buffer_put(rb);
+	handle->event = NULL;
+
+	return NULL;
+}
+
+/*
+ * Commit the data written by hardware into the ring buffer by adjusting
+ * aux_head and posting a PERF_RECORD_AUX into the perf buffer. It is the
+ * pmu driver's responsibility to observe ordering rules of the hardware,
+ * so that all the data is externally visible before this is called.
+ */
+void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,
+			 bool truncated)
+{
+	struct ring_buffer *rb = handle->rb;
+	unsigned long aux_head = local_read(&rb->aux_head);
+	u64 flags = 0;
+
+	if (truncated)
+		flags |= PERF_AUX_FLAG_TRUNCATED;
+
+	local_add(size, &rb->aux_head);
+
+	if (size || flags) {
+		/*
+		 * Only send RECORD_AUX if we have something useful to communicate
+		 */
+
+		perf_event_aux_event(handle->event, aux_head, size, flags);
+	}
+
+	rb->user_page->aux_head = local_read(&rb->aux_head);
+
+	perf_output_wakeup(handle);
+	handle->event = NULL;
+
+	local_set(&rb->aux_nest, 0);
+	rb_free_aux(rb);
+	ring_buffer_put(rb);
+}
+
+/*
+ * Skip over a given number of bytes in the AUX buffer, due to, for example,
+ * hardware's alignment constraints.
+ */
+int perf_aux_output_skip(struct perf_output_handle *handle, unsigned long size)
+{
+	struct ring_buffer *rb = handle->rb;
+	unsigned long aux_head;
+
+	if (size > handle->size)
+		return -ENOSPC;
+
+	local_add(size, &rb->aux_head);
+
+	handle->head = aux_head;
+	handle->size -= size;
+
+	return 0;
+}
+
+void *perf_get_aux(struct perf_output_handle *handle)
+{
+	/* this is only valid between perf_aux_output_begin and *_end */
+	if (!handle->event)
+		return NULL;
+
+	return handle->rb->aux_priv;
+}
+
 #define PERF_AUX_GFP	(GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY)
 
 static struct page *rb_alloc_aux_page(int node, int order)

  reply	other threads:[~2015-04-02 18:40 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-14 12:18 [PATCH v9 00/14] perf: Add infrastructure and support for Intel PT Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 01/14] perf: Add data_{offset,size} to user_page Alexander Shishkin
2015-04-02 18:37   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 02/14] perf: Add AUX area to ring buffer for raw data streams Alexander Shishkin
2015-04-02 18:37   ` [tip:perf/core] " tip-bot for Peter Zijlstra
2015-01-14 12:18 ` [PATCH v9 03/14] perf: Support high-order allocations for AUX space Alexander Shishkin
2015-04-02 18:37   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 04/14] perf: Add a capability for AUX_NO_SG pmus to do software double buffering Alexander Shishkin
2015-04-02 18:38   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 05/14] perf: Add a pmu capability for "exclusive" events Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 06/14] perf: Add AUX record Alexander Shishkin
2015-03-24 11:07   ` Jiri Olsa
2015-03-24 11:27     ` Adrian Hunter
2015-03-24 13:06       ` Jiri Olsa
2015-04-02 18:38   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 07/14] perf: Add api for pmus to write to AUX area Alexander Shishkin
2015-04-02 18:39   ` tip-bot for Alexander Shishkin [this message]
2015-01-14 12:18 ` [PATCH v9 08/14] perf: Support overwrite mode for " Alexander Shishkin
2015-04-02 18:39   ` [tip:perf/core] perf: Support overwrite mode for the " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 09/14] perf: Add wakeup watermark control to " Alexander Shishkin
2015-04-02 18:39   ` [tip:perf/core] perf: Add wakeup watermark control to the " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 10/14] x86: Add Intel Processor Trace (INTEL_PT) cpu feature detection Alexander Shishkin
2015-04-02 18:40   ` [tip:perf/core] " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 11/14] x86: perf: Intel PT and LBR/BTS are mutually exclusive Alexander Shishkin
2015-04-02 18:40   ` [tip:perf/core] perf/x86: Mark Intel PT and LBR/ BTS as " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 12/14] x86: perf: intel_pt: Intel PT PMU driver Alexander Shishkin
2015-01-15  9:06   ` Peter Zijlstra
2015-01-15 12:31     ` Alexander Shishkin
2015-01-20 13:20       ` Alexander Shishkin
2015-01-26 16:55         ` Peter Zijlstra
2015-01-27 18:03           ` Alexander Shishkin
2015-01-29 11:59             ` Peter Zijlstra
2015-01-29 15:03               ` Alexander Shishkin
2015-01-29 15:20                 ` Peter Zijlstra
2015-01-29 15:28                   ` Peter Zijlstra
2015-01-30  9:48                   ` Alexander Shishkin
2015-01-30 10:31                   ` [PATCH] perf: Add a pmu capability for "exclusive" events Alexander Shishkin
2015-04-02 18:38                     ` [tip:perf/core] " tip-bot for Alexander Shishkin
2015-01-30 10:39     ` [PATCH] x86: perf: intel_pt: Intel PT PMU driver Alexander Shishkin
2015-04-02 18:40       ` [tip:perf/core] perf/x86/intel/pt: Add " tip-bot for Alexander Shishkin
2015-01-30 10:40     ` [PATCH] x86: perf: intel_bts: Add BTS " Alexander Shishkin
2015-04-02 18:41       ` [tip:perf/core] perf/x86/intel/bts: " tip-bot for Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 13/14] x86: perf: intel_bts: " Alexander Shishkin
2015-01-14 12:18 ` [PATCH v9 14/14] perf: add ITRACE_START record to indicate that tracing has started Alexander Shishkin
2015-04-02 18:39   ` [tip:perf/core] perf: Add " tip-bot for Alexander Shishkin
2015-01-14 12:43 ` [PATCH v9 00/14] perf: Add infrastructure and support for Intel PT Alexander Shishkin
2015-01-14 14:38   ` Peter Zijlstra
2015-01-14 14:49     ` [PATCH v10 14/14] perf: add ITRACE_START record to indicate that tracing has started Alexander Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-fdc2670666f40ab3e03143f04d1ebf4a05e2c24a@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=kaixu.xia@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rric@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.