All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/perf: More documentation hooked to i915.rst
@ 2016-12-05 11:20 Robert Bragg
  2016-12-05 12:16 ` Matthew Auld
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Robert Bragg @ 2016-12-05 11:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Matthew Auld

This adds a 'Perf' section to i915.rst with the following sub sections:
- Overview
- Comparison with Core Perf
- i915 Driver Entry Points
- i915 Perf Stream
- i915 Perf Observation Architecture Stream
- All i915 Perf Internals

v2:
    section headers in i915.rst (Daniel Vetter)
    missing symbol docs + other fixups (Matthew Auld)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 Documentation/gpu/i915.rst       |  91 +++++++++
 drivers/gpu/drm/i915/i915_drv.h  | 151 +++++++++++---
 drivers/gpu/drm/i915/i915_perf.c | 412 ++++++++++++++++++++++++++++++++++++---
 3 files changed, 598 insertions(+), 56 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 117d2ab..847a094 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -356,4 +356,95 @@ switch_mm
 .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
    :doc: switch_mm tracepoint
 
+Perf
+====
+
+Overview
+--------
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf Overview
+
+Comparison with Core Perf
+-------------------------
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf History and Comparison with Core Perf
+
+i915 Driver Entry Points
+------------------------
+
+This section covers the entrypoints exported outside of i915_perf.c to
+integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_fini
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_register
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_unregister
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_release
+
+i915 Perf Stream
+----------------
+
+This section covers the stream-semantics-agnostic structures and functions
+for representing an i915 perf stream FD and associated file operations.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: read_properties_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_destroy_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_enable_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_disable_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll_locked
+
+i915 Perf Observation Architecture Stream
+-----------------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_oa_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_enable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_disable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_wait_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_poll_wait
+
+All i915 Perf Internals
+-----------------------
+
+This section simply includes all currently documented i915 perf internals, in
+no particular order, but may include some more minor utilities or platform
+specific details than found in the more high-level sections.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :internal:
+
 .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ca9786c..1ddebc7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1830,89 +1830,186 @@ struct i915_oa_reg {
 
 struct i915_perf_stream;
 
+/**
+ * struct i915_perf_stream_ops - the OPs to support a specific stream type
+ */
 struct i915_perf_stream_ops {
-	/* Enables the collection of HW samples, either in response to
-	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
-	 * opened without I915_PERF_FLAG_DISABLED.
+	/**
+	 * @enable: Enables the collection of HW samples, either in response to
+	 * `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
+	 * without `I915_PERF_FLAG_DISABLED`.
 	 */
 	void (*enable)(struct i915_perf_stream *stream);
 
-	/* Disables the collection of HW samples, either in response to
-	 * I915_PERF_IOCTL_DISABLE or implicitly called before
-	 * destroying the stream.
+	/**
+	 * @disable: Disables the collection of HW samples, either in response
+	 * to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
+	 * the stream.
 	 */
 	void (*disable)(struct i915_perf_stream *stream);
 
-	/* Call poll_wait, passing a wait queue that will be woken
+	/**
+	 * @poll_wait: Call poll_wait, passing a wait queue that will be woken
 	 * once there is something ready to read() for the stream
 	 */
 	void (*poll_wait)(struct i915_perf_stream *stream,
 			  struct file *file,
 			  poll_table *wait);
 
-	/* For handling a blocking read, wait until there is something
-	 * to ready to read() for the stream. E.g. wait on the same
+	/**
+	 * @wait_unlocked: For handling a blocking read, wait until there is
+	 * something to ready to read() for the stream. E.g. wait on the same
 	 * wait queue that would be passed to poll_wait().
 	 */
 	int (*wait_unlocked)(struct i915_perf_stream *stream);
 
-	/* read - Copy buffered metrics as records to userspace
-	 * @buf: the userspace, destination buffer
-	 * @count: the number of bytes to copy, requested by userspace
-	 * @offset: zero at the start of the read, updated as the read
-	 *          proceeds, it represents how many bytes have been
-	 *          copied so far and the buffer offset for copying the
-	 *          next record.
+	/**
+	 * @read: Copy buffered metrics as records to userspace
+	 * **buf**: the userspace, destination buffer
+	 * **count**: the number of bytes to copy, requested by userspace
+	 * **offset**: zero at the start of the read, updated as the read
+	 * proceeds, it represents how many bytes have been copied so far and
+	 * the buffer offset for copying the next record.
 	 *
-	 * Copy as many buffered i915 perf samples and records for
-	 * this stream to userspace as will fit in the given buffer.
+	 * Copy as many buffered i915 perf samples and records for this stream
+	 * to userspace as will fit in the given buffer.
 	 *
-	 * Only write complete records; returning -ENOSPC if there
-	 * isn't room for a complete record.
+	 * Only write complete records; returning -%ENOSPC if there isn't room
+	 * for a complete record.
 	 *
-	 * Return any error condition that results in a short read
-	 * such as -ENOSPC or -EFAULT, even though these may be
-	 * squashed before returning to userspace.
+	 * Return any error condition that results in a short read such as
+	 * -%ENOSPC or -%EFAULT, even though these may be squashed before
+	 * returning to userspace.
 	 */
 	int (*read)(struct i915_perf_stream *stream,
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset);
 
-	/* Cleanup any stream specific resources.
+	/**
+	 * @destroy: Cleanup any stream specific resources.
 	 *
 	 * The stream will always be disabled before this is called.
 	 */
 	void (*destroy)(struct i915_perf_stream *stream);
 };
 
+/**
+ * struct i915_perf_stream - state for a single open stream FD
+ */
 struct i915_perf_stream {
+	/**
+	 * @dev_priv: i915 drm device
+	 */
 	struct drm_i915_private *dev_priv;
 
+	/**
+	 * @link: Links the stream into ``&drm_i915_private->streams``
+	 */
 	struct list_head link;
 
+	/**
+	 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
+	 * properties given when opening a stream, representing the contents
+	 * of a single sample as read() by userspace.
+	 */
 	u32 sample_flags;
+
+	/**
+	 * @sample_size: Considering the configured contents of a sample
+	 * combined with the required header size, this is the total size
+	 * of a single sample record.
+	 */
 	int sample_size;
 
+	/**
+	 * @ctx: %NULL if measuring system-wide across all contexts or a
+	 * specific context that is being monitored.
+	 */
 	struct i915_gem_context *ctx;
+
+	/**
+	 * @enabled: Whether the stream is currently enabled, considering
+	 * whether the stream was opened in a disabled state and based
+	 * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
+	 */
 	bool enabled;
 
+	/**
+	 * @ops: The callbacks providing the implementation of this specific
+	 * type of configured stream.
+	 */
 	const struct i915_perf_stream_ops *ops;
 };
 
+/**
+ * struct i915_oa_ops - Gen specific implementation of an OA unit stream
+ */
 struct i915_oa_ops {
+	/**
+	 * @init_oa_buffer: Resets the head and tail pointers of the
+	 * circular buffer for periodic OA reports.
+	 *
+	 * Called when first opening a stream for OA metrics, but also may be
+	 * called in response to an OA buffer overflow or other error
+	 * condition.
+	 *
+	 * Note it may be necessary to clear the full OA buffer here as part of
+	 * maintaining the invariable that new reports must be written to
+	 * zeroed memory for us to be able to reliable detect if an expected
+	 * report has not yet landed in memory.  (At least on Haswell the OA
+	 * buffer tail pointer is not synchronized with reports being visible
+	 * to the CPU)
+	 */
 	void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @enable_metric_set: Applies any MUX configuration to set up the
+	 * Boolean and Custom (B/C) counters that are part of the counter
+	 * reports being sampled. May apply system constraints such as
+	 * disabling EU clock gating as required.
+	 */
 	int (*enable_metric_set)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @disable_metric_set: Remove system constraints associated with using
+	 * the OA unit.
+	 */
 	void (*disable_metric_set)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @oa_enable: Enable periodic sampling
+	 */
 	void (*oa_enable)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @oa_disable: Disable periodic sampling
+	 */
 	void (*oa_disable)(struct drm_i915_private *dev_priv);
-	void (*update_oacontrol)(struct drm_i915_private *dev_priv);
-	void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
-					u32 ctx_id);
+
+	/**
+	 * @read: Copy data from the circular OA buffer into a given userspace
+	 * buffer.
+	 */
 	int (*read)(struct i915_perf_stream *stream,
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset);
+
+	/**
+	 * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
+	 *
+	 * This is either called via fops or the poll check hrtimer (atomic
+	 * ctx) without any locks taken.
+	 *
+	 * It's safe to read OA config state here unlocked, assuming that this
+	 * is only called while the stream is enabled, while the global OA
+	 * configuration can't be modified.
+	 *
+	 * Efficiency is more important than avoiding some false positives
+	 * here, which will be handled gracefully - likely resulting in an
+	 * %EAGAIN error for userspace.
+	 */
 	bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
 };
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0c6e124..ae7bd0e 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -26,7 +26,7 @@
 
 
 /**
- * DOC: i915 Perf, streaming API for GPU metrics
+ * DOC: i915 Perf Overview
  *
  * Gen graphics supports a large number of performance counters that can help
  * driver and application developers understand and optimize their use of the
@@ -45,6 +45,10 @@
  * privileges by default, unless changed via the dev.i915.perf_event_paranoid
  * sysctl option.
  *
+ */
+
+/**
+ * DOC: i915 Perf History and Comparison with Core Perf
  *
  * The interface was initially inspired by the core Perf infrastructure but
  * some notable differences are:
@@ -75,8 +79,8 @@
  * gets copied from the GPU mapped buffers to userspace buffers.
  *
  *
- * Some notes regarding Linux Perf:
- * --------------------------------
+ * Issues hit with first prototype based on Core Perf
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
  * The first prototype of this driver was based on the core perf
  * infrastructure, and while we did make that mostly work, with some changes to
@@ -135,7 +139,7 @@
  *   for combining with the side-band raw reports it captures using
  *   MI_REPORT_PERF_COUNT commands.
  *
- *   _ As a side note on perf's grouping feature; there was also some concern
+ *   - As a side note on perf's grouping feature; there was also some concern
  *     that using PERF_FORMAT_GROUP as a way to pack together counter values
  *     would quite drastically inflate our sample sizes, which would likely
  *     lower the effective sampling resolutions we could use when the available
@@ -277,6 +281,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
 
 #define SAMPLE_OA_REPORT      (1<<0)
 
+/**
+ * struct perf_open_properties - for validated properties given to open a stream
+ * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
+ * @single_context: Whether a single or all gpu contexts should be monitored
+ * @ctx_handle: A gem ctx handle for use with @single_context
+ * @metrics_set: An ID for an OA unit metric set advertised via sysfs
+ * @oa_format: An OA unit HW report format
+ * @oa_periodic: Whether to enable periodic OA unit sampling
+ * @oa_period_exponent: The OA unit sampling period is derived from this
+ *
+ * As read_properties_unlocked() enumerates and validates the properties given
+ * to open a stream of metrics the configuration is built up in the structure
+ * which starts out zero initialized.
+ */
 struct perf_open_properties {
 	u32 sample_flags;
 
@@ -314,7 +332,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
 }
 
 /**
- * Appends a status record to a userspace read() buffer.
+ * append_oa_status - Appends a status record to a userspace read() buffer.
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ * @type: The kind of status to report to userspace
+ *
+ * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
+ * into the userspace read() buffer.
+ *
+ * The @buf @offset will only be updated on success.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int append_oa_status(struct i915_perf_stream *stream,
 			    char __user *buf,
@@ -336,7 +366,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
 }
 
 /**
- * Copies single OA report into userspace read() buffer.
+ * append_oa_sample - Copies single OA report into userspace read() buffer.
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ * @report: A single OA report to (optionally) include as part of the sample
+ *
+ * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
+ * properties when opening a stream, tracked as `stream->sample_flags`. This
+ * function copies the requested components of a single sample to the given
+ * read() @buf.
+ *
+ * The @buf @offset will only be updated on success.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int append_oa_sample(struct i915_perf_stream *stream,
 			    char __user *buf,
@@ -380,10 +424,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * @head_ptr: (inout): the current oa buffer cpu read position
  * @tail: the current oa buffer gpu write position
  *
- * Returns 0 on success, negative error code on failure.
- *
- * Notably any error condition resulting in a short read (-ENOSPC or
- * -EFAULT) will be returned even though one or more records may
+ * Notably any error condition resulting in a short read (-%ENOSPC or
+ * -%EFAULT) will be returned even though one or more records may
  * have been successfully copied. In this case it's up to the caller
  * to decide if the error should be squashed before returning to
  * userspace.
@@ -392,6 +434,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * tail, so the head chases the tail?... If you think that's mad
  * and back-to-front you're not alone, but this follows the
  * Gen PRM naming convention.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 				  char __user *buf,
@@ -496,6 +540,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * gen7_oa_read - copy status records then buffered OA reports
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ *
+ * Checks Gen 7 specific OA unit status registers and if necessary appends
+ * corresponding status records for userspace (such as for a buffer full
+ * condition) and then initiate appending any buffered OA reports.
+ *
+ * Updates @offset according to the number of bytes successfully copied into
+ * the userspace buffer.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int gen7_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
@@ -597,6 +657,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * i915_oa_wait_unlocked - handles blocking IO until OA data available
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * Called when userspace tries to read() from a blocking stream FD opened
+ * for OA metrics. It waits until the hrtimer callback finds a non-empty
+ * OA buffer and wakes us.
+ *
+ * Note: it's acceptable to have this return with some false positives
+ * since any subsequent read handling will return -EAGAIN if there isn't
+ * really data ready for userspace yet.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -615,6 +689,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 					!dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
 }
 
+/**
+ * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
+ * @stream: An i915-perf stream opened for OA metrics
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream opened for OA metrics,
+ * this starts a poll_wait with the wait queue that our hrtimer callback wakes
+ * when it sees data ready to read in the circular OA buffer.
+ */
 static void i915_oa_poll_wait(struct i915_perf_stream *stream,
 			      struct file *file,
 			      poll_table *wait)
@@ -624,6 +708,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
 	poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
 }
 
+/**
+ * i915_oa_read - just calls through to &i915_oa_ops->read
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ *
+ * Updates @offset according to the number of bytes successfully copied into
+ * the userspace buffer.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int i915_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
@@ -634,9 +730,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 	return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
 }
 
-/* Determine the render context hw id, and ensure it remains fixed for the
+/**
+ * oa_get_render_ctx_id - determine and hold ctx hw id
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * Determine the render context hw id, and ensure it remains fixed for the
  * lifetime of the stream. This ensures that we don't have to worry about
  * updating the context ID in OACONTROL on the fly.
+ *
+ * Returns: zero on success or a negative error code
  */
 static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
 {
@@ -673,6 +775,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
 	return ret;
 }
 
+/**
+ * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * In case anything needed doing to ensure the context HW ID would remain valid
+ * for the lifetime of the stream, then that can be undone here.
+ */
 static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -945,6 +1054,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
 	spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
 }
 
+/**
+ * i915_oa_stream_enable - handle `I915_PERF_IOCTL_ENABLE` for OA stream
+ * @stream: An i915 perf stream opened for OA metrics
+ *
+ * [Re]enables hardware periodic sampling according to the period configured
+ * when opening the stream. This also starts a hrtimer that will periodically
+ * check for data in the circular OA buffer for notifying userspace (e.g.
+ * during a read() or poll()).
+ */
 static void i915_oa_stream_enable(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -962,6 +1080,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN7_OACONTROL, 0);
 }
 
+/**
+ * i915_oa_stream_disable - handle `I915_PERF_IOCTL_DISABLE` for OA stream
+ * @stream: An i915 perf stream opened for OA metrics
+ *
+ * Stops the OA unit from periodically writing counter reports into the
+ * circular OA buffer. This also stops the hrtimer that periodically checks for
+ * data in the circular OA buffer, for notifying userspace.
+ */
 static void i915_oa_stream_disable(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -987,6 +1113,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
 	.read = i915_oa_read,
 };
 
+/**
+ * i915_oa_stream_init - validate combined props for OA stream and init
+ * @stream: An i915 perf stream
+ * @param: The open parameters passed to `DRM_I915_PERF_OPEN`
+ * @props: The property state that configures stream (individually validated)
+ *
+ * While read_properties_unlocked() validates properties in isolation it
+ * doesn't ensure that the combination necessarily makes sense.
+ *
+ * At this point it has been determined that userspace wants a stream of
+ * OA metrics, but still we need to further validate the combined
+ * properties are OK.
+ *
+ * If the configuration makes sense then we can allocate memory for
+ * a circular OA buffer and apply the requested metric set configuration.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int i915_oa_stream_init(struct i915_perf_stream *stream,
 			       struct drm_i915_perf_open_param *param,
 			       struct perf_open_properties *props)
@@ -1111,6 +1255,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * i915_perf_read_locked - &i915_perf_stream_ops->read with error normalisation
+ * @stream: An i915 perf stream
+ * @file: An i915 perf stream file
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @ppos: (inout) file seek position (unused)
+ *
+ * Besides wrapping &i915_perf_stream_ops->read this provides a common place to
+ * ensure that if we've successfully copied any data then reporting that takes
+ * precedence over any internal error status, so the data isn't lost.
+ *
+ * For example ret will be -ENOSPC whenever there is more buffered data than
+ * can be copied to userspace, but that's only interesting if we weren't able
+ * to copy some data because it implies the userspace buffer is too small to
+ * receive a single record (and we never split records).
+ *
+ * Another case with ret == -EFAULT is more of a grey area since it would seem
+ * like bad form for userspace to ask us to overrun its buffer, but the user
+ * knows best:
+ *
+ *   http://yarchive.net/comp/linux/partial_reads_writes.html
+ *
+ * Returns: The number of bytes copied or a negative error code on failure.
+ */
 static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
 				     struct file *file,
 				     char __user *buf,
@@ -1126,25 +1295,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
 	size_t offset = 0;
 	int ret = stream->ops->read(stream, buf, count, &offset);
 
-	/* If we've successfully copied any data then reporting that
-	 * takes precedence over any internal error status, so the
-	 * data isn't lost.
-	 *
-	 * For example ret will be -ENOSPC whenever there is more
-	 * buffered data than can be copied to userspace, but that's
-	 * only interesting if we weren't able to copy some data
-	 * because it implies the userspace buffer is too small to
-	 * receive a single record (and we never split records).
-	 *
-	 * Another case with ret == -EFAULT is more of a grey area
-	 * since it would seem like bad form for userspace to ask us
-	 * to overrun its buffer, but the user knows best:
-	 *
-	 *   http://yarchive.net/comp/linux/partial_reads_writes.html
-	 */
 	return offset ?: (ret ?: -EAGAIN);
 }
 
+/**
+ * i915_perf_read - handles read() FOP for i915 perf stream FDs
+ * @file: An i915 perf stream file
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @ppos: (inout) file seek position (unused)
+ *
+ * The entry point for handling a read() on a stream file descriptor from
+ * userspace. Most of the work is left to the i915_perf_read_locked() and
+ * &i915_perf_stream_ops->read but to save having stream implementations (of
+ * which we might have multiple later) we handle blocking read here.
+ *
+ * We can also consistently treat trying to read from a disabled stream
+ * as an IO error so implementations can assume the stream is enabled
+ * while reading.
+ *
+ * Returns: The number of bytes copied or a negative error code on failure.
+ */
 static ssize_t i915_perf_read(struct file *file,
 			      char __user *buf,
 			      size_t count,
@@ -1211,6 +1382,22 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
 	return HRTIMER_RESTART;
 }
 
+/**
+ * i915_perf_poll_locked - poll_wait() with a suitable wait queue for stream
+ * @dev_priv: i915 device instance
+ * @stream: An i915 perf stream
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream, this calls through to
+ * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that
+ * will be woken for new stream data.
+ *
+ * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
+ * with any non-file-operation driver hooks.
+ *
+ * Returns: any poll events that are ready without sleeping
+ */
 static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
 					  struct i915_perf_stream *stream,
 					  struct file *file,
@@ -1232,6 +1419,19 @@ static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
 	return events;
 }
 
+/**
+ * i915_perf_poll - call poll_wait() with a suitable wait queue for stream
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream, this ensures
+ * poll_wait() gets called with a wait queue that will be woken for new stream
+ * data.
+ *
+ * Note: Implementation deferred to i915_perf_poll_locked()
+ *
+ * Returns: any poll events that are ready without sleeping
+ */
 static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
 {
 	struct i915_perf_stream *stream = file->private_data;
@@ -1245,6 +1445,16 @@ static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
 	return ret;
 }
 
+/**
+ * i915_perf_enable_locked - handle `I915_PERF_IOCTL_ENABLE` ioctl
+ * @stream: A disabled i915 perf stream
+ *
+ * [Re]enables the associated capture of data for this stream.
+ *
+ * If a stream was previously enabled then there's currently no intention
+ * to provide userspace any guarantee about the preservation of previously
+ * buffered data.
+ */
 static void i915_perf_enable_locked(struct i915_perf_stream *stream)
 {
 	if (stream->enabled)
@@ -1257,6 +1467,20 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream)
 		stream->ops->enable(stream);
 }
 
+/**
+ * i915_perf_disable_locked - handle `I915_PERF_IOCTL_DISABLE` ioctl
+ * @stream: An enabled i915 perf stream
+ *
+ * Disables the associated capture of data for this stream.
+ *
+ * The intention is that disabling an re-enabling a stream will ideally be
+ * cheaper than destroying and re-opening a stream with the same configuration,
+ * though there are no formal guarantees about what state or buffered data
+ * must be retained between disabling and re-enabling a stream.
+ *
+ * Note: while a stream is disabled it's considered an error for userspace
+ * to attempt to read from the stream (-EIO).
+ */
 static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 {
 	if (!stream->enabled)
@@ -1269,6 +1493,18 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 		stream->ops->disable(stream);
 }
 
+/**
+ * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
+ * @stream: An i915 perf stream
+ * @cmd: the ioctl request
+ * @arg: the ioctl data
+ *
+ * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
+ * with any non-file-operation driver hooks.
+ *
+ * Returns: zero on success or a negative error code. Returns -EINVAL for
+ * an unknown ioctl request.
+ */
 static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
 				   unsigned int cmd,
 				   unsigned long arg)
@@ -1285,6 +1521,17 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
 	return -EINVAL;
 }
 
+/**
+ * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
+ * @file: An i915 perf stream file
+ * @cmd: the ioctl request
+ * @arg: the ioctl data
+ *
+ * Implementation deferred to i915_perf_ioctl_locked().
+ *
+ * Returns: zero on success or a negative error code. Returns -EINVAL for
+ * an unknown ioctl request.
+ */
 static long i915_perf_ioctl(struct file *file,
 			    unsigned int cmd,
 			    unsigned long arg)
@@ -1300,6 +1547,16 @@ static long i915_perf_ioctl(struct file *file,
 	return ret;
 }
 
+/**
+ * i915_perf_destroy_locked - destroy an i915 perf stream
+ * @stream: An i915 perf stream
+ *
+ * Frees all resources associated with the given i915 perf @stream, disabling
+ * any associated data capture in the process.
+ *
+ * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
+ * with any non-file-operation driver hooks.
+ */
 static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -1321,6 +1578,17 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
 	kfree(stream);
 }
 
+/**
+ * i915_perf_release - handles userspace close() of a stream file
+ * @inode: anonymous inode associated with file
+ * @file: An i915 perf stream file
+ *
+ * Cleans up any resources associated with an open i915 perf stream file.
+ *
+ * NB: close() can't really fail from the userspace point of view.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int i915_perf_release(struct inode *inode, struct file *file)
 {
 	struct i915_perf_stream *stream = file->private_data;
@@ -1365,6 +1633,30 @@ lookup_context(struct drm_i915_private *dev_priv,
 	return ctx;
 }
 
+/**
+ * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD
+ * @dev_priv: i915 device instance
+ * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
+ * @props: individually validated u64 property value pairs
+ * @file: drm file
+ *
+ * See i915_perf_ioctl_open() for interface details.
+ *
+ * Implements further stream config validation and stream initialization on
+ * behalf of i915_perf_open_ioctl() with the &drm_i915_private->perf.lock mutex
+ * taken to serialize with any non-file-operation driver hooks.
+ *
+ * Note: at this point the @props have only been validated in isolation and
+ * it's still necessary to validate that the combination of properties makes
+ * sense.
+ *
+ * In the case where userspace is interested in OA unit metrics then further
+ * config validation and stream initialization details will be handled by
+ * i915_oa_stream_init(). The code here should only validate config state that
+ * will be relevant to all stream types / backends.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int
 i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 			    struct drm_i915_perf_open_param *param,
@@ -1459,12 +1751,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	return ret;
 }
 
-/* Note we copy the properties from userspace outside of the i915 perf
- * mutex to avoid an awkward lockdep with mmap_sem.
+/**
+ * read_properties_unlocked - validate + copy userspace stream open properties
+ * @dev_priv: i915 device instance
+ * @uprops: The array of u64 key value pairs given by userspace
+ * @n_props: The number of key value pairs expected in @uprops
+ * @props: The stream configuration built up while validating properties
  *
  * Note this function only validates properties in isolation it doesn't
  * validate that the combination of properties makes sense or that all
  * properties necessary for a particular kind of stream have been set.
+ *
+ * Note that there currently aren't any ordering requirements for properties so
+ * we shouldn't validate or assume anything about ordering here. This doesn't
+ * rule out defining new properties with ordering requirements in the future.
  */
 static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 				    u64 __user *uprops,
@@ -1586,6 +1886,30 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 	return 0;
 }
 
+/**
+ * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
+ * @dev: drm device
+ * @data: ioctl data copied from userspace (unvalidated)
+ * @file: drm file
+ *
+ * Validates the stream open parameters given by userspace including flags
+ * and an array of u64 key, value pair properties.
+ *
+ * Very little is assumed up front about the nature of the stream being
+ * opened (for instance we don't assume it's for periodic OA unit metrics). An
+ * i915-perf stream is expected to be a suitable interface for other forms of
+ * buffered data written by the GPU besides periodic OA metrics.
+ *
+ * Note we copy the properties from userspace outside of the i915 perf
+ * mutex to avoid an awkward lockdep with mmap_sem.
+ *
+ * Most of the implementation details are handled by
+ * i915_perf_open_ioctl_locked() after taking the &drm_i915_private->perf.lock
+ * mutex for serializing with any non-file-operation driver hooks.
+ *
+ * Return: A newly opened i915 Perf stream file descriptor or negative
+ * error code on failure.
+ */
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file)
 {
@@ -1622,6 +1946,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+/**
+ * i915_perf_register - exposes i915-perf to userspace
+ * @dev_priv: i915 device instance
+ *
+ * In particular OA metric sets are advertised under a sysfs metrics/
+ * directory allowing userspace to enumerate valid IDs that can be
+ * used to open an i915-perf stream.
+ */
 void i915_perf_register(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1651,6 +1983,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
 	mutex_unlock(&dev_priv->perf.lock);
 }
 
+/**
+ * i915_perf_unregister - hide i915-perf from userspace
+ * @dev_priv: i915 device instance
+ *
+ * i915-perf state cleanup is split up into an 'unregister' and
+ * 'deinit' phase where the interface is first hidden from
+ * userspace by i915_perf_unregister() before cleaning up
+ * remaining state in i915_perf_fini().
+ */
 void i915_perf_unregister(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1707,6 +2048,15 @@ static struct ctl_table dev_root[] = {
 	{}
 };
 
+/**
+ * i915_perf_init - initialize i915-perf state on module load
+ * @dev_priv: i915 device instance
+ *
+ * Initializes i915-perf state without exposing anything to userspace.
+ *
+ * Note: i915-perf initialization is split into an 'init' and 'register'
+ * phase with the i915_perf_register() exposing state to userspace.
+ */
 void i915_perf_init(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1742,6 +2092,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 	dev_priv->perf.initialized = true;
 }
 
+/**
+ * i915_perf_fini - Counter part to i915_perf_init()
+ * @dev_priv: i915 device instance
+ */
 void i915_perf_fini(struct drm_i915_private *dev_priv)
 {
 	if (!dev_priv->perf.initialized)
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-05 11:20 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
@ 2016-12-05 12:16 ` Matthew Auld
  2016-12-07 21:40   ` Robert Bragg
  2016-12-05 13:55 ` ✗ Fi.CI.BAT: failure for drm/i915/perf: More documentation hooked to i915.rst (rev2) Patchwork
  2016-12-07 22:53 ` ✓ Fi.CI.BAT: success for drm/i915/perf: More documentation hooked to i915.rst (rev3) Patchwork
  2 siblings, 1 reply; 16+ messages in thread
From: Matthew Auld @ 2016-12-05 12:16 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Daniel Vetter, Intel Graphics Development, Matthew Auld

On 5 December 2016 at 11:20, Robert Bragg <robert@sixbynine.org> wrote:
> This adds a 'Perf' section to i915.rst with the following sub sections:
> - Overview
> - Comparison with Core Perf
> - i915 Driver Entry Points
> - i915 Perf Stream
> - i915 Perf Observation Architecture Stream
> - All i915 Perf Internals
>
> v2:
>     section headers in i915.rst (Daniel Vetter)
>     missing symbol docs + other fixups (Matthew Auld)
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Matthew Auld <matthew.auld@intel.com>
> ---
>  Documentation/gpu/i915.rst       |  91 +++++++++
>  drivers/gpu/drm/i915/i915_drv.h  | 151 +++++++++++---
>  drivers/gpu/drm/i915/i915_perf.c | 412 ++++++++++++++++++++++++++++++++++++---
>  3 files changed, 598 insertions(+), 56 deletions(-)
>
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 117d2ab..847a094 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -356,4 +356,95 @@ switch_mm
>  .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
>     :doc: switch_mm tracepoint
>
> +Perf
> +====
> +
> +Overview
> +--------
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf Overview
> +
> +Comparison with Core Perf
> +-------------------------
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf History and Comparison with Core Perf
> +
> +i915 Driver Entry Points
> +------------------------
> +
> +This section covers the entrypoints exported outside of i915_perf.c to
> +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_fini
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_register
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_unregister
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_release
> +
> +i915 Perf Stream
> +----------------
> +
> +This section covers the stream-semantics-agnostic structures and functions
> +for representing an i915 perf stream FD and associated file operations.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: read_properties_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl_locked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_destroy_locked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_enable_unlocked
s/i915_perf_enable_unlocked/i915_perf_enable_locked

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_disable_unlocked
s/i915_perf_disable_unlocked/i915_perf_disable_locked

With that:
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915/perf: More documentation hooked to i915.rst (rev2)
  2016-12-05 11:20 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
  2016-12-05 12:16 ` Matthew Auld
@ 2016-12-05 13:55 ` Patchwork
  2016-12-05 14:25   ` Chris Wilson
  2016-12-07 22:53 ` ✓ Fi.CI.BAT: success for drm/i915/perf: More documentation hooked to i915.rst (rev3) Patchwork
  2 siblings, 1 reply; 16+ messages in thread
From: Patchwork @ 2016-12-05 13:55 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: More documentation hooked to i915.rst (rev2)
URL   : https://patchwork.freedesktop.org/series/16114/
State : failure

== Summary ==

Series 16114v2 drm/i915/perf: More documentation hooked to i915.rst
https://patchwork.freedesktop.org/api/1.0/series/16114/revisions/2/mbox/

Test drv_module_reload:
        Subgroup basic-reload-inject:
                pass       -> INCOMPLETE (fi-hsw-4770)

fi-bdw-5557u     total:247  pass:232  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050     total:247  pass:207  dwarn:0   dfail:0   fail:0   skip:40 
fi-bxt-t5700     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-j1900     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-n2820     total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 
fi-hsw-4770      total:7    pass:6    dwarn:0   dfail:0   fail:0   skip:0  
fi-hsw-4770r     total:247  pass:227  dwarn:0   dfail:0   fail:0   skip:20 
fi-ilk-650       total:247  pass:194  dwarn:0   dfail:0   fail:0   skip:53 
fi-ivb-3520m     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-ivb-3770      total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-7500u     total:247  pass:225  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6260u     total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hq    total:247  pass:226  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6700k     total:247  pass:223  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:247  pass:233  dwarn:0   dfail:0   fail:0   skip:14 
fi-snb-2520m     total:247  pass:215  dwarn:0   dfail:0   fail:0   skip:32 
fi-snb-2600      total:247  pass:214  dwarn:0   dfail:0   fail:0   skip:33 

a17bd268c6f9e1076a9aea44573ff58b0caeb44d drm-tip: 2016y-12m-05d-11h-19m-41s UTC integration manifest
a58cf97 drm/i915/perf: More documentation hooked to i915.rst

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3189/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for drm/i915/perf: More documentation hooked to i915.rst (rev2)
  2016-12-05 13:55 ` ✗ Fi.CI.BAT: failure for drm/i915/perf: More documentation hooked to i915.rst (rev2) Patchwork
@ 2016-12-05 14:25   ` Chris Wilson
  0 siblings, 0 replies; 16+ messages in thread
From: Chris Wilson @ 2016-12-05 14:25 UTC (permalink / raw)
  To: intel-gfx

On Mon, Dec 05, 2016 at 01:55:02PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915/perf: More documentation hooked to i915.rst (rev2)
> URL   : https://patchwork.freedesktop.org/series/16114/
> State : failure
> 
> == Summary ==
> 
> Series 16114v2 drm/i915/perf: More documentation hooked to i915.rst
> https://patchwork.freedesktop.org/api/1.0/series/16114/revisions/2/mbox/
> 
> Test drv_module_reload:
>         Subgroup basic-reload-inject:
>                 pass       -> INCOMPLETE (fi-hsw-4770)

A use-after-free inside drm_fb_hotplug_helper (RAX is POISON_FREE). Could
do with a bug report + CI regularly using KASAN to see what is being
poisoned and help identify the missing lock(s).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-05 12:16 ` Matthew Auld
@ 2016-12-07 21:40   ` Robert Bragg
  2016-12-08 15:53     ` Daniel Vetter
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Bragg @ 2016-12-07 21:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

This adds a 'Perf' section to i915.rst with the following sub sections:
- Overview
- Comparison with Core Perf
- i915 Driver Entry Points
- i915 Perf Stream
- i915 Perf Observation Architecture Stream
- All i915 Perf Internals

v2:
    section headers in i915.rst (Daniel Vetter)
    missing symbol docs + other fixups (Matthew Auld)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 Documentation/gpu/i915.rst       |  91 +++++++++
 drivers/gpu/drm/i915/i915_drv.h  | 151 +++++++++++---
 drivers/gpu/drm/i915/i915_perf.c | 412 ++++++++++++++++++++++++++++++++++++---
 3 files changed, 598 insertions(+), 56 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 117d2ab..3843ef6 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -356,4 +356,95 @@ switch_mm
 .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
    :doc: switch_mm tracepoint
 
+Perf
+====
+
+Overview
+--------
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf Overview
+
+Comparison with Core Perf
+-------------------------
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf History and Comparison with Core Perf
+
+i915 Driver Entry Points
+------------------------
+
+This section covers the entrypoints exported outside of i915_perf.c to
+integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_fini
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_register
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_unregister
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_release
+
+i915 Perf Stream
+----------------
+
+This section covers the stream-semantics-agnostic structures and functions
+for representing an i915 perf stream FD and associated file operations.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: read_properties_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_destroy_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_enable_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_disable_locked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll_locked
+
+i915 Perf Observation Architecture Stream
+-----------------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_oa_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_enable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_disable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_wait_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_poll_wait
+
+All i915 Perf Internals
+-----------------------
+
+This section simply includes all currently documented i915 perf internals, in
+no particular order, but may include some more minor utilities or platform
+specific details than found in the more high-level sections.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :internal:
+
 .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 33758ac..49c7651 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1847,89 +1847,186 @@ struct i915_oa_reg {
 
 struct i915_perf_stream;
 
+/**
+ * struct i915_perf_stream_ops - the OPs to support a specific stream type
+ */
 struct i915_perf_stream_ops {
-	/* Enables the collection of HW samples, either in response to
-	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
-	 * opened without I915_PERF_FLAG_DISABLED.
+	/**
+	 * @enable: Enables the collection of HW samples, either in response to
+	 * `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
+	 * without `I915_PERF_FLAG_DISABLED`.
 	 */
 	void (*enable)(struct i915_perf_stream *stream);
 
-	/* Disables the collection of HW samples, either in response to
-	 * I915_PERF_IOCTL_DISABLE or implicitly called before
-	 * destroying the stream.
+	/**
+	 * @disable: Disables the collection of HW samples, either in response
+	 * to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
+	 * the stream.
 	 */
 	void (*disable)(struct i915_perf_stream *stream);
 
-	/* Call poll_wait, passing a wait queue that will be woken
+	/**
+	 * @poll_wait: Call poll_wait, passing a wait queue that will be woken
 	 * once there is something ready to read() for the stream
 	 */
 	void (*poll_wait)(struct i915_perf_stream *stream,
 			  struct file *file,
 			  poll_table *wait);
 
-	/* For handling a blocking read, wait until there is something
-	 * to ready to read() for the stream. E.g. wait on the same
+	/**
+	 * @wait_unlocked: For handling a blocking read, wait until there is
+	 * something to ready to read() for the stream. E.g. wait on the same
 	 * wait queue that would be passed to poll_wait().
 	 */
 	int (*wait_unlocked)(struct i915_perf_stream *stream);
 
-	/* read - Copy buffered metrics as records to userspace
-	 * @buf: the userspace, destination buffer
-	 * @count: the number of bytes to copy, requested by userspace
-	 * @offset: zero at the start of the read, updated as the read
-	 *          proceeds, it represents how many bytes have been
-	 *          copied so far and the buffer offset for copying the
-	 *          next record.
+	/**
+	 * @read: Copy buffered metrics as records to userspace
+	 * **buf**: the userspace, destination buffer
+	 * **count**: the number of bytes to copy, requested by userspace
+	 * **offset**: zero at the start of the read, updated as the read
+	 * proceeds, it represents how many bytes have been copied so far and
+	 * the buffer offset for copying the next record.
 	 *
-	 * Copy as many buffered i915 perf samples and records for
-	 * this stream to userspace as will fit in the given buffer.
+	 * Copy as many buffered i915 perf samples and records for this stream
+	 * to userspace as will fit in the given buffer.
 	 *
-	 * Only write complete records; returning -ENOSPC if there
-	 * isn't room for a complete record.
+	 * Only write complete records; returning -%ENOSPC if there isn't room
+	 * for a complete record.
 	 *
-	 * Return any error condition that results in a short read
-	 * such as -ENOSPC or -EFAULT, even though these may be
-	 * squashed before returning to userspace.
+	 * Return any error condition that results in a short read such as
+	 * -%ENOSPC or -%EFAULT, even though these may be squashed before
+	 * returning to userspace.
 	 */
 	int (*read)(struct i915_perf_stream *stream,
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset);
 
-	/* Cleanup any stream specific resources.
+	/**
+	 * @destroy: Cleanup any stream specific resources.
 	 *
 	 * The stream will always be disabled before this is called.
 	 */
 	void (*destroy)(struct i915_perf_stream *stream);
 };
 
+/**
+ * struct i915_perf_stream - state for a single open stream FD
+ */
 struct i915_perf_stream {
+	/**
+	 * @dev_priv: i915 drm device
+	 */
 	struct drm_i915_private *dev_priv;
 
+	/**
+	 * @link: Links the stream into ``&drm_i915_private->streams``
+	 */
 	struct list_head link;
 
+	/**
+	 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
+	 * properties given when opening a stream, representing the contents
+	 * of a single sample as read() by userspace.
+	 */
 	u32 sample_flags;
+
+	/**
+	 * @sample_size: Considering the configured contents of a sample
+	 * combined with the required header size, this is the total size
+	 * of a single sample record.
+	 */
 	int sample_size;
 
+	/**
+	 * @ctx: %NULL if measuring system-wide across all contexts or a
+	 * specific context that is being monitored.
+	 */
 	struct i915_gem_context *ctx;
+
+	/**
+	 * @enabled: Whether the stream is currently enabled, considering
+	 * whether the stream was opened in a disabled state and based
+	 * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
+	 */
 	bool enabled;
 
+	/**
+	 * @ops: The callbacks providing the implementation of this specific
+	 * type of configured stream.
+	 */
 	const struct i915_perf_stream_ops *ops;
 };
 
+/**
+ * struct i915_oa_ops - Gen specific implementation of an OA unit stream
+ */
 struct i915_oa_ops {
+	/**
+	 * @init_oa_buffer: Resets the head and tail pointers of the
+	 * circular buffer for periodic OA reports.
+	 *
+	 * Called when first opening a stream for OA metrics, but also may be
+	 * called in response to an OA buffer overflow or other error
+	 * condition.
+	 *
+	 * Note it may be necessary to clear the full OA buffer here as part of
+	 * maintaining the invariable that new reports must be written to
+	 * zeroed memory for us to be able to reliable detect if an expected
+	 * report has not yet landed in memory.  (At least on Haswell the OA
+	 * buffer tail pointer is not synchronized with reports being visible
+	 * to the CPU)
+	 */
 	void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @enable_metric_set: Applies any MUX configuration to set up the
+	 * Boolean and Custom (B/C) counters that are part of the counter
+	 * reports being sampled. May apply system constraints such as
+	 * disabling EU clock gating as required.
+	 */
 	int (*enable_metric_set)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @disable_metric_set: Remove system constraints associated with using
+	 * the OA unit.
+	 */
 	void (*disable_metric_set)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @oa_enable: Enable periodic sampling
+	 */
 	void (*oa_enable)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @oa_disable: Disable periodic sampling
+	 */
 	void (*oa_disable)(struct drm_i915_private *dev_priv);
-	void (*update_oacontrol)(struct drm_i915_private *dev_priv);
-	void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
-					u32 ctx_id);
+
+	/**
+	 * @read: Copy data from the circular OA buffer into a given userspace
+	 * buffer.
+	 */
 	int (*read)(struct i915_perf_stream *stream,
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset);
+
+	/**
+	 * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
+	 *
+	 * This is either called via fops or the poll check hrtimer (atomic
+	 * ctx) without any locks taken.
+	 *
+	 * It's safe to read OA config state here unlocked, assuming that this
+	 * is only called while the stream is enabled, while the global OA
+	 * configuration can't be modified.
+	 *
+	 * Efficiency is more important than avoiding some false positives
+	 * here, which will be handled gracefully - likely resulting in an
+	 * %EAGAIN error for userspace.
+	 */
 	bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
 };
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 0c6e124..ae7bd0e 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -26,7 +26,7 @@
 
 
 /**
- * DOC: i915 Perf, streaming API for GPU metrics
+ * DOC: i915 Perf Overview
  *
  * Gen graphics supports a large number of performance counters that can help
  * driver and application developers understand and optimize their use of the
@@ -45,6 +45,10 @@
  * privileges by default, unless changed via the dev.i915.perf_event_paranoid
  * sysctl option.
  *
+ */
+
+/**
+ * DOC: i915 Perf History and Comparison with Core Perf
  *
  * The interface was initially inspired by the core Perf infrastructure but
  * some notable differences are:
@@ -75,8 +79,8 @@
  * gets copied from the GPU mapped buffers to userspace buffers.
  *
  *
- * Some notes regarding Linux Perf:
- * --------------------------------
+ * Issues hit with first prototype based on Core Perf
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
  * The first prototype of this driver was based on the core perf
  * infrastructure, and while we did make that mostly work, with some changes to
@@ -135,7 +139,7 @@
  *   for combining with the side-band raw reports it captures using
  *   MI_REPORT_PERF_COUNT commands.
  *
- *   _ As a side note on perf's grouping feature; there was also some concern
+ *   - As a side note on perf's grouping feature; there was also some concern
  *     that using PERF_FORMAT_GROUP as a way to pack together counter values
  *     would quite drastically inflate our sample sizes, which would likely
  *     lower the effective sampling resolutions we could use when the available
@@ -277,6 +281,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
 
 #define SAMPLE_OA_REPORT      (1<<0)
 
+/**
+ * struct perf_open_properties - for validated properties given to open a stream
+ * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
+ * @single_context: Whether a single or all gpu contexts should be monitored
+ * @ctx_handle: A gem ctx handle for use with @single_context
+ * @metrics_set: An ID for an OA unit metric set advertised via sysfs
+ * @oa_format: An OA unit HW report format
+ * @oa_periodic: Whether to enable periodic OA unit sampling
+ * @oa_period_exponent: The OA unit sampling period is derived from this
+ *
+ * As read_properties_unlocked() enumerates and validates the properties given
+ * to open a stream of metrics the configuration is built up in the structure
+ * which starts out zero initialized.
+ */
 struct perf_open_properties {
 	u32 sample_flags;
 
@@ -314,7 +332,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
 }
 
 /**
- * Appends a status record to a userspace read() buffer.
+ * append_oa_status - Appends a status record to a userspace read() buffer.
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ * @type: The kind of status to report to userspace
+ *
+ * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
+ * into the userspace read() buffer.
+ *
+ * The @buf @offset will only be updated on success.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int append_oa_status(struct i915_perf_stream *stream,
 			    char __user *buf,
@@ -336,7 +366,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
 }
 
 /**
- * Copies single OA report into userspace read() buffer.
+ * append_oa_sample - Copies single OA report into userspace read() buffer.
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ * @report: A single OA report to (optionally) include as part of the sample
+ *
+ * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
+ * properties when opening a stream, tracked as `stream->sample_flags`. This
+ * function copies the requested components of a single sample to the given
+ * read() @buf.
+ *
+ * The @buf @offset will only be updated on success.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int append_oa_sample(struct i915_perf_stream *stream,
 			    char __user *buf,
@@ -380,10 +424,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * @head_ptr: (inout): the current oa buffer cpu read position
  * @tail: the current oa buffer gpu write position
  *
- * Returns 0 on success, negative error code on failure.
- *
- * Notably any error condition resulting in a short read (-ENOSPC or
- * -EFAULT) will be returned even though one or more records may
+ * Notably any error condition resulting in a short read (-%ENOSPC or
+ * -%EFAULT) will be returned even though one or more records may
  * have been successfully copied. In this case it's up to the caller
  * to decide if the error should be squashed before returning to
  * userspace.
@@ -392,6 +434,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * tail, so the head chases the tail?... If you think that's mad
  * and back-to-front you're not alone, but this follows the
  * Gen PRM naming convention.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 				  char __user *buf,
@@ -496,6 +540,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * gen7_oa_read - copy status records then buffered OA reports
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ *
+ * Checks Gen 7 specific OA unit status registers and if necessary appends
+ * corresponding status records for userspace (such as for a buffer full
+ * condition) and then initiate appending any buffered OA reports.
+ *
+ * Updates @offset according to the number of bytes successfully copied into
+ * the userspace buffer.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int gen7_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
@@ -597,6 +657,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * i915_oa_wait_unlocked - handles blocking IO until OA data available
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * Called when userspace tries to read() from a blocking stream FD opened
+ * for OA metrics. It waits until the hrtimer callback finds a non-empty
+ * OA buffer and wakes us.
+ *
+ * Note: it's acceptable to have this return with some false positives
+ * since any subsequent read handling will return -EAGAIN if there isn't
+ * really data ready for userspace yet.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -615,6 +689,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 					!dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
 }
 
+/**
+ * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
+ * @stream: An i915-perf stream opened for OA metrics
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream opened for OA metrics,
+ * this starts a poll_wait with the wait queue that our hrtimer callback wakes
+ * when it sees data ready to read in the circular OA buffer.
+ */
 static void i915_oa_poll_wait(struct i915_perf_stream *stream,
 			      struct file *file,
 			      poll_table *wait)
@@ -624,6 +708,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
 	poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
 }
 
+/**
+ * i915_oa_read - just calls through to &i915_oa_ops->read
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ *
+ * Updates @offset according to the number of bytes successfully copied into
+ * the userspace buffer.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int i915_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
@@ -634,9 +730,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 	return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
 }
 
-/* Determine the render context hw id, and ensure it remains fixed for the
+/**
+ * oa_get_render_ctx_id - determine and hold ctx hw id
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * Determine the render context hw id, and ensure it remains fixed for the
  * lifetime of the stream. This ensures that we don't have to worry about
  * updating the context ID in OACONTROL on the fly.
+ *
+ * Returns: zero on success or a negative error code
  */
 static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
 {
@@ -673,6 +775,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
 	return ret;
 }
 
+/**
+ * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * In case anything needed doing to ensure the context HW ID would remain valid
+ * for the lifetime of the stream, then that can be undone here.
+ */
 static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -945,6 +1054,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
 	spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
 }
 
+/**
+ * i915_oa_stream_enable - handle `I915_PERF_IOCTL_ENABLE` for OA stream
+ * @stream: An i915 perf stream opened for OA metrics
+ *
+ * [Re]enables hardware periodic sampling according to the period configured
+ * when opening the stream. This also starts a hrtimer that will periodically
+ * check for data in the circular OA buffer for notifying userspace (e.g.
+ * during a read() or poll()).
+ */
 static void i915_oa_stream_enable(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -962,6 +1080,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN7_OACONTROL, 0);
 }
 
+/**
+ * i915_oa_stream_disable - handle `I915_PERF_IOCTL_DISABLE` for OA stream
+ * @stream: An i915 perf stream opened for OA metrics
+ *
+ * Stops the OA unit from periodically writing counter reports into the
+ * circular OA buffer. This also stops the hrtimer that periodically checks for
+ * data in the circular OA buffer, for notifying userspace.
+ */
 static void i915_oa_stream_disable(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -987,6 +1113,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
 	.read = i915_oa_read,
 };
 
+/**
+ * i915_oa_stream_init - validate combined props for OA stream and init
+ * @stream: An i915 perf stream
+ * @param: The open parameters passed to `DRM_I915_PERF_OPEN`
+ * @props: The property state that configures stream (individually validated)
+ *
+ * While read_properties_unlocked() validates properties in isolation it
+ * doesn't ensure that the combination necessarily makes sense.
+ *
+ * At this point it has been determined that userspace wants a stream of
+ * OA metrics, but still we need to further validate the combined
+ * properties are OK.
+ *
+ * If the configuration makes sense then we can allocate memory for
+ * a circular OA buffer and apply the requested metric set configuration.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int i915_oa_stream_init(struct i915_perf_stream *stream,
 			       struct drm_i915_perf_open_param *param,
 			       struct perf_open_properties *props)
@@ -1111,6 +1255,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * i915_perf_read_locked - &i915_perf_stream_ops->read with error normalisation
+ * @stream: An i915 perf stream
+ * @file: An i915 perf stream file
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @ppos: (inout) file seek position (unused)
+ *
+ * Besides wrapping &i915_perf_stream_ops->read this provides a common place to
+ * ensure that if we've successfully copied any data then reporting that takes
+ * precedence over any internal error status, so the data isn't lost.
+ *
+ * For example ret will be -ENOSPC whenever there is more buffered data than
+ * can be copied to userspace, but that's only interesting if we weren't able
+ * to copy some data because it implies the userspace buffer is too small to
+ * receive a single record (and we never split records).
+ *
+ * Another case with ret == -EFAULT is more of a grey area since it would seem
+ * like bad form for userspace to ask us to overrun its buffer, but the user
+ * knows best:
+ *
+ *   http://yarchive.net/comp/linux/partial_reads_writes.html
+ *
+ * Returns: The number of bytes copied or a negative error code on failure.
+ */
 static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
 				     struct file *file,
 				     char __user *buf,
@@ -1126,25 +1295,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
 	size_t offset = 0;
 	int ret = stream->ops->read(stream, buf, count, &offset);
 
-	/* If we've successfully copied any data then reporting that
-	 * takes precedence over any internal error status, so the
-	 * data isn't lost.
-	 *
-	 * For example ret will be -ENOSPC whenever there is more
-	 * buffered data than can be copied to userspace, but that's
-	 * only interesting if we weren't able to copy some data
-	 * because it implies the userspace buffer is too small to
-	 * receive a single record (and we never split records).
-	 *
-	 * Another case with ret == -EFAULT is more of a grey area
-	 * since it would seem like bad form for userspace to ask us
-	 * to overrun its buffer, but the user knows best:
-	 *
-	 *   http://yarchive.net/comp/linux/partial_reads_writes.html
-	 */
 	return offset ?: (ret ?: -EAGAIN);
 }
 
+/**
+ * i915_perf_read - handles read() FOP for i915 perf stream FDs
+ * @file: An i915 perf stream file
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @ppos: (inout) file seek position (unused)
+ *
+ * The entry point for handling a read() on a stream file descriptor from
+ * userspace. Most of the work is left to the i915_perf_read_locked() and
+ * &i915_perf_stream_ops->read but to save having stream implementations (of
+ * which we might have multiple later) we handle blocking read here.
+ *
+ * We can also consistently treat trying to read from a disabled stream
+ * as an IO error so implementations can assume the stream is enabled
+ * while reading.
+ *
+ * Returns: The number of bytes copied or a negative error code on failure.
+ */
 static ssize_t i915_perf_read(struct file *file,
 			      char __user *buf,
 			      size_t count,
@@ -1211,6 +1382,22 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
 	return HRTIMER_RESTART;
 }
 
+/**
+ * i915_perf_poll_locked - poll_wait() with a suitable wait queue for stream
+ * @dev_priv: i915 device instance
+ * @stream: An i915 perf stream
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream, this calls through to
+ * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that
+ * will be woken for new stream data.
+ *
+ * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
+ * with any non-file-operation driver hooks.
+ *
+ * Returns: any poll events that are ready without sleeping
+ */
 static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
 					  struct i915_perf_stream *stream,
 					  struct file *file,
@@ -1232,6 +1419,19 @@ static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
 	return events;
 }
 
+/**
+ * i915_perf_poll - call poll_wait() with a suitable wait queue for stream
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream, this ensures
+ * poll_wait() gets called with a wait queue that will be woken for new stream
+ * data.
+ *
+ * Note: Implementation deferred to i915_perf_poll_locked()
+ *
+ * Returns: any poll events that are ready without sleeping
+ */
 static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
 {
 	struct i915_perf_stream *stream = file->private_data;
@@ -1245,6 +1445,16 @@ static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
 	return ret;
 }
 
+/**
+ * i915_perf_enable_locked - handle `I915_PERF_IOCTL_ENABLE` ioctl
+ * @stream: A disabled i915 perf stream
+ *
+ * [Re]enables the associated capture of data for this stream.
+ *
+ * If a stream was previously enabled then there's currently no intention
+ * to provide userspace any guarantee about the preservation of previously
+ * buffered data.
+ */
 static void i915_perf_enable_locked(struct i915_perf_stream *stream)
 {
 	if (stream->enabled)
@@ -1257,6 +1467,20 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream)
 		stream->ops->enable(stream);
 }
 
+/**
+ * i915_perf_disable_locked - handle `I915_PERF_IOCTL_DISABLE` ioctl
+ * @stream: An enabled i915 perf stream
+ *
+ * Disables the associated capture of data for this stream.
+ *
+ * The intention is that disabling an re-enabling a stream will ideally be
+ * cheaper than destroying and re-opening a stream with the same configuration,
+ * though there are no formal guarantees about what state or buffered data
+ * must be retained between disabling and re-enabling a stream.
+ *
+ * Note: while a stream is disabled it's considered an error for userspace
+ * to attempt to read from the stream (-EIO).
+ */
 static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 {
 	if (!stream->enabled)
@@ -1269,6 +1493,18 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream)
 		stream->ops->disable(stream);
 }
 
+/**
+ * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
+ * @stream: An i915 perf stream
+ * @cmd: the ioctl request
+ * @arg: the ioctl data
+ *
+ * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
+ * with any non-file-operation driver hooks.
+ *
+ * Returns: zero on success or a negative error code. Returns -EINVAL for
+ * an unknown ioctl request.
+ */
 static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
 				   unsigned int cmd,
 				   unsigned long arg)
@@ -1285,6 +1521,17 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
 	return -EINVAL;
 }
 
+/**
+ * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
+ * @file: An i915 perf stream file
+ * @cmd: the ioctl request
+ * @arg: the ioctl data
+ *
+ * Implementation deferred to i915_perf_ioctl_locked().
+ *
+ * Returns: zero on success or a negative error code. Returns -EINVAL for
+ * an unknown ioctl request.
+ */
 static long i915_perf_ioctl(struct file *file,
 			    unsigned int cmd,
 			    unsigned long arg)
@@ -1300,6 +1547,16 @@ static long i915_perf_ioctl(struct file *file,
 	return ret;
 }
 
+/**
+ * i915_perf_destroy_locked - destroy an i915 perf stream
+ * @stream: An i915 perf stream
+ *
+ * Frees all resources associated with the given i915 perf @stream, disabling
+ * any associated data capture in the process.
+ *
+ * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
+ * with any non-file-operation driver hooks.
+ */
 static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -1321,6 +1578,17 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
 	kfree(stream);
 }
 
+/**
+ * i915_perf_release - handles userspace close() of a stream file
+ * @inode: anonymous inode associated with file
+ * @file: An i915 perf stream file
+ *
+ * Cleans up any resources associated with an open i915 perf stream file.
+ *
+ * NB: close() can't really fail from the userspace point of view.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int i915_perf_release(struct inode *inode, struct file *file)
 {
 	struct i915_perf_stream *stream = file->private_data;
@@ -1365,6 +1633,30 @@ lookup_context(struct drm_i915_private *dev_priv,
 	return ctx;
 }
 
+/**
+ * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD
+ * @dev_priv: i915 device instance
+ * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
+ * @props: individually validated u64 property value pairs
+ * @file: drm file
+ *
+ * See i915_perf_ioctl_open() for interface details.
+ *
+ * Implements further stream config validation and stream initialization on
+ * behalf of i915_perf_open_ioctl() with the &drm_i915_private->perf.lock mutex
+ * taken to serialize with any non-file-operation driver hooks.
+ *
+ * Note: at this point the @props have only been validated in isolation and
+ * it's still necessary to validate that the combination of properties makes
+ * sense.
+ *
+ * In the case where userspace is interested in OA unit metrics then further
+ * config validation and stream initialization details will be handled by
+ * i915_oa_stream_init(). The code here should only validate config state that
+ * will be relevant to all stream types / backends.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int
 i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 			    struct drm_i915_perf_open_param *param,
@@ -1459,12 +1751,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	return ret;
 }
 
-/* Note we copy the properties from userspace outside of the i915 perf
- * mutex to avoid an awkward lockdep with mmap_sem.
+/**
+ * read_properties_unlocked - validate + copy userspace stream open properties
+ * @dev_priv: i915 device instance
+ * @uprops: The array of u64 key value pairs given by userspace
+ * @n_props: The number of key value pairs expected in @uprops
+ * @props: The stream configuration built up while validating properties
  *
  * Note this function only validates properties in isolation it doesn't
  * validate that the combination of properties makes sense or that all
  * properties necessary for a particular kind of stream have been set.
+ *
+ * Note that there currently aren't any ordering requirements for properties so
+ * we shouldn't validate or assume anything about ordering here. This doesn't
+ * rule out defining new properties with ordering requirements in the future.
  */
 static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 				    u64 __user *uprops,
@@ -1586,6 +1886,30 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 	return 0;
 }
 
+/**
+ * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
+ * @dev: drm device
+ * @data: ioctl data copied from userspace (unvalidated)
+ * @file: drm file
+ *
+ * Validates the stream open parameters given by userspace including flags
+ * and an array of u64 key, value pair properties.
+ *
+ * Very little is assumed up front about the nature of the stream being
+ * opened (for instance we don't assume it's for periodic OA unit metrics). An
+ * i915-perf stream is expected to be a suitable interface for other forms of
+ * buffered data written by the GPU besides periodic OA metrics.
+ *
+ * Note we copy the properties from userspace outside of the i915 perf
+ * mutex to avoid an awkward lockdep with mmap_sem.
+ *
+ * Most of the implementation details are handled by
+ * i915_perf_open_ioctl_locked() after taking the &drm_i915_private->perf.lock
+ * mutex for serializing with any non-file-operation driver hooks.
+ *
+ * Return: A newly opened i915 Perf stream file descriptor or negative
+ * error code on failure.
+ */
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file)
 {
@@ -1622,6 +1946,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+/**
+ * i915_perf_register - exposes i915-perf to userspace
+ * @dev_priv: i915 device instance
+ *
+ * In particular OA metric sets are advertised under a sysfs metrics/
+ * directory allowing userspace to enumerate valid IDs that can be
+ * used to open an i915-perf stream.
+ */
 void i915_perf_register(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1651,6 +1983,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
 	mutex_unlock(&dev_priv->perf.lock);
 }
 
+/**
+ * i915_perf_unregister - hide i915-perf from userspace
+ * @dev_priv: i915 device instance
+ *
+ * i915-perf state cleanup is split up into an 'unregister' and
+ * 'deinit' phase where the interface is first hidden from
+ * userspace by i915_perf_unregister() before cleaning up
+ * remaining state in i915_perf_fini().
+ */
 void i915_perf_unregister(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1707,6 +2048,15 @@ static struct ctl_table dev_root[] = {
 	{}
 };
 
+/**
+ * i915_perf_init - initialize i915-perf state on module load
+ * @dev_priv: i915 device instance
+ *
+ * Initializes i915-perf state without exposing anything to userspace.
+ *
+ * Note: i915-perf initialization is split into an 'init' and 'register'
+ * phase with the i915_perf_register() exposing state to userspace.
+ */
 void i915_perf_init(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1742,6 +2092,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 	dev_priv->perf.initialized = true;
 }
 
+/**
+ * i915_perf_fini - Counter part to i915_perf_init()
+ * @dev_priv: i915 device instance
+ */
 void i915_perf_fini(struct drm_i915_private *dev_priv)
 {
 	if (!dev_priv->perf.initialized)
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915/perf: More documentation hooked to i915.rst (rev3)
  2016-12-05 11:20 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
  2016-12-05 12:16 ` Matthew Auld
  2016-12-05 13:55 ` ✗ Fi.CI.BAT: failure for drm/i915/perf: More documentation hooked to i915.rst (rev2) Patchwork
@ 2016-12-07 22:53 ` Patchwork
  2 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2016-12-07 22:53 UTC (permalink / raw)
  To: Robert Bragg; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/perf: More documentation hooked to i915.rst (rev3)
URL   : https://patchwork.freedesktop.org/series/16114/
State : success

== Summary ==

Series 16114v3 drm/i915/perf: More documentation hooked to i915.rst
https://patchwork.freedesktop.org/api/1.0/series/16114/revisions/3/mbox/


fi-bdw-5557u     total:247  pass:219  dwarn:0   dfail:0   fail:0   skip:28 
fi-bsw-n3050     total:247  pass:195  dwarn:0   dfail:0   fail:0   skip:52 
fi-bxt-t5700     total:247  pass:206  dwarn:0   dfail:0   fail:0   skip:41 
fi-byt-j1900     total:247  pass:206  dwarn:0   dfail:0   fail:0   skip:41 
fi-hsw-4770      total:247  pass:214  dwarn:0   dfail:0   fail:0   skip:33 
fi-hsw-4770r     total:247  pass:214  dwarn:0   dfail:0   fail:0   skip:33 
fi-ilk-650       total:247  pass:181  dwarn:0   dfail:0   fail:0   skip:66 
fi-ivb-3520m     total:247  pass:213  dwarn:0   dfail:0   fail:0   skip:34 
fi-ivb-3770      total:247  pass:212  dwarn:0   dfail:0   fail:0   skip:35 
fi-kbl-7500u     total:247  pass:212  dwarn:0   dfail:0   fail:0   skip:35 
fi-skl-6260u     total:247  pass:220  dwarn:0   dfail:0   fail:0   skip:27 
fi-skl-6700hq    total:247  pass:214  dwarn:0   dfail:0   fail:0   skip:33 
fi-skl-6700k     total:247  pass:210  dwarn:3   dfail:0   fail:0   skip:34 
fi-skl-6770hq    total:247  pass:220  dwarn:0   dfail:0   fail:0   skip:27 
fi-snb-2520m     total:247  pass:202  dwarn:0   dfail:0   fail:0   skip:45 
fi-snb-2600      total:247  pass:201  dwarn:0   dfail:0   fail:0   skip:46 

ffcd96e45b3b3e4ca57d4879405d02b7b1b58946 drm-tip: 2016y-12m-07d-21h-45m-40s UTC integration manifest
d7cb56d drm/i915/perf: More documentation hooked to i915.rst

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3227/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-07 21:40   ` Robert Bragg
@ 2016-12-08 15:53     ` Daniel Vetter
  2016-12-09 12:41       ` Robert Bragg
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Vetter @ 2016-12-08 15:53 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Daniel Vetter, intel-gfx

On Wed, Dec 07, 2016 at 09:40:33PM +0000, Robert Bragg wrote:
> This adds a 'Perf' section to i915.rst with the following sub sections:
> - Overview
> - Comparison with Core Perf
> - i915 Driver Entry Points
> - i915 Perf Stream
> - i915 Perf Observation Architecture Stream
> - All i915 Perf Internals
> 
> v2:
>     section headers in i915.rst (Daniel Vetter)
>     missing symbol docs + other fixups (Matthew Auld)
> 
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

Obligatory bikeshed about explicitly listening all functions, but hey
great docs, I'll merge ;-)

Thanks, applied to dinq.
-Daniel

> ---
>  Documentation/gpu/i915.rst       |  91 +++++++++
>  drivers/gpu/drm/i915/i915_drv.h  | 151 +++++++++++---
>  drivers/gpu/drm/i915/i915_perf.c | 412 ++++++++++++++++++++++++++++++++++++---
>  3 files changed, 598 insertions(+), 56 deletions(-)
> 
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 117d2ab..3843ef6 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -356,4 +356,95 @@ switch_mm
>  .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
>     :doc: switch_mm tracepoint
>  
> +Perf
> +====
> +
> +Overview
> +--------
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf Overview
> +
> +Comparison with Core Perf
> +-------------------------
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf History and Comparison with Core Perf
> +
> +i915 Driver Entry Points
> +------------------------
> +
> +This section covers the entrypoints exported outside of i915_perf.c to
> +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_fini
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_register
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_unregister
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_release
> +
> +i915 Perf Stream
> +----------------
> +
> +This section covers the stream-semantics-agnostic structures and functions
> +for representing an i915 perf stream FD and associated file operations.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: read_properties_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl_locked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_destroy_locked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_enable_locked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_disable_locked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_poll
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_poll_locked
> +
> +i915 Perf Observation Architecture Stream
> +-----------------------------------------
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_oa_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_enable
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_disable
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_wait_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_poll_wait
> +
> +All i915 Perf Internals
> +-----------------------
> +
> +This section simply includes all currently documented i915 perf internals, in
> +no particular order, but may include some more minor utilities or platform
> +specific details than found in the more high-level sections.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :internal:
> +
>  .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 33758ac..49c7651 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1847,89 +1847,186 @@ struct i915_oa_reg {
>  
>  struct i915_perf_stream;
>  
> +/**
> + * struct i915_perf_stream_ops - the OPs to support a specific stream type
> + */
>  struct i915_perf_stream_ops {
> -	/* Enables the collection of HW samples, either in response to
> -	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
> -	 * opened without I915_PERF_FLAG_DISABLED.
> +	/**
> +	 * @enable: Enables the collection of HW samples, either in response to
> +	 * `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
> +	 * without `I915_PERF_FLAG_DISABLED`.
>  	 */
>  	void (*enable)(struct i915_perf_stream *stream);
>  
> -	/* Disables the collection of HW samples, either in response to
> -	 * I915_PERF_IOCTL_DISABLE or implicitly called before
> -	 * destroying the stream.
> +	/**
> +	 * @disable: Disables the collection of HW samples, either in response
> +	 * to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
> +	 * the stream.
>  	 */
>  	void (*disable)(struct i915_perf_stream *stream);
>  
> -	/* Call poll_wait, passing a wait queue that will be woken
> +	/**
> +	 * @poll_wait: Call poll_wait, passing a wait queue that will be woken
>  	 * once there is something ready to read() for the stream
>  	 */
>  	void (*poll_wait)(struct i915_perf_stream *stream,
>  			  struct file *file,
>  			  poll_table *wait);
>  
> -	/* For handling a blocking read, wait until there is something
> -	 * to ready to read() for the stream. E.g. wait on the same
> +	/**
> +	 * @wait_unlocked: For handling a blocking read, wait until there is
> +	 * something to ready to read() for the stream. E.g. wait on the same
>  	 * wait queue that would be passed to poll_wait().
>  	 */
>  	int (*wait_unlocked)(struct i915_perf_stream *stream);
>  
> -	/* read - Copy buffered metrics as records to userspace
> -	 * @buf: the userspace, destination buffer
> -	 * @count: the number of bytes to copy, requested by userspace
> -	 * @offset: zero at the start of the read, updated as the read
> -	 *          proceeds, it represents how many bytes have been
> -	 *          copied so far and the buffer offset for copying the
> -	 *          next record.
> +	/**
> +	 * @read: Copy buffered metrics as records to userspace
> +	 * **buf**: the userspace, destination buffer
> +	 * **count**: the number of bytes to copy, requested by userspace
> +	 * **offset**: zero at the start of the read, updated as the read
> +	 * proceeds, it represents how many bytes have been copied so far and
> +	 * the buffer offset for copying the next record.
>  	 *
> -	 * Copy as many buffered i915 perf samples and records for
> -	 * this stream to userspace as will fit in the given buffer.
> +	 * Copy as many buffered i915 perf samples and records for this stream
> +	 * to userspace as will fit in the given buffer.
>  	 *
> -	 * Only write complete records; returning -ENOSPC if there
> -	 * isn't room for a complete record.
> +	 * Only write complete records; returning -%ENOSPC if there isn't room
> +	 * for a complete record.
>  	 *
> -	 * Return any error condition that results in a short read
> -	 * such as -ENOSPC or -EFAULT, even though these may be
> -	 * squashed before returning to userspace.
> +	 * Return any error condition that results in a short read such as
> +	 * -%ENOSPC or -%EFAULT, even though these may be squashed before
> +	 * returning to userspace.
>  	 */
>  	int (*read)(struct i915_perf_stream *stream,
>  		    char __user *buf,
>  		    size_t count,
>  		    size_t *offset);
>  
> -	/* Cleanup any stream specific resources.
> +	/**
> +	 * @destroy: Cleanup any stream specific resources.
>  	 *
>  	 * The stream will always be disabled before this is called.
>  	 */
>  	void (*destroy)(struct i915_perf_stream *stream);
>  };
>  
> +/**
> + * struct i915_perf_stream - state for a single open stream FD
> + */
>  struct i915_perf_stream {
> +	/**
> +	 * @dev_priv: i915 drm device
> +	 */
>  	struct drm_i915_private *dev_priv;
>  
> +	/**
> +	 * @link: Links the stream into ``&drm_i915_private->streams``
> +	 */
>  	struct list_head link;
>  
> +	/**
> +	 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
> +	 * properties given when opening a stream, representing the contents
> +	 * of a single sample as read() by userspace.
> +	 */
>  	u32 sample_flags;
> +
> +	/**
> +	 * @sample_size: Considering the configured contents of a sample
> +	 * combined with the required header size, this is the total size
> +	 * of a single sample record.
> +	 */
>  	int sample_size;
>  
> +	/**
> +	 * @ctx: %NULL if measuring system-wide across all contexts or a
> +	 * specific context that is being monitored.
> +	 */
>  	struct i915_gem_context *ctx;
> +
> +	/**
> +	 * @enabled: Whether the stream is currently enabled, considering
> +	 * whether the stream was opened in a disabled state and based
> +	 * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
> +	 */
>  	bool enabled;
>  
> +	/**
> +	 * @ops: The callbacks providing the implementation of this specific
> +	 * type of configured stream.
> +	 */
>  	const struct i915_perf_stream_ops *ops;
>  };
>  
> +/**
> + * struct i915_oa_ops - Gen specific implementation of an OA unit stream
> + */
>  struct i915_oa_ops {
> +	/**
> +	 * @init_oa_buffer: Resets the head and tail pointers of the
> +	 * circular buffer for periodic OA reports.
> +	 *
> +	 * Called when first opening a stream for OA metrics, but also may be
> +	 * called in response to an OA buffer overflow or other error
> +	 * condition.
> +	 *
> +	 * Note it may be necessary to clear the full OA buffer here as part of
> +	 * maintaining the invariable that new reports must be written to
> +	 * zeroed memory for us to be able to reliable detect if an expected
> +	 * report has not yet landed in memory.  (At least on Haswell the OA
> +	 * buffer tail pointer is not synchronized with reports being visible
> +	 * to the CPU)
> +	 */
>  	void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @enable_metric_set: Applies any MUX configuration to set up the
> +	 * Boolean and Custom (B/C) counters that are part of the counter
> +	 * reports being sampled. May apply system constraints such as
> +	 * disabling EU clock gating as required.
> +	 */
>  	int (*enable_metric_set)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @disable_metric_set: Remove system constraints associated with using
> +	 * the OA unit.
> +	 */
>  	void (*disable_metric_set)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @oa_enable: Enable periodic sampling
> +	 */
>  	void (*oa_enable)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @oa_disable: Disable periodic sampling
> +	 */
>  	void (*oa_disable)(struct drm_i915_private *dev_priv);
> -	void (*update_oacontrol)(struct drm_i915_private *dev_priv);
> -	void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
> -					u32 ctx_id);
> +
> +	/**
> +	 * @read: Copy data from the circular OA buffer into a given userspace
> +	 * buffer.
> +	 */
>  	int (*read)(struct i915_perf_stream *stream,
>  		    char __user *buf,
>  		    size_t count,
>  		    size_t *offset);
> +
> +	/**
> +	 * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
> +	 *
> +	 * This is either called via fops or the poll check hrtimer (atomic
> +	 * ctx) without any locks taken.
> +	 *
> +	 * It's safe to read OA config state here unlocked, assuming that this
> +	 * is only called while the stream is enabled, while the global OA
> +	 * configuration can't be modified.
> +	 *
> +	 * Efficiency is more important than avoiding some false positives
> +	 * here, which will be handled gracefully - likely resulting in an
> +	 * %EAGAIN error for userspace.
> +	 */
>  	bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
>  };
>  
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 0c6e124..ae7bd0e 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -26,7 +26,7 @@
>  
>  
>  /**
> - * DOC: i915 Perf, streaming API for GPU metrics
> + * DOC: i915 Perf Overview
>   *
>   * Gen graphics supports a large number of performance counters that can help
>   * driver and application developers understand and optimize their use of the
> @@ -45,6 +45,10 @@
>   * privileges by default, unless changed via the dev.i915.perf_event_paranoid
>   * sysctl option.
>   *
> + */
> +
> +/**
> + * DOC: i915 Perf History and Comparison with Core Perf
>   *
>   * The interface was initially inspired by the core Perf infrastructure but
>   * some notable differences are:
> @@ -75,8 +79,8 @@
>   * gets copied from the GPU mapped buffers to userspace buffers.
>   *
>   *
> - * Some notes regarding Linux Perf:
> - * --------------------------------
> + * Issues hit with first prototype based on Core Perf
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   *
>   * The first prototype of this driver was based on the core perf
>   * infrastructure, and while we did make that mostly work, with some changes to
> @@ -135,7 +139,7 @@
>   *   for combining with the side-band raw reports it captures using
>   *   MI_REPORT_PERF_COUNT commands.
>   *
> - *   _ As a side note on perf's grouping feature; there was also some concern
> + *   - As a side note on perf's grouping feature; there was also some concern
>   *     that using PERF_FORMAT_GROUP as a way to pack together counter values
>   *     would quite drastically inflate our sample sizes, which would likely
>   *     lower the effective sampling resolutions we could use when the available
> @@ -277,6 +281,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
>  
>  #define SAMPLE_OA_REPORT      (1<<0)
>  
> +/**
> + * struct perf_open_properties - for validated properties given to open a stream
> + * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
> + * @single_context: Whether a single or all gpu contexts should be monitored
> + * @ctx_handle: A gem ctx handle for use with @single_context
> + * @metrics_set: An ID for an OA unit metric set advertised via sysfs
> + * @oa_format: An OA unit HW report format
> + * @oa_periodic: Whether to enable periodic OA unit sampling
> + * @oa_period_exponent: The OA unit sampling period is derived from this
> + *
> + * As read_properties_unlocked() enumerates and validates the properties given
> + * to open a stream of metrics the configuration is built up in the structure
> + * which starts out zero initialized.
> + */
>  struct perf_open_properties {
>  	u32 sample_flags;
>  
> @@ -314,7 +332,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
>  }
>  
>  /**
> - * Appends a status record to a userspace read() buffer.
> + * append_oa_status - Appends a status record to a userspace read() buffer.
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + * @type: The kind of status to report to userspace
> + *
> + * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
> + * into the userspace read() buffer.
> + *
> + * The @buf @offset will only be updated on success.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int append_oa_status(struct i915_perf_stream *stream,
>  			    char __user *buf,
> @@ -336,7 +366,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
>  }
>  
>  /**
> - * Copies single OA report into userspace read() buffer.
> + * append_oa_sample - Copies single OA report into userspace read() buffer.
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + * @report: A single OA report to (optionally) include as part of the sample
> + *
> + * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
> + * properties when opening a stream, tracked as `stream->sample_flags`. This
> + * function copies the requested components of a single sample to the given
> + * read() @buf.
> + *
> + * The @buf @offset will only be updated on success.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int append_oa_sample(struct i915_perf_stream *stream,
>  			    char __user *buf,
> @@ -380,10 +424,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>   * @head_ptr: (inout): the current oa buffer cpu read position
>   * @tail: the current oa buffer gpu write position
>   *
> - * Returns 0 on success, negative error code on failure.
> - *
> - * Notably any error condition resulting in a short read (-ENOSPC or
> - * -EFAULT) will be returned even though one or more records may
> + * Notably any error condition resulting in a short read (-%ENOSPC or
> + * -%EFAULT) will be returned even though one or more records may
>   * have been successfully copied. In this case it's up to the caller
>   * to decide if the error should be squashed before returning to
>   * userspace.
> @@ -392,6 +434,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>   * tail, so the head chases the tail?... If you think that's mad
>   * and back-to-front you're not alone, but this follows the
>   * Gen PRM naming convention.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>  				  char __user *buf,
> @@ -496,6 +540,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>  	return ret;
>  }
>  
> +/**
> + * gen7_oa_read - copy status records then buffered OA reports
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + *
> + * Checks Gen 7 specific OA unit status registers and if necessary appends
> + * corresponding status records for userspace (such as for a buffer full
> + * condition) and then initiate appending any buffered OA reports.
> + *
> + * Updates @offset according to the number of bytes successfully copied into
> + * the userspace buffer.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int gen7_oa_read(struct i915_perf_stream *stream,
>  			char __user *buf,
>  			size_t count,
> @@ -597,6 +657,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
>  	return ret;
>  }
>  
> +/**
> + * i915_oa_wait_unlocked - handles blocking IO until OA data available
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * Called when userspace tries to read() from a blocking stream FD opened
> + * for OA metrics. It waits until the hrtimer callback finds a non-empty
> + * OA buffer and wakes us.
> + *
> + * Note: it's acceptable to have this return with some false positives
> + * since any subsequent read handling will return -EAGAIN if there isn't
> + * really data ready for userspace yet.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -615,6 +689,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>  					!dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
>  }
>  
> +/**
> + * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
> + * @stream: An i915-perf stream opened for OA metrics
> + * @file: An i915 perf stream file
> + * @wait: poll() state table
> + *
> + * For handling userspace polling on an i915 perf stream opened for OA metrics,
> + * this starts a poll_wait with the wait queue that our hrtimer callback wakes
> + * when it sees data ready to read in the circular OA buffer.
> + */
>  static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>  			      struct file *file,
>  			      poll_table *wait)
> @@ -624,6 +708,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>  	poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
>  }
>  
> +/**
> + * i915_oa_read - just calls through to &i915_oa_ops->read
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + *
> + * Updates @offset according to the number of bytes successfully copied into
> + * the userspace buffer.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int i915_oa_read(struct i915_perf_stream *stream,
>  			char __user *buf,
>  			size_t count,
> @@ -634,9 +730,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
>  	return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
>  }
>  
> -/* Determine the render context hw id, and ensure it remains fixed for the
> +/**
> + * oa_get_render_ctx_id - determine and hold ctx hw id
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * Determine the render context hw id, and ensure it remains fixed for the
>   * lifetime of the stream. This ensures that we don't have to worry about
>   * updating the context ID in OACONTROL on the fly.
> + *
> + * Returns: zero on success or a negative error code
>   */
>  static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>  {
> @@ -673,6 +775,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>  	return ret;
>  }
>  
> +/**
> + * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * In case anything needed doing to ensure the context HW ID would remain valid
> + * for the lifetime of the stream, then that can be undone here.
> + */
>  static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -945,6 +1054,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
>  	spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
>  }
>  
> +/**
> + * i915_oa_stream_enable - handle `I915_PERF_IOCTL_ENABLE` for OA stream
> + * @stream: An i915 perf stream opened for OA metrics
> + *
> + * [Re]enables hardware periodic sampling according to the period configured
> + * when opening the stream. This also starts a hrtimer that will periodically
> + * check for data in the circular OA buffer for notifying userspace (e.g.
> + * during a read() or poll()).
> + */
>  static void i915_oa_stream_enable(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -962,6 +1080,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
>  	I915_WRITE(GEN7_OACONTROL, 0);
>  }
>  
> +/**
> + * i915_oa_stream_disable - handle `I915_PERF_IOCTL_DISABLE` for OA stream
> + * @stream: An i915 perf stream opened for OA metrics
> + *
> + * Stops the OA unit from periodically writing counter reports into the
> + * circular OA buffer. This also stops the hrtimer that periodically checks for
> + * data in the circular OA buffer, for notifying userspace.
> + */
>  static void i915_oa_stream_disable(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -987,6 +1113,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
>  	.read = i915_oa_read,
>  };
>  
> +/**
> + * i915_oa_stream_init - validate combined props for OA stream and init
> + * @stream: An i915 perf stream
> + * @param: The open parameters passed to `DRM_I915_PERF_OPEN`
> + * @props: The property state that configures stream (individually validated)
> + *
> + * While read_properties_unlocked() validates properties in isolation it
> + * doesn't ensure that the combination necessarily makes sense.
> + *
> + * At this point it has been determined that userspace wants a stream of
> + * OA metrics, but still we need to further validate the combined
> + * properties are OK.
> + *
> + * If the configuration makes sense then we can allocate memory for
> + * a circular OA buffer and apply the requested metric set configuration.
> + *
> + * Returns: zero on success or a negative error code.
> + */
>  static int i915_oa_stream_init(struct i915_perf_stream *stream,
>  			       struct drm_i915_perf_open_param *param,
>  			       struct perf_open_properties *props)
> @@ -1111,6 +1255,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>  	return ret;
>  }
>  
> +/**
> + * i915_perf_read_locked - &i915_perf_stream_ops->read with error normalisation
> + * @stream: An i915 perf stream
> + * @file: An i915 perf stream file
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @ppos: (inout) file seek position (unused)
> + *
> + * Besides wrapping &i915_perf_stream_ops->read this provides a common place to
> + * ensure that if we've successfully copied any data then reporting that takes
> + * precedence over any internal error status, so the data isn't lost.
> + *
> + * For example ret will be -ENOSPC whenever there is more buffered data than
> + * can be copied to userspace, but that's only interesting if we weren't able
> + * to copy some data because it implies the userspace buffer is too small to
> + * receive a single record (and we never split records).
> + *
> + * Another case with ret == -EFAULT is more of a grey area since it would seem
> + * like bad form for userspace to ask us to overrun its buffer, but the user
> + * knows best:
> + *
> + *   http://yarchive.net/comp/linux/partial_reads_writes.html
> + *
> + * Returns: The number of bytes copied or a negative error code on failure.
> + */
>  static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>  				     struct file *file,
>  				     char __user *buf,
> @@ -1126,25 +1295,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>  	size_t offset = 0;
>  	int ret = stream->ops->read(stream, buf, count, &offset);
>  
> -	/* If we've successfully copied any data then reporting that
> -	 * takes precedence over any internal error status, so the
> -	 * data isn't lost.
> -	 *
> -	 * For example ret will be -ENOSPC whenever there is more
> -	 * buffered data than can be copied to userspace, but that's
> -	 * only interesting if we weren't able to copy some data
> -	 * because it implies the userspace buffer is too small to
> -	 * receive a single record (and we never split records).
> -	 *
> -	 * Another case with ret == -EFAULT is more of a grey area
> -	 * since it would seem like bad form for userspace to ask us
> -	 * to overrun its buffer, but the user knows best:
> -	 *
> -	 *   http://yarchive.net/comp/linux/partial_reads_writes.html
> -	 */
>  	return offset ?: (ret ?: -EAGAIN);
>  }
>  
> +/**
> + * i915_perf_read - handles read() FOP for i915 perf stream FDs
> + * @file: An i915 perf stream file
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @ppos: (inout) file seek position (unused)
> + *
> + * The entry point for handling a read() on a stream file descriptor from
> + * userspace. Most of the work is left to the i915_perf_read_locked() and
> + * &i915_perf_stream_ops->read but to save having stream implementations (of
> + * which we might have multiple later) we handle blocking read here.
> + *
> + * We can also consistently treat trying to read from a disabled stream
> + * as an IO error so implementations can assume the stream is enabled
> + * while reading.
> + *
> + * Returns: The number of bytes copied or a negative error code on failure.
> + */
>  static ssize_t i915_perf_read(struct file *file,
>  			      char __user *buf,
>  			      size_t count,
> @@ -1211,6 +1382,22 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
>  	return HRTIMER_RESTART;
>  }
>  
> +/**
> + * i915_perf_poll_locked - poll_wait() with a suitable wait queue for stream
> + * @dev_priv: i915 device instance
> + * @stream: An i915 perf stream
> + * @file: An i915 perf stream file
> + * @wait: poll() state table
> + *
> + * For handling userspace polling on an i915 perf stream, this calls through to
> + * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that
> + * will be woken for new stream data.
> + *
> + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
> + * with any non-file-operation driver hooks.
> + *
> + * Returns: any poll events that are ready without sleeping
> + */
>  static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
>  					  struct i915_perf_stream *stream,
>  					  struct file *file,
> @@ -1232,6 +1419,19 @@ static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
>  	return events;
>  }
>  
> +/**
> + * i915_perf_poll - call poll_wait() with a suitable wait queue for stream
> + * @file: An i915 perf stream file
> + * @wait: poll() state table
> + *
> + * For handling userspace polling on an i915 perf stream, this ensures
> + * poll_wait() gets called with a wait queue that will be woken for new stream
> + * data.
> + *
> + * Note: Implementation deferred to i915_perf_poll_locked()
> + *
> + * Returns: any poll events that are ready without sleeping
> + */
>  static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
>  {
>  	struct i915_perf_stream *stream = file->private_data;
> @@ -1245,6 +1445,16 @@ static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
>  	return ret;
>  }
>  
> +/**
> + * i915_perf_enable_locked - handle `I915_PERF_IOCTL_ENABLE` ioctl
> + * @stream: A disabled i915 perf stream
> + *
> + * [Re]enables the associated capture of data for this stream.
> + *
> + * If a stream was previously enabled then there's currently no intention
> + * to provide userspace any guarantee about the preservation of previously
> + * buffered data.
> + */
>  static void i915_perf_enable_locked(struct i915_perf_stream *stream)
>  {
>  	if (stream->enabled)
> @@ -1257,6 +1467,20 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream)
>  		stream->ops->enable(stream);
>  }
>  
> +/**
> + * i915_perf_disable_locked - handle `I915_PERF_IOCTL_DISABLE` ioctl
> + * @stream: An enabled i915 perf stream
> + *
> + * Disables the associated capture of data for this stream.
> + *
> + * The intention is that disabling an re-enabling a stream will ideally be
> + * cheaper than destroying and re-opening a stream with the same configuration,
> + * though there are no formal guarantees about what state or buffered data
> + * must be retained between disabling and re-enabling a stream.
> + *
> + * Note: while a stream is disabled it's considered an error for userspace
> + * to attempt to read from the stream (-EIO).
> + */
>  static void i915_perf_disable_locked(struct i915_perf_stream *stream)
>  {
>  	if (!stream->enabled)
> @@ -1269,6 +1493,18 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream)
>  		stream->ops->disable(stream);
>  }
>  
> +/**
> + * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
> + * @stream: An i915 perf stream
> + * @cmd: the ioctl request
> + * @arg: the ioctl data
> + *
> + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
> + * with any non-file-operation driver hooks.
> + *
> + * Returns: zero on success or a negative error code. Returns -EINVAL for
> + * an unknown ioctl request.
> + */
>  static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
>  				   unsigned int cmd,
>  				   unsigned long arg)
> @@ -1285,6 +1521,17 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
>  	return -EINVAL;
>  }
>  
> +/**
> + * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
> + * @file: An i915 perf stream file
> + * @cmd: the ioctl request
> + * @arg: the ioctl data
> + *
> + * Implementation deferred to i915_perf_ioctl_locked().
> + *
> + * Returns: zero on success or a negative error code. Returns -EINVAL for
> + * an unknown ioctl request.
> + */
>  static long i915_perf_ioctl(struct file *file,
>  			    unsigned int cmd,
>  			    unsigned long arg)
> @@ -1300,6 +1547,16 @@ static long i915_perf_ioctl(struct file *file,
>  	return ret;
>  }
>  
> +/**
> + * i915_perf_destroy_locked - destroy an i915 perf stream
> + * @stream: An i915 perf stream
> + *
> + * Frees all resources associated with the given i915 perf @stream, disabling
> + * any associated data capture in the process.
> + *
> + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
> + * with any non-file-operation driver hooks.
> + */
>  static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -1321,6 +1578,17 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
>  	kfree(stream);
>  }
>  
> +/**
> + * i915_perf_release - handles userspace close() of a stream file
> + * @inode: anonymous inode associated with file
> + * @file: An i915 perf stream file
> + *
> + * Cleans up any resources associated with an open i915 perf stream file.
> + *
> + * NB: close() can't really fail from the userspace point of view.
> + *
> + * Returns: zero on success or a negative error code.
> + */
>  static int i915_perf_release(struct inode *inode, struct file *file)
>  {
>  	struct i915_perf_stream *stream = file->private_data;
> @@ -1365,6 +1633,30 @@ lookup_context(struct drm_i915_private *dev_priv,
>  	return ctx;
>  }
>  
> +/**
> + * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD
> + * @dev_priv: i915 device instance
> + * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
> + * @props: individually validated u64 property value pairs
> + * @file: drm file
> + *
> + * See i915_perf_ioctl_open() for interface details.
> + *
> + * Implements further stream config validation and stream initialization on
> + * behalf of i915_perf_open_ioctl() with the &drm_i915_private->perf.lock mutex
> + * taken to serialize with any non-file-operation driver hooks.
> + *
> + * Note: at this point the @props have only been validated in isolation and
> + * it's still necessary to validate that the combination of properties makes
> + * sense.
> + *
> + * In the case where userspace is interested in OA unit metrics then further
> + * config validation and stream initialization details will be handled by
> + * i915_oa_stream_init(). The code here should only validate config state that
> + * will be relevant to all stream types / backends.
> + *
> + * Returns: zero on success or a negative error code.
> + */
>  static int
>  i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>  			    struct drm_i915_perf_open_param *param,
> @@ -1459,12 +1751,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>  	return ret;
>  }
>  
> -/* Note we copy the properties from userspace outside of the i915 perf
> - * mutex to avoid an awkward lockdep with mmap_sem.
> +/**
> + * read_properties_unlocked - validate + copy userspace stream open properties
> + * @dev_priv: i915 device instance
> + * @uprops: The array of u64 key value pairs given by userspace
> + * @n_props: The number of key value pairs expected in @uprops
> + * @props: The stream configuration built up while validating properties
>   *
>   * Note this function only validates properties in isolation it doesn't
>   * validate that the combination of properties makes sense or that all
>   * properties necessary for a particular kind of stream have been set.
> + *
> + * Note that there currently aren't any ordering requirements for properties so
> + * we shouldn't validate or assume anything about ordering here. This doesn't
> + * rule out defining new properties with ordering requirements in the future.
>   */
>  static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>  				    u64 __user *uprops,
> @@ -1586,6 +1886,30 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>  	return 0;
>  }
>  
> +/**
> + * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
> + * @dev: drm device
> + * @data: ioctl data copied from userspace (unvalidated)
> + * @file: drm file
> + *
> + * Validates the stream open parameters given by userspace including flags
> + * and an array of u64 key, value pair properties.
> + *
> + * Very little is assumed up front about the nature of the stream being
> + * opened (for instance we don't assume it's for periodic OA unit metrics). An
> + * i915-perf stream is expected to be a suitable interface for other forms of
> + * buffered data written by the GPU besides periodic OA metrics.
> + *
> + * Note we copy the properties from userspace outside of the i915 perf
> + * mutex to avoid an awkward lockdep with mmap_sem.
> + *
> + * Most of the implementation details are handled by
> + * i915_perf_open_ioctl_locked() after taking the &drm_i915_private->perf.lock
> + * mutex for serializing with any non-file-operation driver hooks.
> + *
> + * Return: A newly opened i915 Perf stream file descriptor or negative
> + * error code on failure.
> + */
>  int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>  			 struct drm_file *file)
>  {
> @@ -1622,6 +1946,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>  	return ret;
>  }
>  
> +/**
> + * i915_perf_register - exposes i915-perf to userspace
> + * @dev_priv: i915 device instance
> + *
> + * In particular OA metric sets are advertised under a sysfs metrics/
> + * directory allowing userspace to enumerate valid IDs that can be
> + * used to open an i915-perf stream.
> + */
>  void i915_perf_register(struct drm_i915_private *dev_priv)
>  {
>  	if (!IS_HASWELL(dev_priv))
> @@ -1651,6 +1983,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
>  	mutex_unlock(&dev_priv->perf.lock);
>  }
>  
> +/**
> + * i915_perf_unregister - hide i915-perf from userspace
> + * @dev_priv: i915 device instance
> + *
> + * i915-perf state cleanup is split up into an 'unregister' and
> + * 'deinit' phase where the interface is first hidden from
> + * userspace by i915_perf_unregister() before cleaning up
> + * remaining state in i915_perf_fini().
> + */
>  void i915_perf_unregister(struct drm_i915_private *dev_priv)
>  {
>  	if (!IS_HASWELL(dev_priv))
> @@ -1707,6 +2048,15 @@ static struct ctl_table dev_root[] = {
>  	{}
>  };
>  
> +/**
> + * i915_perf_init - initialize i915-perf state on module load
> + * @dev_priv: i915 device instance
> + *
> + * Initializes i915-perf state without exposing anything to userspace.
> + *
> + * Note: i915-perf initialization is split into an 'init' and 'register'
> + * phase with the i915_perf_register() exposing state to userspace.
> + */
>  void i915_perf_init(struct drm_i915_private *dev_priv)
>  {
>  	if (!IS_HASWELL(dev_priv))
> @@ -1742,6 +2092,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
>  	dev_priv->perf.initialized = true;
>  }
>  
> +/**
> + * i915_perf_fini - Counter part to i915_perf_init()
> + * @dev_priv: i915 device instance
> + */
>  void i915_perf_fini(struct drm_i915_private *dev_priv)
>  {
>  	if (!dev_priv->perf.initialized)
> -- 
> 2.10.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-08 15:53     ` Daniel Vetter
@ 2016-12-09 12:41       ` Robert Bragg
  2016-12-12  7:34         ` Daniel Vetter
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Bragg @ 2016-12-09 12:41 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel Graphics Development

On Thu, Dec 8, 2016 at 3:53 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Dec 07, 2016 at 09:40:33PM +0000, Robert Bragg wrote:
>> This adds a 'Perf' section to i915.rst with the following sub sections:
>> - Overview
>> - Comparison with Core Perf
>> - i915 Driver Entry Points
>> - i915 Perf Stream
>> - i915 Perf Observation Architecture Stream
>> - All i915 Perf Internals
>>
>> v2:
>>     section headers in i915.rst (Daniel Vetter)
>>     missing symbol docs + other fixups (Matthew Auld)
>>
>> Signed-off-by: Robert Bragg <robert@sixbynine.org>
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> Obligatory bikeshed about explicitly listening all functions, but hey
> great docs, I'll merge ;-)

Yeah, understood.

Not sure if you saw my last reply to this, but I think listing symbols
explicitly is appropriate for documentation, where the ordering and
groupings of symbols for the sake of introducing them in logical
stages doesn't necessarily match that found in code.

Not perfect in itself but having worked with gtk-doc in the past I
thought it made sense, and worked reasonably enough, to explicitly
list section headers and symbols in a sections.txt file to control the
order things are presented to developers. Of course we sometimes
forgot to add new symbols to the sections.txt file which is a trade
off, but it never seemed like that big of an issue, compared to having
the extra editorial control over your docs.

Jani's comment was also fair about the fact that the current approach
results in lots of runs of the kernel-doc script, but that one really
seems like an implementation detail where we should avoid compromising
docs to workaround the tools (ofc just my interpretation that it would
be a compromise). I did half consider modifying kernel-doc to allow
for the -functions argument to take an ordered list instead of a set,
but couldn't quite muster the energy to modify a perl script :-p

As a bit of an asside; last year for another project of mine I once
wrote an experimental tool for extracting gtk-doc comments from code
to using the python clang bindings:
https://github.com/rib/clib/blob/master/site/rst-from-c.py

It's crossed my mind to play around with being able to extract
kernel-doc with clang along similar lines which could potentially
track more type information so the docs could support more than just
.. c:function and .. c:type, such as c:macro and c:member and could
better handle things like function pointers as members of structs.
Maybe it's an idea worth considering.


>
> Thanks, applied to dinq.

Thanks,

- Robert

> -Daniel
>
>> ---
>>  Documentation/gpu/i915.rst       |  91 +++++++++
>>  drivers/gpu/drm/i915/i915_drv.h  | 151 +++++++++++---
>>  drivers/gpu/drm/i915/i915_perf.c | 412 ++++++++++++++++++++++++++++++++++++---
>>  3 files changed, 598 insertions(+), 56 deletions(-)
>>
>> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
>> index 117d2ab..3843ef6 100644
>> --- a/Documentation/gpu/i915.rst
>> +++ b/Documentation/gpu/i915.rst
>> @@ -356,4 +356,95 @@ switch_mm
>>  .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
>>     :doc: switch_mm tracepoint
>>
>> +Perf
>> +====
>> +
>> +Overview
>> +--------
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :doc: i915 Perf Overview
>> +
>> +Comparison with Core Perf
>> +-------------------------
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :doc: i915 Perf History and Comparison with Core Perf
>> +
>> +i915 Driver Entry Points
>> +------------------------
>> +
>> +This section covers the entrypoints exported outside of i915_perf.c to
>> +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
>> +
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_init
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_fini
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_register
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_unregister
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_open_ioctl
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_release
>> +
>> +i915 Perf Stream
>> +----------------
>> +
>> +This section covers the stream-semantics-agnostic structures and functions
>> +for representing an i915 perf stream FD and associated file operations.
>> +
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
>> +   :functions: i915_perf_stream
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
>> +   :functions: i915_perf_stream_ops
>> +
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: read_properties_unlocked
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_open_ioctl_locked
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_destroy_locked
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_read
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_ioctl
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_enable_locked
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_disable_locked
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_poll
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_poll_locked
>> +
>> +i915 Perf Observation Architecture Stream
>> +-----------------------------------------
>> +
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
>> +   :functions: i915_oa_ops
>> +
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_oa_stream_init
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_oa_read
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_oa_stream_enable
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_oa_stream_disable
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_oa_wait_unlocked
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_oa_poll_wait
>> +
>> +All i915 Perf Internals
>> +-----------------------
>> +
>> +This section simply includes all currently documented i915 perf internals, in
>> +no particular order, but may include some more minor utilities or platform
>> +specific details than found in the more high-level sections.
>> +
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :internal:
>> +
>>  .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 33758ac..49c7651 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1847,89 +1847,186 @@ struct i915_oa_reg {
>>
>>  struct i915_perf_stream;
>>
>> +/**
>> + * struct i915_perf_stream_ops - the OPs to support a specific stream type
>> + */
>>  struct i915_perf_stream_ops {
>> -     /* Enables the collection of HW samples, either in response to
>> -      * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
>> -      * opened without I915_PERF_FLAG_DISABLED.
>> +     /**
>> +      * @enable: Enables the collection of HW samples, either in response to
>> +      * `I915_PERF_IOCTL_ENABLE` or implicitly called when stream is opened
>> +      * without `I915_PERF_FLAG_DISABLED`.
>>        */
>>       void (*enable)(struct i915_perf_stream *stream);
>>
>> -     /* Disables the collection of HW samples, either in response to
>> -      * I915_PERF_IOCTL_DISABLE or implicitly called before
>> -      * destroying the stream.
>> +     /**
>> +      * @disable: Disables the collection of HW samples, either in response
>> +      * to `I915_PERF_IOCTL_DISABLE` or implicitly called before destroying
>> +      * the stream.
>>        */
>>       void (*disable)(struct i915_perf_stream *stream);
>>
>> -     /* Call poll_wait, passing a wait queue that will be woken
>> +     /**
>> +      * @poll_wait: Call poll_wait, passing a wait queue that will be woken
>>        * once there is something ready to read() for the stream
>>        */
>>       void (*poll_wait)(struct i915_perf_stream *stream,
>>                         struct file *file,
>>                         poll_table *wait);
>>
>> -     /* For handling a blocking read, wait until there is something
>> -      * to ready to read() for the stream. E.g. wait on the same
>> +     /**
>> +      * @wait_unlocked: For handling a blocking read, wait until there is
>> +      * something to ready to read() for the stream. E.g. wait on the same
>>        * wait queue that would be passed to poll_wait().
>>        */
>>       int (*wait_unlocked)(struct i915_perf_stream *stream);
>>
>> -     /* read - Copy buffered metrics as records to userspace
>> -      * @buf: the userspace, destination buffer
>> -      * @count: the number of bytes to copy, requested by userspace
>> -      * @offset: zero at the start of the read, updated as the read
>> -      *          proceeds, it represents how many bytes have been
>> -      *          copied so far and the buffer offset for copying the
>> -      *          next record.
>> +     /**
>> +      * @read: Copy buffered metrics as records to userspace
>> +      * **buf**: the userspace, destination buffer
>> +      * **count**: the number of bytes to copy, requested by userspace
>> +      * **offset**: zero at the start of the read, updated as the read
>> +      * proceeds, it represents how many bytes have been copied so far and
>> +      * the buffer offset for copying the next record.
>>        *
>> -      * Copy as many buffered i915 perf samples and records for
>> -      * this stream to userspace as will fit in the given buffer.
>> +      * Copy as many buffered i915 perf samples and records for this stream
>> +      * to userspace as will fit in the given buffer.
>>        *
>> -      * Only write complete records; returning -ENOSPC if there
>> -      * isn't room for a complete record.
>> +      * Only write complete records; returning -%ENOSPC if there isn't room
>> +      * for a complete record.
>>        *
>> -      * Return any error condition that results in a short read
>> -      * such as -ENOSPC or -EFAULT, even though these may be
>> -      * squashed before returning to userspace.
>> +      * Return any error condition that results in a short read such as
>> +      * -%ENOSPC or -%EFAULT, even though these may be squashed before
>> +      * returning to userspace.
>>        */
>>       int (*read)(struct i915_perf_stream *stream,
>>                   char __user *buf,
>>                   size_t count,
>>                   size_t *offset);
>>
>> -     /* Cleanup any stream specific resources.
>> +     /**
>> +      * @destroy: Cleanup any stream specific resources.
>>        *
>>        * The stream will always be disabled before this is called.
>>        */
>>       void (*destroy)(struct i915_perf_stream *stream);
>>  };
>>
>> +/**
>> + * struct i915_perf_stream - state for a single open stream FD
>> + */
>>  struct i915_perf_stream {
>> +     /**
>> +      * @dev_priv: i915 drm device
>> +      */
>>       struct drm_i915_private *dev_priv;
>>
>> +     /**
>> +      * @link: Links the stream into ``&drm_i915_private->streams``
>> +      */
>>       struct list_head link;
>>
>> +     /**
>> +      * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
>> +      * properties given when opening a stream, representing the contents
>> +      * of a single sample as read() by userspace.
>> +      */
>>       u32 sample_flags;
>> +
>> +     /**
>> +      * @sample_size: Considering the configured contents of a sample
>> +      * combined with the required header size, this is the total size
>> +      * of a single sample record.
>> +      */
>>       int sample_size;
>>
>> +     /**
>> +      * @ctx: %NULL if measuring system-wide across all contexts or a
>> +      * specific context that is being monitored.
>> +      */
>>       struct i915_gem_context *ctx;
>> +
>> +     /**
>> +      * @enabled: Whether the stream is currently enabled, considering
>> +      * whether the stream was opened in a disabled state and based
>> +      * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
>> +      */
>>       bool enabled;
>>
>> +     /**
>> +      * @ops: The callbacks providing the implementation of this specific
>> +      * type of configured stream.
>> +      */
>>       const struct i915_perf_stream_ops *ops;
>>  };
>>
>> +/**
>> + * struct i915_oa_ops - Gen specific implementation of an OA unit stream
>> + */
>>  struct i915_oa_ops {
>> +     /**
>> +      * @init_oa_buffer: Resets the head and tail pointers of the
>> +      * circular buffer for periodic OA reports.
>> +      *
>> +      * Called when first opening a stream for OA metrics, but also may be
>> +      * called in response to an OA buffer overflow or other error
>> +      * condition.
>> +      *
>> +      * Note it may be necessary to clear the full OA buffer here as part of
>> +      * maintaining the invariable that new reports must be written to
>> +      * zeroed memory for us to be able to reliable detect if an expected
>> +      * report has not yet landed in memory.  (At least on Haswell the OA
>> +      * buffer tail pointer is not synchronized with reports being visible
>> +      * to the CPU)
>> +      */
>>       void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
>> +
>> +     /**
>> +      * @enable_metric_set: Applies any MUX configuration to set up the
>> +      * Boolean and Custom (B/C) counters that are part of the counter
>> +      * reports being sampled. May apply system constraints such as
>> +      * disabling EU clock gating as required.
>> +      */
>>       int (*enable_metric_set)(struct drm_i915_private *dev_priv);
>> +
>> +     /**
>> +      * @disable_metric_set: Remove system constraints associated with using
>> +      * the OA unit.
>> +      */
>>       void (*disable_metric_set)(struct drm_i915_private *dev_priv);
>> +
>> +     /**
>> +      * @oa_enable: Enable periodic sampling
>> +      */
>>       void (*oa_enable)(struct drm_i915_private *dev_priv);
>> +
>> +     /**
>> +      * @oa_disable: Disable periodic sampling
>> +      */
>>       void (*oa_disable)(struct drm_i915_private *dev_priv);
>> -     void (*update_oacontrol)(struct drm_i915_private *dev_priv);
>> -     void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
>> -                                     u32 ctx_id);
>> +
>> +     /**
>> +      * @read: Copy data from the circular OA buffer into a given userspace
>> +      * buffer.
>> +      */
>>       int (*read)(struct i915_perf_stream *stream,
>>                   char __user *buf,
>>                   size_t count,
>>                   size_t *offset);
>> +
>> +     /**
>> +      * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
>> +      *
>> +      * This is either called via fops or the poll check hrtimer (atomic
>> +      * ctx) without any locks taken.
>> +      *
>> +      * It's safe to read OA config state here unlocked, assuming that this
>> +      * is only called while the stream is enabled, while the global OA
>> +      * configuration can't be modified.
>> +      *
>> +      * Efficiency is more important than avoiding some false positives
>> +      * here, which will be handled gracefully - likely resulting in an
>> +      * %EAGAIN error for userspace.
>> +      */
>>       bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
>>  };
>>
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index 0c6e124..ae7bd0e 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -26,7 +26,7 @@
>>
>>
>>  /**
>> - * DOC: i915 Perf, streaming API for GPU metrics
>> + * DOC: i915 Perf Overview
>>   *
>>   * Gen graphics supports a large number of performance counters that can help
>>   * driver and application developers understand and optimize their use of the
>> @@ -45,6 +45,10 @@
>>   * privileges by default, unless changed via the dev.i915.perf_event_paranoid
>>   * sysctl option.
>>   *
>> + */
>> +
>> +/**
>> + * DOC: i915 Perf History and Comparison with Core Perf
>>   *
>>   * The interface was initially inspired by the core Perf infrastructure but
>>   * some notable differences are:
>> @@ -75,8 +79,8 @@
>>   * gets copied from the GPU mapped buffers to userspace buffers.
>>   *
>>   *
>> - * Some notes regarding Linux Perf:
>> - * --------------------------------
>> + * Issues hit with first prototype based on Core Perf
>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>   *
>>   * The first prototype of this driver was based on the core perf
>>   * infrastructure, and while we did make that mostly work, with some changes to
>> @@ -135,7 +139,7 @@
>>   *   for combining with the side-band raw reports it captures using
>>   *   MI_REPORT_PERF_COUNT commands.
>>   *
>> - *   _ As a side note on perf's grouping feature; there was also some concern
>> + *   - As a side note on perf's grouping feature; there was also some concern
>>   *     that using PERF_FORMAT_GROUP as a way to pack together counter values
>>   *     would quite drastically inflate our sample sizes, which would likely
>>   *     lower the effective sampling resolutions we could use when the available
>> @@ -277,6 +281,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
>>
>>  #define SAMPLE_OA_REPORT      (1<<0)
>>
>> +/**
>> + * struct perf_open_properties - for validated properties given to open a stream
>> + * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
>> + * @single_context: Whether a single or all gpu contexts should be monitored
>> + * @ctx_handle: A gem ctx handle for use with @single_context
>> + * @metrics_set: An ID for an OA unit metric set advertised via sysfs
>> + * @oa_format: An OA unit HW report format
>> + * @oa_periodic: Whether to enable periodic OA unit sampling
>> + * @oa_period_exponent: The OA unit sampling period is derived from this
>> + *
>> + * As read_properties_unlocked() enumerates and validates the properties given
>> + * to open a stream of metrics the configuration is built up in the structure
>> + * which starts out zero initialized.
>> + */
>>  struct perf_open_properties {
>>       u32 sample_flags;
>>
>> @@ -314,7 +332,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
>>  }
>>
>>  /**
>> - * Appends a status record to a userspace read() buffer.
>> + * append_oa_status - Appends a status record to a userspace read() buffer.
>> + * @stream: An i915-perf stream opened for OA metrics
>> + * @buf: destination buffer given by userspace
>> + * @count: the number of bytes userspace wants to read
>> + * @offset: (inout): the current position for writing into @buf
>> + * @type: The kind of status to report to userspace
>> + *
>> + * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
>> + * into the userspace read() buffer.
>> + *
>> + * The @buf @offset will only be updated on success.
>> + *
>> + * Returns: 0 on success, negative error code on failure.
>>   */
>>  static int append_oa_status(struct i915_perf_stream *stream,
>>                           char __user *buf,
>> @@ -336,7 +366,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
>>  }
>>
>>  /**
>> - * Copies single OA report into userspace read() buffer.
>> + * append_oa_sample - Copies single OA report into userspace read() buffer.
>> + * @stream: An i915-perf stream opened for OA metrics
>> + * @buf: destination buffer given by userspace
>> + * @count: the number of bytes userspace wants to read
>> + * @offset: (inout): the current position for writing into @buf
>> + * @report: A single OA report to (optionally) include as part of the sample
>> + *
>> + * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
>> + * properties when opening a stream, tracked as `stream->sample_flags`. This
>> + * function copies the requested components of a single sample to the given
>> + * read() @buf.
>> + *
>> + * The @buf @offset will only be updated on success.
>> + *
>> + * Returns: 0 on success, negative error code on failure.
>>   */
>>  static int append_oa_sample(struct i915_perf_stream *stream,
>>                           char __user *buf,
>> @@ -380,10 +424,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>>   * @head_ptr: (inout): the current oa buffer cpu read position
>>   * @tail: the current oa buffer gpu write position
>>   *
>> - * Returns 0 on success, negative error code on failure.
>> - *
>> - * Notably any error condition resulting in a short read (-ENOSPC or
>> - * -EFAULT) will be returned even though one or more records may
>> + * Notably any error condition resulting in a short read (-%ENOSPC or
>> + * -%EFAULT) will be returned even though one or more records may
>>   * have been successfully copied. In this case it's up to the caller
>>   * to decide if the error should be squashed before returning to
>>   * userspace.
>> @@ -392,6 +434,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>>   * tail, so the head chases the tail?... If you think that's mad
>>   * and back-to-front you're not alone, but this follows the
>>   * Gen PRM naming convention.
>> + *
>> + * Returns: 0 on success, negative error code on failure.
>>   */
>>  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>>                                 char __user *buf,
>> @@ -496,6 +540,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>>       return ret;
>>  }
>>
>> +/**
>> + * gen7_oa_read - copy status records then buffered OA reports
>> + * @stream: An i915-perf stream opened for OA metrics
>> + * @buf: destination buffer given by userspace
>> + * @count: the number of bytes userspace wants to read
>> + * @offset: (inout): the current position for writing into @buf
>> + *
>> + * Checks Gen 7 specific OA unit status registers and if necessary appends
>> + * corresponding status records for userspace (such as for a buffer full
>> + * condition) and then initiate appending any buffered OA reports.
>> + *
>> + * Updates @offset according to the number of bytes successfully copied into
>> + * the userspace buffer.
>> + *
>> + * Returns: zero on success or a negative error code
>> + */
>>  static int gen7_oa_read(struct i915_perf_stream *stream,
>>                       char __user *buf,
>>                       size_t count,
>> @@ -597,6 +657,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
>>       return ret;
>>  }
>>
>> +/**
>> + * i915_oa_wait_unlocked - handles blocking IO until OA data available
>> + * @stream: An i915-perf stream opened for OA metrics
>> + *
>> + * Called when userspace tries to read() from a blocking stream FD opened
>> + * for OA metrics. It waits until the hrtimer callback finds a non-empty
>> + * OA buffer and wakes us.
>> + *
>> + * Note: it's acceptable to have this return with some false positives
>> + * since any subsequent read handling will return -EAGAIN if there isn't
>> + * really data ready for userspace yet.
>> + *
>> + * Returns: zero on success or a negative error code
>> + */
>>  static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>>  {
>>       struct drm_i915_private *dev_priv = stream->dev_priv;
>> @@ -615,6 +689,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>>                                       !dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
>>  }
>>
>> +/**
>> + * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
>> + * @stream: An i915-perf stream opened for OA metrics
>> + * @file: An i915 perf stream file
>> + * @wait: poll() state table
>> + *
>> + * For handling userspace polling on an i915 perf stream opened for OA metrics,
>> + * this starts a poll_wait with the wait queue that our hrtimer callback wakes
>> + * when it sees data ready to read in the circular OA buffer.
>> + */
>>  static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>>                             struct file *file,
>>                             poll_table *wait)
>> @@ -624,6 +708,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>>       poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
>>  }
>>
>> +/**
>> + * i915_oa_read - just calls through to &i915_oa_ops->read
>> + * @stream: An i915-perf stream opened for OA metrics
>> + * @buf: destination buffer given by userspace
>> + * @count: the number of bytes userspace wants to read
>> + * @offset: (inout): the current position for writing into @buf
>> + *
>> + * Updates @offset according to the number of bytes successfully copied into
>> + * the userspace buffer.
>> + *
>> + * Returns: zero on success or a negative error code
>> + */
>>  static int i915_oa_read(struct i915_perf_stream *stream,
>>                       char __user *buf,
>>                       size_t count,
>> @@ -634,9 +730,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
>>       return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
>>  }
>>
>> -/* Determine the render context hw id, and ensure it remains fixed for the
>> +/**
>> + * oa_get_render_ctx_id - determine and hold ctx hw id
>> + * @stream: An i915-perf stream opened for OA metrics
>> + *
>> + * Determine the render context hw id, and ensure it remains fixed for the
>>   * lifetime of the stream. This ensures that we don't have to worry about
>>   * updating the context ID in OACONTROL on the fly.
>> + *
>> + * Returns: zero on success or a negative error code
>>   */
>>  static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>>  {
>> @@ -673,6 +775,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>>       return ret;
>>  }
>>
>> +/**
>> + * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
>> + * @stream: An i915-perf stream opened for OA metrics
>> + *
>> + * In case anything needed doing to ensure the context HW ID would remain valid
>> + * for the lifetime of the stream, then that can be undone here.
>> + */
>>  static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
>>  {
>>       struct drm_i915_private *dev_priv = stream->dev_priv;
>> @@ -945,6 +1054,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
>>       spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
>>  }
>>
>> +/**
>> + * i915_oa_stream_enable - handle `I915_PERF_IOCTL_ENABLE` for OA stream
>> + * @stream: An i915 perf stream opened for OA metrics
>> + *
>> + * [Re]enables hardware periodic sampling according to the period configured
>> + * when opening the stream. This also starts a hrtimer that will periodically
>> + * check for data in the circular OA buffer for notifying userspace (e.g.
>> + * during a read() or poll()).
>> + */
>>  static void i915_oa_stream_enable(struct i915_perf_stream *stream)
>>  {
>>       struct drm_i915_private *dev_priv = stream->dev_priv;
>> @@ -962,6 +1080,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
>>       I915_WRITE(GEN7_OACONTROL, 0);
>>  }
>>
>> +/**
>> + * i915_oa_stream_disable - handle `I915_PERF_IOCTL_DISABLE` for OA stream
>> + * @stream: An i915 perf stream opened for OA metrics
>> + *
>> + * Stops the OA unit from periodically writing counter reports into the
>> + * circular OA buffer. This also stops the hrtimer that periodically checks for
>> + * data in the circular OA buffer, for notifying userspace.
>> + */
>>  static void i915_oa_stream_disable(struct i915_perf_stream *stream)
>>  {
>>       struct drm_i915_private *dev_priv = stream->dev_priv;
>> @@ -987,6 +1113,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
>>       .read = i915_oa_read,
>>  };
>>
>> +/**
>> + * i915_oa_stream_init - validate combined props for OA stream and init
>> + * @stream: An i915 perf stream
>> + * @param: The open parameters passed to `DRM_I915_PERF_OPEN`
>> + * @props: The property state that configures stream (individually validated)
>> + *
>> + * While read_properties_unlocked() validates properties in isolation it
>> + * doesn't ensure that the combination necessarily makes sense.
>> + *
>> + * At this point it has been determined that userspace wants a stream of
>> + * OA metrics, but still we need to further validate the combined
>> + * properties are OK.
>> + *
>> + * If the configuration makes sense then we can allocate memory for
>> + * a circular OA buffer and apply the requested metric set configuration.
>> + *
>> + * Returns: zero on success or a negative error code.
>> + */
>>  static int i915_oa_stream_init(struct i915_perf_stream *stream,
>>                              struct drm_i915_perf_open_param *param,
>>                              struct perf_open_properties *props)
>> @@ -1111,6 +1255,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>>       return ret;
>>  }
>>
>> +/**
>> + * i915_perf_read_locked - &i915_perf_stream_ops->read with error normalisation
>> + * @stream: An i915 perf stream
>> + * @file: An i915 perf stream file
>> + * @buf: destination buffer given by userspace
>> + * @count: the number of bytes userspace wants to read
>> + * @ppos: (inout) file seek position (unused)
>> + *
>> + * Besides wrapping &i915_perf_stream_ops->read this provides a common place to
>> + * ensure that if we've successfully copied any data then reporting that takes
>> + * precedence over any internal error status, so the data isn't lost.
>> + *
>> + * For example ret will be -ENOSPC whenever there is more buffered data than
>> + * can be copied to userspace, but that's only interesting if we weren't able
>> + * to copy some data because it implies the userspace buffer is too small to
>> + * receive a single record (and we never split records).
>> + *
>> + * Another case with ret == -EFAULT is more of a grey area since it would seem
>> + * like bad form for userspace to ask us to overrun its buffer, but the user
>> + * knows best:
>> + *
>> + *   http://yarchive.net/comp/linux/partial_reads_writes.html
>> + *
>> + * Returns: The number of bytes copied or a negative error code on failure.
>> + */
>>  static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>>                                    struct file *file,
>>                                    char __user *buf,
>> @@ -1126,25 +1295,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>>       size_t offset = 0;
>>       int ret = stream->ops->read(stream, buf, count, &offset);
>>
>> -     /* If we've successfully copied any data then reporting that
>> -      * takes precedence over any internal error status, so the
>> -      * data isn't lost.
>> -      *
>> -      * For example ret will be -ENOSPC whenever there is more
>> -      * buffered data than can be copied to userspace, but that's
>> -      * only interesting if we weren't able to copy some data
>> -      * because it implies the userspace buffer is too small to
>> -      * receive a single record (and we never split records).
>> -      *
>> -      * Another case with ret == -EFAULT is more of a grey area
>> -      * since it would seem like bad form for userspace to ask us
>> -      * to overrun its buffer, but the user knows best:
>> -      *
>> -      *   http://yarchive.net/comp/linux/partial_reads_writes.html
>> -      */
>>       return offset ?: (ret ?: -EAGAIN);
>>  }
>>
>> +/**
>> + * i915_perf_read - handles read() FOP for i915 perf stream FDs
>> + * @file: An i915 perf stream file
>> + * @buf: destination buffer given by userspace
>> + * @count: the number of bytes userspace wants to read
>> + * @ppos: (inout) file seek position (unused)
>> + *
>> + * The entry point for handling a read() on a stream file descriptor from
>> + * userspace. Most of the work is left to the i915_perf_read_locked() and
>> + * &i915_perf_stream_ops->read but to save having stream implementations (of
>> + * which we might have multiple later) we handle blocking read here.
>> + *
>> + * We can also consistently treat trying to read from a disabled stream
>> + * as an IO error so implementations can assume the stream is enabled
>> + * while reading.
>> + *
>> + * Returns: The number of bytes copied or a negative error code on failure.
>> + */
>>  static ssize_t i915_perf_read(struct file *file,
>>                             char __user *buf,
>>                             size_t count,
>> @@ -1211,6 +1382,22 @@ static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer)
>>       return HRTIMER_RESTART;
>>  }
>>
>> +/**
>> + * i915_perf_poll_locked - poll_wait() with a suitable wait queue for stream
>> + * @dev_priv: i915 device instance
>> + * @stream: An i915 perf stream
>> + * @file: An i915 perf stream file
>> + * @wait: poll() state table
>> + *
>> + * For handling userspace polling on an i915 perf stream, this calls through to
>> + * &i915_perf_stream_ops->poll_wait to call poll_wait() with a wait queue that
>> + * will be woken for new stream data.
>> + *
>> + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
>> + * with any non-file-operation driver hooks.
>> + *
>> + * Returns: any poll events that are ready without sleeping
>> + */
>>  static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
>>                                         struct i915_perf_stream *stream,
>>                                         struct file *file,
>> @@ -1232,6 +1419,19 @@ static unsigned int i915_perf_poll_locked(struct drm_i915_private *dev_priv,
>>       return events;
>>  }
>>
>> +/**
>> + * i915_perf_poll - call poll_wait() with a suitable wait queue for stream
>> + * @file: An i915 perf stream file
>> + * @wait: poll() state table
>> + *
>> + * For handling userspace polling on an i915 perf stream, this ensures
>> + * poll_wait() gets called with a wait queue that will be woken for new stream
>> + * data.
>> + *
>> + * Note: Implementation deferred to i915_perf_poll_locked()
>> + *
>> + * Returns: any poll events that are ready without sleeping
>> + */
>>  static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
>>  {
>>       struct i915_perf_stream *stream = file->private_data;
>> @@ -1245,6 +1445,16 @@ static unsigned int i915_perf_poll(struct file *file, poll_table *wait)
>>       return ret;
>>  }
>>
>> +/**
>> + * i915_perf_enable_locked - handle `I915_PERF_IOCTL_ENABLE` ioctl
>> + * @stream: A disabled i915 perf stream
>> + *
>> + * [Re]enables the associated capture of data for this stream.
>> + *
>> + * If a stream was previously enabled then there's currently no intention
>> + * to provide userspace any guarantee about the preservation of previously
>> + * buffered data.
>> + */
>>  static void i915_perf_enable_locked(struct i915_perf_stream *stream)
>>  {
>>       if (stream->enabled)
>> @@ -1257,6 +1467,20 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream)
>>               stream->ops->enable(stream);
>>  }
>>
>> +/**
>> + * i915_perf_disable_locked - handle `I915_PERF_IOCTL_DISABLE` ioctl
>> + * @stream: An enabled i915 perf stream
>> + *
>> + * Disables the associated capture of data for this stream.
>> + *
>> + * The intention is that disabling an re-enabling a stream will ideally be
>> + * cheaper than destroying and re-opening a stream with the same configuration,
>> + * though there are no formal guarantees about what state or buffered data
>> + * must be retained between disabling and re-enabling a stream.
>> + *
>> + * Note: while a stream is disabled it's considered an error for userspace
>> + * to attempt to read from the stream (-EIO).
>> + */
>>  static void i915_perf_disable_locked(struct i915_perf_stream *stream)
>>  {
>>       if (!stream->enabled)
>> @@ -1269,6 +1493,18 @@ static void i915_perf_disable_locked(struct i915_perf_stream *stream)
>>               stream->ops->disable(stream);
>>  }
>>
>> +/**
>> + * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
>> + * @stream: An i915 perf stream
>> + * @cmd: the ioctl request
>> + * @arg: the ioctl data
>> + *
>> + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
>> + * with any non-file-operation driver hooks.
>> + *
>> + * Returns: zero on success or a negative error code. Returns -EINVAL for
>> + * an unknown ioctl request.
>> + */
>>  static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
>>                                  unsigned int cmd,
>>                                  unsigned long arg)
>> @@ -1285,6 +1521,17 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream,
>>       return -EINVAL;
>>  }
>>
>> +/**
>> + * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs
>> + * @file: An i915 perf stream file
>> + * @cmd: the ioctl request
>> + * @arg: the ioctl data
>> + *
>> + * Implementation deferred to i915_perf_ioctl_locked().
>> + *
>> + * Returns: zero on success or a negative error code. Returns -EINVAL for
>> + * an unknown ioctl request.
>> + */
>>  static long i915_perf_ioctl(struct file *file,
>>                           unsigned int cmd,
>>                           unsigned long arg)
>> @@ -1300,6 +1547,16 @@ static long i915_perf_ioctl(struct file *file,
>>       return ret;
>>  }
>>
>> +/**
>> + * i915_perf_destroy_locked - destroy an i915 perf stream
>> + * @stream: An i915 perf stream
>> + *
>> + * Frees all resources associated with the given i915 perf @stream, disabling
>> + * any associated data capture in the process.
>> + *
>> + * Note: The &drm_i915_private->perf.lock mutex has been taken to serialize
>> + * with any non-file-operation driver hooks.
>> + */
>>  static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
>>  {
>>       struct drm_i915_private *dev_priv = stream->dev_priv;
>> @@ -1321,6 +1578,17 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
>>       kfree(stream);
>>  }
>>
>> +/**
>> + * i915_perf_release - handles userspace close() of a stream file
>> + * @inode: anonymous inode associated with file
>> + * @file: An i915 perf stream file
>> + *
>> + * Cleans up any resources associated with an open i915 perf stream file.
>> + *
>> + * NB: close() can't really fail from the userspace point of view.
>> + *
>> + * Returns: zero on success or a negative error code.
>> + */
>>  static int i915_perf_release(struct inode *inode, struct file *file)
>>  {
>>       struct i915_perf_stream *stream = file->private_data;
>> @@ -1365,6 +1633,30 @@ lookup_context(struct drm_i915_private *dev_priv,
>>       return ctx;
>>  }
>>
>> +/**
>> + * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD
>> + * @dev_priv: i915 device instance
>> + * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
>> + * @props: individually validated u64 property value pairs
>> + * @file: drm file
>> + *
>> + * See i915_perf_ioctl_open() for interface details.
>> + *
>> + * Implements further stream config validation and stream initialization on
>> + * behalf of i915_perf_open_ioctl() with the &drm_i915_private->perf.lock mutex
>> + * taken to serialize with any non-file-operation driver hooks.
>> + *
>> + * Note: at this point the @props have only been validated in isolation and
>> + * it's still necessary to validate that the combination of properties makes
>> + * sense.
>> + *
>> + * In the case where userspace is interested in OA unit metrics then further
>> + * config validation and stream initialization details will be handled by
>> + * i915_oa_stream_init(). The code here should only validate config state that
>> + * will be relevant to all stream types / backends.
>> + *
>> + * Returns: zero on success or a negative error code.
>> + */
>>  static int
>>  i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>>                           struct drm_i915_perf_open_param *param,
>> @@ -1459,12 +1751,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>>       return ret;
>>  }
>>
>> -/* Note we copy the properties from userspace outside of the i915 perf
>> - * mutex to avoid an awkward lockdep with mmap_sem.
>> +/**
>> + * read_properties_unlocked - validate + copy userspace stream open properties
>> + * @dev_priv: i915 device instance
>> + * @uprops: The array of u64 key value pairs given by userspace
>> + * @n_props: The number of key value pairs expected in @uprops
>> + * @props: The stream configuration built up while validating properties
>>   *
>>   * Note this function only validates properties in isolation it doesn't
>>   * validate that the combination of properties makes sense or that all
>>   * properties necessary for a particular kind of stream have been set.
>> + *
>> + * Note that there currently aren't any ordering requirements for properties so
>> + * we shouldn't validate or assume anything about ordering here. This doesn't
>> + * rule out defining new properties with ordering requirements in the future.
>>   */
>>  static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>>                                   u64 __user *uprops,
>> @@ -1586,6 +1886,30 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>>       return 0;
>>  }
>>
>> +/**
>> + * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
>> + * @dev: drm device
>> + * @data: ioctl data copied from userspace (unvalidated)
>> + * @file: drm file
>> + *
>> + * Validates the stream open parameters given by userspace including flags
>> + * and an array of u64 key, value pair properties.
>> + *
>> + * Very little is assumed up front about the nature of the stream being
>> + * opened (for instance we don't assume it's for periodic OA unit metrics). An
>> + * i915-perf stream is expected to be a suitable interface for other forms of
>> + * buffered data written by the GPU besides periodic OA metrics.
>> + *
>> + * Note we copy the properties from userspace outside of the i915 perf
>> + * mutex to avoid an awkward lockdep with mmap_sem.
>> + *
>> + * Most of the implementation details are handled by
>> + * i915_perf_open_ioctl_locked() after taking the &drm_i915_private->perf.lock
>> + * mutex for serializing with any non-file-operation driver hooks.
>> + *
>> + * Return: A newly opened i915 Perf stream file descriptor or negative
>> + * error code on failure.
>> + */
>>  int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>>                        struct drm_file *file)
>>  {
>> @@ -1622,6 +1946,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>>       return ret;
>>  }
>>
>> +/**
>> + * i915_perf_register - exposes i915-perf to userspace
>> + * @dev_priv: i915 device instance
>> + *
>> + * In particular OA metric sets are advertised under a sysfs metrics/
>> + * directory allowing userspace to enumerate valid IDs that can be
>> + * used to open an i915-perf stream.
>> + */
>>  void i915_perf_register(struct drm_i915_private *dev_priv)
>>  {
>>       if (!IS_HASWELL(dev_priv))
>> @@ -1651,6 +1983,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
>>       mutex_unlock(&dev_priv->perf.lock);
>>  }
>>
>> +/**
>> + * i915_perf_unregister - hide i915-perf from userspace
>> + * @dev_priv: i915 device instance
>> + *
>> + * i915-perf state cleanup is split up into an 'unregister' and
>> + * 'deinit' phase where the interface is first hidden from
>> + * userspace by i915_perf_unregister() before cleaning up
>> + * remaining state in i915_perf_fini().
>> + */
>>  void i915_perf_unregister(struct drm_i915_private *dev_priv)
>>  {
>>       if (!IS_HASWELL(dev_priv))
>> @@ -1707,6 +2048,15 @@ static struct ctl_table dev_root[] = {
>>       {}
>>  };
>>
>> +/**
>> + * i915_perf_init - initialize i915-perf state on module load
>> + * @dev_priv: i915 device instance
>> + *
>> + * Initializes i915-perf state without exposing anything to userspace.
>> + *
>> + * Note: i915-perf initialization is split into an 'init' and 'register'
>> + * phase with the i915_perf_register() exposing state to userspace.
>> + */
>>  void i915_perf_init(struct drm_i915_private *dev_priv)
>>  {
>>       if (!IS_HASWELL(dev_priv))
>> @@ -1742,6 +2092,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
>>       dev_priv->perf.initialized = true;
>>  }
>>
>> +/**
>> + * i915_perf_fini - Counter part to i915_perf_init()
>> + * @dev_priv: i915 device instance
>> + */
>>  void i915_perf_fini(struct drm_i915_private *dev_priv)
>>  {
>>       if (!dev_priv->perf.initialized)
>> --
>> 2.10.2
>>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-09 12:41       ` Robert Bragg
@ 2016-12-12  7:34         ` Daniel Vetter
  2016-12-12 10:00           ` Jani Nikula
  0 siblings, 1 reply; 16+ messages in thread
From: Daniel Vetter @ 2016-12-12  7:34 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Daniel Vetter, Intel Graphics Development

On Fri, Dec 09, 2016 at 12:41:08PM +0000, Robert Bragg wrote:
> On Thu, Dec 8, 2016 at 3:53 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Wed, Dec 07, 2016 at 09:40:33PM +0000, Robert Bragg wrote:
> >> This adds a 'Perf' section to i915.rst with the following sub sections:
> >> - Overview
> >> - Comparison with Core Perf
> >> - i915 Driver Entry Points
> >> - i915 Perf Stream
> >> - i915 Perf Observation Architecture Stream
> >> - All i915 Perf Internals
> >>
> >> v2:
> >>     section headers in i915.rst (Daniel Vetter)
> >>     missing symbol docs + other fixups (Matthew Auld)
> >>
> >> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> >> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > Obligatory bikeshed about explicitly listening all functions, but hey
> > great docs, I'll merge ;-)
> 
> Yeah, understood.
> 
> Not sure if you saw my last reply to this, but I think listing symbols
> explicitly is appropriate for documentation, where the ordering and
> groupings of symbols for the sake of introducing them in logical
> stages doesn't necessarily match that found in code.
> 
> Not perfect in itself but having worked with gtk-doc in the past I
> thought it made sense, and worked reasonably enough, to explicitly
> list section headers and symbols in a sections.txt file to control the
> order things are presented to developers. Of course we sometimes
> forgot to add new symbols to the sections.txt file which is a trade
> off, but it never seemed like that big of an issue, compared to having
> the extra editorial control over your docs.

It's a trade-off, and given how bad we are with keeping docs up-to-date
I'm heavily in favour of optimizing for that over perfectly looking docs.
If we'd have a full-time doc tech writer to help maintain this I wouldn't
bikeshed at all ;-)

btw 0day spotted an issue in your patch, I didn't look at it though.

> Jani's comment was also fair about the fact that the current approach
> results in lots of runs of the kernel-doc script, but that one really
> seems like an implementation detail where we should avoid compromising
> docs to workaround the tools (ofc just my interpretation that it would
> be a compromise). I did half consider modifying kernel-doc to allow
> for the -functions argument to take an ordered list instead of a set,
> but couldn't quite muster the energy to modify a perl script :-p
> 
> As a bit of an asside; last year for another project of mine I once
> wrote an experimental tool for extracting gtk-doc comments from code
> to using the python clang bindings:
> https://github.com/rib/clib/blob/master/site/rst-from-c.py
> 
> It's crossed my mind to play around with being able to extract
> kernel-doc with clang along similar lines which could potentially
> track more type information so the docs could support more than just
> .. c:function and .. c:type, such as c:macro and c:member and could
> better handle things like function pointers as members of structs.
> Maybe it's an idea worth considering.

Talk to Jani, he has it implemented already somewhere. But kernel-doc is a
horrible, and we have about 50k existing comments that all need to still
parse, so it's really tricky to move forward.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-12  7:34         ` Daniel Vetter
@ 2016-12-12 10:00           ` Jani Nikula
  0 siblings, 0 replies; 16+ messages in thread
From: Jani Nikula @ 2016-12-12 10:00 UTC (permalink / raw)
  To: Daniel Vetter, Robert Bragg; +Cc: Daniel Vetter, Intel Graphics Development

On Mon, 12 Dec 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Fri, Dec 09, 2016 at 12:41:08PM +0000, Robert Bragg wrote:
>> As a bit of an asside; last year for another project of mine I once
>> wrote an experimental tool for extracting gtk-doc comments from code
>> to using the python clang bindings:
>> https://github.com/rib/clib/blob/master/site/rst-from-c.py
>> 
>> It's crossed my mind to play around with being able to extract
>> kernel-doc with clang along similar lines which could potentially
>> track more type information so the docs could support more than just
>> .. c:function and .. c:type, such as c:macro and c:member and could
>> better handle things like function pointers as members of structs.
>> Maybe it's an idea worth considering.
>
> Talk to Jani, he has it implemented already somewhere. But kernel-doc is a
> horrible, and we have about 50k existing comments that all need to still
> parse, so it's really tricky to move forward.

Yeah, I admit to having a little side project to use python-clang to
extract documentation comments directly into Sphinx.

But I've pretty much decided I'm not going to pursue making it a
replacement for kernel-doc. There is no point in that unless it has a
chance of being merged upstream, and I think depending on clang is a
non-starter for that. I'm just totally disinterested in having that
discussion.

Instead, I'm trying to make it a really small and clean alternative for
Doxygen in Sphinx based documentation.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-12-01 12:12   ` Jani Nikula
@ 2016-12-02 15:17     ` Robert Bragg
  0 siblings, 0 replies; 16+ messages in thread
From: Robert Bragg @ 2016-12-02 15:17 UTC (permalink / raw)
  To: Jani Nikula; +Cc: Daniel Vetter, Intel Graphics Development


[-- Attachment #1.1: Type: text/plain, Size: 1947 bytes --]

On Thu, Dec 1, 2016 at 12:12 PM, Jani Nikula <jani.nikula@linux.intel.com>
wrote:

> On Wed, 30 Nov 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Tue, Nov 29, 2016 at 05:00:55PM +0000, Robert Bragg wrote:
> >> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> >> +   :functions: i915_perf_init
> >> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> >> +   :functions: i915_perf_fini
> >> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> >> +   :functions: i915_perf_register
> >> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> >> +   :functions: i915_perf_unregister
> >> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> >> +   :functions: i915_perf_open_ioctl
> >> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> >> +   :functions: i915_perf_release
> >
> > One potential issue with listing everything explicitly is that if someone
> > ever (and this is bound to happen) adds a new function, they'll forget to
> > add it. Hence we just pull them all in, and if you want to refernce some
> > specifically, do that in the overview sections.
>
> One real issue with listing everything separately is that kernel-doc
> parses the source file once per every kernel-doc directive.
>

Yeah, this is unfortunate and I'd originally hoped I could pass an ordered
list which could reduce how often kernel-doc is run. In practice I haven't
seen a performance problem with doing this though.


>
> Also, doesn't Sphinx complain about not having a blank line to end the
> indented block after the directive? It might not, but I thought it
> might.
>

Apparently it's ok, I've been generating and previewing the documentation
and haven't seen a warning about this.

>From the restructure text spec, regarding white space:
"Blank lines may be omitted when the markup makes element separation
unambiguous, in conjunction with indentation."

Regards,
- Robert


>
> BR,
> Jani.
>
>
> --
> Jani Nikula, Intel Open Source Technology Center
>

[-- Attachment #1.2: Type: text/html, Size: 3090 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-11-30 19:41 ` Daniel Vetter
  2016-12-01 12:12   ` Jani Nikula
@ 2016-12-02 15:05   ` Robert Bragg
  1 sibling, 0 replies; 16+ messages in thread
From: Robert Bragg @ 2016-12-02 15:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, Intel Graphics Development


[-- Attachment #1.1: Type: text/plain, Size: 37802 bytes --]

On Nov 30, 2016 19:41, "Daniel Vetter" <daniel@ffwll.ch> wrote:
>
> On Tue, Nov 29, 2016 at 05:00:55PM +0000, Robert Bragg wrote:
> > This adds a 'Perf' section to i915.rst with the following sub sections:
> > - Overview
> > - Comparison with Core Perf
> > - i915 Driver Entry Points
> > - i915 Perf Stream
> > - i915 Perf Observation Architecture Stream
> > - All i915 Perf Internals
> >
> > Signed-off-by: Robert Bragg <robert@sixbynine.org>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> Two style bikesheds below, feel free to ignore.
>
> > ---
> >  Documentation/gpu/i915.rst       |  92 +++++++++++++
> >  drivers/gpu/drm/i915/i915_drv.h  | 151 ++++++++++++++++----
> >  drivers/gpu/drm/i915/i915_perf.c | 289 ++++++++++++++++++++++++++++++
+++++----
> >  3 files changed, 478 insertions(+), 54 deletions(-)
> >
> > diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> > index 117d2ab..714bd4b 100644
> > --- a/Documentation/gpu/i915.rst
> > +++ b/Documentation/gpu/i915.rst
> > @@ -356,4 +356,96 @@ switch_mm
> >  .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
> >     :doc: switch_mm tracepoint
> >
> > +Perf
> > +====
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :doc: i915 Perf Overview
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :doc: i915 Perf History and Comparison with Core Perf
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :doc: i915 Perf File Operations
>
> You have the headings in the DOC comments itself, which works until
> someone reorganizes stuff. Then it tends to fall apart badly.

Yeah, could be better.

>
> > +
> > +i915 Driver Entry Points
> > +------------------------
> > +
> > +This section covers the entrypoints exported outside of i915_perf.c to
> > +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_init
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_fini
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_register
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_unregister
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_open_ioctl
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_release
>
> One potential issue with listing everything explicitly is that if someone
> ever (and this is bound to happen) adds a new function, they'll forget to
> add it. Hence we just pull them all in, and if you want to refernce some
> specifically, do that in the overview sections. And also sprinkle lots of
> cross-references all over to make groups of functions easier to discover.
Without any structure it just didn't seem like documentation; just a dump
of internals info which didn't look that helpful to me.

There are some fairly noteworthy separations of responsibilities between
functions providing the i915 perf stream infrastructure and then the code
for OA unit streams, and then code specifically for Haswell. Splitting them
into sections seems worthwhile to me.

I don't think it's that big a deal listing symbols individually and
maintaining this when adding new symbols. Having maintained Cogl using
gtk-doc where symbols had to be explicitly listed in a -sections.txt file
to added them to the documentation, that wasn't too bad. It's true that
slip ups happen, but having control over the order symbols are presented
seems good to me. It could be nice if an ordered list could be passed to
:functions: to reduce how many times the corresponding perl script runs.

It could maybe be good if there was some way to tag/label symbols in
kerneldoc for selection in restructured text. It would also be nice if the
tooling understood what symbols were in the document so far, so it could
somehow be possible to have a dumping ground section for 'everything else'
at the end.

Regards,
- Robert

>
> But in the end your docs, your turf ;-)
> -Daniel
>
> > +
> > +i915 Perf Stream
> > +----------------
> > +
> > +This section covers the stream-semantics-agnostic structures and
functions
> > +for representing an i915 perf stream FD and associated file operations.
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> > +   :functions: i915_perf_stream
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> > +   :functions: i915_perf_stream_ops
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: read_properties_unlocked
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_open_ioctl_unlocked
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_destroy_unlocked
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_read
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_ioctl
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_enable_unlocked
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_disable_unlocked
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_poll
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_perf_poll_unlocked
> > +
> > +i915 Perf Observation Architecture Stream
> > +-----------------------------------------
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: OA_BUFFER_SIZE
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> > +   :functions: i915_oa_ops
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_oa_stream_init
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_oa_read
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_oa_stream_enable
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_oa_stream_disable
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_oa_wait_unlocked
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :functions: i915_oa_poll_wait
> > +
> > +All i915 Perf Internals
> > +-----------------------
> > +
> > +This section simply includes all currently documented i915 perf
internals, in
> > +no particular order, but may include some more minor utilities or
platform
> > +specific details than found in the more high-level sections.
> > +
> > +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> > +   :internal:
> > +
> >  .. WARNING: DOCPROC directive not supported:
!Cdrivers/gpu/drm/i915/i915_irq.c
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
b/drivers/gpu/drm/i915/i915_drv.h
> > index 1ec9619..9f92755 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1827,89 +1827,186 @@ struct i915_oa_reg {
> >
> >  struct i915_perf_stream;
> >
> > +/**
> > + * struct i915_perf_stream_ops - the OPs to support a specific stream
type
> > + */
> >  struct i915_perf_stream_ops {
> > -     /* Enables the collection of HW samples, either in response to
> > -      * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
> > -      * opened without I915_PERF_FLAG_DISABLED.
> > +     /**
> > +      * @enable: Enables the collection of HW samples, either in
response to
> > +      * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
opened
> > +      * without I915_PERF_FLAG_DISABLED.
> >        */
> >       void (*enable)(struct i915_perf_stream *stream);
> >
> > -     /* Disables the collection of HW samples, either in response to
> > -      * I915_PERF_IOCTL_DISABLE or implicitly called before
> > -      * destroying the stream.
> > +     /**
> > +      * @disable: Disables the collection of HW samples, either in
response
> > +      * to I915_PERF_IOCTL_DISABLE or implicitly called before
destroying
> > +      * the stream.
> >        */
> >       void (*disable)(struct i915_perf_stream *stream);
> >
> > -     /* Call poll_wait, passing a wait queue that will be woken
> > +     /**
> > +      * @poll_wait: Call poll_wait, passing a wait queue that will be
woken
> >        * once there is something ready to read() for the stream
> >        */
> >       void (*poll_wait)(struct i915_perf_stream *stream,
> >                         struct file *file,
> >                         poll_table *wait);
> >
> > -     /* For handling a blocking read, wait until there is something
> > -      * to ready to read() for the stream. E.g. wait on the same
> > +     /**
> > +      * @wait_unlocked: For handling a blocking read, wait until there
is
> > +      * something to ready to read() for the stream. E.g. wait on the
same
> >        * wait queue that would be passed to poll_wait().
> >        */
> >       int (*wait_unlocked)(struct i915_perf_stream *stream);
> >
> > -     /* read - Copy buffered metrics as records to userspace
> > -      * @buf: the userspace, destination buffer
> > -      * @count: the number of bytes to copy, requested by userspace
> > -      * @offset: zero at the start of the read, updated as the read
> > -      *          proceeds, it represents how many bytes have been
> > -      *          copied so far and the buffer offset for copying the
> > -      *          next record.
> > +     /**
> > +      * @read: Copy buffered metrics as records to userspace
> > +      * **buf**: the userspace, destination buffer
> > +      * **count**: the number of bytes to copy, requested by userspace
> > +      * **offset**: zero at the start of the read, updated as the read
> > +      * proceeds, it represents how many bytes have been copied so far
and
> > +      * the buffer offset for copying the next record.
> >        *
> > -      * Copy as many buffered i915 perf samples and records for
> > -      * this stream to userspace as will fit in the given buffer.
> > +      * Copy as many buffered i915 perf samples and records for this
stream
> > +      * to userspace as will fit in the given buffer.
> >        *
> > -      * Only write complete records; returning -ENOSPC if there
> > -      * isn't room for a complete record.
> > +      * Only write complete records; returning -ENOSPC if there isn't
room
> > +      * for a complete record.
> >        *
> > -      * Return any error condition that results in a short read
> > -      * such as -ENOSPC or -EFAULT, even though these may be
> > -      * squashed before returning to userspace.
> > +      * Return any error condition that results in a short read such as
> > +      * -ENOSPC or -EFAULT, even though these may be squashed before
> > +      * returning to userspace.
> >        */
> >       int (*read)(struct i915_perf_stream *stream,
> >                   char __user *buf,
> >                   size_t count,
> >                   size_t *offset);
> >
> > -     /* Cleanup any stream specific resources.
> > +     /**
> > +      * @destroy: Cleanup any stream specific resources.
> >        *
> >        * The stream will always be disabled before this is called.
> >        */
> >       void (*destroy)(struct i915_perf_stream *stream);
> >  };
> >
> > +/**
> > + * struct i915_perf_stream - state for a single open stream FD
> > + */
> >  struct i915_perf_stream {
> > +     /**
> > +      * @dev_priv: i915 drm device
> > +      */
> >       struct drm_i915_private *dev_priv;
> >
> > +     /**
> > +      * @link: Links the stream into ``&drm_i915_private->streams``
> > +      */
> >       struct list_head link;
> >
> > +     /**
> > +      * @sample_flags: Flags representing the
`DRM_I915_PERF_PROP_SAMPLE_*`
> > +      * properties given when opening a stream, representing the
contents
> > +      * of a single sample as read() by userspace.
> > +      */
> >       u32 sample_flags;
> > +
> > +     /**
> > +      * @sample_size: Considering the configured contents of a sample
> > +      * combined with the required header size, this is the total size
> > +      * of a single sample record.
> > +      */
> >       int sample_size;
> >
> > +     /**
> > +      * @ctx: %NULL if measuring system-wide across all contexts or a
> > +      * specific context that is being monitored.
> > +      */
> >       struct i915_gem_context *ctx;
> > +
> > +     /**
> > +      * @enabled: Whether the stream is currently enabled, considering
> > +      * whether the stream was opened in a disabled state and based
> > +      * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE`
calls.
> > +      */
> >       bool enabled;
> >
> > +     /**
> > +      * @ops: The callbacks providing the implementation of this
specific
> > +      * type of configured stream.
> > +      */
> >       const struct i915_perf_stream_ops *ops;
> >  };
> >
> > +/**
> > + * struct i915_oa_ops - Gen specific implementation of an OA unit
stream
> > + */
> >  struct i915_oa_ops {
> > +     /**
> > +      * @init_oa_buffer: Resets the head and tail pointers of the
> > +      * circular buffer for periodic OA reports.
> > +      *
> > +      * Called when first opening a stream for OA metrics, but also
may be
> > +      * called in response to an OA buffer overflow or other error
> > +      * condition.
> > +      *
> > +      * Note it may be necessary to clear the full OA buffer here as
part of
> > +      * maintaining the invariable that new reports must be written to
> > +      * zeroed memory for us to be able to reliable detect if an
expected
> > +      * report has not yet landed in memory.  (At least on Haswell the
OA
> > +      * buffer tail pointer is not synchronized with reports being
visible
> > +      * to the CPU)
> > +      */
> >       void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
> > +
> > +     /**
> > +      * @enable_metric_set: Applies any MUX configuration to set up the
> > +      * Boolean and Custom (B/C) counters that are part of the counter
> > +      * reports being sampled. May apply system constraints such as
> > +      * disabling EU clock gating as required.
> > +      */
> >       int (*enable_metric_set)(struct drm_i915_private *dev_priv);
> > +
> > +     /**
> > +      * @disable_metric_set: Remove system constraints associated with
using
> > +      * the OA unit.
> > +      */
> >       void (*disable_metric_set)(struct drm_i915_private *dev_priv);
> > +
> > +     /**
> > +      * @oa_enable: Enable periodic sampling
> > +      */
> >       void (*oa_enable)(struct drm_i915_private *dev_priv);
> > +
> > +     /**
> > +      * @oa_disable: Disable periodic sampling
> > +      */
> >       void (*oa_disable)(struct drm_i915_private *dev_priv);
> > -     void (*update_oacontrol)(struct drm_i915_private *dev_priv);
> > -     void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
> > -                                     u32 ctx_id);
> > +
> > +     /**
> > +      * @read: Copy data from the circular OA buffer into a given
userspace
> > +      * buffer.
> > +      */
> >       int (*read)(struct i915_perf_stream *stream,
> >                   char __user *buf,
> >                   size_t count,
> >                   size_t *offset);
> > +
> > +     /**
> > +      * @oa_buffer_is_empty: Check if OA buffer empty (false positives
OK)
> > +      *
> > +      * This is either called via fops or the poll check hrtimer
(atomic
> > +      * ctx) without any locks taken.
> > +      *
> > +      * It's safe to read OA config state here unlocked, assuming that
this
> > +      * is only called while the stream is enabled, while the global OA
> > +      * configuration can't be modified.
> > +      *
> > +      * Efficiency is more important than avoiding some false positives
> > +      * here, which will be handled gracefully - likely resulting in an
> > +      * EAGAIN error for userspace.
> > +      */
> >       bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
> >  };
> >
> > diff --git a/drivers/gpu/drm/i915/i915_perf.c
b/drivers/gpu/drm/i915/i915_perf.c
> > index 68b7c27..0be8cef 100644
> > --- a/drivers/gpu/drm/i915/i915_perf.c
> > +++ b/drivers/gpu/drm/i915/i915_perf.c
> > @@ -26,7 +26,10 @@
> >
> >
> >  /**
> > - * DOC: i915 Perf, streaming API for GPU metrics
> > + * DOC: i915 Perf Overview
> > + *
> > + * Overview
> > + * --------
> >   *
> >   * Gen graphics supports a large number of performance counters that
can help
> >   * driver and application developers understand and optimize their use
of the
> > @@ -45,6 +48,13 @@
> >   * privileges by default, unless changed via the
dev.i915.perf_event_paranoid
> >   * sysctl option.
> >   *
> > + */
> > +
> > +/**
> > + * DOC: i915 Perf History and Comparison with Core Perf
> > + *
> > + * Comparison with Core Perf
> > + * -------------------------
> >   *
> >   * The interface was initially inspired by the core Perf
infrastructure but
> >   * some notable differences are:
> > @@ -75,8 +85,8 @@
> >   * gets copied from the GPU mapped buffers to userspace buffers.
> >   *
> >   *
> > - * Some notes regarding Linux Perf:
> > - * --------------------------------
> > + * Issues hit with first prototype based on Core Perf
> > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >   *
> >   * The first prototype of this driver was based on the core perf
> >   * infrastructure, and while we did make that mostly work, with some
changes to
> > @@ -135,7 +145,7 @@
> >   *   for combining with the side-band raw reports it captures using
> >   *   MI_REPORT_PERF_COUNT commands.
> >   *
> > - *   _ As a side note on perf's grouping feature; there was also some
concern
> > + *   - As a side note on perf's grouping feature; there was also some
concern
> >   *     that using PERF_FORMAT_GROUP as a way to pack together counter
values
> >   *     would quite drastically inflate our sample sizes, which would
likely
> >   *     lower the effective sampling resolutions we could use when the
available
> > @@ -277,6 +287,20 @@ static struct i915_oa_format
hsw_oa_formats[I915_OA_FORMAT_MAX] = {
> >
> >  #define SAMPLE_OA_REPORT      (1<<0)
> >
> > +/**
> > + * struct perf_open_properties - for validated properties given to
open a stream
> > + * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked
as flags
> > + * @single_context: Whether a single or all gpu contexts should be
monitored
> > + * @ctx_handle: A gem ctx handle for use with @single_context
> > + * @metrics_set: An ID for an OA unit metric set advertised via sysfs
> > + * @oa_format: An OA unit HW report format
> > + * @oa_periodic: Whether to enable periodic OA unit sampling
> > + * @oa_period_exponent: The OA unit sampling period is derived from
this
> > + *
> > + * As read_properties_unlocked() enumerates and validates the
properties given
> > + * to open a stream of metrics the configuration is built up in the
structure
> > + * which starts out zero initialized.
> > + */
> >  struct perf_open_properties {
> >       u32 sample_flags;
> >
> > @@ -314,7 +338,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct
drm_i915_private *dev_pr
> >  }
> >
> >  /**
> > - * Appends a status record to a userspace read() buffer.
> > + * append_oa_status - Appends a status record to a userspace read()
buffer.
> > + * @stream: An i915-perf stream opened for OA metrics
> > + * @buf: destination buffer given by userspace
> > + * @count: the number of bytes userspace wants to read
> > + * @offset: (inout): the current position for writing into @buf
> > + * @type: The kind of status to report to userspace
> > + *
> > + * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_
REPORT_LOST`)
> > + * into the userspace read() buffer.
> > + *
> > + * The @buf @offset will only be updated on success.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> >   */
> >  static int append_oa_status(struct i915_perf_stream *stream,
> >                           char __user *buf,
> > @@ -336,7 +372,21 @@ static int append_oa_status(struct
i915_perf_stream *stream,
> >  }
> >
> >  /**
> > - * Copies single OA report into userspace read() buffer.
> > + * append_oa_sample - Copies single OA report into userspace read()
buffer.
> > + * @stream: An i915-perf stream opened for OA metrics
> > + * @buf: destination buffer given by userspace
> > + * @count: the number of bytes userspace wants to read
> > + * @offset: (inout): the current position for writing into @buf
> > + * @report: A single OA report to (optionally) include as part of the
sample
> > + *
> > + * The contents of a sample are configured through
`DRM_I915_PERF_PROP_SAMPLE_*`
> > + * properties when opening a stream, tracked as
`stream->sample_flags`. This
> > + * function copies the requested components of a single sample to the
given
> > + * read() @buf.
> > + *
> > + * The @buf @offset will only be updated on success.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> >   */
> >  static int append_oa_sample(struct i915_perf_stream *stream,
> >                           char __user *buf,
> > @@ -380,8 +430,6 @@ static int append_oa_sample(struct i915_perf_stream
*stream,
> >   * @head_ptr: (inout): the current oa buffer cpu read position
> >   * @tail: the current oa buffer gpu write position
> >   *
> > - * Returns 0 on success, negative error code on failure.
> > - *
> >   * Notably any error condition resulting in a short read (-ENOSPC or
> >   * -EFAULT) will be returned even though one or more records may
> >   * have been successfully copied. In this case it's up to the caller
> > @@ -392,6 +440,8 @@ static int append_oa_sample(struct i915_perf_stream
*stream,
> >   * tail, so the head chases the tail?... If you think that's mad
> >   * and back-to-front you're not alone, but this follows the
> >   * Gen PRM naming convention.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> >   */
> >  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
> >                                 char __user *buf,
> > @@ -496,6 +546,22 @@ static int gen7_append_oa_reports(struct
i915_perf_stream *stream,
> >       return ret;
> >  }
> >
> > +/**
> > + * gen7_oa_read - copy status records then buffered OA reports
> > + * @stream: An i915-perf stream opened for OA metrics
> > + * @buf: destination buffer given by userspace
> > + * @count: the number of bytes userspace wants to read
> > + * @offset: (inout): the current position for writing into @buf
> > + *
> > + * Checks Gen 7 specific OA unit status registers and if necessary
appends
> > + * corresponding status records for userspace (such as for a buffer
full
> > + * condition) and then initiate appending any buffered OA reports.
> > + *
> > + * Updates @offset according to the number of bytes successfully
copied into
> > + * the userspace buffer.
> > + *
> > + * Returns: zero on success or a negative error code
> > + */
> >  static int gen7_oa_read(struct i915_perf_stream *stream,
> >                       char __user *buf,
> >                       size_t count,
> > @@ -597,6 +663,20 @@ static int gen7_oa_read(struct i915_perf_stream
*stream,
> >       return ret;
> >  }
> >
> > +/**
> > + * i915_oa_wait_unlocked - handles blocking IO until OA data available
> > + * @stream: An i915-perf stream opened for OA metrics
> > + *
> > + * Called when userspace tries to read() from a blocking stream FD
opened
> > + * for OA metrics. It waits until the hrtimer callback finds a
non-empty
> > + * OA buffer and wakes us.
> > + *
> > + * Note: it's acceptable to have this return with some false positives
> > + * since any subsequent read handling will return EAGAIN if there isn't
> > + * really data ready for userspace yet.
> > + *
> > + * Returns: zero on success or a negative error code
> > + */
> >  static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
> >  {
> >       struct drm_i915_private *dev_priv = stream->dev_priv;
> > @@ -615,6 +695,16 @@ static int i915_oa_wait_unlocked(struct
i915_perf_stream *stream)
> >                                       !dev_priv->perf.oa.ops.oa_
buffer_is_empty(dev_priv));
> >  }
> >
> > +/**
> > + * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
> > + * @stream: An i915-perf stream opened for OA metrics
> > + * @file: An i915 perf stream file
> > + * @wait: poll() state table
> > + *
> > + * For handling userspace polling on an i915 perf stream opened for OA
metrics,
> > + * this starts a poll_wait with the wait queue that our hrtimer
callback wakes
> > + * when it sees data ready to read in the circular OA buffer.
> > + */
> >  static void i915_oa_poll_wait(struct i915_perf_stream *stream,
> >                             struct file *file,
> >                             poll_table *wait)
> > @@ -624,6 +714,18 @@ static void i915_oa_poll_wait(struct
i915_perf_stream *stream,
> >       poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
> >  }
> >
> > +/**
> > + * i915_oa_read - just calls through to ``&i915_oa_ops->read``
> > + * @stream: An i915-perf stream opened for OA metrics
> > + * @buf: destination buffer given by userspace
> > + * @count: the number of bytes userspace wants to read
> > + * @offset: (inout): the current position for writing into @buf
> > + *
> > + * Updates @offset according to the number of bytes successfully
copied into
> > + * the userspace buffer.
> > + *
> > + * Returns: zero on success or a negative error code
> > + */
> >  static int i915_oa_read(struct i915_perf_stream *stream,
> >                       char __user *buf,
> >                       size_t count,
> > @@ -634,9 +736,15 @@ static int i915_oa_read(struct i915_perf_stream
*stream,
> >       return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
> >  }
> >
> > -/* Determine the render context hw id, and ensure it remains fixed for
the
> > +/**
> > + * oa_get_render_ctx_id - determine and hold ctx hw id
> > + * @stream: An i915-perf stream opened for OA metrics
> > + *
> > + * Determine the render context hw id, and ensure it remains fixed for
the
> >   * lifetime of the stream. This ensures that we don't have to worry
about
> >   * updating the context ID in OACONTROL on the fly.
> > + *
> > + * Returns: zero on success or a negative error code
> >   */
> >  static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
> >  {
> > @@ -673,6 +781,13 @@ static int oa_get_render_ctx_id(struct
i915_perf_stream *stream)
> >       return ret;
> >  }
> >
> > +/**
> > + * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases
hold
> > + * @stream: An i915-perf stream opened for OA metrics
> > + *
> > + * In case anything needed doing to ensure the context HW ID would
remain valid
> > + * for the lifetime of the stream, then that can be undone here.
> > + */
> >  static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
> >  {
> >       struct drm_i915_private *dev_priv = stream->dev_priv;
> > @@ -945,6 +1060,15 @@ static void gen7_oa_enable(struct
drm_i915_private *dev_priv)
> >       spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
> >  }
> >
> > +/**
> > + * i915_oa_stream_enable - handle I915_PERF_IOCTL_ENABLE for OA stream
> > + * @stream: An i915 perf stream opened for OA metrics
> > + *
> > + * [Re]enables hardware periodic sampling according to the period
configured
> > + * when opening the stream. This also starts a hrtimer that will
periodically
> > + * check for data in the circular OA buffer for notifying userspace
(e.g.
> > + * during a read() or poll()).
> > + */
> >  static void i915_oa_stream_enable(struct i915_perf_stream *stream)
> >  {
> >       struct drm_i915_private *dev_priv = stream->dev_priv;
> > @@ -962,6 +1086,14 @@ static void gen7_oa_disable(struct
drm_i915_private *dev_priv)
> >       I915_WRITE(GEN7_OACONTROL, 0);
> >  }
> >
> > +/**
> > + * i915_oa_stream_disable - handle I915_PERF_IOCTL_DISABLE for OA
stream
> > + * @stream: An i915 perf stream opened for OA metrics
> > + *
> > + * Stops the OA unit from periodically writing counter reports into the
> > + * circular OA buffer. This also stops the hrtimer that periodically
checks for
> > + * data in the circular OA buffer, for notifying userspace.
> > + */
> >  static void i915_oa_stream_disable(struct i915_perf_stream *stream)
> >  {
> >       struct drm_i915_private *dev_priv = stream->dev_priv;
> > @@ -987,6 +1119,24 @@ static const struct i915_perf_stream_ops
i915_oa_stream_ops = {
> >       .read = i915_oa_read,
> >  };
> >
> > +/**
> > + * i915_oa_stream_init - validate combined props for OA stream and init
> > + * @stream: An i915 perf stream
> > + * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
> > + * @props: The property state that configures stream (individually
validated)
> > + *
> > + * While read_properties_unlocked() validates properties in isolation
it
> > + * doesn't ensure that the combination necessarily makes sense.
> > + *
> > + * At this point it has been determined that userspace wants a stream
of
> > + * OA metrics, but still we need to further validate the combined
> > + * properties are OK.
> > + *
> > + * If the configuration makes sense then we can allocate memory for
> > + * a circular OA buffer and apply the requested metric set
configuration.
> > + *
> > + * Returns: zero on success or a negative error code.
> > + */
> >  static int i915_oa_stream_init(struct i915_perf_stream *stream,
> >                              struct drm_i915_perf_open_param *param,
> >                              struct perf_open_properties *props)
> > @@ -1110,6 +1260,31 @@ static int i915_oa_stream_init(struct
i915_perf_stream *stream,
> >       return ret;
> >  }
> >
> > +/**
> > + * i915_perf_read_locked - wrap stream->ops->read with error
notmalisation
> > + * @stream: An i915 perf stream
> > + * @file: An i915 perf stream file
> > + * @buf: destination buffer given by userspace
> > + * @count: the number of bytes userspace wants to read
> > + * @ppos: (inout) file seek position (unused)
> > + *
> > + * Besides wrapping stream->ops->read() this provides a common place
to ensure
> > + * that if we've successfully copied any data then reporting that takes
> > + * precedence over any internal error status, so the data isn't lost.
> > + *
> > + * For example ret will be -ENOSPC whenever there is more buffered
data than
> > + * can be copied to userspace, but that's only interesting if we
weren't able
> > + * to copy some data because it implies the userspace buffer is too
small to
> > + * receive a single record (and we never split records).
> > + *
> > + * Another case with ret == -EFAULT is more of a grey area since it
would seem
> > + * like bad form for userspace to ask us to overrun its buffer, but
the user
> > + * knows best:
> > + *
> > + *   http://yarchive.net/comp/linux/partial_reads_writes.html
> > + *
> > + * Returns: The number of bytes copied or a negative error code on
failure.
> > + */
> >  static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
> >                                    struct file *file,
> >                                    char __user *buf,
> > @@ -1125,25 +1300,27 @@ static ssize_t i915_perf_read_locked(struct
i915_perf_stream *stream,
> >       size_t offset = 0;
> >       int ret = stream->ops->read(stream, buf, count, &offset);
> >
> > -     /* If we've successfully copied any data then reporting that
> > -      * takes precedence over any internal error status, so the
> > -      * data isn't lost.
> > -      *
> > -      * For example ret will be -ENOSPC whenever there is more
> > -      * buffered data than can be copied to userspace, but that's
> > -      * only interesting if we weren't able to copy some data
> > -      * because it implies the userspace buffer is too small to
> > -      * receive a single record (and we never split records).
> > -      *
> > -      * Another case with ret == -EFAULT is more of a grey area
> > -      * since it would seem like bad form for userspace to ask us
> > -      * to overrun its buffer, but the user knows best:
> > -      *
> > -      *   http://yarchive.net/comp/linux/partial_reads_writes.html
> > -      */
> >       return offset ?: (ret ?: -EAGAIN);
> >  }
> >
> > +/**
> > + * i915_perf_read - handles read() FOP for i915 perf stream FDs
> > + * @file: An i915 perf stream file
> > + * @buf: destination buffer given by userspace
> > + * @count: the number of bytes userspace wants to read
> > + * @ppos: (inout) file seek position (unused)
> > + *
> > + * The entry point for handling a read() on a stream file descriptor
from
> > + * userspace. Most of the work is left to the i915_perf_read_locked()
and
> > + * stream->op->read() but to save having stream implementations (of
which
> > + * we might have multiple later) we handle blocking read here.
> > + *
> > + * We can also consistently treat trying to read from a disable stream
> > + * as an IO error so implementations can assume the stream is enabled
> > + * while reading.
> > + *
> > + * Returns: The number of bytes copied or a negative error code on
failure.
> > + */
> >  static ssize_t i915_perf_read(struct file *file,
> >                             char __user *buf,
> >                             size_t count,
> > @@ -1458,12 +1635,20 @@ i915_perf_open_ioctl_locked(struct
drm_i915_private *dev_priv,
> >       return ret;
> >  }
> >
> > -/* Note we copy the properties from userspace outside of the i915 perf
> > - * mutex to avoid an awkward lockdep with mmap_sem.
> > +/**
> > + * read_properties_unlocked - validate + copy userspace stream open
properties
> > + * @dev_priv: i915 device instance
> > + * @uprops: The array of u64 key value pairs given by userspace
> > + * @n_props: The number of key value pairs expected in @uprops
> > + * @props: The stream configuration built up while validating
properties
> >   *
> >   * Note this function only validates properties in isolation it doesn't
> >   * validate that the combination of properties makes sense or that all
> >   * properties necessary for a particular kind of stream have been set.
> > + *
> > + * Note that there currently aren't any ordering requirements for
properties so
> > + * we shouldn't validate or assume anything about ordering here. This
doesn't
> > + * rule out defining new properties with ordering requirements in the
future.
> >   */
> >  static int read_properties_unlocked(struct drm_i915_private *dev_priv,
> >                                   u64 __user *uprops,
> > @@ -1585,6 +1770,26 @@ static int read_properties_unlocked(struct
drm_i915_private *dev_priv,
> >       return 0;
> >  }
> >
> > +/**
> > + * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
> > + * @dev: drm device
> > + * @data: ioctl data copied from userspace (unvalidated)
> > + * @file: drm file
> > + *
> > + * Validates the stream open parameters given by userspace including
flags
> > + * and an array of u64 key, value pair properties.
> > + *
> > + * Very little is assumed up front about the nature of the stream being
> > + * opened (for instance we don't assume it's for periodic OA unit
metrics). An
> > + * i915-perf stream is expected to be a suitable interface for other
forms of
> > + * buffered data written by the GPU besides periodic OA metrics.
> > + *
> > + * Note we copy the properties from userspace outside of the i915 perf
> > + * mutex to avoid an awkward lockdep with mmap_sem.
> > + *
> > + * Return: A newly opened i915 Perf stream file descriptor or negative
> > + * error code on failure.
> > + */
> >  int i915_perf_open_ioctl(struct drm_device *dev, void *data,
> >                        struct drm_file *file)
> >  {
> > @@ -1621,6 +1826,14 @@ int i915_perf_open_ioctl(struct drm_device *dev,
void *data,
> >       return ret;
> >  }
> >
> > +/**
> > + * i915_perf_register - exposes i915-perf to userspace
> > + * @dev_priv: i915 device instance
> > + *
> > + * In particular OA metric sets are advertised under a sysfs metrics/
> > + * directory allowing userspace to enumerate valid IDs that can be
> > + * used to open an i915-perf stream.
> > + */
> >  void i915_perf_register(struct drm_i915_private *dev_priv)
> >  {
> >       if (!IS_HASWELL(dev_priv))
> > @@ -1650,6 +1863,15 @@ void i915_perf_register(struct drm_i915_private
*dev_priv)
> >       mutex_unlock(&dev_priv->perf.lock);
> >  }
> >
> > +/**
> > + * i915_perf_register - hide i915-perf from userspace
> > + * @dev_priv: i915 device instance
> > + *
> > + * i915-perf state cleanup is split up into an 'unregister' and
> > + * 'deinit' phase where the interface is first hidden from
> > + * userspace by i915_perf_unregister() before cleaning up
> > + * remaining state in i915_perf_fini().
> > + */
> >  void i915_perf_unregister(struct drm_i915_private *dev_priv)
> >  {
> >       if (!IS_HASWELL(dev_priv))
> > @@ -1706,6 +1928,15 @@ static struct ctl_table dev_root[] = {
> >       {}
> >  };
> >
> > +/**
> > + * i915_perf_init - initialize i915-perf state on module load
> > + * @dev_priv: i915 device instance
> > + *
> > + * Initializes state i915-perf state without exposing anything to
userspace.
> > + *
> > + * Note: i915-perf initialization is split into an 'init' and
'register'
> > + * phase with the i915_perf_register() exposing state to userspace.
> > + */
> >  void i915_perf_init(struct drm_i915_private *dev_priv)
> >  {
> >       if (!IS_HASWELL(dev_priv))
> > @@ -1741,6 +1972,10 @@ void i915_perf_init(struct drm_i915_private
*dev_priv)
> >       dev_priv->perf.initialized = true;
> >  }
> >
> > +/**
> > + * i915_perf_fini - Counter part to i915_perf_init()
> > + * @dev_priv: i915 device instance
> > + */
> >  void i915_perf_fini(struct drm_i915_private *dev_priv)
> >  {
> >       if (!dev_priv->perf.initialized)
> > --
> > 2.10.2
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

[-- Attachment #1.2: Type: text/html, Size: 49484 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-11-30 19:41 ` Daniel Vetter
@ 2016-12-01 12:12   ` Jani Nikula
  2016-12-02 15:17     ` Robert Bragg
  2016-12-02 15:05   ` Robert Bragg
  1 sibling, 1 reply; 16+ messages in thread
From: Jani Nikula @ 2016-12-01 12:12 UTC (permalink / raw)
  To: Daniel Vetter, Robert Bragg; +Cc: Daniel Vetter, intel-gfx

On Wed, 30 Nov 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Nov 29, 2016 at 05:00:55PM +0000, Robert Bragg wrote:
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_init
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_fini
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_register
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_unregister
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_open_ioctl
>> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
>> +   :functions: i915_perf_release
>
> One potential issue with listing everything explicitly is that if someone
> ever (and this is bound to happen) adds a new function, they'll forget to
> add it. Hence we just pull them all in, and if you want to refernce some
> specifically, do that in the overview sections.

One real issue with listing everything separately is that kernel-doc
parses the source file once per every kernel-doc directive.

Also, doesn't Sphinx complain about not having a blank line to end the
indented block after the directive? It might not, but I thought it
might.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-11-29 17:00 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
  2016-11-30 16:30 ` Matthew Auld
@ 2016-11-30 19:41 ` Daniel Vetter
  2016-12-01 12:12   ` Jani Nikula
  2016-12-02 15:05   ` Robert Bragg
  1 sibling, 2 replies; 16+ messages in thread
From: Daniel Vetter @ 2016-11-30 19:41 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Daniel Vetter, intel-gfx

On Tue, Nov 29, 2016 at 05:00:55PM +0000, Robert Bragg wrote:
> This adds a 'Perf' section to i915.rst with the following sub sections:
> - Overview
> - Comparison with Core Perf
> - i915 Driver Entry Points
> - i915 Perf Stream
> - i915 Perf Observation Architecture Stream
> - All i915 Perf Internals
> 
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>

Two style bikesheds below, feel free to ignore.

> ---
>  Documentation/gpu/i915.rst       |  92 +++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h  | 151 ++++++++++++++++----
>  drivers/gpu/drm/i915/i915_perf.c | 289 +++++++++++++++++++++++++++++++++++----
>  3 files changed, 478 insertions(+), 54 deletions(-)
> 
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 117d2ab..714bd4b 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -356,4 +356,96 @@ switch_mm
>  .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
>     :doc: switch_mm tracepoint
>  
> +Perf
> +====
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf Overview
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf History and Comparison with Core Perf
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf File Operations

You have the headings in the DOC comments itself, which works until
someone reorganizes stuff. Then it tends to fall apart badly.

> +
> +i915 Driver Entry Points
> +------------------------
> +
> +This section covers the entrypoints exported outside of i915_perf.c to
> +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_fini
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_register
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_unregister
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_release

One potential issue with listing everything explicitly is that if someone
ever (and this is bound to happen) adds a new function, they'll forget to
add it. Hence we just pull them all in, and if you want to refernce some
specifically, do that in the overview sections. And also sprinkle lots of
cross-references all over to make groups of functions easier to discover.

But in the end your docs, your turf ;-)
-Daniel

> +
> +i915 Perf Stream
> +----------------
> +
> +This section covers the stream-semantics-agnostic structures and functions
> +for representing an i915 perf stream FD and associated file operations.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: read_properties_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_destroy_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_enable_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_disable_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_poll
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_poll_unlocked
> +
> +i915 Perf Observation Architecture Stream
> +-----------------------------------------
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: OA_BUFFER_SIZE
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_oa_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_enable
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_disable
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_wait_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_poll_wait
> +
> +All i915 Perf Internals
> +-----------------------
> +
> +This section simply includes all currently documented i915 perf internals, in
> +no particular order, but may include some more minor utilities or platform
> +specific details than found in the more high-level sections.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :internal:
> +
>  .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1ec9619..9f92755 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1827,89 +1827,186 @@ struct i915_oa_reg {
>  
>  struct i915_perf_stream;
>  
> +/**
> + * struct i915_perf_stream_ops - the OPs to support a specific stream type
> + */
>  struct i915_perf_stream_ops {
> -	/* Enables the collection of HW samples, either in response to
> -	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
> -	 * opened without I915_PERF_FLAG_DISABLED.
> +	/**
> +	 * @enable: Enables the collection of HW samples, either in response to
> +	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is opened
> +	 * without I915_PERF_FLAG_DISABLED.
>  	 */
>  	void (*enable)(struct i915_perf_stream *stream);
>  
> -	/* Disables the collection of HW samples, either in response to
> -	 * I915_PERF_IOCTL_DISABLE or implicitly called before
> -	 * destroying the stream.
> +	/**
> +	 * @disable: Disables the collection of HW samples, either in response
> +	 * to I915_PERF_IOCTL_DISABLE or implicitly called before destroying
> +	 * the stream.
>  	 */
>  	void (*disable)(struct i915_perf_stream *stream);
>  
> -	/* Call poll_wait, passing a wait queue that will be woken
> +	/**
> +	 * @poll_wait: Call poll_wait, passing a wait queue that will be woken
>  	 * once there is something ready to read() for the stream
>  	 */
>  	void (*poll_wait)(struct i915_perf_stream *stream,
>  			  struct file *file,
>  			  poll_table *wait);
>  
> -	/* For handling a blocking read, wait until there is something
> -	 * to ready to read() for the stream. E.g. wait on the same
> +	/**
> +	 * @wait_unlocked: For handling a blocking read, wait until there is
> +	 * something to ready to read() for the stream. E.g. wait on the same
>  	 * wait queue that would be passed to poll_wait().
>  	 */
>  	int (*wait_unlocked)(struct i915_perf_stream *stream);
>  
> -	/* read - Copy buffered metrics as records to userspace
> -	 * @buf: the userspace, destination buffer
> -	 * @count: the number of bytes to copy, requested by userspace
> -	 * @offset: zero at the start of the read, updated as the read
> -	 *          proceeds, it represents how many bytes have been
> -	 *          copied so far and the buffer offset for copying the
> -	 *          next record.
> +	/**
> +	 * @read: Copy buffered metrics as records to userspace
> +	 * **buf**: the userspace, destination buffer
> +	 * **count**: the number of bytes to copy, requested by userspace
> +	 * **offset**: zero at the start of the read, updated as the read
> +	 * proceeds, it represents how many bytes have been copied so far and
> +	 * the buffer offset for copying the next record.
>  	 *
> -	 * Copy as many buffered i915 perf samples and records for
> -	 * this stream to userspace as will fit in the given buffer.
> +	 * Copy as many buffered i915 perf samples and records for this stream
> +	 * to userspace as will fit in the given buffer.
>  	 *
> -	 * Only write complete records; returning -ENOSPC if there
> -	 * isn't room for a complete record.
> +	 * Only write complete records; returning -ENOSPC if there isn't room
> +	 * for a complete record.
>  	 *
> -	 * Return any error condition that results in a short read
> -	 * such as -ENOSPC or -EFAULT, even though these may be
> -	 * squashed before returning to userspace.
> +	 * Return any error condition that results in a short read such as
> +	 * -ENOSPC or -EFAULT, even though these may be squashed before
> +	 * returning to userspace.
>  	 */
>  	int (*read)(struct i915_perf_stream *stream,
>  		    char __user *buf,
>  		    size_t count,
>  		    size_t *offset);
>  
> -	/* Cleanup any stream specific resources.
> +	/**
> +	 * @destroy: Cleanup any stream specific resources.
>  	 *
>  	 * The stream will always be disabled before this is called.
>  	 */
>  	void (*destroy)(struct i915_perf_stream *stream);
>  };
>  
> +/**
> + * struct i915_perf_stream - state for a single open stream FD
> + */
>  struct i915_perf_stream {
> +	/**
> +	 * @dev_priv: i915 drm device
> +	 */
>  	struct drm_i915_private *dev_priv;
>  
> +	/**
> +	 * @link: Links the stream into ``&drm_i915_private->streams``
> +	 */
>  	struct list_head link;
>  
> +	/**
> +	 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
> +	 * properties given when opening a stream, representing the contents
> +	 * of a single sample as read() by userspace.
> +	 */
>  	u32 sample_flags;
> +
> +	/**
> +	 * @sample_size: Considering the configured contents of a sample
> +	 * combined with the required header size, this is the total size
> +	 * of a single sample record.
> +	 */
>  	int sample_size;
>  
> +	/**
> +	 * @ctx: %NULL if measuring system-wide across all contexts or a
> +	 * specific context that is being monitored.
> +	 */
>  	struct i915_gem_context *ctx;
> +
> +	/**
> +	 * @enabled: Whether the stream is currently enabled, considering
> +	 * whether the stream was opened in a disabled state and based
> +	 * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
> +	 */
>  	bool enabled;
>  
> +	/**
> +	 * @ops: The callbacks providing the implementation of this specific
> +	 * type of configured stream.
> +	 */
>  	const struct i915_perf_stream_ops *ops;
>  };
>  
> +/**
> + * struct i915_oa_ops - Gen specific implementation of an OA unit stream
> + */
>  struct i915_oa_ops {
> +	/**
> +	 * @init_oa_buffer: Resets the head and tail pointers of the
> +	 * circular buffer for periodic OA reports.
> +	 *
> +	 * Called when first opening a stream for OA metrics, but also may be
> +	 * called in response to an OA buffer overflow or other error
> +	 * condition.
> +	 *
> +	 * Note it may be necessary to clear the full OA buffer here as part of
> +	 * maintaining the invariable that new reports must be written to
> +	 * zeroed memory for us to be able to reliable detect if an expected
> +	 * report has not yet landed in memory.  (At least on Haswell the OA
> +	 * buffer tail pointer is not synchronized with reports being visible
> +	 * to the CPU)
> +	 */
>  	void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @enable_metric_set: Applies any MUX configuration to set up the
> +	 * Boolean and Custom (B/C) counters that are part of the counter
> +	 * reports being sampled. May apply system constraints such as
> +	 * disabling EU clock gating as required.
> +	 */
>  	int (*enable_metric_set)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @disable_metric_set: Remove system constraints associated with using
> +	 * the OA unit.
> +	 */
>  	void (*disable_metric_set)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @oa_enable: Enable periodic sampling
> +	 */
>  	void (*oa_enable)(struct drm_i915_private *dev_priv);
> +
> +	/**
> +	 * @oa_disable: Disable periodic sampling
> +	 */
>  	void (*oa_disable)(struct drm_i915_private *dev_priv);
> -	void (*update_oacontrol)(struct drm_i915_private *dev_priv);
> -	void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
> -					u32 ctx_id);
> +
> +	/**
> +	 * @read: Copy data from the circular OA buffer into a given userspace
> +	 * buffer.
> +	 */
>  	int (*read)(struct i915_perf_stream *stream,
>  		    char __user *buf,
>  		    size_t count,
>  		    size_t *offset);
> +
> +	/**
> +	 * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
> +	 *
> +	 * This is either called via fops or the poll check hrtimer (atomic
> +	 * ctx) without any locks taken.
> +	 *
> +	 * It's safe to read OA config state here unlocked, assuming that this
> +	 * is only called while the stream is enabled, while the global OA
> +	 * configuration can't be modified.
> +	 *
> +	 * Efficiency is more important than avoiding some false positives
> +	 * here, which will be handled gracefully - likely resulting in an
> +	 * EAGAIN error for userspace.
> +	 */
>  	bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
>  };
>  
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 68b7c27..0be8cef 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -26,7 +26,10 @@
>  
>  
>  /**
> - * DOC: i915 Perf, streaming API for GPU metrics
> + * DOC: i915 Perf Overview
> + *
> + * Overview
> + * --------
>   *
>   * Gen graphics supports a large number of performance counters that can help
>   * driver and application developers understand and optimize their use of the
> @@ -45,6 +48,13 @@
>   * privileges by default, unless changed via the dev.i915.perf_event_paranoid
>   * sysctl option.
>   *
> + */
> +
> +/**
> + * DOC: i915 Perf History and Comparison with Core Perf
> + *
> + * Comparison with Core Perf
> + * -------------------------
>   *
>   * The interface was initially inspired by the core Perf infrastructure but
>   * some notable differences are:
> @@ -75,8 +85,8 @@
>   * gets copied from the GPU mapped buffers to userspace buffers.
>   *
>   *
> - * Some notes regarding Linux Perf:
> - * --------------------------------
> + * Issues hit with first prototype based on Core Perf
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   *
>   * The first prototype of this driver was based on the core perf
>   * infrastructure, and while we did make that mostly work, with some changes to
> @@ -135,7 +145,7 @@
>   *   for combining with the side-band raw reports it captures using
>   *   MI_REPORT_PERF_COUNT commands.
>   *
> - *   _ As a side note on perf's grouping feature; there was also some concern
> + *   - As a side note on perf's grouping feature; there was also some concern
>   *     that using PERF_FORMAT_GROUP as a way to pack together counter values
>   *     would quite drastically inflate our sample sizes, which would likely
>   *     lower the effective sampling resolutions we could use when the available
> @@ -277,6 +287,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
>  
>  #define SAMPLE_OA_REPORT      (1<<0)
>  
> +/**
> + * struct perf_open_properties - for validated properties given to open a stream
> + * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
> + * @single_context: Whether a single or all gpu contexts should be monitored
> + * @ctx_handle: A gem ctx handle for use with @single_context
> + * @metrics_set: An ID for an OA unit metric set advertised via sysfs
> + * @oa_format: An OA unit HW report format
> + * @oa_periodic: Whether to enable periodic OA unit sampling
> + * @oa_period_exponent: The OA unit sampling period is derived from this
> + *
> + * As read_properties_unlocked() enumerates and validates the properties given
> + * to open a stream of metrics the configuration is built up in the structure
> + * which starts out zero initialized.
> + */
>  struct perf_open_properties {
>  	u32 sample_flags;
>  
> @@ -314,7 +338,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
>  }
>  
>  /**
> - * Appends a status record to a userspace read() buffer.
> + * append_oa_status - Appends a status record to a userspace read() buffer.
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + * @type: The kind of status to report to userspace
> + *
> + * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
> + * into the userspace read() buffer.
> + *
> + * The @buf @offset will only be updated on success.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int append_oa_status(struct i915_perf_stream *stream,
>  			    char __user *buf,
> @@ -336,7 +372,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
>  }
>  
>  /**
> - * Copies single OA report into userspace read() buffer.
> + * append_oa_sample - Copies single OA report into userspace read() buffer.
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + * @report: A single OA report to (optionally) include as part of the sample
> + *
> + * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
> + * properties when opening a stream, tracked as `stream->sample_flags`. This
> + * function copies the requested components of a single sample to the given
> + * read() @buf.
> + *
> + * The @buf @offset will only be updated on success.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int append_oa_sample(struct i915_perf_stream *stream,
>  			    char __user *buf,
> @@ -380,8 +430,6 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>   * @head_ptr: (inout): the current oa buffer cpu read position
>   * @tail: the current oa buffer gpu write position
>   *
> - * Returns 0 on success, negative error code on failure.
> - *
>   * Notably any error condition resulting in a short read (-ENOSPC or
>   * -EFAULT) will be returned even though one or more records may
>   * have been successfully copied. In this case it's up to the caller
> @@ -392,6 +440,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>   * tail, so the head chases the tail?... If you think that's mad
>   * and back-to-front you're not alone, but this follows the
>   * Gen PRM naming convention.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>  				  char __user *buf,
> @@ -496,6 +546,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>  	return ret;
>  }
>  
> +/**
> + * gen7_oa_read - copy status records then buffered OA reports
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + *
> + * Checks Gen 7 specific OA unit status registers and if necessary appends
> + * corresponding status records for userspace (such as for a buffer full
> + * condition) and then initiate appending any buffered OA reports.
> + *
> + * Updates @offset according to the number of bytes successfully copied into
> + * the userspace buffer.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int gen7_oa_read(struct i915_perf_stream *stream,
>  			char __user *buf,
>  			size_t count,
> @@ -597,6 +663,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
>  	return ret;
>  }
>  
> +/**
> + * i915_oa_wait_unlocked - handles blocking IO until OA data available
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * Called when userspace tries to read() from a blocking stream FD opened
> + * for OA metrics. It waits until the hrtimer callback finds a non-empty
> + * OA buffer and wakes us.
> + *
> + * Note: it's acceptable to have this return with some false positives
> + * since any subsequent read handling will return EAGAIN if there isn't
> + * really data ready for userspace yet.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -615,6 +695,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>  					!dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
>  }
>  
> +/**
> + * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
> + * @stream: An i915-perf stream opened for OA metrics
> + * @file: An i915 perf stream file
> + * @wait: poll() state table
> + *
> + * For handling userspace polling on an i915 perf stream opened for OA metrics,
> + * this starts a poll_wait with the wait queue that our hrtimer callback wakes
> + * when it sees data ready to read in the circular OA buffer.
> + */
>  static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>  			      struct file *file,
>  			      poll_table *wait)
> @@ -624,6 +714,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>  	poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
>  }
>  
> +/**
> + * i915_oa_read - just calls through to ``&i915_oa_ops->read``
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + *
> + * Updates @offset according to the number of bytes successfully copied into
> + * the userspace buffer.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int i915_oa_read(struct i915_perf_stream *stream,
>  			char __user *buf,
>  			size_t count,
> @@ -634,9 +736,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
>  	return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
>  }
>  
> -/* Determine the render context hw id, and ensure it remains fixed for the
> +/**
> + * oa_get_render_ctx_id - determine and hold ctx hw id
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * Determine the render context hw id, and ensure it remains fixed for the
>   * lifetime of the stream. This ensures that we don't have to worry about
>   * updating the context ID in OACONTROL on the fly.
> + *
> + * Returns: zero on success or a negative error code
>   */
>  static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>  {
> @@ -673,6 +781,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>  	return ret;
>  }
>  
> +/**
> + * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * In case anything needed doing to ensure the context HW ID would remain valid
> + * for the lifetime of the stream, then that can be undone here.
> + */
>  static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -945,6 +1060,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
>  	spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
>  }
>  
> +/**
> + * i915_oa_stream_enable - handle I915_PERF_IOCTL_ENABLE for OA stream
> + * @stream: An i915 perf stream opened for OA metrics
> + *
> + * [Re]enables hardware periodic sampling according to the period configured
> + * when opening the stream. This also starts a hrtimer that will periodically
> + * check for data in the circular OA buffer for notifying userspace (e.g.
> + * during a read() or poll()).
> + */
>  static void i915_oa_stream_enable(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -962,6 +1086,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
>  	I915_WRITE(GEN7_OACONTROL, 0);
>  }
>  
> +/**
> + * i915_oa_stream_disable - handle I915_PERF_IOCTL_DISABLE for OA stream
> + * @stream: An i915 perf stream opened for OA metrics
> + *
> + * Stops the OA unit from periodically writing counter reports into the
> + * circular OA buffer. This also stops the hrtimer that periodically checks for
> + * data in the circular OA buffer, for notifying userspace.
> + */
>  static void i915_oa_stream_disable(struct i915_perf_stream *stream)
>  {
>  	struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -987,6 +1119,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
>  	.read = i915_oa_read,
>  };
>  
> +/**
> + * i915_oa_stream_init - validate combined props for OA stream and init
> + * @stream: An i915 perf stream
> + * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
> + * @props: The property state that configures stream (individually validated)
> + *
> + * While read_properties_unlocked() validates properties in isolation it
> + * doesn't ensure that the combination necessarily makes sense.
> + *
> + * At this point it has been determined that userspace wants a stream of
> + * OA metrics, but still we need to further validate the combined
> + * properties are OK.
> + *
> + * If the configuration makes sense then we can allocate memory for
> + * a circular OA buffer and apply the requested metric set configuration.
> + *
> + * Returns: zero on success or a negative error code.
> + */
>  static int i915_oa_stream_init(struct i915_perf_stream *stream,
>  			       struct drm_i915_perf_open_param *param,
>  			       struct perf_open_properties *props)
> @@ -1110,6 +1260,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>  	return ret;
>  }
>  
> +/**
> + * i915_perf_read_locked - wrap stream->ops->read with error notmalisation
> + * @stream: An i915 perf stream
> + * @file: An i915 perf stream file
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @ppos: (inout) file seek position (unused)
> + *
> + * Besides wrapping stream->ops->read() this provides a common place to ensure
> + * that if we've successfully copied any data then reporting that takes
> + * precedence over any internal error status, so the data isn't lost.
> + *
> + * For example ret will be -ENOSPC whenever there is more buffered data than
> + * can be copied to userspace, but that's only interesting if we weren't able
> + * to copy some data because it implies the userspace buffer is too small to
> + * receive a single record (and we never split records).
> + *
> + * Another case with ret == -EFAULT is more of a grey area since it would seem
> + * like bad form for userspace to ask us to overrun its buffer, but the user
> + * knows best:
> + *
> + *   http://yarchive.net/comp/linux/partial_reads_writes.html
> + *
> + * Returns: The number of bytes copied or a negative error code on failure.
> + */
>  static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>  				     struct file *file,
>  				     char __user *buf,
> @@ -1125,25 +1300,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>  	size_t offset = 0;
>  	int ret = stream->ops->read(stream, buf, count, &offset);
>  
> -	/* If we've successfully copied any data then reporting that
> -	 * takes precedence over any internal error status, so the
> -	 * data isn't lost.
> -	 *
> -	 * For example ret will be -ENOSPC whenever there is more
> -	 * buffered data than can be copied to userspace, but that's
> -	 * only interesting if we weren't able to copy some data
> -	 * because it implies the userspace buffer is too small to
> -	 * receive a single record (and we never split records).
> -	 *
> -	 * Another case with ret == -EFAULT is more of a grey area
> -	 * since it would seem like bad form for userspace to ask us
> -	 * to overrun its buffer, but the user knows best:
> -	 *
> -	 *   http://yarchive.net/comp/linux/partial_reads_writes.html
> -	 */
>  	return offset ?: (ret ?: -EAGAIN);
>  }
>  
> +/**
> + * i915_perf_read - handles read() FOP for i915 perf stream FDs
> + * @file: An i915 perf stream file
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @ppos: (inout) file seek position (unused)
> + *
> + * The entry point for handling a read() on a stream file descriptor from
> + * userspace. Most of the work is left to the i915_perf_read_locked() and
> + * stream->op->read() but to save having stream implementations (of which
> + * we might have multiple later) we handle blocking read here.
> + *
> + * We can also consistently treat trying to read from a disable stream
> + * as an IO error so implementations can assume the stream is enabled
> + * while reading.
> + *
> + * Returns: The number of bytes copied or a negative error code on failure.
> + */
>  static ssize_t i915_perf_read(struct file *file,
>  			      char __user *buf,
>  			      size_t count,
> @@ -1458,12 +1635,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>  	return ret;
>  }
>  
> -/* Note we copy the properties from userspace outside of the i915 perf
> - * mutex to avoid an awkward lockdep with mmap_sem.
> +/**
> + * read_properties_unlocked - validate + copy userspace stream open properties
> + * @dev_priv: i915 device instance
> + * @uprops: The array of u64 key value pairs given by userspace
> + * @n_props: The number of key value pairs expected in @uprops
> + * @props: The stream configuration built up while validating properties
>   *
>   * Note this function only validates properties in isolation it doesn't
>   * validate that the combination of properties makes sense or that all
>   * properties necessary for a particular kind of stream have been set.
> + *
> + * Note that there currently aren't any ordering requirements for properties so
> + * we shouldn't validate or assume anything about ordering here. This doesn't
> + * rule out defining new properties with ordering requirements in the future.
>   */
>  static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>  				    u64 __user *uprops,
> @@ -1585,6 +1770,26 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>  	return 0;
>  }
>  
> +/**
> + * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
> + * @dev: drm device
> + * @data: ioctl data copied from userspace (unvalidated)
> + * @file: drm file
> + *
> + * Validates the stream open parameters given by userspace including flags
> + * and an array of u64 key, value pair properties.
> + *
> + * Very little is assumed up front about the nature of the stream being
> + * opened (for instance we don't assume it's for periodic OA unit metrics). An
> + * i915-perf stream is expected to be a suitable interface for other forms of
> + * buffered data written by the GPU besides periodic OA metrics.
> + *
> + * Note we copy the properties from userspace outside of the i915 perf
> + * mutex to avoid an awkward lockdep with mmap_sem.
> + *
> + * Return: A newly opened i915 Perf stream file descriptor or negative
> + * error code on failure.
> + */
>  int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>  			 struct drm_file *file)
>  {
> @@ -1621,6 +1826,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>  	return ret;
>  }
>  
> +/**
> + * i915_perf_register - exposes i915-perf to userspace
> + * @dev_priv: i915 device instance
> + *
> + * In particular OA metric sets are advertised under a sysfs metrics/
> + * directory allowing userspace to enumerate valid IDs that can be
> + * used to open an i915-perf stream.
> + */
>  void i915_perf_register(struct drm_i915_private *dev_priv)
>  {
>  	if (!IS_HASWELL(dev_priv))
> @@ -1650,6 +1863,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
>  	mutex_unlock(&dev_priv->perf.lock);
>  }
>  
> +/**
> + * i915_perf_register - hide i915-perf from userspace
> + * @dev_priv: i915 device instance
> + *
> + * i915-perf state cleanup is split up into an 'unregister' and
> + * 'deinit' phase where the interface is first hidden from
> + * userspace by i915_perf_unregister() before cleaning up
> + * remaining state in i915_perf_fini().
> + */
>  void i915_perf_unregister(struct drm_i915_private *dev_priv)
>  {
>  	if (!IS_HASWELL(dev_priv))
> @@ -1706,6 +1928,15 @@ static struct ctl_table dev_root[] = {
>  	{}
>  };
>  
> +/**
> + * i915_perf_init - initialize i915-perf state on module load
> + * @dev_priv: i915 device instance
> + *
> + * Initializes state i915-perf state without exposing anything to userspace.
> + *
> + * Note: i915-perf initialization is split into an 'init' and 'register'
> + * phase with the i915_perf_register() exposing state to userspace.
> + */
>  void i915_perf_init(struct drm_i915_private *dev_priv)
>  {
>  	if (!IS_HASWELL(dev_priv))
> @@ -1741,6 +1972,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
>  	dev_priv->perf.initialized = true;
>  }
>  
> +/**
> + * i915_perf_fini - Counter part to i915_perf_init()
> + * @dev_priv: i915 device instance
> + */
>  void i915_perf_fini(struct drm_i915_private *dev_priv)
>  {
>  	if (!dev_priv->perf.initialized)
> -- 
> 2.10.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] drm/i915/perf: More documentation hooked to i915.rst
  2016-11-29 17:00 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
@ 2016-11-30 16:30 ` Matthew Auld
  2016-11-30 19:41 ` Daniel Vetter
  1 sibling, 0 replies; 16+ messages in thread
From: Matthew Auld @ 2016-11-30 16:30 UTC (permalink / raw)
  To: Robert Bragg; +Cc: Daniel Vetter, Intel Graphics Development

On 29 November 2016 at 17:00, Robert Bragg <robert@sixbynine.org> wrote:
> This adds a 'Perf' section to i915.rst with the following sub sections:
> - Overview
> - Comparison with Core Perf
> - i915 Driver Entry Points
> - i915 Perf Stream
> - i915 Perf Observation Architecture Stream
> - All i915 Perf Internals
>
> Signed-off-by: Robert Bragg <robert@sixbynine.org>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  Documentation/gpu/i915.rst       |  92 +++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h  | 151 ++++++++++++++++----
>  drivers/gpu/drm/i915/i915_perf.c | 289 +++++++++++++++++++++++++++++++++++----
>  3 files changed, 478 insertions(+), 54 deletions(-)
>
> diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
> index 117d2ab..714bd4b 100644
> --- a/Documentation/gpu/i915.rst
> +++ b/Documentation/gpu/i915.rst
> @@ -356,4 +356,96 @@ switch_mm
>  .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
>     :doc: switch_mm tracepoint
>
> +Perf
> +====
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf Overview
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf History and Comparison with Core Perf
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :doc: i915 Perf File Operations
I see no such :DOC entry in i915_perf.c.

> +
> +i915 Driver Entry Points
> +------------------------
> +
> +This section covers the entrypoints exported outside of i915_perf.c to
> +integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_fini
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_register
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_unregister
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_release
I see no kernel doc for this.

> +
> +i915 Perf Stream
> +----------------
> +
> +This section covers the stream-semantics-agnostic structures and functions
> +for representing an i915 perf stream FD and associated file operations.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_perf_stream_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: read_properties_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_open_ioctl_unlocked
i915_perf_open_ioctl_locked, and there is no kernel doc for it.

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_destroy_unlocked
i915_perf_destroy_locked, and there is no kernel doc for it.

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_ioctl
I see no kernel doc for this.

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_enable_unlocked
i915_perf_enable_locked, and there is no kernel doc for it.

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_disable_unlocked
i915_perf_disable_locked, and there is no kernel doc for it.

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_poll
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_perf_poll_unlocked
i915_perf_poll_locked, and there is no kernel doc for it

> +
> +i915 Perf Observation Architecture Stream
> +-----------------------------------------
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: OA_BUFFER_SIZE
hmm, but this is not a function, enum, union, struct or typedef, so
can we actually add kernel doc for preprocessor directives ? Also
OA_BUFFER_SIZE lacks the proper formatting to get picked up anyway.

> +.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
> +   :functions: i915_oa_ops
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_init
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_read
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_enable
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_stream_disable
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_wait_unlocked
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :functions: i915_oa_poll_wait
> +
> +All i915 Perf Internals
> +-----------------------
> +
> +This section simply includes all currently documented i915 perf internals, in
> +no particular order, but may include some more minor utilities or platform
> +specific details than found in the more high-level sections.
> +
> +.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
> +   :internal:
> +
>  .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1ec9619..9f92755 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1827,89 +1827,186 @@ struct i915_oa_reg {
>
>  struct i915_perf_stream;
>
> +/**
> + * struct i915_perf_stream_ops - the OPs to support a specific stream type
> + */
>  struct i915_perf_stream_ops {
> -       /* Enables the collection of HW samples, either in response to
> -        * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
> -        * opened without I915_PERF_FLAG_DISABLED.
> +       /**
> +        * @enable: Enables the collection of HW samples, either in response to
> +        * I915_PERF_IOCTL_ENABLE or implicitly called when stream is opened
> +        * without I915_PERF_FLAG_DISABLED.
>          */
>         void (*enable)(struct i915_perf_stream *stream);
>
> -       /* Disables the collection of HW samples, either in response to
> -        * I915_PERF_IOCTL_DISABLE or implicitly called before
> -        * destroying the stream.
> +       /**
> +        * @disable: Disables the collection of HW samples, either in response
> +        * to I915_PERF_IOCTL_DISABLE or implicitly called before destroying
> +        * the stream.
>          */
>         void (*disable)(struct i915_perf_stream *stream);
>
> -       /* Call poll_wait, passing a wait queue that will be woken
> +       /**
> +        * @poll_wait: Call poll_wait, passing a wait queue that will be woken
>          * once there is something ready to read() for the stream
>          */
>         void (*poll_wait)(struct i915_perf_stream *stream,
>                           struct file *file,
>                           poll_table *wait);
>
> -       /* For handling a blocking read, wait until there is something
> -        * to ready to read() for the stream. E.g. wait on the same
> +       /**
> +        * @wait_unlocked: For handling a blocking read, wait until there is
> +        * something to ready to read() for the stream. E.g. wait on the same
>          * wait queue that would be passed to poll_wait().
>          */
>         int (*wait_unlocked)(struct i915_perf_stream *stream);
>
> -       /* read - Copy buffered metrics as records to userspace
> -        * @buf: the userspace, destination buffer
> -        * @count: the number of bytes to copy, requested by userspace
> -        * @offset: zero at the start of the read, updated as the read
> -        *          proceeds, it represents how many bytes have been
> -        *          copied so far and the buffer offset for copying the
> -        *          next record.
> +       /**
> +        * @read: Copy buffered metrics as records to userspace
> +        * **buf**: the userspace, destination buffer
> +        * **count**: the number of bytes to copy, requested by userspace
> +        * **offset**: zero at the start of the read, updated as the read
> +        * proceeds, it represents how many bytes have been copied so far and
> +        * the buffer offset for copying the next record.
So there's no proper way of documenting the parameters of a function
pointer which happens to also be a member of a struct ? I guess you
would have to do a typedef, but then that's a bit ugly...

>          *
> -        * Copy as many buffered i915 perf samples and records for
> -        * this stream to userspace as will fit in the given buffer.
> +        * Copy as many buffered i915 perf samples and records for this stream
> +        * to userspace as will fit in the given buffer.
>          *
> -        * Only write complete records; returning -ENOSPC if there
> -        * isn't room for a complete record.
> +        * Only write complete records; returning -ENOSPC if there isn't room
> +        * for a complete record.
>          *
> -        * Return any error condition that results in a short read
> -        * such as -ENOSPC or -EFAULT, even though these may be
> -        * squashed before returning to userspace.
> +        * Return any error condition that results in a short read such as
> +        * -ENOSPC or -EFAULT, even though these may be squashed before
> +        * returning to userspace.
>          */
>         int (*read)(struct i915_perf_stream *stream,
>                     char __user *buf,
>                     size_t count,
>                     size_t *offset);
>
> -       /* Cleanup any stream specific resources.
> +       /**
> +        * @destroy: Cleanup any stream specific resources.
>          *
>          * The stream will always be disabled before this is called.
>          */
>         void (*destroy)(struct i915_perf_stream *stream);
>  };
>
> +/**
> + * struct i915_perf_stream - state for a single open stream FD
> + */
>  struct i915_perf_stream {
> +       /**
> +        * @dev_priv: i915 drm device
> +        */
>         struct drm_i915_private *dev_priv;
>
> +       /**
> +        * @link: Links the stream into ``&drm_i915_private->streams``
> +        */
>         struct list_head link;
>
> +       /**
> +        * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
> +        * properties given when opening a stream, representing the contents
> +        * of a single sample as read() by userspace.
> +        */
>         u32 sample_flags;
> +
> +       /**
> +        * @sample_size: Considering the configured contents of a sample
> +        * combined with the required header size, this is the total size
> +        * of a single sample record.
> +        */
>         int sample_size;
>
> +       /**
> +        * @ctx: %NULL if measuring system-wide across all contexts or a
> +        * specific context that is being monitored.
> +        */
>         struct i915_gem_context *ctx;
> +
> +       /**
> +        * @enabled: Whether the stream is currently enabled, considering
> +        * whether the stream was opened in a disabled state and based
> +        * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
> +        */
>         bool enabled;
>
> +       /**
> +        * @ops: The callbacks providing the implementation of this specific
> +        * type of configured stream.
> +        */
>         const struct i915_perf_stream_ops *ops;
>  };
>
> +/**
> + * struct i915_oa_ops - Gen specific implementation of an OA unit stream
> + */
>  struct i915_oa_ops {
> +       /**
> +        * @init_oa_buffer: Resets the head and tail pointers of the
> +        * circular buffer for periodic OA reports.
> +        *
> +        * Called when first opening a stream for OA metrics, but also may be
> +        * called in response to an OA buffer overflow or other error
> +        * condition.
> +        *
> +        * Note it may be necessary to clear the full OA buffer here as part of
> +        * maintaining the invariable that new reports must be written to
> +        * zeroed memory for us to be able to reliable detect if an expected
> +        * report has not yet landed in memory.  (At least on Haswell the OA
> +        * buffer tail pointer is not synchronized with reports being visible
> +        * to the CPU)
> +        */
>         void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
> +
> +       /**
> +        * @enable_metric_set: Applies any MUX configuration to set up the
> +        * Boolean and Custom (B/C) counters that are part of the counter
> +        * reports being sampled. May apply system constraints such as
> +        * disabling EU clock gating as required.
> +        */
>         int (*enable_metric_set)(struct drm_i915_private *dev_priv);
> +
> +       /**
> +        * @disable_metric_set: Remove system constraints associated with using
> +        * the OA unit.
> +        */
>         void (*disable_metric_set)(struct drm_i915_private *dev_priv);
> +
> +       /**
> +        * @oa_enable: Enable periodic sampling
> +        */
>         void (*oa_enable)(struct drm_i915_private *dev_priv);
> +
> +       /**
> +        * @oa_disable: Disable periodic sampling
> +        */
>         void (*oa_disable)(struct drm_i915_private *dev_priv);
> -       void (*update_oacontrol)(struct drm_i915_private *dev_priv);
> -       void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
> -                                       u32 ctx_id);
> +
> +       /**
> +        * @read: Copy data from the circular OA buffer into a given userspace
> +        * buffer.
> +        */
>         int (*read)(struct i915_perf_stream *stream,
>                     char __user *buf,
>                     size_t count,
>                     size_t *offset);
> +
> +       /**
> +        * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
> +        *
> +        * This is either called via fops or the poll check hrtimer (atomic
> +        * ctx) without any locks taken.
> +        *
> +        * It's safe to read OA config state here unlocked, assuming that this
> +        * is only called while the stream is enabled, while the global OA
> +        * configuration can't be modified.
> +        *
> +        * Efficiency is more important than avoiding some false positives
> +        * here, which will be handled gracefully - likely resulting in an
> +        * EAGAIN error for userspace.
> +        */
>         bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
>  };
>
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 68b7c27..0be8cef 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -26,7 +26,10 @@
>
>
>  /**
> - * DOC: i915 Perf, streaming API for GPU metrics
> + * DOC: i915 Perf Overview
> + *
> + * Overview
> + * --------
>   *
>   * Gen graphics supports a large number of performance counters that can help
>   * driver and application developers understand and optimize their use of the
> @@ -45,6 +48,13 @@
>   * privileges by default, unless changed via the dev.i915.perf_event_paranoid
>   * sysctl option.
>   *
> + */
> +
> +/**
> + * DOC: i915 Perf History and Comparison with Core Perf
> + *
> + * Comparison with Core Perf
> + * -------------------------
>   *
>   * The interface was initially inspired by the core Perf infrastructure but
>   * some notable differences are:
> @@ -75,8 +85,8 @@
>   * gets copied from the GPU mapped buffers to userspace buffers.
>   *
>   *
> - * Some notes regarding Linux Perf:
> - * --------------------------------
> + * Issues hit with first prototype based on Core Perf
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   *
>   * The first prototype of this driver was based on the core perf
>   * infrastructure, and while we did make that mostly work, with some changes to
> @@ -135,7 +145,7 @@
>   *   for combining with the side-band raw reports it captures using
>   *   MI_REPORT_PERF_COUNT commands.
>   *
> - *   _ As a side note on perf's grouping feature; there was also some concern
> + *   - As a side note on perf's grouping feature; there was also some concern
>   *     that using PERF_FORMAT_GROUP as a way to pack together counter values
>   *     would quite drastically inflate our sample sizes, which would likely
>   *     lower the effective sampling resolutions we could use when the available
> @@ -277,6 +287,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
>
>  #define SAMPLE_OA_REPORT      (1<<0)
>
> +/**
> + * struct perf_open_properties - for validated properties given to open a stream
> + * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
> + * @single_context: Whether a single or all gpu contexts should be monitored
> + * @ctx_handle: A gem ctx handle for use with @single_context
> + * @metrics_set: An ID for an OA unit metric set advertised via sysfs
> + * @oa_format: An OA unit HW report format
> + * @oa_periodic: Whether to enable periodic OA unit sampling
> + * @oa_period_exponent: The OA unit sampling period is derived from this
> + *
> + * As read_properties_unlocked() enumerates and validates the properties given
> + * to open a stream of metrics the configuration is built up in the structure
> + * which starts out zero initialized.
> + */
>  struct perf_open_properties {
>         u32 sample_flags;
>
> @@ -314,7 +338,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
>  }
>
>  /**
> - * Appends a status record to a userspace read() buffer.
> + * append_oa_status - Appends a status record to a userspace read() buffer.
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + * @type: The kind of status to report to userspace
> + *
> + * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
> + * into the userspace read() buffer.
> + *
> + * The @buf @offset will only be updated on success.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int append_oa_status(struct i915_perf_stream *stream,
>                             char __user *buf,
> @@ -336,7 +372,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
>  }
>
>  /**
> - * Copies single OA report into userspace read() buffer.
> + * append_oa_sample - Copies single OA report into userspace read() buffer.
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + * @report: A single OA report to (optionally) include as part of the sample
> + *
> + * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
> + * properties when opening a stream, tracked as `stream->sample_flags`. This
> + * function copies the requested components of a single sample to the given
> + * read() @buf.
> + *
> + * The @buf @offset will only be updated on success.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int append_oa_sample(struct i915_perf_stream *stream,
>                             char __user *buf,
> @@ -380,8 +430,6 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>   * @head_ptr: (inout): the current oa buffer cpu read position
>   * @tail: the current oa buffer gpu write position
>   *
> - * Returns 0 on success, negative error code on failure.
> - *
>   * Notably any error condition resulting in a short read (-ENOSPC or
>   * -EFAULT) will be returned even though one or more records may
>   * have been successfully copied. In this case it's up to the caller
> @@ -392,6 +440,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
>   * tail, so the head chases the tail?... If you think that's mad
>   * and back-to-front you're not alone, but this follows the
>   * Gen PRM naming convention.
> + *
> + * Returns: 0 on success, negative error code on failure.
>   */
>  static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>                                   char __user *buf,
> @@ -496,6 +546,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
>         return ret;
>  }
>
> +/**
> + * gen7_oa_read - copy status records then buffered OA reports
> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + *
> + * Checks Gen 7 specific OA unit status registers and if necessary appends
> + * corresponding status records for userspace (such as for a buffer full
> + * condition) and then initiate appending any buffered OA reports.
> + *
> + * Updates @offset according to the number of bytes successfully copied into
> + * the userspace buffer.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int gen7_oa_read(struct i915_perf_stream *stream,
>                         char __user *buf,
>                         size_t count,
> @@ -597,6 +663,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
>         return ret;
>  }
>
> +/**
> + * i915_oa_wait_unlocked - handles blocking IO until OA data available
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * Called when userspace tries to read() from a blocking stream FD opened
> + * for OA metrics. It waits until the hrtimer callback finds a non-empty
> + * OA buffer and wakes us.
> + *
> + * Note: it's acceptable to have this return with some false positives
> + * since any subsequent read handling will return EAGAIN if there isn't
s/EAGAIN/-EAGAIN/

> + * really data ready for userspace yet.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>  {
>         struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -615,6 +695,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
>                                         !dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
>  }
>
> +/**
> + * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
> + * @stream: An i915-perf stream opened for OA metrics
> + * @file: An i915 perf stream file
> + * @wait: poll() state table
> + *
> + * For handling userspace polling on an i915 perf stream opened for OA metrics,
> + * this starts a poll_wait with the wait queue that our hrtimer callback wakes
> + * when it sees data ready to read in the circular OA buffer.
> + */
>  static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>                               struct file *file,
>                               poll_table *wait)
> @@ -624,6 +714,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
>         poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
>  }
>
> +/**
> + * i915_oa_read - just calls through to ``&i915_oa_ops->read``
I've got no idea what ``&i915_oa_ops->read`` is supposed to generate,
but it spews out :c:type:`i915_oa_ops->read <i915_oa_ops>` for me...

> + * @stream: An i915-perf stream opened for OA metrics
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @offset: (inout): the current position for writing into @buf
> + *
> + * Updates @offset according to the number of bytes successfully copied into
> + * the userspace buffer.
> + *
> + * Returns: zero on success or a negative error code
> + */
>  static int i915_oa_read(struct i915_perf_stream *stream,
>                         char __user *buf,
>                         size_t count,
> @@ -634,9 +736,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
>         return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
>  }
>
> -/* Determine the render context hw id, and ensure it remains fixed for the
> +/**
> + * oa_get_render_ctx_id - determine and hold ctx hw id
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * Determine the render context hw id, and ensure it remains fixed for the
>   * lifetime of the stream. This ensures that we don't have to worry about
>   * updating the context ID in OACONTROL on the fly.
> + *
> + * Returns: zero on success or a negative error code
>   */
>  static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>  {
> @@ -673,6 +781,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>         return ret;
>  }
>
> +/**
> + * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
> + * @stream: An i915-perf stream opened for OA metrics
> + *
> + * In case anything needed doing to ensure the context HW ID would remain valid
> + * for the lifetime of the stream, then that can be undone here.
> + */
>  static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
>  {
>         struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -945,6 +1060,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
>         spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
>  }
>
> +/**
> + * i915_oa_stream_enable - handle I915_PERF_IOCTL_ENABLE for OA stream
> + * @stream: An i915 perf stream opened for OA metrics
> + *
> + * [Re]enables hardware periodic sampling according to the period configured
> + * when opening the stream. This also starts a hrtimer that will periodically
> + * check for data in the circular OA buffer for notifying userspace (e.g.
> + * during a read() or poll()).
> + */
>  static void i915_oa_stream_enable(struct i915_perf_stream *stream)
>  {
>         struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -962,6 +1086,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
>         I915_WRITE(GEN7_OACONTROL, 0);
>  }
>
> +/**
> + * i915_oa_stream_disable - handle I915_PERF_IOCTL_DISABLE for OA stream
> + * @stream: An i915 perf stream opened for OA metrics
> + *
> + * Stops the OA unit from periodically writing counter reports into the
> + * circular OA buffer. This also stops the hrtimer that periodically checks for
> + * data in the circular OA buffer, for notifying userspace.
> + */
>  static void i915_oa_stream_disable(struct i915_perf_stream *stream)
>  {
>         struct drm_i915_private *dev_priv = stream->dev_priv;
> @@ -987,6 +1119,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
>         .read = i915_oa_read,
>  };
>
> +/**
> + * i915_oa_stream_init - validate combined props for OA stream and init
> + * @stream: An i915 perf stream
> + * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
> + * @props: The property state that configures stream (individually validated)
> + *
> + * While read_properties_unlocked() validates properties in isolation it
> + * doesn't ensure that the combination necessarily makes sense.
> + *
> + * At this point it has been determined that userspace wants a stream of
> + * OA metrics, but still we need to further validate the combined
> + * properties are OK.
> + *
> + * If the configuration makes sense then we can allocate memory for
> + * a circular OA buffer and apply the requested metric set configuration.
> + *
> + * Returns: zero on success or a negative error code.
> + */
>  static int i915_oa_stream_init(struct i915_perf_stream *stream,
>                                struct drm_i915_perf_open_param *param,
>                                struct perf_open_properties *props)
> @@ -1110,6 +1260,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
>         return ret;
>  }
>
> +/**
> + * i915_perf_read_locked - wrap stream->ops->read with error notmalisation
notmalisation? normalisation? nationalisation?

> + * @stream: An i915 perf stream
> + * @file: An i915 perf stream file
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @ppos: (inout) file seek position (unused)
> + *
> + * Besides wrapping stream->ops->read() this provides a common place to ensure
This generates stream->ops->:c:func:read() for me...hmm, so maybe we
don't want the brackets ?

> + * that if we've successfully copied any data then reporting that takes
> + * precedence over any internal error status, so the data isn't lost.
> + *
> + * For example ret will be -ENOSPC whenever there is more buffered data than
> + * can be copied to userspace, but that's only interesting if we weren't able
> + * to copy some data because it implies the userspace buffer is too small to
> + * receive a single record (and we never split records).
> + *
> + * Another case with ret == -EFAULT is more of a grey area since it would seem
> + * like bad form for userspace to ask us to overrun its buffer, but the user
> + * knows best:
> + *
> + *   http://yarchive.net/comp/linux/partial_reads_writes.html
> + *
> + * Returns: The number of bytes copied or a negative error code on failure.
> + */
>  static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>                                      struct file *file,
>                                      char __user *buf,
> @@ -1125,25 +1300,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
>         size_t offset = 0;
>         int ret = stream->ops->read(stream, buf, count, &offset);
>
> -       /* If we've successfully copied any data then reporting that
> -        * takes precedence over any internal error status, so the
> -        * data isn't lost.
> -        *
> -        * For example ret will be -ENOSPC whenever there is more
> -        * buffered data than can be copied to userspace, but that's
> -        * only interesting if we weren't able to copy some data
> -        * because it implies the userspace buffer is too small to
> -        * receive a single record (and we never split records).
> -        *
> -        * Another case with ret == -EFAULT is more of a grey area
> -        * since it would seem like bad form for userspace to ask us
> -        * to overrun its buffer, but the user knows best:
> -        *
> -        *   http://yarchive.net/comp/linux/partial_reads_writes.html
> -        */
>         return offset ?: (ret ?: -EAGAIN);
>  }
>
> +/**
> + * i915_perf_read - handles read() FOP for i915 perf stream FDs
> + * @file: An i915 perf stream file
> + * @buf: destination buffer given by userspace
> + * @count: the number of bytes userspace wants to read
> + * @ppos: (inout) file seek position (unused)
> + *
> + * The entry point for handling a read() on a stream file descriptor from
> + * userspace. Most of the work is left to the i915_perf_read_locked() and
> + * stream->op->read() but to save having stream implementations (of which
s/stream->op->read()/stream->ops->read

> + * we might have multiple later) we handle blocking read here.
> + *
> + * We can also consistently treat trying to read from a disable stream
s/disable/disabled

> + * as an IO error so implementations can assume the stream is enabled
> + * while reading.
> + *
> + * Returns: The number of bytes copied or a negative error code on failure.
> + */
>  static ssize_t i915_perf_read(struct file *file,
>                               char __user *buf,
>                               size_t count,
> @@ -1458,12 +1635,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
>         return ret;
>  }
>
> -/* Note we copy the properties from userspace outside of the i915 perf
> - * mutex to avoid an awkward lockdep with mmap_sem.
> +/**
> + * read_properties_unlocked - validate + copy userspace stream open properties
> + * @dev_priv: i915 device instance
> + * @uprops: The array of u64 key value pairs given by userspace
> + * @n_props: The number of key value pairs expected in @uprops
> + * @props: The stream configuration built up while validating properties
>   *
>   * Note this function only validates properties in isolation it doesn't
>   * validate that the combination of properties makes sense or that all
>   * properties necessary for a particular kind of stream have been set.
> + *
> + * Note that there currently aren't any ordering requirements for properties so
> + * we shouldn't validate or assume anything about ordering here. This doesn't
> + * rule out defining new properties with ordering requirements in the future.
>   */
>  static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>                                     u64 __user *uprops,
> @@ -1585,6 +1770,26 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
>         return 0;
>  }
>
> +/**
> + * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
> + * @dev: drm device
> + * @data: ioctl data copied from userspace (unvalidated)
> + * @file: drm file
> + *
> + * Validates the stream open parameters given by userspace including flags
> + * and an array of u64 key, value pair properties.
> + *
> + * Very little is assumed up front about the nature of the stream being
> + * opened (for instance we don't assume it's for periodic OA unit metrics). An
> + * i915-perf stream is expected to be a suitable interface for other forms of
> + * buffered data written by the GPU besides periodic OA metrics.
> + *
> + * Note we copy the properties from userspace outside of the i915 perf
> + * mutex to avoid an awkward lockdep with mmap_sem.
> + *
> + * Return: A newly opened i915 Perf stream file descriptor or negative
> + * error code on failure.
> + */
>  int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>                          struct drm_file *file)
>  {
> @@ -1621,6 +1826,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>         return ret;
>  }
>
> +/**
> + * i915_perf_register - exposes i915-perf to userspace
> + * @dev_priv: i915 device instance
> + *
> + * In particular OA metric sets are advertised under a sysfs metrics/
> + * directory allowing userspace to enumerate valid IDs that can be
> + * used to open an i915-perf stream.
> + */
>  void i915_perf_register(struct drm_i915_private *dev_priv)
>  {
>         if (!IS_HASWELL(dev_priv))
> @@ -1650,6 +1863,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
>         mutex_unlock(&dev_priv->perf.lock);
>  }
>
> +/**
> + * i915_perf_register - hide i915-perf from userspace
i915_perf_unregister

> + * @dev_priv: i915 device instance
> + *
> + * i915-perf state cleanup is split up into an 'unregister' and
> + * 'deinit' phase where the interface is first hidden from
> + * userspace by i915_perf_unregister() before cleaning up
> + * remaining state in i915_perf_fini().
> + */
>  void i915_perf_unregister(struct drm_i915_private *dev_priv)
>  {
>         if (!IS_HASWELL(dev_priv))
> @@ -1706,6 +1928,15 @@ static struct ctl_table dev_root[] = {
>         {}
>  };
>
> +/**
> + * i915_perf_init - initialize i915-perf state on module load
> + * @dev_priv: i915 device instance
> + *
> + * Initializes state i915-perf state without exposing anything to userspace.
s/state i915-perf state/i915-perf state/

Other than that it all looks good.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] drm/i915/perf: More documentation hooked to i915.rst
@ 2016-11-29 17:00 Robert Bragg
  2016-11-30 16:30 ` Matthew Auld
  2016-11-30 19:41 ` Daniel Vetter
  0 siblings, 2 replies; 16+ messages in thread
From: Robert Bragg @ 2016-11-29 17:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

This adds a 'Perf' section to i915.rst with the following sub sections:
- Overview
- Comparison with Core Perf
- i915 Driver Entry Points
- i915 Perf Stream
- i915 Perf Observation Architecture Stream
- All i915 Perf Internals

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 Documentation/gpu/i915.rst       |  92 +++++++++++++
 drivers/gpu/drm/i915/i915_drv.h  | 151 ++++++++++++++++----
 drivers/gpu/drm/i915/i915_perf.c | 289 +++++++++++++++++++++++++++++++++++----
 3 files changed, 478 insertions(+), 54 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 117d2ab..714bd4b 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -356,4 +356,96 @@ switch_mm
 .. kernel-doc:: drivers/gpu/drm/i915/i915_trace.h
    :doc: switch_mm tracepoint
 
+Perf
+====
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf Overview
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf History and Comparison with Core Perf
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :doc: i915 Perf File Operations
+
+i915 Driver Entry Points
+------------------------
+
+This section covers the entrypoints exported outside of i915_perf.c to
+integrate with drm/i915 and to handle the `DRM_I915_PERF_OPEN` ioctl.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_fini
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_register
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_unregister
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_release
+
+i915 Perf Stream
+----------------
+
+This section covers the stream-semantics-agnostic structures and functions
+for representing an i915 perf stream FD and associated file operations.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_perf_stream_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: read_properties_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_open_ioctl_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_destroy_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_enable_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_disable_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_poll_unlocked
+
+i915 Perf Observation Architecture Stream
+-----------------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: OA_BUFFER_SIZE
+.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
+   :functions: i915_oa_ops
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_init
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_read
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_enable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_stream_disable
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_wait_unlocked
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_oa_poll_wait
+
+All i915 Perf Internals
+-----------------------
+
+This section simply includes all currently documented i915 perf internals, in
+no particular order, but may include some more minor utilities or platform
+specific details than found in the more high-level sections.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :internal:
+
 .. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1ec9619..9f92755 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1827,89 +1827,186 @@ struct i915_oa_reg {
 
 struct i915_perf_stream;
 
+/**
+ * struct i915_perf_stream_ops - the OPs to support a specific stream type
+ */
 struct i915_perf_stream_ops {
-	/* Enables the collection of HW samples, either in response to
-	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is
-	 * opened without I915_PERF_FLAG_DISABLED.
+	/**
+	 * @enable: Enables the collection of HW samples, either in response to
+	 * I915_PERF_IOCTL_ENABLE or implicitly called when stream is opened
+	 * without I915_PERF_FLAG_DISABLED.
 	 */
 	void (*enable)(struct i915_perf_stream *stream);
 
-	/* Disables the collection of HW samples, either in response to
-	 * I915_PERF_IOCTL_DISABLE or implicitly called before
-	 * destroying the stream.
+	/**
+	 * @disable: Disables the collection of HW samples, either in response
+	 * to I915_PERF_IOCTL_DISABLE or implicitly called before destroying
+	 * the stream.
 	 */
 	void (*disable)(struct i915_perf_stream *stream);
 
-	/* Call poll_wait, passing a wait queue that will be woken
+	/**
+	 * @poll_wait: Call poll_wait, passing a wait queue that will be woken
 	 * once there is something ready to read() for the stream
 	 */
 	void (*poll_wait)(struct i915_perf_stream *stream,
 			  struct file *file,
 			  poll_table *wait);
 
-	/* For handling a blocking read, wait until there is something
-	 * to ready to read() for the stream. E.g. wait on the same
+	/**
+	 * @wait_unlocked: For handling a blocking read, wait until there is
+	 * something to ready to read() for the stream. E.g. wait on the same
 	 * wait queue that would be passed to poll_wait().
 	 */
 	int (*wait_unlocked)(struct i915_perf_stream *stream);
 
-	/* read - Copy buffered metrics as records to userspace
-	 * @buf: the userspace, destination buffer
-	 * @count: the number of bytes to copy, requested by userspace
-	 * @offset: zero at the start of the read, updated as the read
-	 *          proceeds, it represents how many bytes have been
-	 *          copied so far and the buffer offset for copying the
-	 *          next record.
+	/**
+	 * @read: Copy buffered metrics as records to userspace
+	 * **buf**: the userspace, destination buffer
+	 * **count**: the number of bytes to copy, requested by userspace
+	 * **offset**: zero at the start of the read, updated as the read
+	 * proceeds, it represents how many bytes have been copied so far and
+	 * the buffer offset for copying the next record.
 	 *
-	 * Copy as many buffered i915 perf samples and records for
-	 * this stream to userspace as will fit in the given buffer.
+	 * Copy as many buffered i915 perf samples and records for this stream
+	 * to userspace as will fit in the given buffer.
 	 *
-	 * Only write complete records; returning -ENOSPC if there
-	 * isn't room for a complete record.
+	 * Only write complete records; returning -ENOSPC if there isn't room
+	 * for a complete record.
 	 *
-	 * Return any error condition that results in a short read
-	 * such as -ENOSPC or -EFAULT, even though these may be
-	 * squashed before returning to userspace.
+	 * Return any error condition that results in a short read such as
+	 * -ENOSPC or -EFAULT, even though these may be squashed before
+	 * returning to userspace.
 	 */
 	int (*read)(struct i915_perf_stream *stream,
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset);
 
-	/* Cleanup any stream specific resources.
+	/**
+	 * @destroy: Cleanup any stream specific resources.
 	 *
 	 * The stream will always be disabled before this is called.
 	 */
 	void (*destroy)(struct i915_perf_stream *stream);
 };
 
+/**
+ * struct i915_perf_stream - state for a single open stream FD
+ */
 struct i915_perf_stream {
+	/**
+	 * @dev_priv: i915 drm device
+	 */
 	struct drm_i915_private *dev_priv;
 
+	/**
+	 * @link: Links the stream into ``&drm_i915_private->streams``
+	 */
 	struct list_head link;
 
+	/**
+	 * @sample_flags: Flags representing the `DRM_I915_PERF_PROP_SAMPLE_*`
+	 * properties given when opening a stream, representing the contents
+	 * of a single sample as read() by userspace.
+	 */
 	u32 sample_flags;
+
+	/**
+	 * @sample_size: Considering the configured contents of a sample
+	 * combined with the required header size, this is the total size
+	 * of a single sample record.
+	 */
 	int sample_size;
 
+	/**
+	 * @ctx: %NULL if measuring system-wide across all contexts or a
+	 * specific context that is being monitored.
+	 */
 	struct i915_gem_context *ctx;
+
+	/**
+	 * @enabled: Whether the stream is currently enabled, considering
+	 * whether the stream was opened in a disabled state and based
+	 * on `I915_PERF_IOCTL_ENABLE` and `I915_PERF_IOCTL_DISABLE` calls.
+	 */
 	bool enabled;
 
+	/**
+	 * @ops: The callbacks providing the implementation of this specific
+	 * type of configured stream.
+	 */
 	const struct i915_perf_stream_ops *ops;
 };
 
+/**
+ * struct i915_oa_ops - Gen specific implementation of an OA unit stream
+ */
 struct i915_oa_ops {
+	/**
+	 * @init_oa_buffer: Resets the head and tail pointers of the
+	 * circular buffer for periodic OA reports.
+	 *
+	 * Called when first opening a stream for OA metrics, but also may be
+	 * called in response to an OA buffer overflow or other error
+	 * condition.
+	 *
+	 * Note it may be necessary to clear the full OA buffer here as part of
+	 * maintaining the invariable that new reports must be written to
+	 * zeroed memory for us to be able to reliable detect if an expected
+	 * report has not yet landed in memory.  (At least on Haswell the OA
+	 * buffer tail pointer is not synchronized with reports being visible
+	 * to the CPU)
+	 */
 	void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @enable_metric_set: Applies any MUX configuration to set up the
+	 * Boolean and Custom (B/C) counters that are part of the counter
+	 * reports being sampled. May apply system constraints such as
+	 * disabling EU clock gating as required.
+	 */
 	int (*enable_metric_set)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @disable_metric_set: Remove system constraints associated with using
+	 * the OA unit.
+	 */
 	void (*disable_metric_set)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @oa_enable: Enable periodic sampling
+	 */
 	void (*oa_enable)(struct drm_i915_private *dev_priv);
+
+	/**
+	 * @oa_disable: Disable periodic sampling
+	 */
 	void (*oa_disable)(struct drm_i915_private *dev_priv);
-	void (*update_oacontrol)(struct drm_i915_private *dev_priv);
-	void (*update_hw_ctx_id_locked)(struct drm_i915_private *dev_priv,
-					u32 ctx_id);
+
+	/**
+	 * @read: Copy data from the circular OA buffer into a given userspace
+	 * buffer.
+	 */
 	int (*read)(struct i915_perf_stream *stream,
 		    char __user *buf,
 		    size_t count,
 		    size_t *offset);
+
+	/**
+	 * @oa_buffer_is_empty: Check if OA buffer empty (false positives OK)
+	 *
+	 * This is either called via fops or the poll check hrtimer (atomic
+	 * ctx) without any locks taken.
+	 *
+	 * It's safe to read OA config state here unlocked, assuming that this
+	 * is only called while the stream is enabled, while the global OA
+	 * configuration can't be modified.
+	 *
+	 * Efficiency is more important than avoiding some false positives
+	 * here, which will be handled gracefully - likely resulting in an
+	 * EAGAIN error for userspace.
+	 */
 	bool (*oa_buffer_is_empty)(struct drm_i915_private *dev_priv);
 };
 
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 68b7c27..0be8cef 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -26,7 +26,10 @@
 
 
 /**
- * DOC: i915 Perf, streaming API for GPU metrics
+ * DOC: i915 Perf Overview
+ *
+ * Overview
+ * --------
  *
  * Gen graphics supports a large number of performance counters that can help
  * driver and application developers understand and optimize their use of the
@@ -45,6 +48,13 @@
  * privileges by default, unless changed via the dev.i915.perf_event_paranoid
  * sysctl option.
  *
+ */
+
+/**
+ * DOC: i915 Perf History and Comparison with Core Perf
+ *
+ * Comparison with Core Perf
+ * -------------------------
  *
  * The interface was initially inspired by the core Perf infrastructure but
  * some notable differences are:
@@ -75,8 +85,8 @@
  * gets copied from the GPU mapped buffers to userspace buffers.
  *
  *
- * Some notes regarding Linux Perf:
- * --------------------------------
+ * Issues hit with first prototype based on Core Perf
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  *
  * The first prototype of this driver was based on the core perf
  * infrastructure, and while we did make that mostly work, with some changes to
@@ -135,7 +145,7 @@
  *   for combining with the side-band raw reports it captures using
  *   MI_REPORT_PERF_COUNT commands.
  *
- *   _ As a side note on perf's grouping feature; there was also some concern
+ *   - As a side note on perf's grouping feature; there was also some concern
  *     that using PERF_FORMAT_GROUP as a way to pack together counter values
  *     would quite drastically inflate our sample sizes, which would likely
  *     lower the effective sampling resolutions we could use when the available
@@ -277,6 +287,20 @@ static struct i915_oa_format hsw_oa_formats[I915_OA_FORMAT_MAX] = {
 
 #define SAMPLE_OA_REPORT      (1<<0)
 
+/**
+ * struct perf_open_properties - for validated properties given to open a stream
+ * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
+ * @single_context: Whether a single or all gpu contexts should be monitored
+ * @ctx_handle: A gem ctx handle for use with @single_context
+ * @metrics_set: An ID for an OA unit metric set advertised via sysfs
+ * @oa_format: An OA unit HW report format
+ * @oa_periodic: Whether to enable periodic OA unit sampling
+ * @oa_period_exponent: The OA unit sampling period is derived from this
+ *
+ * As read_properties_unlocked() enumerates and validates the properties given
+ * to open a stream of metrics the configuration is built up in the structure
+ * which starts out zero initialized.
+ */
 struct perf_open_properties {
 	u32 sample_flags;
 
@@ -314,7 +338,19 @@ static bool gen7_oa_buffer_is_empty_fop_unlocked(struct drm_i915_private *dev_pr
 }
 
 /**
- * Appends a status record to a userspace read() buffer.
+ * append_oa_status - Appends a status record to a userspace read() buffer.
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ * @type: The kind of status to report to userspace
+ *
+ * Writes a status record (such as `DRM_I915_PERF_RECORD_OA_REPORT_LOST`)
+ * into the userspace read() buffer.
+ *
+ * The @buf @offset will only be updated on success.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int append_oa_status(struct i915_perf_stream *stream,
 			    char __user *buf,
@@ -336,7 +372,21 @@ static int append_oa_status(struct i915_perf_stream *stream,
 }
 
 /**
- * Copies single OA report into userspace read() buffer.
+ * append_oa_sample - Copies single OA report into userspace read() buffer.
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ * @report: A single OA report to (optionally) include as part of the sample
+ *
+ * The contents of a sample are configured through `DRM_I915_PERF_PROP_SAMPLE_*`
+ * properties when opening a stream, tracked as `stream->sample_flags`. This
+ * function copies the requested components of a single sample to the given
+ * read() @buf.
+ *
+ * The @buf @offset will only be updated on success.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int append_oa_sample(struct i915_perf_stream *stream,
 			    char __user *buf,
@@ -380,8 +430,6 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * @head_ptr: (inout): the current oa buffer cpu read position
  * @tail: the current oa buffer gpu write position
  *
- * Returns 0 on success, negative error code on failure.
- *
  * Notably any error condition resulting in a short read (-ENOSPC or
  * -EFAULT) will be returned even though one or more records may
  * have been successfully copied. In this case it's up to the caller
@@ -392,6 +440,8 @@ static int append_oa_sample(struct i915_perf_stream *stream,
  * tail, so the head chases the tail?... If you think that's mad
  * and back-to-front you're not alone, but this follows the
  * Gen PRM naming convention.
+ *
+ * Returns: 0 on success, negative error code on failure.
  */
 static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 				  char __user *buf,
@@ -496,6 +546,22 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * gen7_oa_read - copy status records then buffered OA reports
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ *
+ * Checks Gen 7 specific OA unit status registers and if necessary appends
+ * corresponding status records for userspace (such as for a buffer full
+ * condition) and then initiate appending any buffered OA reports.
+ *
+ * Updates @offset according to the number of bytes successfully copied into
+ * the userspace buffer.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int gen7_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
@@ -597,6 +663,20 @@ static int gen7_oa_read(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * i915_oa_wait_unlocked - handles blocking IO until OA data available
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * Called when userspace tries to read() from a blocking stream FD opened
+ * for OA metrics. It waits until the hrtimer callback finds a non-empty
+ * OA buffer and wakes us.
+ *
+ * Note: it's acceptable to have this return with some false positives
+ * since any subsequent read handling will return EAGAIN if there isn't
+ * really data ready for userspace yet.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -615,6 +695,16 @@ static int i915_oa_wait_unlocked(struct i915_perf_stream *stream)
 					!dev_priv->perf.oa.ops.oa_buffer_is_empty(dev_priv));
 }
 
+/**
+ * i915_oa_poll_wait - call poll_wait() for an OA stream poll()
+ * @stream: An i915-perf stream opened for OA metrics
+ * @file: An i915 perf stream file
+ * @wait: poll() state table
+ *
+ * For handling userspace polling on an i915 perf stream opened for OA metrics,
+ * this starts a poll_wait with the wait queue that our hrtimer callback wakes
+ * when it sees data ready to read in the circular OA buffer.
+ */
 static void i915_oa_poll_wait(struct i915_perf_stream *stream,
 			      struct file *file,
 			      poll_table *wait)
@@ -624,6 +714,18 @@ static void i915_oa_poll_wait(struct i915_perf_stream *stream,
 	poll_wait(file, &dev_priv->perf.oa.poll_wq, wait);
 }
 
+/**
+ * i915_oa_read - just calls through to ``&i915_oa_ops->read``
+ * @stream: An i915-perf stream opened for OA metrics
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @offset: (inout): the current position for writing into @buf
+ *
+ * Updates @offset according to the number of bytes successfully copied into
+ * the userspace buffer.
+ *
+ * Returns: zero on success or a negative error code
+ */
 static int i915_oa_read(struct i915_perf_stream *stream,
 			char __user *buf,
 			size_t count,
@@ -634,9 +736,15 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 	return dev_priv->perf.oa.ops.read(stream, buf, count, offset);
 }
 
-/* Determine the render context hw id, and ensure it remains fixed for the
+/**
+ * oa_get_render_ctx_id - determine and hold ctx hw id
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * Determine the render context hw id, and ensure it remains fixed for the
  * lifetime of the stream. This ensures that we don't have to worry about
  * updating the context ID in OACONTROL on the fly.
+ *
+ * Returns: zero on success or a negative error code
  */
 static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
 {
@@ -673,6 +781,13 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
 	return ret;
 }
 
+/**
+ * oa_put_render_ctx_id - counterpart to oa_get_render_ctx_id releases hold
+ * @stream: An i915-perf stream opened for OA metrics
+ *
+ * In case anything needed doing to ensure the context HW ID would remain valid
+ * for the lifetime of the stream, then that can be undone here.
+ */
 static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -945,6 +1060,15 @@ static void gen7_oa_enable(struct drm_i915_private *dev_priv)
 	spin_unlock_irqrestore(&dev_priv->perf.hook_lock, flags);
 }
 
+/**
+ * i915_oa_stream_enable - handle I915_PERF_IOCTL_ENABLE for OA stream
+ * @stream: An i915 perf stream opened for OA metrics
+ *
+ * [Re]enables hardware periodic sampling according to the period configured
+ * when opening the stream. This also starts a hrtimer that will periodically
+ * check for data in the circular OA buffer for notifying userspace (e.g.
+ * during a read() or poll()).
+ */
 static void i915_oa_stream_enable(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -962,6 +1086,14 @@ static void gen7_oa_disable(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN7_OACONTROL, 0);
 }
 
+/**
+ * i915_oa_stream_disable - handle I915_PERF_IOCTL_DISABLE for OA stream
+ * @stream: An i915 perf stream opened for OA metrics
+ *
+ * Stops the OA unit from periodically writing counter reports into the
+ * circular OA buffer. This also stops the hrtimer that periodically checks for
+ * data in the circular OA buffer, for notifying userspace.
+ */
 static void i915_oa_stream_disable(struct i915_perf_stream *stream)
 {
 	struct drm_i915_private *dev_priv = stream->dev_priv;
@@ -987,6 +1119,24 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = {
 	.read = i915_oa_read,
 };
 
+/**
+ * i915_oa_stream_init - validate combined props for OA stream and init
+ * @stream: An i915 perf stream
+ * @param: The open parameters passed to 'DRM_I915_PERF_OPEN`
+ * @props: The property state that configures stream (individually validated)
+ *
+ * While read_properties_unlocked() validates properties in isolation it
+ * doesn't ensure that the combination necessarily makes sense.
+ *
+ * At this point it has been determined that userspace wants a stream of
+ * OA metrics, but still we need to further validate the combined
+ * properties are OK.
+ *
+ * If the configuration makes sense then we can allocate memory for
+ * a circular OA buffer and apply the requested metric set configuration.
+ *
+ * Returns: zero on success or a negative error code.
+ */
 static int i915_oa_stream_init(struct i915_perf_stream *stream,
 			       struct drm_i915_perf_open_param *param,
 			       struct perf_open_properties *props)
@@ -1110,6 +1260,31 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	return ret;
 }
 
+/**
+ * i915_perf_read_locked - wrap stream->ops->read with error notmalisation
+ * @stream: An i915 perf stream
+ * @file: An i915 perf stream file
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @ppos: (inout) file seek position (unused)
+ *
+ * Besides wrapping stream->ops->read() this provides a common place to ensure
+ * that if we've successfully copied any data then reporting that takes
+ * precedence over any internal error status, so the data isn't lost.
+ *
+ * For example ret will be -ENOSPC whenever there is more buffered data than
+ * can be copied to userspace, but that's only interesting if we weren't able
+ * to copy some data because it implies the userspace buffer is too small to
+ * receive a single record (and we never split records).
+ *
+ * Another case with ret == -EFAULT is more of a grey area since it would seem
+ * like bad form for userspace to ask us to overrun its buffer, but the user
+ * knows best:
+ *
+ *   http://yarchive.net/comp/linux/partial_reads_writes.html
+ *
+ * Returns: The number of bytes copied or a negative error code on failure.
+ */
 static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
 				     struct file *file,
 				     char __user *buf,
@@ -1125,25 +1300,27 @@ static ssize_t i915_perf_read_locked(struct i915_perf_stream *stream,
 	size_t offset = 0;
 	int ret = stream->ops->read(stream, buf, count, &offset);
 
-	/* If we've successfully copied any data then reporting that
-	 * takes precedence over any internal error status, so the
-	 * data isn't lost.
-	 *
-	 * For example ret will be -ENOSPC whenever there is more
-	 * buffered data than can be copied to userspace, but that's
-	 * only interesting if we weren't able to copy some data
-	 * because it implies the userspace buffer is too small to
-	 * receive a single record (and we never split records).
-	 *
-	 * Another case with ret == -EFAULT is more of a grey area
-	 * since it would seem like bad form for userspace to ask us
-	 * to overrun its buffer, but the user knows best:
-	 *
-	 *   http://yarchive.net/comp/linux/partial_reads_writes.html
-	 */
 	return offset ?: (ret ?: -EAGAIN);
 }
 
+/**
+ * i915_perf_read - handles read() FOP for i915 perf stream FDs
+ * @file: An i915 perf stream file
+ * @buf: destination buffer given by userspace
+ * @count: the number of bytes userspace wants to read
+ * @ppos: (inout) file seek position (unused)
+ *
+ * The entry point for handling a read() on a stream file descriptor from
+ * userspace. Most of the work is left to the i915_perf_read_locked() and
+ * stream->op->read() but to save having stream implementations (of which
+ * we might have multiple later) we handle blocking read here.
+ *
+ * We can also consistently treat trying to read from a disable stream
+ * as an IO error so implementations can assume the stream is enabled
+ * while reading.
+ *
+ * Returns: The number of bytes copied or a negative error code on failure.
+ */
 static ssize_t i915_perf_read(struct file *file,
 			      char __user *buf,
 			      size_t count,
@@ -1458,12 +1635,20 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	return ret;
 }
 
-/* Note we copy the properties from userspace outside of the i915 perf
- * mutex to avoid an awkward lockdep with mmap_sem.
+/**
+ * read_properties_unlocked - validate + copy userspace stream open properties
+ * @dev_priv: i915 device instance
+ * @uprops: The array of u64 key value pairs given by userspace
+ * @n_props: The number of key value pairs expected in @uprops
+ * @props: The stream configuration built up while validating properties
  *
  * Note this function only validates properties in isolation it doesn't
  * validate that the combination of properties makes sense or that all
  * properties necessary for a particular kind of stream have been set.
+ *
+ * Note that there currently aren't any ordering requirements for properties so
+ * we shouldn't validate or assume anything about ordering here. This doesn't
+ * rule out defining new properties with ordering requirements in the future.
  */
 static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 				    u64 __user *uprops,
@@ -1585,6 +1770,26 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 	return 0;
 }
 
+/**
+ * i915_perf_open_ioctl - DRM ioctl() for userspace to open a stream FD
+ * @dev: drm device
+ * @data: ioctl data copied from userspace (unvalidated)
+ * @file: drm file
+ *
+ * Validates the stream open parameters given by userspace including flags
+ * and an array of u64 key, value pair properties.
+ *
+ * Very little is assumed up front about the nature of the stream being
+ * opened (for instance we don't assume it's for periodic OA unit metrics). An
+ * i915-perf stream is expected to be a suitable interface for other forms of
+ * buffered data written by the GPU besides periodic OA metrics.
+ *
+ * Note we copy the properties from userspace outside of the i915 perf
+ * mutex to avoid an awkward lockdep with mmap_sem.
+ *
+ * Return: A newly opened i915 Perf stream file descriptor or negative
+ * error code on failure.
+ */
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file)
 {
@@ -1621,6 +1826,14 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 	return ret;
 }
 
+/**
+ * i915_perf_register - exposes i915-perf to userspace
+ * @dev_priv: i915 device instance
+ *
+ * In particular OA metric sets are advertised under a sysfs metrics/
+ * directory allowing userspace to enumerate valid IDs that can be
+ * used to open an i915-perf stream.
+ */
 void i915_perf_register(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1650,6 +1863,15 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
 	mutex_unlock(&dev_priv->perf.lock);
 }
 
+/**
+ * i915_perf_register - hide i915-perf from userspace
+ * @dev_priv: i915 device instance
+ *
+ * i915-perf state cleanup is split up into an 'unregister' and
+ * 'deinit' phase where the interface is first hidden from
+ * userspace by i915_perf_unregister() before cleaning up
+ * remaining state in i915_perf_fini().
+ */
 void i915_perf_unregister(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1706,6 +1928,15 @@ static struct ctl_table dev_root[] = {
 	{}
 };
 
+/**
+ * i915_perf_init - initialize i915-perf state on module load
+ * @dev_priv: i915 device instance
+ *
+ * Initializes state i915-perf state without exposing anything to userspace.
+ *
+ * Note: i915-perf initialization is split into an 'init' and 'register'
+ * phase with the i915_perf_register() exposing state to userspace.
+ */
 void i915_perf_init(struct drm_i915_private *dev_priv)
 {
 	if (!IS_HASWELL(dev_priv))
@@ -1741,6 +1972,10 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 	dev_priv->perf.initialized = true;
 }
 
+/**
+ * i915_perf_fini - Counter part to i915_perf_init()
+ * @dev_priv: i915 device instance
+ */
 void i915_perf_fini(struct drm_i915_private *dev_priv)
 {
 	if (!dev_priv->perf.initialized)
-- 
2.10.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-12-12 10:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-05 11:20 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
2016-12-05 12:16 ` Matthew Auld
2016-12-07 21:40   ` Robert Bragg
2016-12-08 15:53     ` Daniel Vetter
2016-12-09 12:41       ` Robert Bragg
2016-12-12  7:34         ` Daniel Vetter
2016-12-12 10:00           ` Jani Nikula
2016-12-05 13:55 ` ✗ Fi.CI.BAT: failure for drm/i915/perf: More documentation hooked to i915.rst (rev2) Patchwork
2016-12-05 14:25   ` Chris Wilson
2016-12-07 22:53 ` ✓ Fi.CI.BAT: success for drm/i915/perf: More documentation hooked to i915.rst (rev3) Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2016-11-29 17:00 [PATCH] drm/i915/perf: More documentation hooked to i915.rst Robert Bragg
2016-11-30 16:30 ` Matthew Auld
2016-11-30 19:41 ` Daniel Vetter
2016-12-01 12:12   ` Jani Nikula
2016-12-02 15:17     ` Robert Bragg
2016-12-02 15:05   ` Robert Bragg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.