dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings
@ 2021-04-15 15:59 Matthew Auld
  2021-04-15 15:59 ` [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc Matthew Auld
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Matthew Auld @ 2021-04-15 15:59 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jordan Justen, dri-devel, Kenneth Graunke, Jason Ekstrand,
	mesa-dev, Daniel Vetter

It's not properly formatted kernel doc, just nerf the warnings for now.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: mesa-dev@lists.freedesktop.org
---
 include/uapi/drm/i915_drm.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index ddc47bbf48b6..a50257cde9ff 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1054,12 +1054,12 @@ struct drm_i915_gem_exec_fence {
 	__u32 flags;
 };
 
-/**
+/*
  * See drm_i915_gem_execbuffer_ext_timeline_fences.
  */
 #define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
 
-/**
+/*
  * This structure describes an array of drm_syncobj and associated points for
  * timeline variants of drm_syncobj. It is invalid to append this structure to
  * the execbuf if I915_EXEC_FENCE_ARRAY is set.
@@ -1700,7 +1700,7 @@ struct drm_i915_gem_context_param {
 	__u64 value;
 };
 
-/**
+/*
  * Context SSEU programming
  *
  * It may be necessary for either functional or performance reason to configure
@@ -2067,7 +2067,7 @@ struct drm_i915_perf_open_param {
 	__u64 properties_ptr;
 };
 
-/**
+/*
  * Enable data capture for a stream that was either opened in a disabled state
  * via I915_PERF_FLAG_DISABLED or was later disabled via
  * I915_PERF_IOCTL_DISABLE.
@@ -2081,7 +2081,7 @@ struct drm_i915_perf_open_param {
  */
 #define I915_PERF_IOCTL_ENABLE	_IO('i', 0x0)
 
-/**
+/*
  * Disable data capture for a stream.
  *
  * It is an error to try and read a stream that is disabled.
@@ -2090,7 +2090,7 @@ struct drm_i915_perf_open_param {
  */
 #define I915_PERF_IOCTL_DISABLE	_IO('i', 0x1)
 
-/**
+/*
  * Change metrics_set captured by a stream.
  *
  * If the stream is bound to a specific context, the configuration change
@@ -2103,7 +2103,7 @@ struct drm_i915_perf_open_param {
  */
 #define I915_PERF_IOCTL_CONFIG	_IO('i', 0x2)
 
-/**
+/*
  * Common to all i915 perf records
  */
 struct drm_i915_perf_record_header {
@@ -2151,7 +2151,7 @@ enum drm_i915_perf_record_type {
 	DRM_I915_PERF_RECORD_MAX /* non-ABI */
 };
 
-/**
+/*
  * Structure to upload perf dynamic configuration into the kernel.
  */
 struct drm_i915_perf_oa_config {
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc
  2021-04-15 15:59 [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Matthew Auld
@ 2021-04-15 15:59 ` Matthew Auld
  2021-04-16  8:46   ` Daniel Vetter
  2021-04-15 15:59 ` [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend " Matthew Auld
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Matthew Auld @ 2021-04-15 15:59 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jordan Justen, dri-devel, Kenneth Graunke, Jason Ekstrand,
	mesa-dev, Daniel Vetter

Add some example usage for the extension chaining also, which is quite
nifty.

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: mesa-dev@lists.freedesktop.org
---
 include/uapi/drm/i915_drm.h | 46 +++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a50257cde9ff..d9c954a5a456 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -62,8 +62,8 @@ extern "C" {
 #define I915_ERROR_UEVENT		"ERROR"
 #define I915_RESET_UEVENT		"RESET"
 
-/*
- * i915_user_extension: Base class for defining a chain of extensions
+/**
+ * struct i915_user_extension - Base class for defining a chain of extensions
  *
  * Many interfaces need to grow over time. In most cases we can simply
  * extend the struct and have userspace pass in more data. Another option,
@@ -76,12 +76,50 @@ extern "C" {
  * increasing complexity, and for large parts of that interface to be
  * entirely optional. The downside is more pointer chasing; chasing across
  * the __user boundary with pointers encapsulated inside u64.
+ *
+ * Example chaining:
+ *
+ * .. code-block:: C
+ *
+ *	struct i915_user_extension ext3 {
+ *		.next_extension = 0, // end
+ *		.name = ...,
+ *	};
+ *	struct i915_user_extension ext2 {
+ *		.next_extension = (uintptr_t)&ext3,
+ *		.name = ...,
+ *	};
+ *	struct i915_user_extension ext1 {
+ *		.next_extension = (uintptr_t)&ext2,
+ *		.name = ...,
+ *	};
+ *
+ * Typically the i915_user_extension would be embedded in some uAPI struct, and
+ * in this case we would feed it the head of the chain(i.e ext1), which would
+ * then apply all of the above extensions.
+ *
  */
 struct i915_user_extension {
+	/**
+	 * @next_extension:
+	 *
+	 * Pointer to the next i915_user_extension, or zero if the end.
+	 */
 	__u64 next_extension;
+	/** @name: Name of the extension */
 	__u32 name;
-	__u32 flags; /* All undefined bits must be zero. */
-	__u32 rsvd[4]; /* Reserved for future use; must be zero. */
+	/**
+	 * @flags: MBZ
+	 *
+	 * All undefined bits must be zero.
+	 */
+	__u32 flags;
+	/**
+	 * @rsvd: MBZ
+	 *
+	 * Reserved for future use; must be zero.
+	 */
+	__u32 rsvd[4];
 };
 
 /*
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend to kernel doc
  2021-04-15 15:59 [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Matthew Auld
  2021-04-15 15:59 ` [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc Matthew Auld
@ 2021-04-15 15:59 ` Matthew Auld
  2021-04-15 22:25   ` [Mesa-dev] " Ian Romanick
                     ` (2 more replies)
  2021-04-15 15:59 ` [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI Matthew Auld
  2021-04-16  8:44 ` [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Daniel Vetter
  3 siblings, 3 replies; 27+ messages in thread
From: Matthew Auld @ 2021-04-15 15:59 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jordan Justen, dri-devel, Kenneth Graunke, Jason Ekstrand,
	mesa-dev, Daniel Vetter

Add a note about the two-step process.

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: mesa-dev@lists.freedesktop.org
---
 include/uapi/drm/i915_drm.h | 57 ++++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 11 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d9c954a5a456..ef36f1a0adde 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2210,14 +2210,23 @@ struct drm_i915_perf_oa_config {
 	__u64 flex_regs_ptr;
 };
 
+/**
+ * struct drm_i915_query_item - An individual query for the kernel to process.
+ *
+ * The behaviour is determined by the @query_id. Note that exactly what
+ * @data_ptr is also depends on the specific @query_id.
+ */
 struct drm_i915_query_item {
+	/** @query_id: The id for this query */
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
 #define DRM_I915_QUERY_ENGINE_INFO	2
 #define DRM_I915_QUERY_PERF_CONFIG      3
 /* Must be kept compact -- no holes and well documented */
 
-	/*
+	/**
+	 * @length:
+	 *
 	 * When set to zero by userspace, this is filled with the size of the
 	 * data to be written at the data_ptr pointer. The kernel sets this
 	 * value to a negative value to signal an error on a particular query
@@ -2225,21 +2234,26 @@ struct drm_i915_query_item {
 	 */
 	__s32 length;
 
-	/*
+	/**
+	 * @flags:
+	 *
 	 * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
 	 *
 	 * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
-	 * following :
-	 *         - DRM_I915_QUERY_PERF_CONFIG_LIST
-	 *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
-	 *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
+	 * following:
+	 *
+	 *	- DRM_I915_QUERY_PERF_CONFIG_LIST
+	 *      - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
+	 *      - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
 	 */
 	__u32 flags;
 #define DRM_I915_QUERY_PERF_CONFIG_LIST          1
 #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
 #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
 
-	/*
+	/**
+	 * @data_ptr:
+	 *
 	 * Data will be written at the location pointed by data_ptr when the
 	 * value of length matches the length of the data to be written by the
 	 * kernel.
@@ -2247,16 +2261,37 @@ struct drm_i915_query_item {
 	__u64 data_ptr;
 };
 
+/**
+ * struct drm_i915_query - Supply an array of drm_i915_query_item for the kernel
+ * to fill out.
+ *
+ * Note that this is generally a two step process for each drm_i915_query_item
+ * in the array:
+ *
+ *	1.) Call the DRM_IOCTL_I915_QUERY, giving it our array of
+ *	drm_i915_query_item, with drm_i915_query_item.size set to zero. The
+ *	kernel will then fill in the size, in bytes, which tells userspace how
+ *	memory it needs to allocate for the blob(say for an array of
+ *	properties).
+ *
+ *	2.) Next we call DRM_IOCTL_I915_QUERY again, this time with the
+ *	drm_i915_query_item.data_ptr equal to our newly allocated blob. Note
+ *	that the i915_query_item.size should still be the same as what the
+ *	kernel previously set. At this point the kernel can fill in the blob.
+ *
+ */
 struct drm_i915_query {
+	/** @num_items: The number of elements in the @items_ptr array */
 	__u32 num_items;
 
-	/*
-	 * Unused for now. Must be cleared to zero.
+	/**
+	 * @flags: Unused for now. Must be cleared to zero.
 	 */
 	__u32 flags;
 
-	/*
-	 * This points to an array of num_items drm_i915_query_item structures.
+	/**
+	 * @items_ptr: This points to an array of num_items drm_i915_query_item
+	 * structures.
 	 */
 	__u64 items_ptr;
 };
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-15 15:59 [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Matthew Auld
  2021-04-15 15:59 ` [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc Matthew Auld
  2021-04-15 15:59 ` [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend " Matthew Auld
@ 2021-04-15 15:59 ` Matthew Auld
  2021-04-16  9:15   ` Daniel Vetter
  2021-04-16 16:38   ` Jason Ekstrand
  2021-04-16  8:44 ` [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Daniel Vetter
  3 siblings, 2 replies; 27+ messages in thread
From: Matthew Auld @ 2021-04-15 15:59 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jordan Justen, dri-devel, Kenneth Graunke, Jason Ekstrand,
	mesa-dev, Daniel Vetter

Add an entry for the new uAPI needed for DG1.

v2(Daniel):
  - include the overall upstreaming plan
  - add a note for mmap, there are differences here for TTM vs i915
  - bunch of other suggestions from Daniel
v3:
 (Daniel)
  - add a note for set/get caching stuff
  - add some more docs for existing query and extensions stuff
  - add an actual code example for regions query
  - bunch of other stuff
 (Jason)
  - uAPI change(!):
	- try a simpler design with the placements extension
	- rather than have a generic setparam which can cover multiple
	  use cases, have each extension be responsible for one thing
	  only

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: mesa-dev@lists.freedesktop.org
---
 Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
 Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
 Documentation/gpu/rfc/index.rst         |   4 +
 3 files changed, 398 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
 create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst

diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
new file mode 100644
index 000000000000..2a82a452e9f2
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_gem_lmem.h
@@ -0,0 +1,255 @@
+/*
+ * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
+ * For the regions query we are just adding a new query id, so no actual new
+ * ioctl or anything, but including it here for reference.
+ */
+struct drm_i915_query_item {
+#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
+	....
+        __u64 query_id;
+
+        /*
+         * When set to zero by userspace, this is filled with the size of the
+         * data to be written at the data_ptr pointer. The kernel sets this
+         * value to a negative value to signal an error on a particular query
+         * item.
+         */
+        __s32 length;
+
+        __u32 flags;
+        /*
+         * Data will be written at the location pointed by data_ptr when the
+         * value of length matches the length of the data to be written by the
+         * kernel.
+         */
+        __u64 data_ptr;
+};
+
+struct drm_i915_query {
+        __u32 num_items;
+        /*
+         * Unused for now. Must be cleared to zero.
+         */
+        __u32 flags;
+        /*
+         * This points to an array of num_items drm_i915_query_item structures.
+         */
+        __u64 items_ptr;
+};
+
+#define DRM_IOCTL_I915_QUERY	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+
+/**
+ * enum drm_i915_gem_memory_class
+ */
+enum drm_i915_gem_memory_class {
+	/** @I915_MEMORY_CLASS_SYSTEM: system memory */
+	I915_MEMORY_CLASS_SYSTEM = 0,
+	/** @I915_MEMORY_CLASS_DEVICE: device local-memory */
+	I915_MEMORY_CLASS_DEVICE,
+};
+
+/**
+ * struct drm_i915_gem_memory_class_instance
+ */
+struct drm_i915_gem_memory_class_instance {
+	/** @memory_class: see enum drm_i915_gem_memory_class */
+	__u16 memory_class;
+
+	/** @memory_instance: which instance */
+	__u16 memory_instance;
+};
+
+/**
+ * struct drm_i915_memory_region_info
+ *
+ * Describes one region as known to the driver.
+ *
+ * Note that we reserve quite a lot of stuff here for potential future work. As
+ * an example we might want expose the capabilities(see caps) for a given
+ * region, which could include things like if the region is CPU
+ * mappable/accessible etc.
+ */
+struct drm_i915_memory_region_info {
+	/** @region: class:instance pair encoding */
+	struct drm_i915_gem_memory_class_instance region;
+
+	/** @rsvd0: MBZ */
+	__u32 rsvd0;
+
+	/** @caps: MBZ */
+	__u64 caps;
+
+	/** @flags: MBZ */
+	__u64 flags;
+
+	/** @probed_size: Memory probed by the driver (-1 = unknown) */
+	__u64 probed_size;
+
+	/** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
+	__u64 unallocated_size;
+
+	/** @rsvd1: MBZ */
+	__u64 rsvd1[8];
+};
+
+/**
+ * struct drm_i915_query_memory_regions
+ *
+ * Region info query enumerates all regions known to the driver by filling in
+ * an array of struct drm_i915_memory_region_info structures.
+ *
+ * Example for getting the list of supported regions:
+ *
+ * .. code-block:: C
+ *
+ *	struct drm_i915_query_memory_regions *info;
+ *	struct drm_i915_query_item item = {
+ *		.query_id = DRM_I915_QUERY_MEMORY_REGIONS;
+ *	};
+ *	struct drm_i915_query query = {
+ *		.num_items = 1,
+ *		.items_ptr = (uintptr_t)&item,
+ *	};
+ *	int err, i;
+ *
+ *	// First query the size of the blob we need, this needs to be large
+ *	// enough to hold our array of regions. The kernel will fill out the
+ *	// item.length for us, which is the number of bytes we need.
+ *	err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
+ *	if (err) ...
+ *
+ *	info = calloc(1, item.length);
+ *	// Now that we allocated the required number of bytes, we call the ioctl
+ *	// again, this time with the data_ptr pointing to our newly allocated
+ *	// blob, which the kernel can then populate with the all the region info.
+ *	item.data_ptr = (uintptr_t)&info,
+ *
+ *	err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
+ *	if (err) ...
+ *
+ *	// We can now access each region in the array
+ *	for (i = 0; i < info->num_regions; i++) {
+ *		struct drm_i915_memory_region_info mr = info->regions[i];
+ *		u16 class = mr.region.class;
+ *		u16 instance = mr.region.instance;
+ *
+ *		....
+ *	}
+ *
+ *	free(info);
+ */
+struct drm_i915_query_memory_regions {
+	/** @num_regions: Number of supported regions */
+	__u32 num_regions;
+
+	/** @rsvd: MBZ */
+	__u32 rsvd[3];
+
+	/** @regions: Info about each supported region */
+	struct drm_i915_memory_region_info regions[];
+};
+
+#define DRM_I915_GEM_CREATE_EXT		0xdeadbeaf
+#define DRM_IOCTL_I915_GEM_CREATE_EXT	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
+
+/**
+ * struct drm_i915_gem_create_ext
+ *
+ * Existing gem_create behaviour, with added extension support.
+ *
+ * Note that in the future we want to have our buffer flags here, at least for
+ * the stuff that is immutable. Previously we would have two ioctls, one to
+ * create the object with gem_create, and another to apply various parameters,
+ * however this creates some ambiguity for the params which are considered
+ * immutable. Also in general we're phasing out the various SET/GET ioctls.
+ */
+struct drm_i915_gem_create_ext {
+	/**
+	 * @size: Requested size for the object.
+	 *
+	 * The (page-aligned) allocated size for the object will be returned.
+	 *
+	 * Note that for some devices we have might have further minimum
+	 * page-size restrictions(larger than 4K), like for device local-memory.
+	 * However in general the final size here should always reflect any
+	 * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
+	 * extension to place the object in device local-memory.
+	 */
+	__u64 size;
+	/**
+	 * @handle: Returned handle for the object.
+	 *
+	 * Object handles are nonzero.
+	 */
+	__u32 handle;
+	/** @flags: MBZ */
+	__u32 flags;
+	/**
+	 * @extensions: The chain of extensions to apply to this object.
+	 *
+	 * This will be useful in the future when we need to support several
+	 * different extensions, and we need to apply more than one when
+	 * creating the object. See struct i915_user_extension.
+	 *
+	 * If we don't supply any extensions then we get the same old gem_create
+	 * behaviour.
+	 *
+	 * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
+	 * drm_i915_gem_create_ext_memory_regions
+	 */
+#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_create_ext_memory_regions
+ *
+ * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
+ *
+ * Set the object with the desired set of placements/regions in priority
+ * order(each entry must be unique and supported by the device), as an array of
+ * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
+ * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
+ * supported regions for a device.
+ *
+ * As an example, on discrete devices, if we wish to set the placement as
+ * device local-memory we can do something like:
+ *
+ * .. code-block:: C
+ *
+ *	struct drm_i915_gem_memory_class_instance region_lmem = {
+ *              .memory_class = I915_MEMORY_CLASS_DEVICE,
+ *              .memory_instance = 0,
+ *      };
+ *      struct drm_i915_gem_create_ext_memory_regions regions = {
+ *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
+ *              .regions = (uintptr_t)&region_lmem,
+ *              .num_regions = 1,
+ *      };
+ *      struct drm_i915_gem_create_ext create_ext = {
+ *              .size = 16 * PAGE_SIZE,
+ *              .extensions = (uintptr_t)&regions,
+ *      };
+ *
+ *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
+ *      if (err) ...
+ *
+ * At which point we get the object handle in create_ext.handle, if all went
+ * well.
+ */
+struct drm_i915_gem_create_ext_memory_regions {
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @pad: MBZ */
+	__u32 pad;
+	/** @num_regions: Number of elements in the placements array. */
+	__u32 num_regions;
+	/**
+	 * @regions: The placements array.
+	 *
+	 * Should be an array of drm_i915_gem_memory_class_instance.
+	 */
+	__u64 regions;
+};
diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
new file mode 100644
index 000000000000..52f1db15ae94
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
@@ -0,0 +1,139 @@
+=========================
+I915 DG1/LMEM RFC Section
+=========================
+
+Upstream plan
+=============
+For upstream the overall plan for landing all the DG1 stuff and turning it for
+real, with all the uAPI bits is:
+
+* Merge basic HW enabling of DG1(still without pciid)
+* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
+        * At this point we can still make changes, but importantly this lets us
+          start running IGTs which can utilize local-memory in CI
+* Convert over to TTM, make sure it all keeps working
+* Add pciid for DG1 and turn on uAPI for real
+
+New object placement and region query uAPI
+==========================================
+Starting from DG1 we need to give userspace the ability to allocate buffers from
+device local-memory. Currently the driver supports gem_create, which can place
+buffers in system memory via shmem, and the usual assortment of other
+interfaces, like dumb buffers and userptr.
+
+To support this new capability, while also providing a uAPI which will work
+beyond just DG1, we propose to offer three new bits of uAPI:
+
+Query uAPI
+----------
+Existing query interface
+^^^^^^^^^^^^^^^^^^^^^^^^
+.. kernel-doc:: include/uapi/drm/i915_drm.h
+        :functions: drm_i915_query_item drm_i915_query
+
+DRM_I915_QUERY_MEMORY_REGIONS
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+New query ID which allows userspace to discover the list of supported memory
+regions(like system-memory and local-memory) for a given device. We identify
+each region with a class and instance pair, which should be unique. The class
+here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
+like DG1.
+
+Side note: The class/instance design is borrowed from our existing engine uAPI,
+where we describe every physical engine in terms of its class, and the
+particular instance, since we can have more than one per class.
+
+In the future we also want to expose more information which can further
+describe the capabilities of a region.
+
+.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
+        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
+
+GEM_CREATE_EXT
+--------------
+New ioctl which is basically just gem_create but now allows userspace to
+provide a chain of possible extensions. Note that if we don't provide any
+extensions then we get the exact same behaviour as gem_create.
+
+Side note: We also need to support PXP[1] in the near future, which is also
+applicable to integrated platforms, and adds its own gem_create_ext extension,
+which basically lets userspace mark a buffer as "protected".
+
+.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
+        :functions: drm_i915_gem_create_ext
+
+It's raining extensions
+^^^^^^^^^^^^^^^^^^^^^^^
+As noted above, extensions can be supplied as a chain in gem_create_ext using the
+existing i915_user_extension. This will be useful in the future when we need to
+support several different extensions, and we need to apply more than one when
+creating the object.
+
+.. kernel-doc:: include/uapi/drm/i915_drm.h
+        :functions: i915_user_extension
+
+I915_GEM_CREATE_EXT_MEMORY_REGIONS
+----------------------------------
+Implemented as an extension for gem_create_ext, we would now allow userspace to
+optionally provide an immutable list of preferred placements at creation time,
+in priority order, for a given buffer object.  For the placements we expect
+them each to use the class/instance encoding, as per the output of the regions
+query. Having the list in priority order will be useful in the future when
+placing an object, say during eviction.
+
+.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
+        :functions: drm_i915_gem_create_ext_memory_regions
+
+One fair criticism here is that this seems a little over-engineered[2]. If we
+just consider DG1 then yes, a simple gem_create.flags or something is totally
+all that's needed to tell the kernel to allocate the buffer in local-memory or
+whatever. However looking to the future we need uAPI which can also support
+upcoming Xe HP multi-tile architecture in a sane way, where there can be
+multiple local-memory instances for a given device, and so using both class and
+instance in our uAPI to describe regions is desirable, although specifically
+for DG1 it's uninteresting, since we only have a single local-memory instance.
+
+Existing uAPI issues
+====================
+Some potential issues we still need to resolve.
+
+I915 MMAP
+---------
+In i915 there are multiple ways to MMAP GEM object, including mapping the same
+object using different mapping types(WC vs WB), i.e multiple active mmaps per
+object. TTM expects one MMAP at most for the lifetime of the object. If it
+turns out that we have to backpedal here, there might be some potential
+userspace fallout.
+
+I915 SET/GET_CACHING
+--------------------
+In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
+DG1 doesn't support non-snooped pcie transactions, so we can just always
+allocate as WB for smem-only buffers.  If/when our hw gains support for
+non-snooped pcie transactions then we must fix this mode at allocation time as
+a new GEM extension.
+
+This is related to the mmap problem, because in general (meaning, when we're
+not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
+allocation mode.
+
+Possible idea is to let the kernel picks the mmap mode for userspace from the
+following table:
+
+smem-only: WB. Userspace does not need to call clflush.
+
+smem+lmem: We allocate uncached memory, and give userspace a WC mapping
+for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
+access, which is a bit inefficient.
+
+lmem only: always WC
+
+This means on discrete you only get a single mmap mode, all others must be
+rejected. That's probably going to be a new default mode or something like
+that.
+
+Links
+=====
+[1] https://patchwork.freedesktop.org/series/86798/
+
+[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
index a8621f7dab8b..05670442ca1b 100644
--- a/Documentation/gpu/rfc/index.rst
+++ b/Documentation/gpu/rfc/index.rst
@@ -15,3 +15,7 @@ host such documentation:
 
 * Once the code has landed move all the documentation to the right places in
   the main core, helper or driver sections.
+
+.. toctree::
+
+    i915_gem_lmem.rst
-- 
2.26.3

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [Mesa-dev] [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend to kernel doc
  2021-04-15 15:59 ` [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend " Matthew Auld
@ 2021-04-15 22:25   ` Ian Romanick
  2021-04-16  8:59     ` Daniel Vetter
  2021-04-16  7:57   ` [Intel-gfx] " Tvrtko Ursulin
  2021-04-16  8:49   ` Daniel Vetter
  2 siblings, 1 reply; 27+ messages in thread
From: Ian Romanick @ 2021-04-15 22:25 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: mesa-dev, Kenneth Graunke, dri-devel, Daniel Vetter

On 4/15/21 8:59 AM, Matthew Auld wrote:
> Add a note about the two-step process.
> 
> Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org
> ---
>  include/uapi/drm/i915_drm.h | 57 ++++++++++++++++++++++++++++++-------
>  1 file changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index d9c954a5a456..ef36f1a0adde 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2210,14 +2210,23 @@ struct drm_i915_perf_oa_config {
>  	__u64 flex_regs_ptr;
>  };
>  
> +/**
> + * struct drm_i915_query_item - An individual query for the kernel to process.
> + *
> + * The behaviour is determined by the @query_id. Note that exactly what

Since we just had a big discussion about this on mesa-dev w.r.t. Mesa
code and documentation... does the kernel have a policy about which
flavor (pun intended) of English should be used?

> + * @data_ptr is also depends on the specific @query_id.
> + */
>  struct drm_i915_query_item {
> +	/** @query_id: The id for this query */
>  	__u64 query_id;
>  #define DRM_I915_QUERY_TOPOLOGY_INFO    1
>  #define DRM_I915_QUERY_ENGINE_INFO	2
>  #define DRM_I915_QUERY_PERF_CONFIG      3
>  /* Must be kept compact -- no holes and well documented */
>  
> -	/*
> +	/**
> +	 * @length:
> +	 *
>  	 * When set to zero by userspace, this is filled with the size of the
>  	 * data to be written at the data_ptr pointer. The kernel sets this
>  	 * value to a negative value to signal an error on a particular query
> @@ -2225,21 +2234,26 @@ struct drm_i915_query_item {
>  	 */
>  	__s32 length;
>  
> -	/*
> +	/**
> +	 * @flags:
> +	 *
>  	 * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
>  	 *
>  	 * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
> -	 * following :
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_LIST
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
> +	 * following:
> +	 *
> +	 *	- DRM_I915_QUERY_PERF_CONFIG_LIST
> +	 *      - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> +	 *      - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
>  	 */
>  	__u32 flags;
>  #define DRM_I915_QUERY_PERF_CONFIG_LIST          1
>  #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
>  #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
>  
> -	/*
> +	/**
> +	 * @data_ptr:
> +	 *
>  	 * Data will be written at the location pointed by data_ptr when the
>  	 * value of length matches the length of the data to be written by the
>  	 * kernel.
> @@ -2247,16 +2261,37 @@ struct drm_i915_query_item {
>  	__u64 data_ptr;
>  };
>  
> +/**
> + * struct drm_i915_query - Supply an array of drm_i915_query_item for the kernel
> + * to fill out.
> + *
> + * Note that this is generally a two step process for each drm_i915_query_item
> + * in the array:
> + *
> + *	1.) Call the DRM_IOCTL_I915_QUERY, giving it our array of
> + *	drm_i915_query_item, with drm_i915_query_item.size set to zero. The
> + *	kernel will then fill in the size, in bytes, which tells userspace how
> + *	memory it needs to allocate for the blob(say for an array of
> + *	properties).
> + *
> + *	2.) Next we call DRM_IOCTL_I915_QUERY again, this time with the
> + *	drm_i915_query_item.data_ptr equal to our newly allocated blob. Note
> + *	that the i915_query_item.size should still be the same as what the
> + *	kernel previously set. At this point the kernel can fill in the blob.
> + *
> + */
>  struct drm_i915_query {
> +	/** @num_items: The number of elements in the @items_ptr array */
>  	__u32 num_items;
>  
> -	/*
> -	 * Unused for now. Must be cleared to zero.
> +	/**
> +	 * @flags: Unused for now. Must be cleared to zero.
>  	 */
>  	__u32 flags;
>  
> -	/*
> -	 * This points to an array of num_items drm_i915_query_item structures.
> +	/**
> +	 * @items_ptr: This points to an array of num_items drm_i915_query_item
> +	 * structures.
>  	 */
>  	__u64 items_ptr;
>  };
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend to kernel doc
  2021-04-15 15:59 ` [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend " Matthew Auld
  2021-04-15 22:25   ` [Mesa-dev] " Ian Romanick
@ 2021-04-16  7:57   ` Tvrtko Ursulin
  2021-04-16  8:49   ` Daniel Vetter
  2 siblings, 0 replies; 27+ messages in thread
From: Tvrtko Ursulin @ 2021-04-16  7:57 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx
  Cc: mesa-dev, Kenneth Graunke, dri-devel, Daniel Vetter


On 15/04/2021 16:59, Matthew Auld wrote:
> Add a note about the two-step process.
> 
> Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org
> ---
>   include/uapi/drm/i915_drm.h | 57 ++++++++++++++++++++++++++++++-------
>   1 file changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index d9c954a5a456..ef36f1a0adde 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2210,14 +2210,23 @@ struct drm_i915_perf_oa_config {
>   	__u64 flex_regs_ptr;
>   };
>   
> +/**
> + * struct drm_i915_query_item - An individual query for the kernel to process.
> + *
> + * The behaviour is determined by the @query_id. Note that exactly what
> + * @data_ptr is also depends on the specific @query_id.
> + */
>   struct drm_i915_query_item {
> +	/** @query_id: The id for this query */
>   	__u64 query_id;
>   #define DRM_I915_QUERY_TOPOLOGY_INFO    1
>   #define DRM_I915_QUERY_ENGINE_INFO	2
>   #define DRM_I915_QUERY_PERF_CONFIG      3
>   /* Must be kept compact -- no holes and well documented */
>   
> -	/*
> +	/**
> +	 * @length:
> +	 *
>   	 * When set to zero by userspace, this is filled with the size of the
>   	 * data to be written at the data_ptr pointer. The kernel sets this
>   	 * value to a negative value to signal an error on a particular query
> @@ -2225,21 +2234,26 @@ struct drm_i915_query_item {
>   	 */
>   	__s32 length;
>   
> -	/*
> +	/**
> +	 * @flags:
> +	 *
>   	 * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
>   	 *
>   	 * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
> -	 * following :
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_LIST
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
> +	 * following:
> +	 *
> +	 *	- DRM_I915_QUERY_PERF_CONFIG_LIST
> +	 *      - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> +	 *      - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
>   	 */
>   	__u32 flags;
>   #define DRM_I915_QUERY_PERF_CONFIG_LIST          1
>   #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
>   #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
>   
> -	/*
> +	/**
> +	 * @data_ptr:
> +	 *
>   	 * Data will be written at the location pointed by data_ptr when the
>   	 * value of length matches the length of the data to be written by the
>   	 * kernel.
> @@ -2247,16 +2261,37 @@ struct drm_i915_query_item {
>   	__u64 data_ptr;
>   };
>   
> +/**
> + * struct drm_i915_query - Supply an array of drm_i915_query_item for the kernel
> + * to fill out.
> + *
> + * Note that this is generally a two step process for each drm_i915_query_item
> + * in the array:
> + *
> + *	1.) Call the DRM_IOCTL_I915_QUERY, giving it our array of
> + *	drm_i915_query_item, with drm_i915_query_item.size set to zero. The
> + *	kernel will then fill in the size, in bytes, which tells userspace how
> + *	memory it needs to allocate for the blob(say for an array of
> + *	properties).
> + *
> + *	2.) Next we call DRM_IOCTL_I915_QUERY again, this time with the
> + *	drm_i915_query_item.data_ptr equal to our newly allocated blob. Note
> + *	that the i915_query_item.size should still be the same as what the
> + *	kernel previously set. At this point the kernel can fill in the blob.

I'd also document the one step alternative where userspace passes in a 
buffer equal or larger than the required size. For many small queries 
(most?) this actually works just as well and with one ioctl only.

Regards,

Tvrtko

> + *
> + */
>   struct drm_i915_query {
> +	/** @num_items: The number of elements in the @items_ptr array */
>   	__u32 num_items;
>   
> -	/*
> -	 * Unused for now. Must be cleared to zero.
> +	/**
> +	 * @flags: Unused for now. Must be cleared to zero.
>   	 */
>   	__u32 flags;
>   
> -	/*
> -	 * This points to an array of num_items drm_i915_query_item structures.
> +	/**
> +	 * @items_ptr: This points to an array of num_items drm_i915_query_item
> +	 * structures.
>   	 */
>   	__u64 items_ptr;
>   };
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings
  2021-04-15 15:59 [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Matthew Auld
                   ` (2 preceding siblings ...)
  2021-04-15 15:59 ` [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI Matthew Auld
@ 2021-04-16  8:44 ` Daniel Vetter
  2021-04-16  8:54   ` Daniel Vetter
  3 siblings, 1 reply; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16  8:44 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jordan Justen, intel-gfx, dri-devel, Kenneth Graunke,
	Jason Ekstrand, Daniel Vetter, mesa-dev

On Thu, Apr 15, 2021 at 04:59:55PM +0100, Matthew Auld wrote:
> It's not properly formatted kernel doc, just nerf the warnings for now.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  include/uapi/drm/i915_drm.h | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index ddc47bbf48b6..a50257cde9ff 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1054,12 +1054,12 @@ struct drm_i915_gem_exec_fence {
>  	__u32 flags;
>  };
>  
> -/**
> +/*
>   * See drm_i915_gem_execbuffer_ext_timeline_fences.
>   */
>  #define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
>  
> -/**
> +/*
>   * This structure describes an array of drm_syncobj and associated points for
>   * timeline variants of drm_syncobj. It is invalid to append this structure to
>   * the execbuf if I915_EXEC_FENCE_ARRAY is set.
> @@ -1700,7 +1700,7 @@ struct drm_i915_gem_context_param {
>  	__u64 value;
>  };
>  
> -/**
> +/*
>   * Context SSEU programming
>   *
>   * It may be necessary for either functional or performance reason to configure
> @@ -2067,7 +2067,7 @@ struct drm_i915_perf_open_param {
>  	__u64 properties_ptr;
>  };
>  
> -/**
> +/*
>   * Enable data capture for a stream that was either opened in a disabled state
>   * via I915_PERF_FLAG_DISABLED or was later disabled via
>   * I915_PERF_IOCTL_DISABLE.
> @@ -2081,7 +2081,7 @@ struct drm_i915_perf_open_param {
>   */
>  #define I915_PERF_IOCTL_ENABLE	_IO('i', 0x0)
>  
> -/**
> +/*
>   * Disable data capture for a stream.
>   *
>   * It is an error to try and read a stream that is disabled.
> @@ -2090,7 +2090,7 @@ struct drm_i915_perf_open_param {
>   */
>  #define I915_PERF_IOCTL_DISABLE	_IO('i', 0x1)
>  
> -/**
> +/*
>   * Change metrics_set captured by a stream.
>   *
>   * If the stream is bound to a specific context, the configuration change
> @@ -2103,7 +2103,7 @@ struct drm_i915_perf_open_param {
>   */
>  #define I915_PERF_IOCTL_CONFIG	_IO('i', 0x2)
>  
> -/**
> +/*
>   * Common to all i915 perf records
>   */
>  struct drm_i915_perf_record_header {
> @@ -2151,7 +2151,7 @@ enum drm_i915_perf_record_type {
>  	DRM_I915_PERF_RECORD_MAX /* non-ABI */
>  };
>  
> -/**
> +/*
>   * Structure to upload perf dynamic configuration into the kernel.
>   */
>  struct drm_i915_perf_oa_config {
> -- 
> 2.26.3
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc
  2021-04-15 15:59 ` [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc Matthew Auld
@ 2021-04-16  8:46   ` Daniel Vetter
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16  8:46 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jason Ekstrand, Jordan Justen, intel-gfx, dri-devel,
	Kenneth Graunke, mesa-dev, Daniel Vetter

On Thu, Apr 15, 2021 at 04:59:56PM +0100, Matthew Auld wrote:
> Add some example usage for the extension chaining also, which is quite
> nifty.
> 
> Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org
> ---
>  include/uapi/drm/i915_drm.h | 46 +++++++++++++++++++++++++++++++++----
>  1 file changed, 42 insertions(+), 4 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index a50257cde9ff..d9c954a5a456 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -62,8 +62,8 @@ extern "C" {
>  #define I915_ERROR_UEVENT		"ERROR"
>  #define I915_RESET_UEVENT		"RESET"
>  
> -/*
> - * i915_user_extension: Base class for defining a chain of extensions
> +/**
> + * struct i915_user_extension - Base class for defining a chain of extensions
>   *
>   * Many interfaces need to grow over time. In most cases we can simply
>   * extend the struct and have userspace pass in more data. Another option,
> @@ -76,12 +76,50 @@ extern "C" {
>   * increasing complexity, and for large parts of that interface to be
>   * entirely optional. The downside is more pointer chasing; chasing across
>   * the __user boundary with pointers encapsulated inside u64.
> + *
> + * Example chaining:
> + *
> + * .. code-block:: C
> + *
> + *	struct i915_user_extension ext3 {
> + *		.next_extension = 0, // end
> + *		.name = ...,
> + *	};
> + *	struct i915_user_extension ext2 {
> + *		.next_extension = (uintptr_t)&ext3,
> + *		.name = ...,
> + *	};
> + *	struct i915_user_extension ext1 {
> + *		.next_extension = (uintptr_t)&ext2,
> + *		.name = ...,
> + *	};
> + *
> + * Typically the i915_user_extension would be embedded in some uAPI struct, and
> + * in this case we would feed it the head of the chain(i.e ext1), which would
> + * then apply all of the above extensions.
> + *
>   */
>  struct i915_user_extension {
> +	/**
> +	 * @next_extension:
> +	 *
> +	 * Pointer to the next i915_user_extension, or zero if the end.
> +	 */
>  	__u64 next_extension;
> +	/** @name: Name of the extension */

Maybe clarify that the namespace here is per ioctl, not global. And maybe
also that the name is just an integer #define or something like that.

Either this is solid documentation, either way:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	__u32 name;
> -	__u32 flags; /* All undefined bits must be zero. */
> -	__u32 rsvd[4]; /* Reserved for future use; must be zero. */
> +	/**
> +	 * @flags: MBZ
> +	 *
> +	 * All undefined bits must be zero.
> +	 */
> +	__u32 flags;
> +	/**
> +	 * @rsvd: MBZ
> +	 *
> +	 * Reserved for future use; must be zero.
> +	 */
> +	__u32 rsvd[4];
>  };
>  
>  /*
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend to kernel doc
  2021-04-15 15:59 ` [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend " Matthew Auld
  2021-04-15 22:25   ` [Mesa-dev] " Ian Romanick
  2021-04-16  7:57   ` [Intel-gfx] " Tvrtko Ursulin
@ 2021-04-16  8:49   ` Daniel Vetter
  2 siblings, 0 replies; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16  8:49 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jason Ekstrand, Jordan Justen, intel-gfx, dri-devel,
	Kenneth Graunke, mesa-dev, Daniel Vetter

On Thu, Apr 15, 2021 at 04:59:57PM +0100, Matthew Auld wrote:
> Add a note about the two-step process.
> 
> Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org
> ---
>  include/uapi/drm/i915_drm.h | 57 ++++++++++++++++++++++++++++++-------
>  1 file changed, 46 insertions(+), 11 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index d9c954a5a456..ef36f1a0adde 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -2210,14 +2210,23 @@ struct drm_i915_perf_oa_config {
>  	__u64 flex_regs_ptr;
>  };
>  
> +/**
> + * struct drm_i915_query_item - An individual query for the kernel to process.
> + *
> + * The behaviour is determined by the @query_id. Note that exactly what
> + * @data_ptr is also depends on the specific @query_id.
> + */
>  struct drm_i915_query_item {
> +	/** @query_id: The id for this query */
>  	__u64 query_id;
>  #define DRM_I915_QUERY_TOPOLOGY_INFO    1
>  #define DRM_I915_QUERY_ENGINE_INFO	2
>  #define DRM_I915_QUERY_PERF_CONFIG      3
>  /* Must be kept compact -- no holes and well documented */
>  
> -	/*
> +	/**
> +	 * @length:
> +	 *
>  	 * When set to zero by userspace, this is filled with the size of the
>  	 * data to be written at the data_ptr pointer. The kernel sets this
>  	 * value to a negative value to signal an error on a particular query
> @@ -2225,21 +2234,26 @@ struct drm_i915_query_item {
>  	 */
>  	__s32 length;
>  
> -	/*
> +	/**
> +	 * @flags:
> +	 *
>  	 * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
>  	 *
>  	 * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
> -	 * following :
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_LIST
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> -	 *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
> +	 * following:
> +	 *
> +	 *	- DRM_I915_QUERY_PERF_CONFIG_LIST
> +	 *      - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> +	 *      - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
>  	 */
>  	__u32 flags;
>  #define DRM_I915_QUERY_PERF_CONFIG_LIST          1
>  #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
>  #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
>  
> -	/*
> +	/**
> +	 * @data_ptr:
> +	 *
>  	 * Data will be written at the location pointed by data_ptr when the
>  	 * value of length matches the length of the data to be written by the
>  	 * kernel.
> @@ -2247,16 +2261,37 @@ struct drm_i915_query_item {
>  	__u64 data_ptr;
>  };
>  
> +/**
> + * struct drm_i915_query - Supply an array of drm_i915_query_item for the kernel
> + * to fill out.
> + *
> + * Note that this is generally a two step process for each drm_i915_query_item
> + * in the array:
> + *
> + *	1.) Call the DRM_IOCTL_I915_QUERY, giving it our array of

I'm not sure this results in pretty rendering in htmldocs output. Please
check this.

This also made me realize that we're not pulling any of this into the drm
documents at all. I'll revise my review on patch 1.

Docs here look good:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


> + *	drm_i915_query_item, with drm_i915_query_item.size set to zero. The
> + *	kernel will then fill in the size, in bytes, which tells userspace how
> + *	memory it needs to allocate for the blob(say for an array of
> + *	properties).
> + *
> + *	2.) Next we call DRM_IOCTL_I915_QUERY again, this time with the
> + *	drm_i915_query_item.data_ptr equal to our newly allocated blob. Note
> + *	that the i915_query_item.size should still be the same as what the
> + *	kernel previously set. At this point the kernel can fill in the blob.
> + *
> + */
>  struct drm_i915_query {
> +	/** @num_items: The number of elements in the @items_ptr array */
>  	__u32 num_items;
>  
> -	/*
> -	 * Unused for now. Must be cleared to zero.
> +	/**
> +	 * @flags: Unused for now. Must be cleared to zero.
>  	 */
>  	__u32 flags;
>  
> -	/*
> -	 * This points to an array of num_items drm_i915_query_item structures.
> +	/**
> +	 * @items_ptr: This points to an array of num_items drm_i915_query_item
> +	 * structures.
>  	 */
>  	__u64 items_ptr;
>  };
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings
  2021-04-16  8:44 ` [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Daniel Vetter
@ 2021-04-16  8:54   ` Daniel Vetter
  0 siblings, 0 replies; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16  8:54 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jordan Justen, intel-gfx, dri-devel, Kenneth Graunke,
	Jason Ekstrand, Daniel Vetter, mesa-dev

On Fri, Apr 16, 2021 at 10:44:28AM +0200, Daniel Vetter wrote:
> On Thu, Apr 15, 2021 at 04:59:55PM +0100, Matthew Auld wrote:
> > It's not properly formatted kernel doc, just nerf the warnings for now.
> > 
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: mesa-dev@lists.freedesktop.org
> 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Ok I need to revise, we need to pull this into Documentation/gpu/. I think
best would be to create a new driver-uapi.rst file, put it right after
drm-uapi.rst, and then add a section for drm/i915 uapi or similar.

Also since pxp patches, Jason's ctx cleanup and lmem all need this prep
work in patches 1-3 here, can you pls just resend those with the review
feedback so we can fast-track merging?

Thanks, Daniel

> 
> > ---
> >  include/uapi/drm/i915_drm.h | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index ddc47bbf48b6..a50257cde9ff 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1054,12 +1054,12 @@ struct drm_i915_gem_exec_fence {
> >  	__u32 flags;
> >  };
> >  
> > -/**
> > +/*
> >   * See drm_i915_gem_execbuffer_ext_timeline_fences.
> >   */
> >  #define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
> >  
> > -/**
> > +/*
> >   * This structure describes an array of drm_syncobj and associated points for
> >   * timeline variants of drm_syncobj. It is invalid to append this structure to
> >   * the execbuf if I915_EXEC_FENCE_ARRAY is set.
> > @@ -1700,7 +1700,7 @@ struct drm_i915_gem_context_param {
> >  	__u64 value;
> >  };
> >  
> > -/**
> > +/*
> >   * Context SSEU programming
> >   *
> >   * It may be necessary for either functional or performance reason to configure
> > @@ -2067,7 +2067,7 @@ struct drm_i915_perf_open_param {
> >  	__u64 properties_ptr;
> >  };
> >  
> > -/**
> > +/*
> >   * Enable data capture for a stream that was either opened in a disabled state
> >   * via I915_PERF_FLAG_DISABLED or was later disabled via
> >   * I915_PERF_IOCTL_DISABLE.
> > @@ -2081,7 +2081,7 @@ struct drm_i915_perf_open_param {
> >   */
> >  #define I915_PERF_IOCTL_ENABLE	_IO('i', 0x0)
> >  
> > -/**
> > +/*
> >   * Disable data capture for a stream.
> >   *
> >   * It is an error to try and read a stream that is disabled.
> > @@ -2090,7 +2090,7 @@ struct drm_i915_perf_open_param {
> >   */
> >  #define I915_PERF_IOCTL_DISABLE	_IO('i', 0x1)
> >  
> > -/**
> > +/*
> >   * Change metrics_set captured by a stream.
> >   *
> >   * If the stream is bound to a specific context, the configuration change
> > @@ -2103,7 +2103,7 @@ struct drm_i915_perf_open_param {
> >   */
> >  #define I915_PERF_IOCTL_CONFIG	_IO('i', 0x2)
> >  
> > -/**
> > +/*
> >   * Common to all i915 perf records
> >   */
> >  struct drm_i915_perf_record_header {
> > @@ -2151,7 +2151,7 @@ enum drm_i915_perf_record_type {
> >  	DRM_I915_PERF_RECORD_MAX /* non-ABI */
> >  };
> >  
> > -/**
> > +/*
> >   * Structure to upload perf dynamic configuration into the kernel.
> >   */
> >  struct drm_i915_perf_oa_config {
> > -- 
> > 2.26.3
> > 
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Mesa-dev] [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend to kernel doc
  2021-04-15 22:25   ` [Mesa-dev] " Ian Romanick
@ 2021-04-16  8:59     ` Daniel Vetter
  2021-04-16 19:04       ` Jonathan Corbet
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16  8:59 UTC (permalink / raw)
  To: Ian Romanick, Linux Doc Mailing List, Jonathan Corbet
  Cc: intel-gfx, dri-devel, Kenneth Graunke, Matthew Auld,
	Daniel Vetter, Mesa Dev

On Fri, Apr 16, 2021 at 12:25 AM Ian Romanick <idr@freedesktop.org> wrote:
> On 4/15/21 8:59 AM, Matthew Auld wrote:
> > Add a note about the two-step process.
> >
> > Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: mesa-dev@lists.freedesktop.org
> > ---
> >  include/uapi/drm/i915_drm.h | 57 ++++++++++++++++++++++++++++++-------
> >  1 file changed, 46 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index d9c954a5a456..ef36f1a0adde 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -2210,14 +2210,23 @@ struct drm_i915_perf_oa_config {
> >       __u64 flex_regs_ptr;
> >  };
> >
> > +/**
> > + * struct drm_i915_query_item - An individual query for the kernel to process.
> > + *
> > + * The behaviour is determined by the @query_id. Note that exactly what
>
> Since we just had a big discussion about this on mesa-dev w.r.t. Mesa
> code and documentation... does the kernel have a policy about which
> flavor (pun intended) of English should be used?

I'm not finding it documented in
https://dri.freedesktop.org/docs/drm/doc-guide/sphinx.html but I
thought we've discussed it. Adding linux-doc and Jon Corbet.
-Daniel

>
> > + * @data_ptr is also depends on the specific @query_id.
> > + */
> >  struct drm_i915_query_item {
> > +     /** @query_id: The id for this query */
> >       __u64 query_id;
> >  #define DRM_I915_QUERY_TOPOLOGY_INFO    1
> >  #define DRM_I915_QUERY_ENGINE_INFO   2
> >  #define DRM_I915_QUERY_PERF_CONFIG      3
> >  /* Must be kept compact -- no holes and well documented */
> >
> > -     /*
> > +     /**
> > +      * @length:
> > +      *
> >        * When set to zero by userspace, this is filled with the size of the
> >        * data to be written at the data_ptr pointer. The kernel sets this
> >        * value to a negative value to signal an error on a particular query
> > @@ -2225,21 +2234,26 @@ struct drm_i915_query_item {
> >        */
> >       __s32 length;
> >
> > -     /*
> > +     /**
> > +      * @flags:
> > +      *
> >        * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
> >        *
> >        * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
> > -      * following :
> > -      *         - DRM_I915_QUERY_PERF_CONFIG_LIST
> > -      *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> > -      *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
> > +      * following:
> > +      *
> > +      *      - DRM_I915_QUERY_PERF_CONFIG_LIST
> > +      *      - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
> > +      *      - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
> >        */
> >       __u32 flags;
> >  #define DRM_I915_QUERY_PERF_CONFIG_LIST          1
> >  #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
> >  #define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
> >
> > -     /*
> > +     /**
> > +      * @data_ptr:
> > +      *
> >        * Data will be written at the location pointed by data_ptr when the
> >        * value of length matches the length of the data to be written by the
> >        * kernel.
> > @@ -2247,16 +2261,37 @@ struct drm_i915_query_item {
> >       __u64 data_ptr;
> >  };
> >
> > +/**
> > + * struct drm_i915_query - Supply an array of drm_i915_query_item for the kernel
> > + * to fill out.
> > + *
> > + * Note that this is generally a two step process for each drm_i915_query_item
> > + * in the array:
> > + *
> > + *   1.) Call the DRM_IOCTL_I915_QUERY, giving it our array of
> > + *   drm_i915_query_item, with drm_i915_query_item.size set to zero. The
> > + *   kernel will then fill in the size, in bytes, which tells userspace how
> > + *   memory it needs to allocate for the blob(say for an array of
> > + *   properties).
> > + *
> > + *   2.) Next we call DRM_IOCTL_I915_QUERY again, this time with the
> > + *   drm_i915_query_item.data_ptr equal to our newly allocated blob. Note
> > + *   that the i915_query_item.size should still be the same as what the
> > + *   kernel previously set. At this point the kernel can fill in the blob.
> > + *
> > + */
> >  struct drm_i915_query {
> > +     /** @num_items: The number of elements in the @items_ptr array */
> >       __u32 num_items;
> >
> > -     /*
> > -      * Unused for now. Must be cleared to zero.
> > +     /**
> > +      * @flags: Unused for now. Must be cleared to zero.
> >        */
> >       __u32 flags;
> >
> > -     /*
> > -      * This points to an array of num_items drm_i915_query_item structures.
> > +     /**
> > +      * @items_ptr: This points to an array of num_items drm_i915_query_item
> > +      * structures.
> >        */
> >       __u64 items_ptr;
> >  };
> >
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-15 15:59 ` [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI Matthew Auld
@ 2021-04-16  9:15   ` Daniel Vetter
  2021-04-16 16:38   ` Jason Ekstrand
  1 sibling, 0 replies; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16  9:15 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jordan Justen, intel-gfx, dri-devel, Kenneth Graunke,
	Jason Ekstrand, Daniel Vetter, mesa-dev

On Thu, Apr 15, 2021 at 04:59:58PM +0100, Matthew Auld wrote:
> Add an entry for the new uAPI needed for DG1.
> 
> v2(Daniel):
>   - include the overall upstreaming plan
>   - add a note for mmap, there are differences here for TTM vs i915
>   - bunch of other suggestions from Daniel
> v3:
>  (Daniel)
>   - add a note for set/get caching stuff
>   - add some more docs for existing query and extensions stuff
>   - add an actual code example for regions query
>   - bunch of other stuff
>  (Jason)
>   - uAPI change(!):
> 	- try a simpler design with the placements extension
> 	- rather than have a generic setparam which can cover multiple
> 	  use cases, have each extension be responsible for one thing
> 	  only
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org
> ---
>  Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>  Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>  Documentation/gpu/rfc/index.rst         |   4 +
>  3 files changed, 398 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>  create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
> 
> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> new file mode 100644
> index 000000000000..2a82a452e9f2
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> @@ -0,0 +1,255 @@
> +/*
> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> + * For the regions query we are just adding a new query id, so no actual new
> + * ioctl or anything, but including it here for reference.
> + */

Oops I didn't realize this. I think the better/prettier way is to just
mention how it's built on top of the query ioctl and structs, and use
kerneldoc hyperlinks to point there. That way it's still easy to find, and
also serves as better documentation for the uapi when it's all merged.

See https://dri.freedesktop.org/docs/drm/doc-guide/kernel-doc.html#highlights-and-cross-references

That's also why it matters that we pull the kerneldoc into our overall
documentation, otherwise the hyperlinks aren't working.

> +struct drm_i915_query_item {
> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> +	....
> +        __u64 query_id;
> +
> +        /*
> +         * When set to zero by userspace, this is filled with the size of the
> +         * data to be written at the data_ptr pointer. The kernel sets this
> +         * value to a negative value to signal an error on a particular query
> +         * item.
> +         */
> +        __s32 length;
> +
> +        __u32 flags;
> +        /*
> +         * Data will be written at the location pointed by data_ptr when the
> +         * value of length matches the length of the data to be written by the
> +         * kernel.
> +         */
> +        __u64 data_ptr;
> +};
> +
> +struct drm_i915_query {
> +        __u32 num_items;
> +        /*
> +         * Unused for now. Must be cleared to zero.
> +         */
> +        __u32 flags;
> +        /*
> +         * This points to an array of num_items drm_i915_query_item structures.
> +         */
> +        __u64 items_ptr;
> +};
> +
> +#define DRM_IOCTL_I915_QUERY	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> +
> +/**
> + * enum drm_i915_gem_memory_class
> + */
> +enum drm_i915_gem_memory_class {
> +	/** @I915_MEMORY_CLASS_SYSTEM: system memory */
> +	I915_MEMORY_CLASS_SYSTEM = 0,
> +	/** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> +	I915_MEMORY_CLASS_DEVICE,
> +};
> +
> +/**
> + * struct drm_i915_gem_memory_class_instance
> + */
> +struct drm_i915_gem_memory_class_instance {
> +	/** @memory_class: see enum drm_i915_gem_memory_class */
> +	__u16 memory_class;
> +
> +	/** @memory_instance: which instance */
> +	__u16 memory_instance;
> +};
> +
> +/**
> + * struct drm_i915_memory_region_info
> + *
> + * Describes one region as known to the driver.
> + *
> + * Note that we reserve quite a lot of stuff here for potential future work. As
> + * an example we might want expose the capabilities(see caps) for a given
> + * region, which could include things like if the region is CPU
> + * mappable/accessible etc.
> + */
> +struct drm_i915_memory_region_info {
> +	/** @region: class:instance pair encoding */
> +	struct drm_i915_gem_memory_class_instance region;
> +
> +	/** @rsvd0: MBZ */
> +	__u32 rsvd0;
> +
> +	/** @caps: MBZ */
> +	__u64 caps;
> +
> +	/** @flags: MBZ */
> +	__u64 flags;
> +
> +	/** @probed_size: Memory probed by the driver (-1 = unknown) */
> +	__u64 probed_size;
> +
> +	/** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
> +	__u64 unallocated_size;
> +
> +	/** @rsvd1: MBZ */
> +	__u64 rsvd1[8];
> +};
> +
> +/**
> + * struct drm_i915_query_memory_regions
> + *
> + * Region info query enumerates all regions known to the driver by filling in
> + * an array of struct drm_i915_memory_region_info structures.
> + *
> + * Example for getting the list of supported regions:
> + *
> + * .. code-block:: C
> + *
> + *	struct drm_i915_query_memory_regions *info;
> + *	struct drm_i915_query_item item = {
> + *		.query_id = DRM_I915_QUERY_MEMORY_REGIONS;
> + *	};
> + *	struct drm_i915_query query = {
> + *		.num_items = 1,
> + *		.items_ptr = (uintptr_t)&item,
> + *	};
> + *	int err, i;
> + *
> + *	// First query the size of the blob we need, this needs to be large
> + *	// enough to hold our array of regions. The kernel will fill out the
> + *	// item.length for us, which is the number of bytes we need.
> + *	err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> + *	if (err) ...
> + *
> + *	info = calloc(1, item.length);
> + *	// Now that we allocated the required number of bytes, we call the ioctl
> + *	// again, this time with the data_ptr pointing to our newly allocated
> + *	// blob, which the kernel can then populate with the all the region info.
> + *	item.data_ptr = (uintptr_t)&info,
> + *
> + *	err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> + *	if (err) ...
> + *
> + *	// We can now access each region in the array
> + *	for (i = 0; i < info->num_regions; i++) {
> + *		struct drm_i915_memory_region_info mr = info->regions[i];
> + *		u16 class = mr.region.class;
> + *		u16 instance = mr.region.instance;
> + *
> + *		....
> + *	}
> + *
> + *	free(info);
> + */
> +struct drm_i915_query_memory_regions {
> +	/** @num_regions: Number of supported regions */
> +	__u32 num_regions;
> +
> +	/** @rsvd: MBZ */
> +	__u32 rsvd[3];
> +
> +	/** @regions: Info about each supported region */
> +	struct drm_i915_memory_region_info regions[];
> +};
> +
> +#define DRM_I915_GEM_CREATE_EXT		0xdeadbeaf
> +#define DRM_IOCTL_I915_GEM_CREATE_EXT	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
> +
> +/**
> + * struct drm_i915_gem_create_ext
> + *
> + * Existing gem_create behaviour, with added extension support.
> + *
> + * Note that in the future we want to have our buffer flags here, at least for
> + * the stuff that is immutable. Previously we would have two ioctls, one to
> + * create the object with gem_create, and another to apply various parameters,
> + * however this creates some ambiguity for the params which are considered
> + * immutable. Also in general we're phasing out the various SET/GET ioctls.
> + */
> +struct drm_i915_gem_create_ext {
> +	/**
> +	 * @size: Requested size for the object.
> +	 *
> +	 * The (page-aligned) allocated size for the object will be returned.
> +	 *
> +	 * Note that for some devices we have might have further minimum
> +	 * page-size restrictions(larger than 4K), like for device local-memory.
> +	 * However in general the final size here should always reflect any
> +	 * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> +	 * extension to place the object in device local-memory.
> +	 */
> +	__u64 size;
> +	/**
> +	 * @handle: Returned handle for the object.
> +	 *
> +	 * Object handles are nonzero.
> +	 */
> +	__u32 handle;
> +	/** @flags: MBZ */
> +	__u32 flags;
> +	/**
> +	 * @extensions: The chain of extensions to apply to this object.
> +	 *
> +	 * This will be useful in the future when we need to support several
> +	 * different extensions, and we need to apply more than one when
> +	 * creating the object. See struct i915_user_extension.
> +	 *
> +	 * If we don't supply any extensions then we get the same old gem_create
> +	 * behaviour.
> +	 *
> +	 * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
> +	 * drm_i915_gem_create_ext_memory_regions
> +	 */
> +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
> +	__u64 extensions;
> +};
> +
> +/**
> + * struct drm_i915_gem_create_ext_memory_regions
> + *
> + * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
> + *
> + * Set the object with the desired set of placements/regions in priority
> + * order(each entry must be unique and supported by the device), as an array of
> + * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance

I didn't look, but please make sure all the type references hyperlink.
Usually prepending struct or similar is good enough. See the link above
for how it works.

> + * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
> + * supported regions for a device.
> + *
> + * As an example, on discrete devices, if we wish to set the placement as
> + * device local-memory we can do something like:
> + *
> + * .. code-block:: C
> + *
> + *	struct drm_i915_gem_memory_class_instance region_lmem = {
> + *              .memory_class = I915_MEMORY_CLASS_DEVICE,
> + *              .memory_instance = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext_memory_regions regions = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
> + *              .regions = (uintptr_t)&region_lmem,
> + *              .num_regions = 1,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = 16 * PAGE_SIZE,
> + *              .extensions = (uintptr_t)&regions,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + *
> + * At which point we get the object handle in create_ext.handle, if all went
> + * well.
> + */
> +struct drm_i915_gem_create_ext_memory_regions {
> +	/** @base: Extension link. See struct i915_user_extension. */
> +	struct i915_user_extension base;
> +
> +	/** @pad: MBZ */
> +	__u32 pad;
> +	/** @num_regions: Number of elements in the placements array. */
> +	__u32 num_regions;
> +	/**
> +	 * @regions: The placements array.
> +	 *
> +	 * Should be an array of drm_i915_gem_memory_class_instance.
> +	 */
> +	__u64 regions;
> +};
> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
> new file mode 100644
> index 000000000000..52f1db15ae94
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
> @@ -0,0 +1,139 @@
> +=========================
> +I915 DG1/LMEM RFC Section
> +=========================
> +
> +Upstream plan
> +=============
> +For upstream the overall plan for landing all the DG1 stuff and turning it for
> +real, with all the uAPI bits is:
> +
> +* Merge basic HW enabling of DG1(still without pciid)
> +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
> +        * At this point we can still make changes, but importantly this lets us
> +          start running IGTs which can utilize local-memory in CI
> +* Convert over to TTM, make sure it all keeps working

For completeness I think it would be good to list the items we need to
look here in shared ttm code to make it fit for us. Thomas H has already
started some rfc thread on dri-devel, please check with him what he thinks
we should list here. From my side I think the two main ones are:

	- ttm shrinker is needed, since we currently have that
	- dma_resv_lockitem for full dma_resv_lock in the eviction code


> +* Add pciid for DG1 and turn on uAPI for real
> +
> +New object placement and region query uAPI
> +==========================================
> +Starting from DG1 we need to give userspace the ability to allocate buffers from
> +device local-memory. Currently the driver supports gem_create, which can place
> +buffers in system memory via shmem, and the usual assortment of other
> +interfaces, like dumb buffers and userptr.
> +
> +To support this new capability, while also providing a uAPI which will work
> +beyond just DG1, we propose to offer three new bits of uAPI:
> +
> +Query uAPI
> +----------
> +Existing query interface
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +        :functions: drm_i915_query_item drm_i915_query

Ah here is the include. Imo no point to reference existing uapi headers,
better to put them into a full section of their own. sphinx hyperlinks
will work their magic.

> +
> +DRM_I915_QUERY_MEMORY_REGIONS
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +New query ID which allows userspace to discover the list of supported memory
> +regions(like system-memory and local-memory) for a given device. We identify
> +each region with a class and instance pair, which should be unique. The class
> +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
> +like DG1.
> +
> +Side note: The class/instance design is borrowed from our existing engine uAPI,
> +where we describe every physical engine in terms of its class, and the
> +particular instance, since we can have more than one per class.
> +
> +In the future we also want to expose more information which can further
> +describe the capabilities of a region.
> +
> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> +        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
> +
> +GEM_CREATE_EXT
> +--------------
> +New ioctl which is basically just gem_create but now allows userspace to
> +provide a chain of possible extensions. Note that if we don't provide any
> +extensions then we get the exact same behaviour as gem_create.
> +
> +Side note: We also need to support PXP[1] in the near future, which is also
> +applicable to integrated platforms, and adds its own gem_create_ext extension,
> +which basically lets userspace mark a buffer as "protected".
> +
> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> +        :functions: drm_i915_gem_create_ext
> +
> +It's raining extensions
> +^^^^^^^^^^^^^^^^^^^^^^^
> +As noted above, extensions can be supplied as a chain in gem_create_ext using the
> +existing i915_user_extension. This will be useful in the future when we need to
> +support several different extensions, and we need to apply more than one when
> +creating the object.
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +        :functions: i915_user_extension
> +
> +I915_GEM_CREATE_EXT_MEMORY_REGIONS
> +----------------------------------
> +Implemented as an extension for gem_create_ext, we would now allow userspace to
> +optionally provide an immutable list of preferred placements at creation time,
> +in priority order, for a given buffer object.  For the placements we expect
> +them each to use the class/instance encoding, as per the output of the regions
> +query. Having the list in priority order will be useful in the future when
> +placing an object, say during eviction.
> +
> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> +        :functions: drm_i915_gem_create_ext_memory_regions
> +
> +One fair criticism here is that this seems a little over-engineered[2]. If we
> +just consider DG1 then yes, a simple gem_create.flags or something is totally
> +all that's needed to tell the kernel to allocate the buffer in local-memory or
> +whatever. However looking to the future we need uAPI which can also support
> +upcoming Xe HP multi-tile architecture in a sane way, where there can be
> +multiple local-memory instances for a given device, and so using both class and
> +instance in our uAPI to describe regions is desirable, although specifically
> +for DG1 it's uninteresting, since we only have a single local-memory instance.
> +
> +Existing uAPI issues
> +====================
> +Some potential issues we still need to resolve.
> +
> +I915 MMAP
> +---------
> +In i915 there are multiple ways to MMAP GEM object, including mapping the same
> +object using different mapping types(WC vs WB), i.e multiple active mmaps per
> +object. TTM expects one MMAP at most for the lifetime of the object. If it
> +turns out that we have to backpedal here, there might be some potential
> +userspace fallout.
> +
> +I915 SET/GET_CACHING
> +--------------------
> +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
> +DG1 doesn't support non-snooped pcie transactions, so we can just always
> +allocate as WB for smem-only buffers.  If/when our hw gains support for
> +non-snooped pcie transactions then we must fix this mode at allocation time as
> +a new GEM extension.
> +
> +This is related to the mmap problem, because in general (meaning, when we're
> +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
> +allocation mode.
> +
> +Possible idea is to let the kernel picks the mmap mode for userspace from the
> +following table:
> +
> +smem-only: WB. Userspace does not need to call clflush.
> +
> +smem+lmem: We allocate uncached memory, and give userspace a WC mapping
> +for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
> +access, which is a bit inefficient.
> +
> +lmem only: always WC
> +
> +This means on discrete you only get a single mmap mode, all others must be
> +rejected. That's probably going to be a new default mode or something like
> +that.
> +
> +Links
> +=====
> +[1] https://patchwork.freedesktop.org/series/86798/
> +
> +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> index a8621f7dab8b..05670442ca1b 100644
> --- a/Documentation/gpu/rfc/index.rst
> +++ b/Documentation/gpu/rfc/index.rst
> @@ -15,3 +15,7 @@ host such documentation:
>  
>  * Once the code has landed move all the documentation to the right places in
>    the main core, helper or driver sections.
> +
> +.. toctree::
> +
> +    i915_gem_lmem.rst

I think this strikes the right balance between brevity and details for
this new rfc process now, so with the details mentioned above checked:

Acke-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I think for merging we need an ack from Dave, mesa side, and I'll also
ping Jon Bloomfield. Please cc him on the next round too of this one. Then
I think we should be good to merge.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-15 15:59 ` [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI Matthew Auld
  2021-04-16  9:15   ` Daniel Vetter
@ 2021-04-16 16:38   ` Jason Ekstrand
  2021-04-16 17:02     ` Daniel Vetter
  2021-04-19 12:02     ` Matthew Auld
  1 sibling, 2 replies; 27+ messages in thread
From: Jason Ekstrand @ 2021-04-16 16:38 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, ML mesa-dev, Daniel Vetter

On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>
> Add an entry for the new uAPI needed for DG1.
>
> v2(Daniel):
>   - include the overall upstreaming plan
>   - add a note for mmap, there are differences here for TTM vs i915
>   - bunch of other suggestions from Daniel
> v3:
>  (Daniel)
>   - add a note for set/get caching stuff
>   - add some more docs for existing query and extensions stuff
>   - add an actual code example for regions query
>   - bunch of other stuff
>  (Jason)
>   - uAPI change(!):
>         - try a simpler design with the placements extension
>         - rather than have a generic setparam which can cover multiple
>           use cases, have each extension be responsible for one thing
>           only
>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: mesa-dev@lists.freedesktop.org
> ---
>  Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>  Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>  Documentation/gpu/rfc/index.rst         |   4 +
>  3 files changed, 398 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>  create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>
> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> new file mode 100644
> index 000000000000..2a82a452e9f2
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> @@ -0,0 +1,255 @@
> +/*
> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> + * For the regions query we are just adding a new query id, so no actual new
> + * ioctl or anything, but including it here for reference.
> + */
> +struct drm_i915_query_item {
> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> +       ....
> +        __u64 query_id;
> +
> +        /*
> +         * When set to zero by userspace, this is filled with the size of the
> +         * data to be written at the data_ptr pointer. The kernel sets this
> +         * value to a negative value to signal an error on a particular query
> +         * item.
> +         */
> +        __s32 length;
> +
> +        __u32 flags;
> +        /*
> +         * Data will be written at the location pointed by data_ptr when the
> +         * value of length matches the length of the data to be written by the
> +         * kernel.
> +         */
> +        __u64 data_ptr;
> +};
> +
> +struct drm_i915_query {
> +        __u32 num_items;
> +        /*
> +         * Unused for now. Must be cleared to zero.
> +         */
> +        __u32 flags;
> +        /*
> +         * This points to an array of num_items drm_i915_query_item structures.
> +         */
> +        __u64 items_ptr;
> +};
> +
> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> +
> +/**
> + * enum drm_i915_gem_memory_class
> + */
> +enum drm_i915_gem_memory_class {
> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
> +       I915_MEMORY_CLASS_SYSTEM = 0,
> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> +       I915_MEMORY_CLASS_DEVICE,
> +};
> +
> +/**
> + * struct drm_i915_gem_memory_class_instance
> + */
> +struct drm_i915_gem_memory_class_instance {
> +       /** @memory_class: see enum drm_i915_gem_memory_class */
> +       __u16 memory_class;
> +
> +       /** @memory_instance: which instance */
> +       __u16 memory_instance;
> +};
> +
> +/**
> + * struct drm_i915_memory_region_info
> + *
> + * Describes one region as known to the driver.
> + *
> + * Note that we reserve quite a lot of stuff here for potential future work. As
> + * an example we might want expose the capabilities(see caps) for a given
> + * region, which could include things like if the region is CPU
> + * mappable/accessible etc.

I get caps but I'm seriously at a loss as to what the rest of this
would be used for.  Why are caps and flags both there and separate?
Flags are typically something you set, not query.  Also, what's with
rsvd1 at the end?  This smells of substantial over-building to me.

I thought to myself, "maybe I'm missing a future use-case" so I looked
at the internal tree and none of this is being used there either.
This indicates to me that either I'm missing something and there's
code somewhere I don't know about or, with three years of building on
internal branches, we still haven't proven that any of this is needed.
If it's the latter, which I strongly suspect, maybe we should drop the
unnecessary bits and only add them back in if and when we have proof
that they're useful.

To be clear, I don't mind the query API as such and the class/instance
stuff seems fine and I really like being able to get the sizes
directly.  What concerns me is all this extra future-proofing that we
have zero proof is actually useful.  In my experience, when you build
out like this without so much as a use-case, you always end up
building the wrong thing.

> + */
> +struct drm_i915_memory_region_info {
> +       /** @region: class:instance pair encoding */
> +       struct drm_i915_gem_memory_class_instance region;
> +
> +       /** @rsvd0: MBZ */
> +       __u32 rsvd0;
> +
> +       /** @caps: MBZ */
> +       __u64 caps;
> +
> +       /** @flags: MBZ */
> +       __u64 flags;
> +
> +       /** @probed_size: Memory probed by the driver (-1 = unknown) */
> +       __u64 probed_size;
> +
> +       /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
> +       __u64 unallocated_size;
> +
> +       /** @rsvd1: MBZ */
> +       __u64 rsvd1[8];
> +};
> +
> +/**
> + * struct drm_i915_query_memory_regions
> + *
> + * Region info query enumerates all regions known to the driver by filling in
> + * an array of struct drm_i915_memory_region_info structures.
> + *
> + * Example for getting the list of supported regions:
> + *
> + * .. code-block:: C
> + *
> + *     struct drm_i915_query_memory_regions *info;
> + *     struct drm_i915_query_item item = {
> + *             .query_id = DRM_I915_QUERY_MEMORY_REGIONS;
> + *     };
> + *     struct drm_i915_query query = {
> + *             .num_items = 1,
> + *             .items_ptr = (uintptr_t)&item,
> + *     };
> + *     int err, i;
> + *
> + *     // First query the size of the blob we need, this needs to be large
> + *     // enough to hold our array of regions. The kernel will fill out the
> + *     // item.length for us, which is the number of bytes we need.
> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> + *     if (err) ...
> + *
> + *     info = calloc(1, item.length);
> + *     // Now that we allocated the required number of bytes, we call the ioctl
> + *     // again, this time with the data_ptr pointing to our newly allocated
> + *     // blob, which the kernel can then populate with the all the region info.
> + *     item.data_ptr = (uintptr_t)&info,
> + *
> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> + *     if (err) ...
> + *
> + *     // We can now access each region in the array
> + *     for (i = 0; i < info->num_regions; i++) {
> + *             struct drm_i915_memory_region_info mr = info->regions[i];
> + *             u16 class = mr.region.class;
> + *             u16 instance = mr.region.instance;
> + *
> + *             ....
> + *     }
> + *
> + *     free(info);
> + */
> +struct drm_i915_query_memory_regions {
> +       /** @num_regions: Number of supported regions */
> +       __u32 num_regions;
> +
> +       /** @rsvd: MBZ */
> +       __u32 rsvd[3];

Why pad to 16B instead of 8B?

> +
> +       /** @regions: Info about each supported region */
> +       struct drm_i915_memory_region_info regions[];
> +};
> +
> +#define DRM_I915_GEM_CREATE_EXT                0xdeadbeaf
> +#define DRM_IOCTL_I915_GEM_CREATE_EXT  DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
> +
> +/**
> + * struct drm_i915_gem_create_ext
> + *
> + * Existing gem_create behaviour, with added extension support.
> + *
> + * Note that in the future we want to have our buffer flags here, at least for
> + * the stuff that is immutable. Previously we would have two ioctls, one to
> + * create the object with gem_create, and another to apply various parameters,
> + * however this creates some ambiguity for the params which are considered
> + * immutable. Also in general we're phasing out the various SET/GET ioctls.
> + */
> +struct drm_i915_gem_create_ext {
> +       /**
> +        * @size: Requested size for the object.
> +        *
> +        * The (page-aligned) allocated size for the object will be returned.
> +        *
> +        * Note that for some devices we have might have further minimum
> +        * page-size restrictions(larger than 4K), like for device local-memory.
> +        * However in general the final size here should always reflect any
> +        * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> +        * extension to place the object in device local-memory.
> +        */
> +       __u64 size;
> +       /**
> +        * @handle: Returned handle for the object.
> +        *
> +        * Object handles are nonzero.
> +        */
> +       __u32 handle;
> +       /** @flags: MBZ */
> +       __u32 flags;
> +       /**
> +        * @extensions: The chain of extensions to apply to this object.
> +        *
> +        * This will be useful in the future when we need to support several
> +        * different extensions, and we need to apply more than one when
> +        * creating the object. See struct i915_user_extension.
> +        *
> +        * If we don't supply any extensions then we get the same old gem_create
> +        * behaviour.
> +        *
> +        * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
> +        * drm_i915_gem_create_ext_memory_regions
> +        */
> +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
> +       __u64 extensions;
> +};
> +
> +/**
> + * struct drm_i915_gem_create_ext_memory_regions
> + *
> + * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
> + *
> + * Set the object with the desired set of placements/regions in priority
> + * order(each entry must be unique and supported by the device), as an array of
> + * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
> + * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
> + * supported regions for a device.
> + *
> + * As an example, on discrete devices, if we wish to set the placement as
> + * device local-memory we can do something like:
> + *
> + * .. code-block:: C
> + *
> + *     struct drm_i915_gem_memory_class_instance region_lmem = {
> + *              .memory_class = I915_MEMORY_CLASS_DEVICE,
> + *              .memory_instance = 0,
> + *      };
> + *      struct drm_i915_gem_create_ext_memory_regions regions = {
> + *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
> + *              .regions = (uintptr_t)&region_lmem,
> + *              .num_regions = 1,
> + *      };
> + *      struct drm_i915_gem_create_ext create_ext = {
> + *              .size = 16 * PAGE_SIZE,
> + *              .extensions = (uintptr_t)&regions,
> + *      };
> + *
> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> + *      if (err) ...
> + *
> + * At which point we get the object handle in create_ext.handle, if all went
> + * well.
> + */
> +struct drm_i915_gem_create_ext_memory_regions {
> +       /** @base: Extension link. See struct i915_user_extension. */
> +       struct i915_user_extension base;
> +
> +       /** @pad: MBZ */
> +       __u32 pad;
> +       /** @num_regions: Number of elements in the placements array. */
> +       __u32 num_regions;
> +       /**
> +        * @regions: The placements array.
> +        *
> +        * Should be an array of drm_i915_gem_memory_class_instance.
> +        */
> +       __u64 regions;
> +};
> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
> new file mode 100644
> index 000000000000..52f1db15ae94
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
> @@ -0,0 +1,139 @@
> +=========================
> +I915 DG1/LMEM RFC Section
> +=========================
> +
> +Upstream plan
> +=============
> +For upstream the overall plan for landing all the DG1 stuff and turning it for
> +real, with all the uAPI bits is:
> +
> +* Merge basic HW enabling of DG1(still without pciid)
> +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
> +        * At this point we can still make changes, but importantly this lets us
> +          start running IGTs which can utilize local-memory in CI
> +* Convert over to TTM, make sure it all keeps working
> +* Add pciid for DG1 and turn on uAPI for real
> +
> +New object placement and region query uAPI
> +==========================================
> +Starting from DG1 we need to give userspace the ability to allocate buffers from
> +device local-memory. Currently the driver supports gem_create, which can place
> +buffers in system memory via shmem, and the usual assortment of other
> +interfaces, like dumb buffers and userptr.
> +
> +To support this new capability, while also providing a uAPI which will work
> +beyond just DG1, we propose to offer three new bits of uAPI:
> +
> +Query uAPI
> +----------
> +Existing query interface
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +        :functions: drm_i915_query_item drm_i915_query
> +
> +DRM_I915_QUERY_MEMORY_REGIONS
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +New query ID which allows userspace to discover the list of supported memory
> +regions(like system-memory and local-memory) for a given device. We identify
> +each region with a class and instance pair, which should be unique. The class
> +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
> +like DG1.
> +
> +Side note: The class/instance design is borrowed from our existing engine uAPI,
> +where we describe every physical engine in terms of its class, and the
> +particular instance, since we can have more than one per class.
> +
> +In the future we also want to expose more information which can further
> +describe the capabilities of a region.
> +
> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> +        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
> +
> +GEM_CREATE_EXT
> +--------------
> +New ioctl which is basically just gem_create but now allows userspace to
> +provide a chain of possible extensions. Note that if we don't provide any
> +extensions then we get the exact same behaviour as gem_create.
> +
> +Side note: We also need to support PXP[1] in the near future, which is also
> +applicable to integrated platforms, and adds its own gem_create_ext extension,
> +which basically lets userspace mark a buffer as "protected".

A bit off-topic, but do we really need a whole extension for that?  Or
can we just throw a bit in flags?  I'm a big fan of landing create_ext
anyway; I like extensibility.  I'm just questioning whether or not
that one needs its own struct.

--Jason


> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> +        :functions: drm_i915_gem_create_ext
> +
> +It's raining extensions
> +^^^^^^^^^^^^^^^^^^^^^^^
> +As noted above, extensions can be supplied as a chain in gem_create_ext using the
> +existing i915_user_extension. This will be useful in the future when we need to
> +support several different extensions, and we need to apply more than one when
> +creating the object.
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +        :functions: i915_user_extension
> +
> +I915_GEM_CREATE_EXT_MEMORY_REGIONS
> +----------------------------------
> +Implemented as an extension for gem_create_ext, we would now allow userspace to
> +optionally provide an immutable list of preferred placements at creation time,
> +in priority order, for a given buffer object.  For the placements we expect
> +them each to use the class/instance encoding, as per the output of the regions
> +query. Having the list in priority order will be useful in the future when
> +placing an object, say during eviction.
> +
> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> +        :functions: drm_i915_gem_create_ext_memory_regions
> +
> +One fair criticism here is that this seems a little over-engineered[2]. If we
> +just consider DG1 then yes, a simple gem_create.flags or something is totally
> +all that's needed to tell the kernel to allocate the buffer in local-memory or
> +whatever. However looking to the future we need uAPI which can also support
> +upcoming Xe HP multi-tile architecture in a sane way, where there can be
> +multiple local-memory instances for a given device, and so using both class and
> +instance in our uAPI to describe regions is desirable, although specifically
> +for DG1 it's uninteresting, since we only have a single local-memory instance.
> +
> +Existing uAPI issues
> +====================
> +Some potential issues we still need to resolve.
> +
> +I915 MMAP
> +---------
> +In i915 there are multiple ways to MMAP GEM object, including mapping the same
> +object using different mapping types(WC vs WB), i.e multiple active mmaps per
> +object. TTM expects one MMAP at most for the lifetime of the object. If it
> +turns out that we have to backpedal here, there might be some potential
> +userspace fallout.
> +
> +I915 SET/GET_CACHING
> +--------------------
> +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
> +DG1 doesn't support non-snooped pcie transactions, so we can just always
> +allocate as WB for smem-only buffers.  If/when our hw gains support for
> +non-snooped pcie transactions then we must fix this mode at allocation time as
> +a new GEM extension.
> +
> +This is related to the mmap problem, because in general (meaning, when we're
> +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
> +allocation mode.
> +
> +Possible idea is to let the kernel picks the mmap mode for userspace from the
> +following table:
> +
> +smem-only: WB. Userspace does not need to call clflush.
> +
> +smem+lmem: We allocate uncached memory, and give userspace a WC mapping
> +for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
> +access, which is a bit inefficient.
> +
> +lmem only: always WC
> +
> +This means on discrete you only get a single mmap mode, all others must be
> +rejected. That's probably going to be a new default mode or something like
> +that.
> +
> +Links
> +=====
> +[1] https://patchwork.freedesktop.org/series/86798/
> +
> +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> index a8621f7dab8b..05670442ca1b 100644
> --- a/Documentation/gpu/rfc/index.rst
> +++ b/Documentation/gpu/rfc/index.rst
> @@ -15,3 +15,7 @@ host such documentation:
>
>  * Once the code has landed move all the documentation to the right places in
>    the main core, helper or driver sections.
> +
> +.. toctree::
> +
> +    i915_gem_lmem.rst
> --
> 2.26.3
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-16 16:38   ` Jason Ekstrand
@ 2021-04-16 17:02     ` Daniel Vetter
  2021-04-16 17:33       ` Daniele Ceraolo Spurio
  2021-04-19 12:02     ` Matthew Auld
  1 sibling, 1 reply; 27+ messages in thread
From: Daniel Vetter @ 2021-04-16 17:02 UTC (permalink / raw)
  To: Jason Ekstrand, Ceraolo Spurio, Daniele, Lionel Landwerlin
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev

On Fri, Apr 16, 2021 at 6:38 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >
> > Add an entry for the new uAPI needed for DG1.
> >
> > v2(Daniel):
> >   - include the overall upstreaming plan
> >   - add a note for mmap, there are differences here for TTM vs i915
> >   - bunch of other suggestions from Daniel
> > v3:
> >  (Daniel)
> >   - add a note for set/get caching stuff
> >   - add some more docs for existing query and extensions stuff
> >   - add an actual code example for regions query
> >   - bunch of other stuff
> >  (Jason)
> >   - uAPI change(!):
> >         - try a simpler design with the placements extension
> >         - rather than have a generic setparam which can cover multiple
> >           use cases, have each extension be responsible for one thing
> >           only
> >
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: mesa-dev@lists.freedesktop.org
> > ---
> >  Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
> >  Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
> >  Documentation/gpu/rfc/index.rst         |   4 +
> >  3 files changed, 398 insertions(+)
> >  create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
> >  create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
> >
> > diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> > new file mode 100644
> > index 000000000000..2a82a452e9f2
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> > @@ -0,0 +1,255 @@
> > +/*
> > + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> > + * For the regions query we are just adding a new query id, so no actual new
> > + * ioctl or anything, but including it here for reference.
> > + */
> > +struct drm_i915_query_item {
> > +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> > +       ....
> > +        __u64 query_id;
> > +
> > +        /*
> > +         * When set to zero by userspace, this is filled with the size of the
> > +         * data to be written at the data_ptr pointer. The kernel sets this
> > +         * value to a negative value to signal an error on a particular query
> > +         * item.
> > +         */
> > +        __s32 length;
> > +
> > +        __u32 flags;
> > +        /*
> > +         * Data will be written at the location pointed by data_ptr when the
> > +         * value of length matches the length of the data to be written by the
> > +         * kernel.
> > +         */
> > +        __u64 data_ptr;
> > +};
> > +
> > +struct drm_i915_query {
> > +        __u32 num_items;
> > +        /*
> > +         * Unused for now. Must be cleared to zero.
> > +         */
> > +        __u32 flags;
> > +        /*
> > +         * This points to an array of num_items drm_i915_query_item structures.
> > +         */
> > +        __u64 items_ptr;
> > +};
> > +
> > +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> > +
> > +/**
> > + * enum drm_i915_gem_memory_class
> > + */
> > +enum drm_i915_gem_memory_class {
> > +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
> > +       I915_MEMORY_CLASS_SYSTEM = 0,
> > +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> > +       I915_MEMORY_CLASS_DEVICE,
> > +};
> > +
> > +/**
> > + * struct drm_i915_gem_memory_class_instance
> > + */
> > +struct drm_i915_gem_memory_class_instance {
> > +       /** @memory_class: see enum drm_i915_gem_memory_class */
> > +       __u16 memory_class;
> > +
> > +       /** @memory_instance: which instance */
> > +       __u16 memory_instance;
> > +};
> > +
> > +/**
> > + * struct drm_i915_memory_region_info
> > + *
> > + * Describes one region as known to the driver.
> > + *
> > + * Note that we reserve quite a lot of stuff here for potential future work. As
> > + * an example we might want expose the capabilities(see caps) for a given
> > + * region, which could include things like if the region is CPU
> > + * mappable/accessible etc.
>
> I get caps but I'm seriously at a loss as to what the rest of this
> would be used for.  Why are caps and flags both there and separate?
> Flags are typically something you set, not query.  Also, what's with
> rsvd1 at the end?  This smells of substantial over-building to me.
>
> I thought to myself, "maybe I'm missing a future use-case" so I looked
> at the internal tree and none of this is being used there either.
> This indicates to me that either I'm missing something and there's
> code somewhere I don't know about or, with three years of building on
> internal branches, we still haven't proven that any of this is needed.
> If it's the latter, which I strongly suspect, maybe we should drop the
> unnecessary bits and only add them back in if and when we have proof
> that they're useful.
>
> To be clear, I don't mind the query API as such and the class/instance
> stuff seems fine and I really like being able to get the sizes
> directly.  What concerns me is all this extra future-proofing that we
> have zero proof is actually useful.  In my experience, when you build
> out like this without so much as a use-case, you always end up
> building the wrong thing.
>
> > + */
> > +struct drm_i915_memory_region_info {
> > +       /** @region: class:instance pair encoding */
> > +       struct drm_i915_gem_memory_class_instance region;
> > +
> > +       /** @rsvd0: MBZ */
> > +       __u32 rsvd0;
> > +
> > +       /** @caps: MBZ */
> > +       __u64 caps;
> > +
> > +       /** @flags: MBZ */
> > +       __u64 flags;
> > +
> > +       /** @probed_size: Memory probed by the driver (-1 = unknown) */
> > +       __u64 probed_size;
> > +
> > +       /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
> > +       __u64 unallocated_size;
> > +
> > +       /** @rsvd1: MBZ */
> > +       __u64 rsvd1[8];
> > +};
> > +
> > +/**
> > + * struct drm_i915_query_memory_regions
> > + *
> > + * Region info query enumerates all regions known to the driver by filling in
> > + * an array of struct drm_i915_memory_region_info structures.
> > + *
> > + * Example for getting the list of supported regions:
> > + *
> > + * .. code-block:: C
> > + *
> > + *     struct drm_i915_query_memory_regions *info;
> > + *     struct drm_i915_query_item item = {
> > + *             .query_id = DRM_I915_QUERY_MEMORY_REGIONS;
> > + *     };
> > + *     struct drm_i915_query query = {
> > + *             .num_items = 1,
> > + *             .items_ptr = (uintptr_t)&item,
> > + *     };
> > + *     int err, i;
> > + *
> > + *     // First query the size of the blob we need, this needs to be large
> > + *     // enough to hold our array of regions. The kernel will fill out the
> > + *     // item.length for us, which is the number of bytes we need.
> > + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> > + *     if (err) ...
> > + *
> > + *     info = calloc(1, item.length);
> > + *     // Now that we allocated the required number of bytes, we call the ioctl
> > + *     // again, this time with the data_ptr pointing to our newly allocated
> > + *     // blob, which the kernel can then populate with the all the region info.
> > + *     item.data_ptr = (uintptr_t)&info,
> > + *
> > + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> > + *     if (err) ...
> > + *
> > + *     // We can now access each region in the array
> > + *     for (i = 0; i < info->num_regions; i++) {
> > + *             struct drm_i915_memory_region_info mr = info->regions[i];
> > + *             u16 class = mr.region.class;
> > + *             u16 instance = mr.region.instance;
> > + *
> > + *             ....
> > + *     }
> > + *
> > + *     free(info);
> > + */
> > +struct drm_i915_query_memory_regions {
> > +       /** @num_regions: Number of supported regions */
> > +       __u32 num_regions;
> > +
> > +       /** @rsvd: MBZ */
> > +       __u32 rsvd[3];
>
> Why pad to 16B instead of 8B?
>
> > +
> > +       /** @regions: Info about each supported region */
> > +       struct drm_i915_memory_region_info regions[];
> > +};
> > +
> > +#define DRM_I915_GEM_CREATE_EXT                0xdeadbeaf
> > +#define DRM_IOCTL_I915_GEM_CREATE_EXT  DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
> > +
> > +/**
> > + * struct drm_i915_gem_create_ext
> > + *
> > + * Existing gem_create behaviour, with added extension support.
> > + *
> > + * Note that in the future we want to have our buffer flags here, at least for
> > + * the stuff that is immutable. Previously we would have two ioctls, one to
> > + * create the object with gem_create, and another to apply various parameters,
> > + * however this creates some ambiguity for the params which are considered
> > + * immutable. Also in general we're phasing out the various SET/GET ioctls.
> > + */
> > +struct drm_i915_gem_create_ext {
> > +       /**
> > +        * @size: Requested size for the object.
> > +        *
> > +        * The (page-aligned) allocated size for the object will be returned.
> > +        *
> > +        * Note that for some devices we have might have further minimum
> > +        * page-size restrictions(larger than 4K), like for device local-memory.
> > +        * However in general the final size here should always reflect any
> > +        * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> > +        * extension to place the object in device local-memory.
> > +        */
> > +       __u64 size;
> > +       /**
> > +        * @handle: Returned handle for the object.
> > +        *
> > +        * Object handles are nonzero.
> > +        */
> > +       __u32 handle;
> > +       /** @flags: MBZ */
> > +       __u32 flags;
> > +       /**
> > +        * @extensions: The chain of extensions to apply to this object.
> > +        *
> > +        * This will be useful in the future when we need to support several
> > +        * different extensions, and we need to apply more than one when
> > +        * creating the object. See struct i915_user_extension.
> > +        *
> > +        * If we don't supply any extensions then we get the same old gem_create
> > +        * behaviour.
> > +        *
> > +        * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
> > +        * drm_i915_gem_create_ext_memory_regions
> > +        */
> > +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
> > +       __u64 extensions;
> > +};
> > +
> > +/**
> > + * struct drm_i915_gem_create_ext_memory_regions
> > + *
> > + * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
> > + *
> > + * Set the object with the desired set of placements/regions in priority
> > + * order(each entry must be unique and supported by the device), as an array of
> > + * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
> > + * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
> > + * supported regions for a device.
> > + *
> > + * As an example, on discrete devices, if we wish to set the placement as
> > + * device local-memory we can do something like:
> > + *
> > + * .. code-block:: C
> > + *
> > + *     struct drm_i915_gem_memory_class_instance region_lmem = {
> > + *              .memory_class = I915_MEMORY_CLASS_DEVICE,
> > + *              .memory_instance = 0,
> > + *      };
> > + *      struct drm_i915_gem_create_ext_memory_regions regions = {
> > + *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
> > + *              .regions = (uintptr_t)&region_lmem,
> > + *              .num_regions = 1,
> > + *      };
> > + *      struct drm_i915_gem_create_ext create_ext = {
> > + *              .size = 16 * PAGE_SIZE,
> > + *              .extensions = (uintptr_t)&regions,
> > + *      };
> > + *
> > + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> > + *      if (err) ...
> > + *
> > + * At which point we get the object handle in create_ext.handle, if all went
> > + * well.
> > + */
> > +struct drm_i915_gem_create_ext_memory_regions {
> > +       /** @base: Extension link. See struct i915_user_extension. */
> > +       struct i915_user_extension base;
> > +
> > +       /** @pad: MBZ */
> > +       __u32 pad;
> > +       /** @num_regions: Number of elements in the placements array. */
> > +       __u32 num_regions;
> > +       /**
> > +        * @regions: The placements array.
> > +        *
> > +        * Should be an array of drm_i915_gem_memory_class_instance.
> > +        */
> > +       __u64 regions;
> > +};
> > diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
> > new file mode 100644
> > index 000000000000..52f1db15ae94
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
> > @@ -0,0 +1,139 @@
> > +=========================
> > +I915 DG1/LMEM RFC Section
> > +=========================
> > +
> > +Upstream plan
> > +=============
> > +For upstream the overall plan for landing all the DG1 stuff and turning it for
> > +real, with all the uAPI bits is:
> > +
> > +* Merge basic HW enabling of DG1(still without pciid)
> > +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
> > +        * At this point we can still make changes, but importantly this lets us
> > +          start running IGTs which can utilize local-memory in CI
> > +* Convert over to TTM, make sure it all keeps working
> > +* Add pciid for DG1 and turn on uAPI for real
> > +
> > +New object placement and region query uAPI
> > +==========================================
> > +Starting from DG1 we need to give userspace the ability to allocate buffers from
> > +device local-memory. Currently the driver supports gem_create, which can place
> > +buffers in system memory via shmem, and the usual assortment of other
> > +interfaces, like dumb buffers and userptr.
> > +
> > +To support this new capability, while also providing a uAPI which will work
> > +beyond just DG1, we propose to offer three new bits of uAPI:
> > +
> > +Query uAPI
> > +----------
> > +Existing query interface
> > +^^^^^^^^^^^^^^^^^^^^^^^^
> > +.. kernel-doc:: include/uapi/drm/i915_drm.h
> > +        :functions: drm_i915_query_item drm_i915_query
> > +
> > +DRM_I915_QUERY_MEMORY_REGIONS
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +New query ID which allows userspace to discover the list of supported memory
> > +regions(like system-memory and local-memory) for a given device. We identify
> > +each region with a class and instance pair, which should be unique. The class
> > +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
> > +like DG1.
> > +
> > +Side note: The class/instance design is borrowed from our existing engine uAPI,
> > +where we describe every physical engine in terms of its class, and the
> > +particular instance, since we can have more than one per class.
> > +
> > +In the future we also want to expose more information which can further
> > +describe the capabilities of a region.
> > +
> > +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> > +        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
> > +
> > +GEM_CREATE_EXT
> > +--------------
> > +New ioctl which is basically just gem_create but now allows userspace to
> > +provide a chain of possible extensions. Note that if we don't provide any
> > +extensions then we get the exact same behaviour as gem_create.
> > +
> > +Side note: We also need to support PXP[1] in the near future, which is also
> > +applicable to integrated platforms, and adds its own gem_create_ext extension,
> > +which basically lets userspace mark a buffer as "protected".
>
> A bit off-topic, but do we really need a whole extension for that?  Or
> can we just throw a bit in flags?  I'm a big fan of landing create_ext
> anyway; I like extensibility.  I'm just questioning whether or not
> that one needs its own struct.

Yeah for PXP just a flag is all we need. If we cant/don't want to
stuff that into the main structure then we could do an empty
extension: It's presence would indicate that we want PXP enabled.
Definitely we shouldn't build the entire setparam machinery for gem bo
copied over from ctx_create_ext while Jason is working on around 100
or so igt+kernel patches to undo the unecessary uapi complexity this
caused over there in ctx land.

Also adding Daniele and Lionel to keep them in the loop here on what's
happening to gem_create_ext. Daniele I think for merging we need a)
resubmit the pxp patches to dri-devel per new process that we have
since 2-3 weeks now and then b) land the uapi bits only after this
patch here from Matt has landed, to make sure we don't end up with the
wrong version of gem_create_ext. Is that ok?
-Daniel

>
> --Jason
>
>
> > +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> > +        :functions: drm_i915_gem_create_ext
> > +
> > +It's raining extensions
> > +^^^^^^^^^^^^^^^^^^^^^^^
> > +As noted above, extensions can be supplied as a chain in gem_create_ext using the
> > +existing i915_user_extension. This will be useful in the future when we need to
> > +support several different extensions, and we need to apply more than one when
> > +creating the object.
> > +
> > +.. kernel-doc:: include/uapi/drm/i915_drm.h
> > +        :functions: i915_user_extension
> > +
> > +I915_GEM_CREATE_EXT_MEMORY_REGIONS
> > +----------------------------------
> > +Implemented as an extension for gem_create_ext, we would now allow userspace to
> > +optionally provide an immutable list of preferred placements at creation time,
> > +in priority order, for a given buffer object.  For the placements we expect
> > +them each to use the class/instance encoding, as per the output of the regions
> > +query. Having the list in priority order will be useful in the future when
> > +placing an object, say during eviction.
> > +
> > +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> > +        :functions: drm_i915_gem_create_ext_memory_regions
> > +
> > +One fair criticism here is that this seems a little over-engineered[2]. If we
> > +just consider DG1 then yes, a simple gem_create.flags or something is totally
> > +all that's needed to tell the kernel to allocate the buffer in local-memory or
> > +whatever. However looking to the future we need uAPI which can also support
> > +upcoming Xe HP multi-tile architecture in a sane way, where there can be
> > +multiple local-memory instances for a given device, and so using both class and
> > +instance in our uAPI to describe regions is desirable, although specifically
> > +for DG1 it's uninteresting, since we only have a single local-memory instance.
> > +
> > +Existing uAPI issues
> > +====================
> > +Some potential issues we still need to resolve.
> > +
> > +I915 MMAP
> > +---------
> > +In i915 there are multiple ways to MMAP GEM object, including mapping the same
> > +object using different mapping types(WC vs WB), i.e multiple active mmaps per
> > +object. TTM expects one MMAP at most for the lifetime of the object. If it
> > +turns out that we have to backpedal here, there might be some potential
> > +userspace fallout.
> > +
> > +I915 SET/GET_CACHING
> > +--------------------
> > +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
> > +DG1 doesn't support non-snooped pcie transactions, so we can just always
> > +allocate as WB for smem-only buffers.  If/when our hw gains support for
> > +non-snooped pcie transactions then we must fix this mode at allocation time as
> > +a new GEM extension.
> > +
> > +This is related to the mmap problem, because in general (meaning, when we're
> > +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
> > +allocation mode.
> > +
> > +Possible idea is to let the kernel picks the mmap mode for userspace from the
> > +following table:
> > +
> > +smem-only: WB. Userspace does not need to call clflush.
> > +
> > +smem+lmem: We allocate uncached memory, and give userspace a WC mapping
> > +for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
> > +access, which is a bit inefficient.
> > +
> > +lmem only: always WC
> > +
> > +This means on discrete you only get a single mmap mode, all others must be
> > +rejected. That's probably going to be a new default mode or something like
> > +that.
> > +
> > +Links
> > +=====
> > +[1] https://patchwork.freedesktop.org/series/86798/
> > +
> > +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
> > diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> > index a8621f7dab8b..05670442ca1b 100644
> > --- a/Documentation/gpu/rfc/index.rst
> > +++ b/Documentation/gpu/rfc/index.rst
> > @@ -15,3 +15,7 @@ host such documentation:
> >
> >  * Once the code has landed move all the documentation to the right places in
> >    the main core, helper or driver sections.
> > +
> > +.. toctree::
> > +
> > +    i915_gem_lmem.rst
> > --
> > 2.26.3
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-16 17:02     ` Daniel Vetter
@ 2021-04-16 17:33       ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 27+ messages in thread
From: Daniele Ceraolo Spurio @ 2021-04-16 17:33 UTC (permalink / raw)
  To: Daniel Vetter, Jason Ekstrand, Lionel Landwerlin
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev



On 4/16/2021 10:02 AM, Daniel Vetter wrote:
> On Fri, Apr 16, 2021 at 6:38 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>> Add an entry for the new uAPI needed for DG1.
>>>
>>> v2(Daniel):
>>>    - include the overall upstreaming plan
>>>    - add a note for mmap, there are differences here for TTM vs i915
>>>    - bunch of other suggestions from Daniel
>>> v3:
>>>   (Daniel)
>>>    - add a note for set/get caching stuff
>>>    - add some more docs for existing query and extensions stuff
>>>    - add an actual code example for regions query
>>>    - bunch of other stuff
>>>   (Jason)
>>>    - uAPI change(!):
>>>          - try a simpler design with the placements extension
>>>          - rather than have a generic setparam which can cover multiple
>>>            use cases, have each extension be responsible for one thing
>>>            only
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>>> Cc: Dave Airlie <airlied@gmail.com>
>>> Cc: dri-devel@lists.freedesktop.org
>>> Cc: mesa-dev@lists.freedesktop.org
>>> ---
>>>   Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>>>   Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>>>   Documentation/gpu/rfc/index.rst         |   4 +
>>>   3 files changed, 398 insertions(+)
>>>   create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>>>   create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>>>
>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
>>> new file mode 100644
>>> index 000000000000..2a82a452e9f2
>>> --- /dev/null
>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
>>> @@ -0,0 +1,255 @@
>>> +/*
>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
>>> + * For the regions query we are just adding a new query id, so no actual new
>>> + * ioctl or anything, but including it here for reference.
>>> + */
>>> +struct drm_i915_query_item {
>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
>>> +       ....
>>> +        __u64 query_id;
>>> +
>>> +        /*
>>> +         * When set to zero by userspace, this is filled with the size of the
>>> +         * data to be written at the data_ptr pointer. The kernel sets this
>>> +         * value to a negative value to signal an error on a particular query
>>> +         * item.
>>> +         */
>>> +        __s32 length;
>>> +
>>> +        __u32 flags;
>>> +        /*
>>> +         * Data will be written at the location pointed by data_ptr when the
>>> +         * value of length matches the length of the data to be written by the
>>> +         * kernel.
>>> +         */
>>> +        __u64 data_ptr;
>>> +};
>>> +
>>> +struct drm_i915_query {
>>> +        __u32 num_items;
>>> +        /*
>>> +         * Unused for now. Must be cleared to zero.
>>> +         */
>>> +        __u32 flags;
>>> +        /*
>>> +         * This points to an array of num_items drm_i915_query_item structures.
>>> +         */
>>> +        __u64 items_ptr;
>>> +};
>>> +
>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>> +
>>> +/**
>>> + * enum drm_i915_gem_memory_class
>>> + */
>>> +enum drm_i915_gem_memory_class {
>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
>>> +       I915_MEMORY_CLASS_DEVICE,
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_memory_class_instance
>>> + */
>>> +struct drm_i915_gem_memory_class_instance {
>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
>>> +       __u16 memory_class;
>>> +
>>> +       /** @memory_instance: which instance */
>>> +       __u16 memory_instance;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_memory_region_info
>>> + *
>>> + * Describes one region as known to the driver.
>>> + *
>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
>>> + * an example we might want expose the capabilities(see caps) for a given
>>> + * region, which could include things like if the region is CPU
>>> + * mappable/accessible etc.
>> I get caps but I'm seriously at a loss as to what the rest of this
>> would be used for.  Why are caps and flags both there and separate?
>> Flags are typically something you set, not query.  Also, what's with
>> rsvd1 at the end?  This smells of substantial over-building to me.
>>
>> I thought to myself, "maybe I'm missing a future use-case" so I looked
>> at the internal tree and none of this is being used there either.
>> This indicates to me that either I'm missing something and there's
>> code somewhere I don't know about or, with three years of building on
>> internal branches, we still haven't proven that any of this is needed.
>> If it's the latter, which I strongly suspect, maybe we should drop the
>> unnecessary bits and only add them back in if and when we have proof
>> that they're useful.
>>
>> To be clear, I don't mind the query API as such and the class/instance
>> stuff seems fine and I really like being able to get the sizes
>> directly.  What concerns me is all this extra future-proofing that we
>> have zero proof is actually useful.  In my experience, when you build
>> out like this without so much as a use-case, you always end up
>> building the wrong thing.
>>
>>> + */
>>> +struct drm_i915_memory_region_info {
>>> +       /** @region: class:instance pair encoding */
>>> +       struct drm_i915_gem_memory_class_instance region;
>>> +
>>> +       /** @rsvd0: MBZ */
>>> +       __u32 rsvd0;
>>> +
>>> +       /** @caps: MBZ */
>>> +       __u64 caps;
>>> +
>>> +       /** @flags: MBZ */
>>> +       __u64 flags;
>>> +
>>> +       /** @probed_size: Memory probed by the driver (-1 = unknown) */
>>> +       __u64 probed_size;
>>> +
>>> +       /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
>>> +       __u64 unallocated_size;
>>> +
>>> +       /** @rsvd1: MBZ */
>>> +       __u64 rsvd1[8];
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_query_memory_regions
>>> + *
>>> + * Region info query enumerates all regions known to the driver by filling in
>>> + * an array of struct drm_i915_memory_region_info structures.
>>> + *
>>> + * Example for getting the list of supported regions:
>>> + *
>>> + * .. code-block:: C
>>> + *
>>> + *     struct drm_i915_query_memory_regions *info;
>>> + *     struct drm_i915_query_item item = {
>>> + *             .query_id = DRM_I915_QUERY_MEMORY_REGIONS;
>>> + *     };
>>> + *     struct drm_i915_query query = {
>>> + *             .num_items = 1,
>>> + *             .items_ptr = (uintptr_t)&item,
>>> + *     };
>>> + *     int err, i;
>>> + *
>>> + *     // First query the size of the blob we need, this needs to be large
>>> + *     // enough to hold our array of regions. The kernel will fill out the
>>> + *     // item.length for us, which is the number of bytes we need.
>>> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
>>> + *     if (err) ...
>>> + *
>>> + *     info = calloc(1, item.length);
>>> + *     // Now that we allocated the required number of bytes, we call the ioctl
>>> + *     // again, this time with the data_ptr pointing to our newly allocated
>>> + *     // blob, which the kernel can then populate with the all the region info.
>>> + *     item.data_ptr = (uintptr_t)&info,
>>> + *
>>> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
>>> + *     if (err) ...
>>> + *
>>> + *     // We can now access each region in the array
>>> + *     for (i = 0; i < info->num_regions; i++) {
>>> + *             struct drm_i915_memory_region_info mr = info->regions[i];
>>> + *             u16 class = mr.region.class;
>>> + *             u16 instance = mr.region.instance;
>>> + *
>>> + *             ....
>>> + *     }
>>> + *
>>> + *     free(info);
>>> + */
>>> +struct drm_i915_query_memory_regions {
>>> +       /** @num_regions: Number of supported regions */
>>> +       __u32 num_regions;
>>> +
>>> +       /** @rsvd: MBZ */
>>> +       __u32 rsvd[3];
>> Why pad to 16B instead of 8B?
>>
>>> +
>>> +       /** @regions: Info about each supported region */
>>> +       struct drm_i915_memory_region_info regions[];
>>> +};
>>> +
>>> +#define DRM_I915_GEM_CREATE_EXT                0xdeadbeaf
>>> +#define DRM_IOCTL_I915_GEM_CREATE_EXT  DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
>>> +
>>> +/**
>>> + * struct drm_i915_gem_create_ext
>>> + *
>>> + * Existing gem_create behaviour, with added extension support.
>>> + *
>>> + * Note that in the future we want to have our buffer flags here, at least for
>>> + * the stuff that is immutable. Previously we would have two ioctls, one to
>>> + * create the object with gem_create, and another to apply various parameters,
>>> + * however this creates some ambiguity for the params which are considered
>>> + * immutable. Also in general we're phasing out the various SET/GET ioctls.
>>> + */
>>> +struct drm_i915_gem_create_ext {
>>> +       /**
>>> +        * @size: Requested size for the object.
>>> +        *
>>> +        * The (page-aligned) allocated size for the object will be returned.
>>> +        *
>>> +        * Note that for some devices we have might have further minimum
>>> +        * page-size restrictions(larger than 4K), like for device local-memory.
>>> +        * However in general the final size here should always reflect any
>>> +        * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
>>> +        * extension to place the object in device local-memory.
>>> +        */
>>> +       __u64 size;
>>> +       /**
>>> +        * @handle: Returned handle for the object.
>>> +        *
>>> +        * Object handles are nonzero.
>>> +        */
>>> +       __u32 handle;
>>> +       /** @flags: MBZ */
>>> +       __u32 flags;
>>> +       /**
>>> +        * @extensions: The chain of extensions to apply to this object.
>>> +        *
>>> +        * This will be useful in the future when we need to support several
>>> +        * different extensions, and we need to apply more than one when
>>> +        * creating the object. See struct i915_user_extension.
>>> +        *
>>> +        * If we don't supply any extensions then we get the same old gem_create
>>> +        * behaviour.
>>> +        *
>>> +        * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
>>> +        * drm_i915_gem_create_ext_memory_regions
>>> +        */
>>> +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>>> +       __u64 extensions;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_create_ext_memory_regions
>>> + *
>>> + * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
>>> + *
>>> + * Set the object with the desired set of placements/regions in priority
>>> + * order(each entry must be unique and supported by the device), as an array of
>>> + * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
>>> + * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
>>> + * supported regions for a device.
>>> + *
>>> + * As an example, on discrete devices, if we wish to set the placement as
>>> + * device local-memory we can do something like:
>>> + *
>>> + * .. code-block:: C
>>> + *
>>> + *     struct drm_i915_gem_memory_class_instance region_lmem = {
>>> + *              .memory_class = I915_MEMORY_CLASS_DEVICE,
>>> + *              .memory_instance = 0,
>>> + *      };
>>> + *      struct drm_i915_gem_create_ext_memory_regions regions = {
>>> + *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
>>> + *              .regions = (uintptr_t)&region_lmem,
>>> + *              .num_regions = 1,
>>> + *      };
>>> + *      struct drm_i915_gem_create_ext create_ext = {
>>> + *              .size = 16 * PAGE_SIZE,
>>> + *              .extensions = (uintptr_t)&regions,
>>> + *      };
>>> + *
>>> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
>>> + *      if (err) ...
>>> + *
>>> + * At which point we get the object handle in create_ext.handle, if all went
>>> + * well.
>>> + */
>>> +struct drm_i915_gem_create_ext_memory_regions {
>>> +       /** @base: Extension link. See struct i915_user_extension. */
>>> +       struct i915_user_extension base;
>>> +
>>> +       /** @pad: MBZ */
>>> +       __u32 pad;
>>> +       /** @num_regions: Number of elements in the placements array. */
>>> +       __u32 num_regions;
>>> +       /**
>>> +        * @regions: The placements array.
>>> +        *
>>> +        * Should be an array of drm_i915_gem_memory_class_instance.
>>> +        */
>>> +       __u64 regions;
>>> +};
>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
>>> new file mode 100644
>>> index 000000000000..52f1db15ae94
>>> --- /dev/null
>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
>>> @@ -0,0 +1,139 @@
>>> +=========================
>>> +I915 DG1/LMEM RFC Section
>>> +=========================
>>> +
>>> +Upstream plan
>>> +=============
>>> +For upstream the overall plan for landing all the DG1 stuff and turning it for
>>> +real, with all the uAPI bits is:
>>> +
>>> +* Merge basic HW enabling of DG1(still without pciid)
>>> +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
>>> +        * At this point we can still make changes, but importantly this lets us
>>> +          start running IGTs which can utilize local-memory in CI
>>> +* Convert over to TTM, make sure it all keeps working
>>> +* Add pciid for DG1 and turn on uAPI for real
>>> +
>>> +New object placement and region query uAPI
>>> +==========================================
>>> +Starting from DG1 we need to give userspace the ability to allocate buffers from
>>> +device local-memory. Currently the driver supports gem_create, which can place
>>> +buffers in system memory via shmem, and the usual assortment of other
>>> +interfaces, like dumb buffers and userptr.
>>> +
>>> +To support this new capability, while also providing a uAPI which will work
>>> +beyond just DG1, we propose to offer three new bits of uAPI:
>>> +
>>> +Query uAPI
>>> +----------
>>> +Existing query interface
>>> +^^^^^^^^^^^^^^^^^^^^^^^^
>>> +.. kernel-doc:: include/uapi/drm/i915_drm.h
>>> +        :functions: drm_i915_query_item drm_i915_query
>>> +
>>> +DRM_I915_QUERY_MEMORY_REGIONS
>>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> +New query ID which allows userspace to discover the list of supported memory
>>> +regions(like system-memory and local-memory) for a given device. We identify
>>> +each region with a class and instance pair, which should be unique. The class
>>> +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
>>> +like DG1.
>>> +
>>> +Side note: The class/instance design is borrowed from our existing engine uAPI,
>>> +where we describe every physical engine in terms of its class, and the
>>> +particular instance, since we can have more than one per class.
>>> +
>>> +In the future we also want to expose more information which can further
>>> +describe the capabilities of a region.
>>> +
>>> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
>>> +        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
>>> +
>>> +GEM_CREATE_EXT
>>> +--------------
>>> +New ioctl which is basically just gem_create but now allows userspace to
>>> +provide a chain of possible extensions. Note that if we don't provide any
>>> +extensions then we get the exact same behaviour as gem_create.
>>> +
>>> +Side note: We also need to support PXP[1] in the near future, which is also
>>> +applicable to integrated platforms, and adds its own gem_create_ext extension,
>>> +which basically lets userspace mark a buffer as "protected".
>> A bit off-topic, but do we really need a whole extension for that?  Or
>> can we just throw a bit in flags?  I'm a big fan of landing create_ext
>> anyway; I like extensibility.  I'm just questioning whether or not
>> that one needs its own struct.
> Yeah for PXP just a flag is all we need. If we cant/don't want to
> stuff that into the main structure then we could do an empty
> extension: It's presence would indicate that we want PXP enabled.

There is actually a potential future use for the PXP extension as there 
are different PXP object use types; we currently plan on supporting only 
one, but I'd prefer to keep the extension option open for potential 
future expansion.

> Definitely we shouldn't build the entire setparam machinery for gem bo
> copied over from ctx_create_ext while Jason is working on around 100
> or so igt+kernel patches to undo the unecessary uapi complexity this
> caused over there in ctx land.
>
> Also adding Daniele and Lionel to keep them in the loop here on what's
> happening to gem_create_ext. Daniele I think for merging we need a)
> resubmit the pxp patches to dri-devel per new process that we have
> since 2-3 weeks now and then b) land the uapi bits only after this
> patch here from Matt has landed, to make sure we don't end up with the
> wrong version of gem_create_ext. Is that ok?

Sure. I might send a new rev for review before this lands, but I'll note 
the dependency in the cover letter and rebase and re-send afterwards.

Daniele

> -Daniel
>
>> --Jason
>>
>>
>>> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
>>> +        :functions: drm_i915_gem_create_ext
>>> +
>>> +It's raining extensions
>>> +^^^^^^^^^^^^^^^^^^^^^^^
>>> +As noted above, extensions can be supplied as a chain in gem_create_ext using the
>>> +existing i915_user_extension. This will be useful in the future when we need to
>>> +support several different extensions, and we need to apply more than one when
>>> +creating the object.
>>> +
>>> +.. kernel-doc:: include/uapi/drm/i915_drm.h
>>> +        :functions: i915_user_extension
>>> +
>>> +I915_GEM_CREATE_EXT_MEMORY_REGIONS
>>> +----------------------------------
>>> +Implemented as an extension for gem_create_ext, we would now allow userspace to
>>> +optionally provide an immutable list of preferred placements at creation time,
>>> +in priority order, for a given buffer object.  For the placements we expect
>>> +them each to use the class/instance encoding, as per the output of the regions
>>> +query. Having the list in priority order will be useful in the future when
>>> +placing an object, say during eviction.
>>> +
>>> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
>>> +        :functions: drm_i915_gem_create_ext_memory_regions
>>> +
>>> +One fair criticism here is that this seems a little over-engineered[2]. If we
>>> +just consider DG1 then yes, a simple gem_create.flags or something is totally
>>> +all that's needed to tell the kernel to allocate the buffer in local-memory or
>>> +whatever. However looking to the future we need uAPI which can also support
>>> +upcoming Xe HP multi-tile architecture in a sane way, where there can be
>>> +multiple local-memory instances for a given device, and so using both class and
>>> +instance in our uAPI to describe regions is desirable, although specifically
>>> +for DG1 it's uninteresting, since we only have a single local-memory instance.
>>> +
>>> +Existing uAPI issues
>>> +====================
>>> +Some potential issues we still need to resolve.
>>> +
>>> +I915 MMAP
>>> +---------
>>> +In i915 there are multiple ways to MMAP GEM object, including mapping the same
>>> +object using different mapping types(WC vs WB), i.e multiple active mmaps per
>>> +object. TTM expects one MMAP at most for the lifetime of the object. If it
>>> +turns out that we have to backpedal here, there might be some potential
>>> +userspace fallout.
>>> +
>>> +I915 SET/GET_CACHING
>>> +--------------------
>>> +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
>>> +DG1 doesn't support non-snooped pcie transactions, so we can just always
>>> +allocate as WB for smem-only buffers.  If/when our hw gains support for
>>> +non-snooped pcie transactions then we must fix this mode at allocation time as
>>> +a new GEM extension.
>>> +
>>> +This is related to the mmap problem, because in general (meaning, when we're
>>> +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
>>> +allocation mode.
>>> +
>>> +Possible idea is to let the kernel picks the mmap mode for userspace from the
>>> +following table:
>>> +
>>> +smem-only: WB. Userspace does not need to call clflush.
>>> +
>>> +smem+lmem: We allocate uncached memory, and give userspace a WC mapping
>>> +for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
>>> +access, which is a bit inefficient.
>>> +
>>> +lmem only: always WC
>>> +
>>> +This means on discrete you only get a single mmap mode, all others must be
>>> +rejected. That's probably going to be a new default mode or something like
>>> +that.
>>> +
>>> +Links
>>> +=====
>>> +[1] https://patchwork.freedesktop.org/series/86798/
>>> +
>>> +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
>>> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
>>> index a8621f7dab8b..05670442ca1b 100644
>>> --- a/Documentation/gpu/rfc/index.rst
>>> +++ b/Documentation/gpu/rfc/index.rst
>>> @@ -15,3 +15,7 @@ host such documentation:
>>>
>>>   * Once the code has landed move all the documentation to the right places in
>>>     the main core, helper or driver sections.
>>> +
>>> +.. toctree::
>>> +
>>> +    i915_gem_lmem.rst
>>> --
>>> 2.26.3
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Mesa-dev] [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend to kernel doc
  2021-04-16  8:59     ` Daniel Vetter
@ 2021-04-16 19:04       ` Jonathan Corbet
  0 siblings, 0 replies; 27+ messages in thread
From: Jonathan Corbet @ 2021-04-16 19:04 UTC (permalink / raw)
  To: Daniel Vetter, Ian Romanick, Linux Doc Mailing List
  Cc: intel-gfx, dri-devel, Kenneth Graunke, Matthew Auld,
	Daniel Vetter, Mesa Dev

Daniel Vetter <daniel@ffwll.ch> writes:

> On Fri, Apr 16, 2021 at 12:25 AM Ian Romanick <idr@freedesktop.org> wrote:
>> Since we just had a big discussion about this on mesa-dev w.r.t. Mesa
>> code and documentation... does the kernel have a policy about which
>> flavor (pun intended) of English should be used?
>
> I'm not finding it documented in
> https://dri.freedesktop.org/docs/drm/doc-guide/sphinx.html but I
> thought we've discussed it. Adding linux-doc and Jon Corbet.

It's in Documentation/doc-guide/contributing.rst:

> Please note that some things are *not* typos and should not be "fixed":
> 
>  - Both American and British English spellings are allowed within the
>    kernel documentation.  There is no need to fix one by replacing it with
>    the other.

Thanks,

jon
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-16 16:38   ` Jason Ekstrand
  2021-04-16 17:02     ` Daniel Vetter
@ 2021-04-19 12:02     ` Matthew Auld
  2021-04-19 15:19       ` Jason Ekstrand
  1 sibling, 1 reply; 27+ messages in thread
From: Matthew Auld @ 2021-04-19 12:02 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, ML mesa-dev, Daniel Vetter

On 16/04/2021 17:38, Jason Ekstrand wrote:
> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>
>> Add an entry for the new uAPI needed for DG1.
>>
>> v2(Daniel):
>>    - include the overall upstreaming plan
>>    - add a note for mmap, there are differences here for TTM vs i915
>>    - bunch of other suggestions from Daniel
>> v3:
>>   (Daniel)
>>    - add a note for set/get caching stuff
>>    - add some more docs for existing query and extensions stuff
>>    - add an actual code example for regions query
>>    - bunch of other stuff
>>   (Jason)
>>    - uAPI change(!):
>>          - try a simpler design with the placements extension
>>          - rather than have a generic setparam which can cover multiple
>>            use cases, have each extension be responsible for one thing
>>            only
>>
>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>> Cc: Dave Airlie <airlied@gmail.com>
>> Cc: dri-devel@lists.freedesktop.org
>> Cc: mesa-dev@lists.freedesktop.org
>> ---
>>   Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>>   Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>>   Documentation/gpu/rfc/index.rst         |   4 +
>>   3 files changed, 398 insertions(+)
>>   create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>>   create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>>
>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
>> new file mode 100644
>> index 000000000000..2a82a452e9f2
>> --- /dev/null
>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
>> @@ -0,0 +1,255 @@
>> +/*
>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
>> + * For the regions query we are just adding a new query id, so no actual new
>> + * ioctl or anything, but including it here for reference.
>> + */
>> +struct drm_i915_query_item {
>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
>> +       ....
>> +        __u64 query_id;
>> +
>> +        /*
>> +         * When set to zero by userspace, this is filled with the size of the
>> +         * data to be written at the data_ptr pointer. The kernel sets this
>> +         * value to a negative value to signal an error on a particular query
>> +         * item.
>> +         */
>> +        __s32 length;
>> +
>> +        __u32 flags;
>> +        /*
>> +         * Data will be written at the location pointed by data_ptr when the
>> +         * value of length matches the length of the data to be written by the
>> +         * kernel.
>> +         */
>> +        __u64 data_ptr;
>> +};
>> +
>> +struct drm_i915_query {
>> +        __u32 num_items;
>> +        /*
>> +         * Unused for now. Must be cleared to zero.
>> +         */
>> +        __u32 flags;
>> +        /*
>> +         * This points to an array of num_items drm_i915_query_item structures.
>> +         */
>> +        __u64 items_ptr;
>> +};
>> +
>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>> +
>> +/**
>> + * enum drm_i915_gem_memory_class
>> + */
>> +enum drm_i915_gem_memory_class {
>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
>> +       I915_MEMORY_CLASS_SYSTEM = 0,
>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
>> +       I915_MEMORY_CLASS_DEVICE,
>> +};
>> +
>> +/**
>> + * struct drm_i915_gem_memory_class_instance
>> + */
>> +struct drm_i915_gem_memory_class_instance {
>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
>> +       __u16 memory_class;
>> +
>> +       /** @memory_instance: which instance */
>> +       __u16 memory_instance;
>> +};
>> +
>> +/**
>> + * struct drm_i915_memory_region_info
>> + *
>> + * Describes one region as known to the driver.
>> + *
>> + * Note that we reserve quite a lot of stuff here for potential future work. As
>> + * an example we might want expose the capabilities(see caps) for a given
>> + * region, which could include things like if the region is CPU
>> + * mappable/accessible etc.
> 
> I get caps but I'm seriously at a loss as to what the rest of this
> would be used for.  Why are caps and flags both there and separate?
> Flags are typically something you set, not query.  Also, what's with
> rsvd1 at the end?  This smells of substantial over-building to me.
> 
> I thought to myself, "maybe I'm missing a future use-case" so I looked
> at the internal tree and none of this is being used there either.
> This indicates to me that either I'm missing something and there's
> code somewhere I don't know about or, with three years of building on
> internal branches, we still haven't proven that any of this is needed.
> If it's the latter, which I strongly suspect, maybe we should drop the
> unnecessary bits and only add them back in if and when we have proof
> that they're useful.

Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1, 
which is less opinionated about future unknowns? If so, makes sense to me.

> 
> To be clear, I don't mind the query API as such and the class/instance
> stuff seems fine and I really like being able to get the sizes
> directly.  What concerns me is all this extra future-proofing that we
> have zero proof is actually useful.  In my experience, when you build
> out like this without so much as a use-case, you always end up
> building the wrong thing.
> 
>> + */
>> +struct drm_i915_memory_region_info {
>> +       /** @region: class:instance pair encoding */
>> +       struct drm_i915_gem_memory_class_instance region;
>> +
>> +       /** @rsvd0: MBZ */
>> +       __u32 rsvd0;
>> +
>> +       /** @caps: MBZ */
>> +       __u64 caps;
>> +
>> +       /** @flags: MBZ */
>> +       __u64 flags;
>> +
>> +       /** @probed_size: Memory probed by the driver (-1 = unknown) */
>> +       __u64 probed_size;
>> +
>> +       /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
>> +       __u64 unallocated_size;
>> +
>> +       /** @rsvd1: MBZ */
>> +       __u64 rsvd1[8];
>> +};
>> +
>> +/**
>> + * struct drm_i915_query_memory_regions
>> + *
>> + * Region info query enumerates all regions known to the driver by filling in
>> + * an array of struct drm_i915_memory_region_info structures.
>> + *
>> + * Example for getting the list of supported regions:
>> + *
>> + * .. code-block:: C
>> + *
>> + *     struct drm_i915_query_memory_regions *info;
>> + *     struct drm_i915_query_item item = {
>> + *             .query_id = DRM_I915_QUERY_MEMORY_REGIONS;
>> + *     };
>> + *     struct drm_i915_query query = {
>> + *             .num_items = 1,
>> + *             .items_ptr = (uintptr_t)&item,
>> + *     };
>> + *     int err, i;
>> + *
>> + *     // First query the size of the blob we need, this needs to be large
>> + *     // enough to hold our array of regions. The kernel will fill out the
>> + *     // item.length for us, which is the number of bytes we need.
>> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
>> + *     if (err) ...
>> + *
>> + *     info = calloc(1, item.length);
>> + *     // Now that we allocated the required number of bytes, we call the ioctl
>> + *     // again, this time with the data_ptr pointing to our newly allocated
>> + *     // blob, which the kernel can then populate with the all the region info.
>> + *     item.data_ptr = (uintptr_t)&info,
>> + *
>> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
>> + *     if (err) ...
>> + *
>> + *     // We can now access each region in the array
>> + *     for (i = 0; i < info->num_regions; i++) {
>> + *             struct drm_i915_memory_region_info mr = info->regions[i];
>> + *             u16 class = mr.region.class;
>> + *             u16 instance = mr.region.instance;
>> + *
>> + *             ....
>> + *     }
>> + *
>> + *     free(info);
>> + */
>> +struct drm_i915_query_memory_regions {
>> +       /** @num_regions: Number of supported regions */
>> +       __u32 num_regions;
>> +
>> +       /** @rsvd: MBZ */
>> +       __u32 rsvd[3];
> 
> Why pad to 16B instead of 8B?

It's copy-pasta from engine_info. I can shrink it if you want? I don't 
have a strong opinion.

> 
>> +
>> +       /** @regions: Info about each supported region */
>> +       struct drm_i915_memory_region_info regions[];
>> +};
>> +
>> +#define DRM_I915_GEM_CREATE_EXT                0xdeadbeaf
>> +#define DRM_IOCTL_I915_GEM_CREATE_EXT  DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
>> +
>> +/**
>> + * struct drm_i915_gem_create_ext
>> + *
>> + * Existing gem_create behaviour, with added extension support.
>> + *
>> + * Note that in the future we want to have our buffer flags here, at least for
>> + * the stuff that is immutable. Previously we would have two ioctls, one to
>> + * create the object with gem_create, and another to apply various parameters,
>> + * however this creates some ambiguity for the params which are considered
>> + * immutable. Also in general we're phasing out the various SET/GET ioctls.
>> + */
>> +struct drm_i915_gem_create_ext {
>> +       /**
>> +        * @size: Requested size for the object.
>> +        *
>> +        * The (page-aligned) allocated size for the object will be returned.
>> +        *
>> +        * Note that for some devices we have might have further minimum
>> +        * page-size restrictions(larger than 4K), like for device local-memory.
>> +        * However in general the final size here should always reflect any
>> +        * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
>> +        * extension to place the object in device local-memory.
>> +        */
>> +       __u64 size;
>> +       /**
>> +        * @handle: Returned handle for the object.
>> +        *
>> +        * Object handles are nonzero.
>> +        */
>> +       __u32 handle;
>> +       /** @flags: MBZ */
>> +       __u32 flags;
>> +       /**
>> +        * @extensions: The chain of extensions to apply to this object.
>> +        *
>> +        * This will be useful in the future when we need to support several
>> +        * different extensions, and we need to apply more than one when
>> +        * creating the object. See struct i915_user_extension.
>> +        *
>> +        * If we don't supply any extensions then we get the same old gem_create
>> +        * behaviour.
>> +        *
>> +        * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
>> +        * drm_i915_gem_create_ext_memory_regions
>> +        */
>> +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
>> +       __u64 extensions;
>> +};
>> +
>> +/**
>> + * struct drm_i915_gem_create_ext_memory_regions
>> + *
>> + * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
>> + *
>> + * Set the object with the desired set of placements/regions in priority
>> + * order(each entry must be unique and supported by the device), as an array of
>> + * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
>> + * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
>> + * supported regions for a device.
>> + *
>> + * As an example, on discrete devices, if we wish to set the placement as
>> + * device local-memory we can do something like:
>> + *
>> + * .. code-block:: C
>> + *
>> + *     struct drm_i915_gem_memory_class_instance region_lmem = {
>> + *              .memory_class = I915_MEMORY_CLASS_DEVICE,
>> + *              .memory_instance = 0,
>> + *      };
>> + *      struct drm_i915_gem_create_ext_memory_regions regions = {
>> + *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
>> + *              .regions = (uintptr_t)&region_lmem,
>> + *              .num_regions = 1,
>> + *      };
>> + *      struct drm_i915_gem_create_ext create_ext = {
>> + *              .size = 16 * PAGE_SIZE,
>> + *              .extensions = (uintptr_t)&regions,
>> + *      };
>> + *
>> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
>> + *      if (err) ...
>> + *
>> + * At which point we get the object handle in create_ext.handle, if all went
>> + * well.
>> + */
>> +struct drm_i915_gem_create_ext_memory_regions {
>> +       /** @base: Extension link. See struct i915_user_extension. */
>> +       struct i915_user_extension base;
>> +
>> +       /** @pad: MBZ */
>> +       __u32 pad;
>> +       /** @num_regions: Number of elements in the placements array. */
>> +       __u32 num_regions;
>> +       /**
>> +        * @regions: The placements array.
>> +        *
>> +        * Should be an array of drm_i915_gem_memory_class_instance.
>> +        */
>> +       __u64 regions;
>> +};
>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
>> new file mode 100644
>> index 000000000000..52f1db15ae94
>> --- /dev/null
>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
>> @@ -0,0 +1,139 @@
>> +=========================
>> +I915 DG1/LMEM RFC Section
>> +=========================
>> +
>> +Upstream plan
>> +=============
>> +For upstream the overall plan for landing all the DG1 stuff and turning it for
>> +real, with all the uAPI bits is:
>> +
>> +* Merge basic HW enabling of DG1(still without pciid)
>> +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
>> +        * At this point we can still make changes, but importantly this lets us
>> +          start running IGTs which can utilize local-memory in CI
>> +* Convert over to TTM, make sure it all keeps working
>> +* Add pciid for DG1 and turn on uAPI for real
>> +
>> +New object placement and region query uAPI
>> +==========================================
>> +Starting from DG1 we need to give userspace the ability to allocate buffers from
>> +device local-memory. Currently the driver supports gem_create, which can place
>> +buffers in system memory via shmem, and the usual assortment of other
>> +interfaces, like dumb buffers and userptr.
>> +
>> +To support this new capability, while also providing a uAPI which will work
>> +beyond just DG1, we propose to offer three new bits of uAPI:
>> +
>> +Query uAPI
>> +----------
>> +Existing query interface
>> +^^^^^^^^^^^^^^^^^^^^^^^^
>> +.. kernel-doc:: include/uapi/drm/i915_drm.h
>> +        :functions: drm_i915_query_item drm_i915_query
>> +
>> +DRM_I915_QUERY_MEMORY_REGIONS
>> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +New query ID which allows userspace to discover the list of supported memory
>> +regions(like system-memory and local-memory) for a given device. We identify
>> +each region with a class and instance pair, which should be unique. The class
>> +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
>> +like DG1.
>> +
>> +Side note: The class/instance design is borrowed from our existing engine uAPI,
>> +where we describe every physical engine in terms of its class, and the
>> +particular instance, since we can have more than one per class.
>> +
>> +In the future we also want to expose more information which can further
>> +describe the capabilities of a region.
>> +
>> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
>> +        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
>> +
>> +GEM_CREATE_EXT
>> +--------------
>> +New ioctl which is basically just gem_create but now allows userspace to
>> +provide a chain of possible extensions. Note that if we don't provide any
>> +extensions then we get the exact same behaviour as gem_create.
>> +
>> +Side note: We also need to support PXP[1] in the near future, which is also
>> +applicable to integrated platforms, and adds its own gem_create_ext extension,
>> +which basically lets userspace mark a buffer as "protected".
> 
> A bit off-topic, but do we really need a whole extension for that?  Or
> can we just throw a bit in flags?  I'm a big fan of landing create_ext
> anyway; I like extensibility.  I'm just questioning whether or not
> that one needs its own struct.
> 
> --Jason
> 
> 
>> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
>> +        :functions: drm_i915_gem_create_ext
>> +
>> +It's raining extensions
>> +^^^^^^^^^^^^^^^^^^^^^^^
>> +As noted above, extensions can be supplied as a chain in gem_create_ext using the
>> +existing i915_user_extension. This will be useful in the future when we need to
>> +support several different extensions, and we need to apply more than one when
>> +creating the object.
>> +
>> +.. kernel-doc:: include/uapi/drm/i915_drm.h
>> +        :functions: i915_user_extension
>> +
>> +I915_GEM_CREATE_EXT_MEMORY_REGIONS
>> +----------------------------------
>> +Implemented as an extension for gem_create_ext, we would now allow userspace to
>> +optionally provide an immutable list of preferred placements at creation time,
>> +in priority order, for a given buffer object.  For the placements we expect
>> +them each to use the class/instance encoding, as per the output of the regions
>> +query. Having the list in priority order will be useful in the future when
>> +placing an object, say during eviction.
>> +
>> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
>> +        :functions: drm_i915_gem_create_ext_memory_regions
>> +
>> +One fair criticism here is that this seems a little over-engineered[2]. If we
>> +just consider DG1 then yes, a simple gem_create.flags or something is totally
>> +all that's needed to tell the kernel to allocate the buffer in local-memory or
>> +whatever. However looking to the future we need uAPI which can also support
>> +upcoming Xe HP multi-tile architecture in a sane way, where there can be
>> +multiple local-memory instances for a given device, and so using both class and
>> +instance in our uAPI to describe regions is desirable, although specifically
>> +for DG1 it's uninteresting, since we only have a single local-memory instance.
>> +
>> +Existing uAPI issues
>> +====================
>> +Some potential issues we still need to resolve.
>> +
>> +I915 MMAP
>> +---------
>> +In i915 there are multiple ways to MMAP GEM object, including mapping the same
>> +object using different mapping types(WC vs WB), i.e multiple active mmaps per
>> +object. TTM expects one MMAP at most for the lifetime of the object. If it
>> +turns out that we have to backpedal here, there might be some potential
>> +userspace fallout.
>> +
>> +I915 SET/GET_CACHING
>> +--------------------
>> +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
>> +DG1 doesn't support non-snooped pcie transactions, so we can just always
>> +allocate as WB for smem-only buffers.  If/when our hw gains support for
>> +non-snooped pcie transactions then we must fix this mode at allocation time as
>> +a new GEM extension.
>> +
>> +This is related to the mmap problem, because in general (meaning, when we're
>> +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
>> +allocation mode.
>> +
>> +Possible idea is to let the kernel picks the mmap mode for userspace from the
>> +following table:
>> +
>> +smem-only: WB. Userspace does not need to call clflush.
>> +
>> +smem+lmem: We allocate uncached memory, and give userspace a WC mapping
>> +for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
>> +access, which is a bit inefficient.
>> +
>> +lmem only: always WC
>> +
>> +This means on discrete you only get a single mmap mode, all others must be
>> +rejected. That's probably going to be a new default mode or something like
>> +that.
>> +
>> +Links
>> +=====
>> +[1] https://patchwork.freedesktop.org/series/86798/
>> +
>> +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
>> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
>> index a8621f7dab8b..05670442ca1b 100644
>> --- a/Documentation/gpu/rfc/index.rst
>> +++ b/Documentation/gpu/rfc/index.rst
>> @@ -15,3 +15,7 @@ host such documentation:
>>
>>   * Once the code has landed move all the documentation to the right places in
>>     the main core, helper or driver sections.
>> +
>> +.. toctree::
>> +
>> +    i915_gem_lmem.rst
>> --
>> 2.26.3
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-19 12:02     ` Matthew Auld
@ 2021-04-19 15:19       ` Jason Ekstrand
  2021-04-20 16:34         ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Ekstrand @ 2021-04-19 15:19 UTC (permalink / raw)
  To: Matthew Auld
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, ML mesa-dev, Daniel Vetter

On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
>
> On 16/04/2021 17:38, Jason Ekstrand wrote:
> > On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>
> >> Add an entry for the new uAPI needed for DG1.
> >>
> >> v2(Daniel):
> >>    - include the overall upstreaming plan
> >>    - add a note for mmap, there are differences here for TTM vs i915
> >>    - bunch of other suggestions from Daniel
> >> v3:
> >>   (Daniel)
> >>    - add a note for set/get caching stuff
> >>    - add some more docs for existing query and extensions stuff
> >>    - add an actual code example for regions query
> >>    - bunch of other stuff
> >>   (Jason)
> >>    - uAPI change(!):
> >>          - try a simpler design with the placements extension
> >>          - rather than have a generic setparam which can cover multiple
> >>            use cases, have each extension be responsible for one thing
> >>            only
> >>
> >> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> Cc: Jordan Justen <jordan.l.justen@intel.com>
> >> Cc: Daniel Vetter <daniel.vetter@intel.com>
> >> Cc: Kenneth Graunke <kenneth@whitecape.org>
> >> Cc: Jason Ekstrand <jason@jlekstrand.net>
> >> Cc: Dave Airlie <airlied@gmail.com>
> >> Cc: dri-devel@lists.freedesktop.org
> >> Cc: mesa-dev@lists.freedesktop.org
> >> ---
> >>   Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
> >>   Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
> >>   Documentation/gpu/rfc/index.rst         |   4 +
> >>   3 files changed, 398 insertions(+)
> >>   create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
> >>   create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
> >>
> >> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> >> new file mode 100644
> >> index 000000000000..2a82a452e9f2
> >> --- /dev/null
> >> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> >> @@ -0,0 +1,255 @@
> >> +/*
> >> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> >> + * For the regions query we are just adding a new query id, so no actual new
> >> + * ioctl or anything, but including it here for reference.
> >> + */
> >> +struct drm_i915_query_item {
> >> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> >> +       ....
> >> +        __u64 query_id;
> >> +
> >> +        /*
> >> +         * When set to zero by userspace, this is filled with the size of the
> >> +         * data to be written at the data_ptr pointer. The kernel sets this
> >> +         * value to a negative value to signal an error on a particular query
> >> +         * item.
> >> +         */
> >> +        __s32 length;
> >> +
> >> +        __u32 flags;
> >> +        /*
> >> +         * Data will be written at the location pointed by data_ptr when the
> >> +         * value of length matches the length of the data to be written by the
> >> +         * kernel.
> >> +         */
> >> +        __u64 data_ptr;
> >> +};
> >> +
> >> +struct drm_i915_query {
> >> +        __u32 num_items;
> >> +        /*
> >> +         * Unused for now. Must be cleared to zero.
> >> +         */
> >> +        __u32 flags;
> >> +        /*
> >> +         * This points to an array of num_items drm_i915_query_item structures.
> >> +         */
> >> +        __u64 items_ptr;
> >> +};
> >> +
> >> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> >> +
> >> +/**
> >> + * enum drm_i915_gem_memory_class
> >> + */
> >> +enum drm_i915_gem_memory_class {
> >> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
> >> +       I915_MEMORY_CLASS_SYSTEM = 0,
> >> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> >> +       I915_MEMORY_CLASS_DEVICE,
> >> +};
> >> +
> >> +/**
> >> + * struct drm_i915_gem_memory_class_instance
> >> + */
> >> +struct drm_i915_gem_memory_class_instance {
> >> +       /** @memory_class: see enum drm_i915_gem_memory_class */
> >> +       __u16 memory_class;
> >> +
> >> +       /** @memory_instance: which instance */
> >> +       __u16 memory_instance;
> >> +};
> >> +
> >> +/**
> >> + * struct drm_i915_memory_region_info
> >> + *
> >> + * Describes one region as known to the driver.
> >> + *
> >> + * Note that we reserve quite a lot of stuff here for potential future work. As
> >> + * an example we might want expose the capabilities(see caps) for a given
> >> + * region, which could include things like if the region is CPU
> >> + * mappable/accessible etc.
> >
> > I get caps but I'm seriously at a loss as to what the rest of this
> > would be used for.  Why are caps and flags both there and separate?
> > Flags are typically something you set, not query.  Also, what's with
> > rsvd1 at the end?  This smells of substantial over-building to me.
> >
> > I thought to myself, "maybe I'm missing a future use-case" so I looked
> > at the internal tree and none of this is being used there either.
> > This indicates to me that either I'm missing something and there's
> > code somewhere I don't know about or, with three years of building on
> > internal branches, we still haven't proven that any of this is needed.
> > If it's the latter, which I strongly suspect, maybe we should drop the
> > unnecessary bits and only add them back in if and when we have proof
> > that they're useful.
>
> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
> which is less opinionated about future unknowns? If so, makes sense to me.

I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
see some value to caps.  We may want to advertise, for instance, what
mapping coherency types are available per-heap.  But I don't see any
use for any of the other fields.

> >
> > To be clear, I don't mind the query API as such and the class/instance
> > stuff seems fine and I really like being able to get the sizes
> > directly.  What concerns me is all this extra future-proofing that we
> > have zero proof is actually useful.  In my experience, when you build
> > out like this without so much as a use-case, you always end up
> > building the wrong thing.
> >
> >> + */
> >> +struct drm_i915_memory_region_info {
> >> +       /** @region: class:instance pair encoding */
> >> +       struct drm_i915_gem_memory_class_instance region;
> >> +
> >> +       /** @rsvd0: MBZ */
> >> +       __u32 rsvd0;
> >> +
> >> +       /** @caps: MBZ */
> >> +       __u64 caps;
> >> +
> >> +       /** @flags: MBZ */
> >> +       __u64 flags;
> >> +
> >> +       /** @probed_size: Memory probed by the driver (-1 = unknown) */
> >> +       __u64 probed_size;
> >> +
> >> +       /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
> >> +       __u64 unallocated_size;
> >> +
> >> +       /** @rsvd1: MBZ */
> >> +       __u64 rsvd1[8];
> >> +};
> >> +
> >> +/**
> >> + * struct drm_i915_query_memory_regions
> >> + *
> >> + * Region info query enumerates all regions known to the driver by filling in
> >> + * an array of struct drm_i915_memory_region_info structures.
> >> + *
> >> + * Example for getting the list of supported regions:
> >> + *
> >> + * .. code-block:: C
> >> + *
> >> + *     struct drm_i915_query_memory_regions *info;
> >> + *     struct drm_i915_query_item item = {
> >> + *             .query_id = DRM_I915_QUERY_MEMORY_REGIONS;
> >> + *     };
> >> + *     struct drm_i915_query query = {
> >> + *             .num_items = 1,
> >> + *             .items_ptr = (uintptr_t)&item,
> >> + *     };
> >> + *     int err, i;
> >> + *
> >> + *     // First query the size of the blob we need, this needs to be large
> >> + *     // enough to hold our array of regions. The kernel will fill out the
> >> + *     // item.length for us, which is the number of bytes we need.
> >> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> >> + *     if (err) ...
> >> + *
> >> + *     info = calloc(1, item.length);
> >> + *     // Now that we allocated the required number of bytes, we call the ioctl
> >> + *     // again, this time with the data_ptr pointing to our newly allocated
> >> + *     // blob, which the kernel can then populate with the all the region info.
> >> + *     item.data_ptr = (uintptr_t)&info,
> >> + *
> >> + *     err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
> >> + *     if (err) ...
> >> + *
> >> + *     // We can now access each region in the array
> >> + *     for (i = 0; i < info->num_regions; i++) {
> >> + *             struct drm_i915_memory_region_info mr = info->regions[i];
> >> + *             u16 class = mr.region.class;
> >> + *             u16 instance = mr.region.instance;
> >> + *
> >> + *             ....
> >> + *     }
> >> + *
> >> + *     free(info);
> >> + */
> >> +struct drm_i915_query_memory_regions {
> >> +       /** @num_regions: Number of supported regions */
> >> +       __u32 num_regions;
> >> +
> >> +       /** @rsvd: MBZ */
> >> +       __u32 rsvd[3];
> >
> > Why pad to 16B instead of 8B?
>
> It's copy-pasta from engine_info. I can shrink it if you want? I don't
> have a strong opinion.

Yeah, I'd shrink to ust a __u32.  We could probably drop it entirely
but aligning to 8B seems like a good idea before an array.

--Jason


> >
> >> +
> >> +       /** @regions: Info about each supported region */
> >> +       struct drm_i915_memory_region_info regions[];
> >> +};
> >> +
> >> +#define DRM_I915_GEM_CREATE_EXT                0xdeadbeaf
> >> +#define DRM_IOCTL_I915_GEM_CREATE_EXT  DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_CREATE_EXT, struct drm_i915_gem_create_ext)
> >> +
> >> +/**
> >> + * struct drm_i915_gem_create_ext
> >> + *
> >> + * Existing gem_create behaviour, with added extension support.
> >> + *
> >> + * Note that in the future we want to have our buffer flags here, at least for
> >> + * the stuff that is immutable. Previously we would have two ioctls, one to
> >> + * create the object with gem_create, and another to apply various parameters,
> >> + * however this creates some ambiguity for the params which are considered
> >> + * immutable. Also in general we're phasing out the various SET/GET ioctls.
> >> + */
> >> +struct drm_i915_gem_create_ext {
> >> +       /**
> >> +        * @size: Requested size for the object.
> >> +        *
> >> +        * The (page-aligned) allocated size for the object will be returned.
> >> +        *
> >> +        * Note that for some devices we have might have further minimum
> >> +        * page-size restrictions(larger than 4K), like for device local-memory.
> >> +        * However in general the final size here should always reflect any
> >> +        * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS
> >> +        * extension to place the object in device local-memory.
> >> +        */
> >> +       __u64 size;
> >> +       /**
> >> +        * @handle: Returned handle for the object.
> >> +        *
> >> +        * Object handles are nonzero.
> >> +        */
> >> +       __u32 handle;
> >> +       /** @flags: MBZ */
> >> +       __u32 flags;
> >> +       /**
> >> +        * @extensions: The chain of extensions to apply to this object.
> >> +        *
> >> +        * This will be useful in the future when we need to support several
> >> +        * different extensions, and we need to apply more than one when
> >> +        * creating the object. See struct i915_user_extension.
> >> +        *
> >> +        * If we don't supply any extensions then we get the same old gem_create
> >> +        * behaviour.
> >> +        *
> >> +        * For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see
> >> +        * drm_i915_gem_create_ext_memory_regions
> >> +        */
> >> +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
> >> +       __u64 extensions;
> >> +};
> >> +
> >> +/**
> >> + * struct drm_i915_gem_create_ext_memory_regions
> >> + *
> >> + * I915_GEM_CREATE_EXT_MEMORY_REGIONS extension:
> >> + *
> >> + * Set the object with the desired set of placements/regions in priority
> >> + * order(each entry must be unique and supported by the device), as an array of
> >> + * drm_i915_gem_memory_class_instance, or an equivalent layout of class:instance
> >> + * pair encodings. See DRM_I915_QUERY_MEMORY_REGIONS for how to query the
> >> + * supported regions for a device.
> >> + *
> >> + * As an example, on discrete devices, if we wish to set the placement as
> >> + * device local-memory we can do something like:
> >> + *
> >> + * .. code-block:: C
> >> + *
> >> + *     struct drm_i915_gem_memory_class_instance region_lmem = {
> >> + *              .memory_class = I915_MEMORY_CLASS_DEVICE,
> >> + *              .memory_instance = 0,
> >> + *      };
> >> + *      struct drm_i915_gem_create_ext_memory_regions regions = {
> >> + *              .base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
> >> + *              .regions = (uintptr_t)&region_lmem,
> >> + *              .num_regions = 1,
> >> + *      };
> >> + *      struct drm_i915_gem_create_ext create_ext = {
> >> + *              .size = 16 * PAGE_SIZE,
> >> + *              .extensions = (uintptr_t)&regions,
> >> + *      };
> >> + *
> >> + *      int err = ioctl(fd, DRM_IOCTL_I915_GEM_CREATE_EXT, &create_ext);
> >> + *      if (err) ...
> >> + *
> >> + * At which point we get the object handle in create_ext.handle, if all went
> >> + * well.
> >> + */
> >> +struct drm_i915_gem_create_ext_memory_regions {
> >> +       /** @base: Extension link. See struct i915_user_extension. */
> >> +       struct i915_user_extension base;
> >> +
> >> +       /** @pad: MBZ */
> >> +       __u32 pad;
> >> +       /** @num_regions: Number of elements in the placements array. */
> >> +       __u32 num_regions;
> >> +       /**
> >> +        * @regions: The placements array.
> >> +        *
> >> +        * Should be an array of drm_i915_gem_memory_class_instance.
> >> +        */
> >> +       __u64 regions;
> >> +};
> >> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.rst b/Documentation/gpu/rfc/i915_gem_lmem.rst
> >> new file mode 100644
> >> index 000000000000..52f1db15ae94
> >> --- /dev/null
> >> +++ b/Documentation/gpu/rfc/i915_gem_lmem.rst
> >> @@ -0,0 +1,139 @@
> >> +=========================
> >> +I915 DG1/LMEM RFC Section
> >> +=========================
> >> +
> >> +Upstream plan
> >> +=============
> >> +For upstream the overall plan for landing all the DG1 stuff and turning it for
> >> +real, with all the uAPI bits is:
> >> +
> >> +* Merge basic HW enabling of DG1(still without pciid)
> >> +* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
> >> +        * At this point we can still make changes, but importantly this lets us
> >> +          start running IGTs which can utilize local-memory in CI
> >> +* Convert over to TTM, make sure it all keeps working
> >> +* Add pciid for DG1 and turn on uAPI for real
> >> +
> >> +New object placement and region query uAPI
> >> +==========================================
> >> +Starting from DG1 we need to give userspace the ability to allocate buffers from
> >> +device local-memory. Currently the driver supports gem_create, which can place
> >> +buffers in system memory via shmem, and the usual assortment of other
> >> +interfaces, like dumb buffers and userptr.
> >> +
> >> +To support this new capability, while also providing a uAPI which will work
> >> +beyond just DG1, we propose to offer three new bits of uAPI:
> >> +
> >> +Query uAPI
> >> +----------
> >> +Existing query interface
> >> +^^^^^^^^^^^^^^^^^^^^^^^^
> >> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> >> +        :functions: drm_i915_query_item drm_i915_query
> >> +
> >> +DRM_I915_QUERY_MEMORY_REGIONS
> >> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >> +New query ID which allows userspace to discover the list of supported memory
> >> +regions(like system-memory and local-memory) for a given device. We identify
> >> +each region with a class and instance pair, which should be unique. The class
> >> +here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
> >> +like DG1.
> >> +
> >> +Side note: The class/instance design is borrowed from our existing engine uAPI,
> >> +where we describe every physical engine in terms of its class, and the
> >> +particular instance, since we can have more than one per class.
> >> +
> >> +In the future we also want to expose more information which can further
> >> +describe the capabilities of a region.
> >> +
> >> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> >> +        :functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
> >> +
> >> +GEM_CREATE_EXT
> >> +--------------
> >> +New ioctl which is basically just gem_create but now allows userspace to
> >> +provide a chain of possible extensions. Note that if we don't provide any
> >> +extensions then we get the exact same behaviour as gem_create.
> >> +
> >> +Side note: We also need to support PXP[1] in the near future, which is also
> >> +applicable to integrated platforms, and adds its own gem_create_ext extension,
> >> +which basically lets userspace mark a buffer as "protected".
> >
> > A bit off-topic, but do we really need a whole extension for that?  Or
> > can we just throw a bit in flags?  I'm a big fan of landing create_ext
> > anyway; I like extensibility.  I'm just questioning whether or not
> > that one needs its own struct.
> >
> > --Jason
> >
> >
> >> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> >> +        :functions: drm_i915_gem_create_ext
> >> +
> >> +It's raining extensions
> >> +^^^^^^^^^^^^^^^^^^^^^^^
> >> +As noted above, extensions can be supplied as a chain in gem_create_ext using the
> >> +existing i915_user_extension. This will be useful in the future when we need to
> >> +support several different extensions, and we need to apply more than one when
> >> +creating the object.
> >> +
> >> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> >> +        :functions: i915_user_extension
> >> +
> >> +I915_GEM_CREATE_EXT_MEMORY_REGIONS
> >> +----------------------------------
> >> +Implemented as an extension for gem_create_ext, we would now allow userspace to
> >> +optionally provide an immutable list of preferred placements at creation time,
> >> +in priority order, for a given buffer object.  For the placements we expect
> >> +them each to use the class/instance encoding, as per the output of the regions
> >> +query. Having the list in priority order will be useful in the future when
> >> +placing an object, say during eviction.
> >> +
> >> +.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
> >> +        :functions: drm_i915_gem_create_ext_memory_regions
> >> +
> >> +One fair criticism here is that this seems a little over-engineered[2]. If we
> >> +just consider DG1 then yes, a simple gem_create.flags or something is totally
> >> +all that's needed to tell the kernel to allocate the buffer in local-memory or
> >> +whatever. However looking to the future we need uAPI which can also support
> >> +upcoming Xe HP multi-tile architecture in a sane way, where there can be
> >> +multiple local-memory instances for a given device, and so using both class and
> >> +instance in our uAPI to describe regions is desirable, although specifically
> >> +for DG1 it's uninteresting, since we only have a single local-memory instance.
> >> +
> >> +Existing uAPI issues
> >> +====================
> >> +Some potential issues we still need to resolve.
> >> +
> >> +I915 MMAP
> >> +---------
> >> +In i915 there are multiple ways to MMAP GEM object, including mapping the same
> >> +object using different mapping types(WC vs WB), i.e multiple active mmaps per
> >> +object. TTM expects one MMAP at most for the lifetime of the object. If it
> >> +turns out that we have to backpedal here, there might be some potential
> >> +userspace fallout.
> >> +
> >> +I915 SET/GET_CACHING
> >> +--------------------
> >> +In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
> >> +DG1 doesn't support non-snooped pcie transactions, so we can just always
> >> +allocate as WB for smem-only buffers.  If/when our hw gains support for
> >> +non-snooped pcie transactions then we must fix this mode at allocation time as
> >> +a new GEM extension.
> >> +
> >> +This is related to the mmap problem, because in general (meaning, when we're
> >> +not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
> >> +allocation mode.
> >> +
> >> +Possible idea is to let the kernel picks the mmap mode for userspace from the
> >> +following table:
> >> +
> >> +smem-only: WB. Userspace does not need to call clflush.
> >> +
> >> +smem+lmem: We allocate uncached memory, and give userspace a WC mapping
> >> +for when the buffer is in smem, and WC when it's in lmem. GPU does snooped
> >> +access, which is a bit inefficient.
> >> +
> >> +lmem only: always WC
> >> +
> >> +This means on discrete you only get a single mmap mode, all others must be
> >> +rejected. That's probably going to be a new default mode or something like
> >> +that.
> >> +
> >> +Links
> >> +=====
> >> +[1] https://patchwork.freedesktop.org/series/86798/
> >> +
> >> +[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
> >> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> >> index a8621f7dab8b..05670442ca1b 100644
> >> --- a/Documentation/gpu/rfc/index.rst
> >> +++ b/Documentation/gpu/rfc/index.rst
> >> @@ -15,3 +15,7 @@ host such documentation:
> >>
> >>   * Once the code has landed move all the documentation to the right places in
> >>     the main core, helper or driver sections.
> >> +
> >> +.. toctree::
> >> +
> >> +    i915_gem_lmem.rst
> >> --
> >> 2.26.3
> >>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-19 15:19       ` Jason Ekstrand
@ 2021-04-20 16:34         ` Tvrtko Ursulin
  2021-04-20 17:00           ` Jason Ekstrand
  0 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2021-04-20 16:34 UTC (permalink / raw)
  To: Jason Ekstrand, Matthew Auld
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, ML mesa-dev, Daniel Vetter


On 19/04/2021 16:19, Jason Ekstrand wrote:
> On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>
>> On 16/04/2021 17:38, Jason Ekstrand wrote:
>>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>
>>>> Add an entry for the new uAPI needed for DG1.
>>>>
>>>> v2(Daniel):
>>>>     - include the overall upstreaming plan
>>>>     - add a note for mmap, there are differences here for TTM vs i915
>>>>     - bunch of other suggestions from Daniel
>>>> v3:
>>>>    (Daniel)
>>>>     - add a note for set/get caching stuff
>>>>     - add some more docs for existing query and extensions stuff
>>>>     - add an actual code example for regions query
>>>>     - bunch of other stuff
>>>>    (Jason)
>>>>     - uAPI change(!):
>>>>           - try a simpler design with the placements extension
>>>>           - rather than have a generic setparam which can cover multiple
>>>>             use cases, have each extension be responsible for one thing
>>>>             only
>>>>
>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>> Cc: dri-devel@lists.freedesktop.org
>>>> Cc: mesa-dev@lists.freedesktop.org
>>>> ---
>>>>    Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>>>>    Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>>>>    Documentation/gpu/rfc/index.rst         |   4 +
>>>>    3 files changed, 398 insertions(+)
>>>>    create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>>>>    create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>>>>
>>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>> new file mode 100644
>>>> index 000000000000..2a82a452e9f2
>>>> --- /dev/null
>>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>> @@ -0,0 +1,255 @@
>>>> +/*
>>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
>>>> + * For the regions query we are just adding a new query id, so no actual new
>>>> + * ioctl or anything, but including it here for reference.
>>>> + */
>>>> +struct drm_i915_query_item {
>>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
>>>> +       ....
>>>> +        __u64 query_id;
>>>> +
>>>> +        /*
>>>> +         * When set to zero by userspace, this is filled with the size of the
>>>> +         * data to be written at the data_ptr pointer. The kernel sets this
>>>> +         * value to a negative value to signal an error on a particular query
>>>> +         * item.
>>>> +         */
>>>> +        __s32 length;
>>>> +
>>>> +        __u32 flags;
>>>> +        /*
>>>> +         * Data will be written at the location pointed by data_ptr when the
>>>> +         * value of length matches the length of the data to be written by the
>>>> +         * kernel.
>>>> +         */
>>>> +        __u64 data_ptr;
>>>> +};
>>>> +
>>>> +struct drm_i915_query {
>>>> +        __u32 num_items;
>>>> +        /*
>>>> +         * Unused for now. Must be cleared to zero.
>>>> +         */
>>>> +        __u32 flags;
>>>> +        /*
>>>> +         * This points to an array of num_items drm_i915_query_item structures.
>>>> +         */
>>>> +        __u64 items_ptr;
>>>> +};
>>>> +
>>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>>> +
>>>> +/**
>>>> + * enum drm_i915_gem_memory_class
>>>> + */
>>>> +enum drm_i915_gem_memory_class {
>>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
>>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
>>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
>>>> +       I915_MEMORY_CLASS_DEVICE,
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct drm_i915_gem_memory_class_instance
>>>> + */
>>>> +struct drm_i915_gem_memory_class_instance {
>>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
>>>> +       __u16 memory_class;
>>>> +
>>>> +       /** @memory_instance: which instance */
>>>> +       __u16 memory_instance;
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct drm_i915_memory_region_info
>>>> + *
>>>> + * Describes one region as known to the driver.
>>>> + *
>>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
>>>> + * an example we might want expose the capabilities(see caps) for a given
>>>> + * region, which could include things like if the region is CPU
>>>> + * mappable/accessible etc.
>>>
>>> I get caps but I'm seriously at a loss as to what the rest of this
>>> would be used for.  Why are caps and flags both there and separate?
>>> Flags are typically something you set, not query.  Also, what's with
>>> rsvd1 at the end?  This smells of substantial over-building to me.
>>>
>>> I thought to myself, "maybe I'm missing a future use-case" so I looked
>>> at the internal tree and none of this is being used there either.
>>> This indicates to me that either I'm missing something and there's
>>> code somewhere I don't know about or, with three years of building on
>>> internal branches, we still haven't proven that any of this is needed.
>>> If it's the latter, which I strongly suspect, maybe we should drop the
>>> unnecessary bits and only add them back in if and when we have proof
>>> that they're useful.
>>
>> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
>> which is less opinionated about future unknowns? If so, makes sense to me.
> 
> I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
> see some value to caps.  We may want to advertise, for instance, what
> mapping coherency types are available per-heap.  But I don't see any
> use for any of the other fields.

I'd suggest making sure at least enough rsvd fields remain so that flags 
could be added later if needed. Experience from engine info shows that 
both were required in order to extend the query via re-purposing the 
rsvds and adding flag bits to indicate when a certain rsvd contains a 
new piece of information. I probably cannot go into too much detail 
here, but anyway the point is just to make sure too much is not stripped 
out so that instead of simply adding fields/flags we have to add a new 
query in the future. IMO some rsvd fields are not really harmful and if 
they can make things easier in the future why not.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-20 16:34         ` Tvrtko Ursulin
@ 2021-04-20 17:00           ` Jason Ekstrand
  2021-04-21  8:22             ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Ekstrand @ 2021-04-20 17:00 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev

On Tue, Apr 20, 2021 at 11:34 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 19/04/2021 16:19, Jason Ekstrand wrote:
> > On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>
> >> On 16/04/2021 17:38, Jason Ekstrand wrote:
> >>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>>>
> >>>> Add an entry for the new uAPI needed for DG1.
> >>>>
> >>>> v2(Daniel):
> >>>>     - include the overall upstreaming plan
> >>>>     - add a note for mmap, there are differences here for TTM vs i915
> >>>>     - bunch of other suggestions from Daniel
> >>>> v3:
> >>>>    (Daniel)
> >>>>     - add a note for set/get caching stuff
> >>>>     - add some more docs for existing query and extensions stuff
> >>>>     - add an actual code example for regions query
> >>>>     - bunch of other stuff
> >>>>    (Jason)
> >>>>     - uAPI change(!):
> >>>>           - try a simpler design with the placements extension
> >>>>           - rather than have a generic setparam which can cover multiple
> >>>>             use cases, have each extension be responsible for one thing
> >>>>             only
> >>>>
> >>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
> >>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
> >>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
> >>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>> Cc: dri-devel@lists.freedesktop.org
> >>>> Cc: mesa-dev@lists.freedesktop.org
> >>>> ---
> >>>>    Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
> >>>>    Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
> >>>>    Documentation/gpu/rfc/index.rst         |   4 +
> >>>>    3 files changed, 398 insertions(+)
> >>>>    create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>    create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
> >>>>
> >>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> >>>> new file mode 100644
> >>>> index 000000000000..2a82a452e9f2
> >>>> --- /dev/null
> >>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> >>>> @@ -0,0 +1,255 @@
> >>>> +/*
> >>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> >>>> + * For the regions query we are just adding a new query id, so no actual new
> >>>> + * ioctl or anything, but including it here for reference.
> >>>> + */
> >>>> +struct drm_i915_query_item {
> >>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> >>>> +       ....
> >>>> +        __u64 query_id;
> >>>> +
> >>>> +        /*
> >>>> +         * When set to zero by userspace, this is filled with the size of the
> >>>> +         * data to be written at the data_ptr pointer. The kernel sets this
> >>>> +         * value to a negative value to signal an error on a particular query
> >>>> +         * item.
> >>>> +         */
> >>>> +        __s32 length;
> >>>> +
> >>>> +        __u32 flags;
> >>>> +        /*
> >>>> +         * Data will be written at the location pointed by data_ptr when the
> >>>> +         * value of length matches the length of the data to be written by the
> >>>> +         * kernel.
> >>>> +         */
> >>>> +        __u64 data_ptr;
> >>>> +};
> >>>> +
> >>>> +struct drm_i915_query {
> >>>> +        __u32 num_items;
> >>>> +        /*
> >>>> +         * Unused for now. Must be cleared to zero.
> >>>> +         */
> >>>> +        __u32 flags;
> >>>> +        /*
> >>>> +         * This points to an array of num_items drm_i915_query_item structures.
> >>>> +         */
> >>>> +        __u64 items_ptr;
> >>>> +};
> >>>> +
> >>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> >>>> +
> >>>> +/**
> >>>> + * enum drm_i915_gem_memory_class
> >>>> + */
> >>>> +enum drm_i915_gem_memory_class {
> >>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
> >>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
> >>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> >>>> +       I915_MEMORY_CLASS_DEVICE,
> >>>> +};
> >>>> +
> >>>> +/**
> >>>> + * struct drm_i915_gem_memory_class_instance
> >>>> + */
> >>>> +struct drm_i915_gem_memory_class_instance {
> >>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
> >>>> +       __u16 memory_class;
> >>>> +
> >>>> +       /** @memory_instance: which instance */
> >>>> +       __u16 memory_instance;
> >>>> +};
> >>>> +
> >>>> +/**
> >>>> + * struct drm_i915_memory_region_info
> >>>> + *
> >>>> + * Describes one region as known to the driver.
> >>>> + *
> >>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
> >>>> + * an example we might want expose the capabilities(see caps) for a given
> >>>> + * region, which could include things like if the region is CPU
> >>>> + * mappable/accessible etc.
> >>>
> >>> I get caps but I'm seriously at a loss as to what the rest of this
> >>> would be used for.  Why are caps and flags both there and separate?
> >>> Flags are typically something you set, not query.  Also, what's with
> >>> rsvd1 at the end?  This smells of substantial over-building to me.
> >>>
> >>> I thought to myself, "maybe I'm missing a future use-case" so I looked
> >>> at the internal tree and none of this is being used there either.
> >>> This indicates to me that either I'm missing something and there's
> >>> code somewhere I don't know about or, with three years of building on
> >>> internal branches, we still haven't proven that any of this is needed.
> >>> If it's the latter, which I strongly suspect, maybe we should drop the
> >>> unnecessary bits and only add them back in if and when we have proof
> >>> that they're useful.
> >>
> >> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
> >> which is less opinionated about future unknowns? If so, makes sense to me.
> >
> > I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
> > see some value to caps.  We may want to advertise, for instance, what
> > mapping coherency types are available per-heap.  But I don't see any
> > use for any of the other fields.
>
> I'd suggest making sure at least enough rsvd fields remain so that flags
> could be added later if needed. Experience from engine info shows that
> both were required in order to extend the query via re-purposing the
> rsvds and adding flag bits to indicate when a certain rsvd contains a
> new piece of information.

Looking at DII, all I see is we started using caps.  I already said
I'm fine with caps.  I can already imagine some useful ones like
specifying what kinds of mappings we can do.

If we're concerned about more complicated stuff, I argue that we have
no ability to predict what that will look like and so just throwing in
a bunch of __u32 rsvd[N] is blind guessing.  I'm seeing a lot of that
in the recently added APIs such as the flags and rsvd[4] in
i915_user_extension.  What's that there for?  Why can't you put that
information in the extension struct which derives from it?  Maybe it's
so that we can extend it.  But we added that struct as part of an
extension mechanism!?!

If we want to make things extensible, Vulkan actually provides some
prior art for this in the form of allowing queries to be extended just
like input structs.  We could add a __u64 extensions field to
memory_region_info and, if we ever need to query more info, the client
can provide a chain of additional per-region queries.  Yeah, there are
problems with it and it gets a bit clunky but it does work pretty
well.

> I probably cannot go into too much detail
> here, but anyway the point is just to make sure too much is not stripped
> out so that instead of simply adding fields/flags we have to add a new
> query in the future. IMO some rsvd fields are not really harmful and if
> they can make things easier in the future why not.

Maybe it's my tired and generally grumpy state of mind but I'm not
particularly favorable towards "why not?" as a justification for
immutable kernel APIs.  We've already found a few places where
Zoidberg API design has caused us problems.  We need an answer to
"why?"  Future extensibility is a potentially valid answer but we need
to do a better job of thinking through it than we have in the past.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-20 17:00           ` Jason Ekstrand
@ 2021-04-21  8:22             ` Tvrtko Ursulin
  2021-04-21 13:54               ` Jason Ekstrand
  0 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2021-04-21  8:22 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev


On 20/04/2021 18:00, Jason Ekstrand wrote:
> On Tue, Apr 20, 2021 at 11:34 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 19/04/2021 16:19, Jason Ekstrand wrote:
>>> On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>
>>>> On 16/04/2021 17:38, Jason Ekstrand wrote:
>>>>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>>>
>>>>>> Add an entry for the new uAPI needed for DG1.
>>>>>>
>>>>>> v2(Daniel):
>>>>>>      - include the overall upstreaming plan
>>>>>>      - add a note for mmap, there are differences here for TTM vs i915
>>>>>>      - bunch of other suggestions from Daniel
>>>>>> v3:
>>>>>>     (Daniel)
>>>>>>      - add a note for set/get caching stuff
>>>>>>      - add some more docs for existing query and extensions stuff
>>>>>>      - add an actual code example for regions query
>>>>>>      - bunch of other stuff
>>>>>>     (Jason)
>>>>>>      - uAPI change(!):
>>>>>>            - try a simpler design with the placements extension
>>>>>>            - rather than have a generic setparam which can cover multiple
>>>>>>              use cases, have each extension be responsible for one thing
>>>>>>              only
>>>>>>
>>>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>>>>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>> Cc: dri-devel@lists.freedesktop.org
>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>> ---
>>>>>>     Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>>>>>>     Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>>>>>>     Documentation/gpu/rfc/index.rst         |   4 +
>>>>>>     3 files changed, 398 insertions(+)
>>>>>>     create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>     create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>>>>>>
>>>>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>> new file mode 100644
>>>>>> index 000000000000..2a82a452e9f2
>>>>>> --- /dev/null
>>>>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>> @@ -0,0 +1,255 @@
>>>>>> +/*
>>>>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
>>>>>> + * For the regions query we are just adding a new query id, so no actual new
>>>>>> + * ioctl or anything, but including it here for reference.
>>>>>> + */
>>>>>> +struct drm_i915_query_item {
>>>>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
>>>>>> +       ....
>>>>>> +        __u64 query_id;
>>>>>> +
>>>>>> +        /*
>>>>>> +         * When set to zero by userspace, this is filled with the size of the
>>>>>> +         * data to be written at the data_ptr pointer. The kernel sets this
>>>>>> +         * value to a negative value to signal an error on a particular query
>>>>>> +         * item.
>>>>>> +         */
>>>>>> +        __s32 length;
>>>>>> +
>>>>>> +        __u32 flags;
>>>>>> +        /*
>>>>>> +         * Data will be written at the location pointed by data_ptr when the
>>>>>> +         * value of length matches the length of the data to be written by the
>>>>>> +         * kernel.
>>>>>> +         */
>>>>>> +        __u64 data_ptr;
>>>>>> +};
>>>>>> +
>>>>>> +struct drm_i915_query {
>>>>>> +        __u32 num_items;
>>>>>> +        /*
>>>>>> +         * Unused for now. Must be cleared to zero.
>>>>>> +         */
>>>>>> +        __u32 flags;
>>>>>> +        /*
>>>>>> +         * This points to an array of num_items drm_i915_query_item structures.
>>>>>> +         */
>>>>>> +        __u64 items_ptr;
>>>>>> +};
>>>>>> +
>>>>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>>>>> +
>>>>>> +/**
>>>>>> + * enum drm_i915_gem_memory_class
>>>>>> + */
>>>>>> +enum drm_i915_gem_memory_class {
>>>>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
>>>>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
>>>>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
>>>>>> +       I915_MEMORY_CLASS_DEVICE,
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>> + * struct drm_i915_gem_memory_class_instance
>>>>>> + */
>>>>>> +struct drm_i915_gem_memory_class_instance {
>>>>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
>>>>>> +       __u16 memory_class;
>>>>>> +
>>>>>> +       /** @memory_instance: which instance */
>>>>>> +       __u16 memory_instance;
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>> + * struct drm_i915_memory_region_info
>>>>>> + *
>>>>>> + * Describes one region as known to the driver.
>>>>>> + *
>>>>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
>>>>>> + * an example we might want expose the capabilities(see caps) for a given
>>>>>> + * region, which could include things like if the region is CPU
>>>>>> + * mappable/accessible etc.
>>>>>
>>>>> I get caps but I'm seriously at a loss as to what the rest of this
>>>>> would be used for.  Why are caps and flags both there and separate?
>>>>> Flags are typically something you set, not query.  Also, what's with
>>>>> rsvd1 at the end?  This smells of substantial over-building to me.
>>>>>
>>>>> I thought to myself, "maybe I'm missing a future use-case" so I looked
>>>>> at the internal tree and none of this is being used there either.
>>>>> This indicates to me that either I'm missing something and there's
>>>>> code somewhere I don't know about or, with three years of building on
>>>>> internal branches, we still haven't proven that any of this is needed.
>>>>> If it's the latter, which I strongly suspect, maybe we should drop the
>>>>> unnecessary bits and only add them back in if and when we have proof
>>>>> that they're useful.
>>>>
>>>> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
>>>> which is less opinionated about future unknowns? If so, makes sense to me.
>>>
>>> I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
>>> see some value to caps.  We may want to advertise, for instance, what
>>> mapping coherency types are available per-heap.  But I don't see any
>>> use for any of the other fields.
>>
>> I'd suggest making sure at least enough rsvd fields remain so that flags
>> could be added later if needed. Experience from engine info shows that
>> both were required in order to extend the query via re-purposing the
>> rsvds and adding flag bits to indicate when a certain rsvd contains a
>> new piece of information.
> 
> Looking at DII, all I see is we started using caps.  I already said
> I'm fine with caps.  I can already imagine some useful ones like
> specifying what kinds of mappings we can do.
> 
> If we're concerned about more complicated stuff, I argue that we have
> no ability to predict what that will look like and so just throwing in
> a bunch of __u32 rsvd[N] is blind guessing.  I'm seeing a lot of that
> in the recently added APIs such as the flags and rsvd[4] in
> i915_user_extension.  What's that there for?  Why can't you put that
> information in the extension struct which derives from it?  Maybe it's
> so that we can extend it.  But we added that struct as part of an
> extension mechanism!?!
> 
> If we want to make things extensible, Vulkan actually provides some
> prior art for this in the form of allowing queries to be extended just
> like input structs.  We could add a __u64 extensions field to
> memory_region_info and, if we ever need to query more info, the client
> can provide a chain of additional per-region queries.  Yeah, there are
> problems with it and it gets a bit clunky but it does work pretty
> well.
> 
>> I probably cannot go into too much detail
>> here, but anyway the point is just to make sure too much is not stripped
>> out so that instead of simply adding fields/flags we have to add a new
>> query in the future. IMO some rsvd fields are not really harmful and if
>> they can make things easier in the future why not.
> 
> Maybe it's my tired and generally grumpy state of mind but I'm not
> particularly favorable towards "why not?" as a justification for
> immutable kernel APIs.  We've already found a few places where
> Zoidberg API design has caused us problems.  We need an answer to
> "why?"  Future extensibility is a potentially valid answer but we need
> to do a better job of thinking through it than we have in the past.

I did not simply say why not, did I? It is a balance thing between cost 
and benefit. I see the cost of rsvd fields as approaching zero really , 
and cost of having to add query v2 if we end up having not enough rsvd 
as definitely way bigger.

If you look at the mentioned engine info query you will see that as soon 
as you add some caps, flags become useful (so userspace can answer the 
question of does the object not support this cap or does the kernel does 
not even know about the cap).

Furthermore, in that uapi, caps pertain to the property of the 
underlying object being queried, while the flags pertain to the query 
itself. I find that separation logical and useful.

I am not claiming to know memory region query will end up the same, and 
I definitely agree we cannot guess the future. I am just saying rsvd 
fields are inconsequential really in terms of maintenance burden and 
have been proven useful in the past. So I disagree with the drive to 
kick them all out.

Btw extension chains also work for me. I this a bit more complicated and 
we don't have prior art in i915 to use them on the read/get/query side 
but we could add if a couple of rsvd is so objectionable.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-21  8:22             ` Tvrtko Ursulin
@ 2021-04-21 13:54               ` Jason Ekstrand
  2021-04-21 14:25                 ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Ekstrand @ 2021-04-21 13:54 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev

On Wed, Apr 21, 2021 at 3:22 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
> On 20/04/2021 18:00, Jason Ekstrand wrote:
> > On Tue, Apr 20, 2021 at 11:34 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 19/04/2021 16:19, Jason Ekstrand wrote:
> >>> On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>>>
> >>>> On 16/04/2021 17:38, Jason Ekstrand wrote:
> >>>>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>>>>>
> >>>>>> Add an entry for the new uAPI needed for DG1.
> >>>>>>
> >>>>>> v2(Daniel):
> >>>>>>      - include the overall upstreaming plan
> >>>>>>      - add a note for mmap, there are differences here for TTM vs i915
> >>>>>>      - bunch of other suggestions from Daniel
> >>>>>> v3:
> >>>>>>     (Daniel)
> >>>>>>      - add a note for set/get caching stuff
> >>>>>>      - add some more docs for existing query and extensions stuff
> >>>>>>      - add an actual code example for regions query
> >>>>>>      - bunch of other stuff
> >>>>>>     (Jason)
> >>>>>>      - uAPI change(!):
> >>>>>>            - try a simpler design with the placements extension
> >>>>>>            - rather than have a generic setparam which can cover multiple
> >>>>>>              use cases, have each extension be responsible for one thing
> >>>>>>              only
> >>>>>>
> >>>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
> >>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
> >>>>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
> >>>>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>>>> Cc: dri-devel@lists.freedesktop.org
> >>>>>> Cc: mesa-dev@lists.freedesktop.org
> >>>>>> ---
> >>>>>>     Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
> >>>>>>     Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
> >>>>>>     Documentation/gpu/rfc/index.rst         |   4 +
> >>>>>>     3 files changed, 398 insertions(+)
> >>>>>>     create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>>>     create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
> >>>>>>
> >>>>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>>> new file mode 100644
> >>>>>> index 000000000000..2a82a452e9f2
> >>>>>> --- /dev/null
> >>>>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>>> @@ -0,0 +1,255 @@
> >>>>>> +/*
> >>>>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> >>>>>> + * For the regions query we are just adding a new query id, so no actual new
> >>>>>> + * ioctl or anything, but including it here for reference.
> >>>>>> + */
> >>>>>> +struct drm_i915_query_item {
> >>>>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> >>>>>> +       ....
> >>>>>> +        __u64 query_id;
> >>>>>> +
> >>>>>> +        /*
> >>>>>> +         * When set to zero by userspace, this is filled with the size of the
> >>>>>> +         * data to be written at the data_ptr pointer. The kernel sets this
> >>>>>> +         * value to a negative value to signal an error on a particular query
> >>>>>> +         * item.
> >>>>>> +         */
> >>>>>> +        __s32 length;
> >>>>>> +
> >>>>>> +        __u32 flags;
> >>>>>> +        /*
> >>>>>> +         * Data will be written at the location pointed by data_ptr when the
> >>>>>> +         * value of length matches the length of the data to be written by the
> >>>>>> +         * kernel.
> >>>>>> +         */
> >>>>>> +        __u64 data_ptr;
> >>>>>> +};
> >>>>>> +
> >>>>>> +struct drm_i915_query {
> >>>>>> +        __u32 num_items;
> >>>>>> +        /*
> >>>>>> +         * Unused for now. Must be cleared to zero.
> >>>>>> +         */
> >>>>>> +        __u32 flags;
> >>>>>> +        /*
> >>>>>> +         * This points to an array of num_items drm_i915_query_item structures.
> >>>>>> +         */
> >>>>>> +        __u64 items_ptr;
> >>>>>> +};
> >>>>>> +
> >>>>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> >>>>>> +
> >>>>>> +/**
> >>>>>> + * enum drm_i915_gem_memory_class
> >>>>>> + */
> >>>>>> +enum drm_i915_gem_memory_class {
> >>>>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
> >>>>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
> >>>>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> >>>>>> +       I915_MEMORY_CLASS_DEVICE,
> >>>>>> +};
> >>>>>> +
> >>>>>> +/**
> >>>>>> + * struct drm_i915_gem_memory_class_instance
> >>>>>> + */
> >>>>>> +struct drm_i915_gem_memory_class_instance {
> >>>>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
> >>>>>> +       __u16 memory_class;
> >>>>>> +
> >>>>>> +       /** @memory_instance: which instance */
> >>>>>> +       __u16 memory_instance;
> >>>>>> +};
> >>>>>> +
> >>>>>> +/**
> >>>>>> + * struct drm_i915_memory_region_info
> >>>>>> + *
> >>>>>> + * Describes one region as known to the driver.
> >>>>>> + *
> >>>>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
> >>>>>> + * an example we might want expose the capabilities(see caps) for a given
> >>>>>> + * region, which could include things like if the region is CPU
> >>>>>> + * mappable/accessible etc.
> >>>>>
> >>>>> I get caps but I'm seriously at a loss as to what the rest of this
> >>>>> would be used for.  Why are caps and flags both there and separate?
> >>>>> Flags are typically something you set, not query.  Also, what's with
> >>>>> rsvd1 at the end?  This smells of substantial over-building to me.
> >>>>>
> >>>>> I thought to myself, "maybe I'm missing a future use-case" so I looked
> >>>>> at the internal tree and none of this is being used there either.
> >>>>> This indicates to me that either I'm missing something and there's
> >>>>> code somewhere I don't know about or, with three years of building on
> >>>>> internal branches, we still haven't proven that any of this is needed.
> >>>>> If it's the latter, which I strongly suspect, maybe we should drop the
> >>>>> unnecessary bits and only add them back in if and when we have proof
> >>>>> that they're useful.
> >>>>
> >>>> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
> >>>> which is less opinionated about future unknowns? If so, makes sense to me.
> >>>
> >>> I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
> >>> see some value to caps.  We may want to advertise, for instance, what
> >>> mapping coherency types are available per-heap.  But I don't see any
> >>> use for any of the other fields.
> >>
> >> I'd suggest making sure at least enough rsvd fields remain so that flags
> >> could be added later if needed. Experience from engine info shows that
> >> both were required in order to extend the query via re-purposing the
> >> rsvds and adding flag bits to indicate when a certain rsvd contains a
> >> new piece of information.
> >
> > Looking at DII, all I see is we started using caps.  I already said
> > I'm fine with caps.  I can already imagine some useful ones like
> > specifying what kinds of mappings we can do.
> >
> > If we're concerned about more complicated stuff, I argue that we have
> > no ability to predict what that will look like and so just throwing in
> > a bunch of __u32 rsvd[N] is blind guessing.  I'm seeing a lot of that
> > in the recently added APIs such as the flags and rsvd[4] in
> > i915_user_extension.  What's that there for?  Why can't you put that
> > information in the extension struct which derives from it?  Maybe it's
> > so that we can extend it.  But we added that struct as part of an
> > extension mechanism!?!
> >
> > If we want to make things extensible, Vulkan actually provides some
> > prior art for this in the form of allowing queries to be extended just
> > like input structs.  We could add a __u64 extensions field to
> > memory_region_info and, if we ever need to query more info, the client
> > can provide a chain of additional per-region queries.  Yeah, there are
> > problems with it and it gets a bit clunky but it does work pretty
> > well.
> >
> >> I probably cannot go into too much detail
> >> here, but anyway the point is just to make sure too much is not stripped
> >> out so that instead of simply adding fields/flags we have to add a new
> >> query in the future. IMO some rsvd fields are not really harmful and if
> >> they can make things easier in the future why not.
> >
> > Maybe it's my tired and generally grumpy state of mind but I'm not
> > particularly favorable towards "why not?" as a justification for
> > immutable kernel APIs.  We've already found a few places where
> > Zoidberg API design has caused us problems.  We need an answer to
> > "why?"  Future extensibility is a potentially valid answer but we need
> > to do a better job of thinking through it than we have in the past.
>
> I did not simply say why not, did I?

You literally did:  "...and if they can make things easier in the
future why not."


> It is a balance thing between cost
> and benefit. I see the cost of rsvd fields as approaching zero really ,
> and cost of having to add query v2 if we end up having not enough rsvd
> as definitely way bigger.
>
> If you look at the mentioned engine info query you will see that as soon
> as you add some caps, flags become useful (so userspace can answer the
> question of does the object not support this cap or does the kernel does
> not even know about the cap).
>
> Furthermore, in that uapi, caps pertain to the property of the
> underlying object being queried, while the flags pertain to the query
> itself. I find that separation logical and useful.

Ok, that answers the question I asked above: "what are flags for and
why are they different?"  At the very least, that should be
documented.  Then again...  We really want a GETPARAM query any time a
kernel interface changes, such as adding caps, and we can say that
userspace should ignore caps it doesn't understand.  I think that
solves both directions of the negotiation without flags.

> I am not claiming to know memory region query will end up the same, and
> I definitely agree we cannot guess the future. I am just saying rsvd
> fields are inconsequential really in terms of maintenance burden and
> have been proven useful in the past. So I disagree with the drive to
> kick them all out.

Sure, it doesn't cost anything to have extra zeros in the struct.
However, if/when the API grows using rsvd fields, we end up with "if
CAP_FOO is set, rsvd[5] means blah" which makes for a horribly
confusing API.  As a userspace person who has to remember how to use
this stuff, I'd rather make another call or chain in a struct than try
to remember and/or figure out what all 8 rsvd fields mean.

> Btw extension chains also work for me. I this a bit more complicated and
> we don't have prior art in i915 to use them on the read/get/query side
> but we could add if a couple of rsvd is so objectionable.

Another option, which I think I mentioned somewhere, is that we could
add a secondary query which takes a memory region class and instance
and lets you query additional properties one-at-a-time.  That might be
easier to extend.  Sure, it means more ioctls but they're not that
expensive and they should only happen at driver initialization so I'm
not that inclined to care about the cost there.

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-21 13:54               ` Jason Ekstrand
@ 2021-04-21 14:25                 ` Tvrtko Ursulin
  2021-04-21 17:17                   ` Jason Ekstrand
  0 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2021-04-21 14:25 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev


On 21/04/2021 14:54, Jason Ekstrand wrote:
> On Wed, Apr 21, 2021 at 3:22 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>> On 20/04/2021 18:00, Jason Ekstrand wrote:
>>> On Tue, Apr 20, 2021 at 11:34 AM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 19/04/2021 16:19, Jason Ekstrand wrote:
>>>>> On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>>>
>>>>>> On 16/04/2021 17:38, Jason Ekstrand wrote:
>>>>>>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>>>>>
>>>>>>>> Add an entry for the new uAPI needed for DG1.
>>>>>>>>
>>>>>>>> v2(Daniel):
>>>>>>>>       - include the overall upstreaming plan
>>>>>>>>       - add a note for mmap, there are differences here for TTM vs i915
>>>>>>>>       - bunch of other suggestions from Daniel
>>>>>>>> v3:
>>>>>>>>      (Daniel)
>>>>>>>>       - add a note for set/get caching stuff
>>>>>>>>       - add some more docs for existing query and extensions stuff
>>>>>>>>       - add an actual code example for regions query
>>>>>>>>       - bunch of other stuff
>>>>>>>>      (Jason)
>>>>>>>>       - uAPI change(!):
>>>>>>>>             - try a simpler design with the placements extension
>>>>>>>>             - rather than have a generic setparam which can cover multiple
>>>>>>>>               use cases, have each extension be responsible for one thing
>>>>>>>>               only
>>>>>>>>
>>>>>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>>>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>>>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>>>>>>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>>> Cc: dri-devel@lists.freedesktop.org
>>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>>> ---
>>>>>>>>      Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>>>>>>>>      Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>>>>>>>>      Documentation/gpu/rfc/index.rst         |   4 +
>>>>>>>>      3 files changed, 398 insertions(+)
>>>>>>>>      create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>>>      create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>>>>>>>>
>>>>>>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>>> new file mode 100644
>>>>>>>> index 000000000000..2a82a452e9f2
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>>> @@ -0,0 +1,255 @@
>>>>>>>> +/*
>>>>>>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
>>>>>>>> + * For the regions query we are just adding a new query id, so no actual new
>>>>>>>> + * ioctl or anything, but including it here for reference.
>>>>>>>> + */
>>>>>>>> +struct drm_i915_query_item {
>>>>>>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
>>>>>>>> +       ....
>>>>>>>> +        __u64 query_id;
>>>>>>>> +
>>>>>>>> +        /*
>>>>>>>> +         * When set to zero by userspace, this is filled with the size of the
>>>>>>>> +         * data to be written at the data_ptr pointer. The kernel sets this
>>>>>>>> +         * value to a negative value to signal an error on a particular query
>>>>>>>> +         * item.
>>>>>>>> +         */
>>>>>>>> +        __s32 length;
>>>>>>>> +
>>>>>>>> +        __u32 flags;
>>>>>>>> +        /*
>>>>>>>> +         * Data will be written at the location pointed by data_ptr when the
>>>>>>>> +         * value of length matches the length of the data to be written by the
>>>>>>>> +         * kernel.
>>>>>>>> +         */
>>>>>>>> +        __u64 data_ptr;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +struct drm_i915_query {
>>>>>>>> +        __u32 num_items;
>>>>>>>> +        /*
>>>>>>>> +         * Unused for now. Must be cleared to zero.
>>>>>>>> +         */
>>>>>>>> +        __u32 flags;
>>>>>>>> +        /*
>>>>>>>> +         * This points to an array of num_items drm_i915_query_item structures.
>>>>>>>> +         */
>>>>>>>> +        __u64 items_ptr;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * enum drm_i915_gem_memory_class
>>>>>>>> + */
>>>>>>>> +enum drm_i915_gem_memory_class {
>>>>>>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
>>>>>>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
>>>>>>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
>>>>>>>> +       I915_MEMORY_CLASS_DEVICE,
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * struct drm_i915_gem_memory_class_instance
>>>>>>>> + */
>>>>>>>> +struct drm_i915_gem_memory_class_instance {
>>>>>>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
>>>>>>>> +       __u16 memory_class;
>>>>>>>> +
>>>>>>>> +       /** @memory_instance: which instance */
>>>>>>>> +       __u16 memory_instance;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * struct drm_i915_memory_region_info
>>>>>>>> + *
>>>>>>>> + * Describes one region as known to the driver.
>>>>>>>> + *
>>>>>>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
>>>>>>>> + * an example we might want expose the capabilities(see caps) for a given
>>>>>>>> + * region, which could include things like if the region is CPU
>>>>>>>> + * mappable/accessible etc.
>>>>>>>
>>>>>>> I get caps but I'm seriously at a loss as to what the rest of this
>>>>>>> would be used for.  Why are caps and flags both there and separate?
>>>>>>> Flags are typically something you set, not query.  Also, what's with
>>>>>>> rsvd1 at the end?  This smells of substantial over-building to me.
>>>>>>>
>>>>>>> I thought to myself, "maybe I'm missing a future use-case" so I looked
>>>>>>> at the internal tree and none of this is being used there either.
>>>>>>> This indicates to me that either I'm missing something and there's
>>>>>>> code somewhere I don't know about or, with three years of building on
>>>>>>> internal branches, we still haven't proven that any of this is needed.
>>>>>>> If it's the latter, which I strongly suspect, maybe we should drop the
>>>>>>> unnecessary bits and only add them back in if and when we have proof
>>>>>>> that they're useful.
>>>>>>
>>>>>> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
>>>>>> which is less opinionated about future unknowns? If so, makes sense to me.
>>>>>
>>>>> I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
>>>>> see some value to caps.  We may want to advertise, for instance, what
>>>>> mapping coherency types are available per-heap.  But I don't see any
>>>>> use for any of the other fields.
>>>>
>>>> I'd suggest making sure at least enough rsvd fields remain so that flags
>>>> could be added later if needed. Experience from engine info shows that
>>>> both were required in order to extend the query via re-purposing the
>>>> rsvds and adding flag bits to indicate when a certain rsvd contains a
>>>> new piece of information.
>>>
>>> Looking at DII, all I see is we started using caps.  I already said
>>> I'm fine with caps.  I can already imagine some useful ones like
>>> specifying what kinds of mappings we can do.
>>>
>>> If we're concerned about more complicated stuff, I argue that we have
>>> no ability to predict what that will look like and so just throwing in
>>> a bunch of __u32 rsvd[N] is blind guessing.  I'm seeing a lot of that
>>> in the recently added APIs such as the flags and rsvd[4] in
>>> i915_user_extension.  What's that there for?  Why can't you put that
>>> information in the extension struct which derives from it?  Maybe it's
>>> so that we can extend it.  But we added that struct as part of an
>>> extension mechanism!?!
>>>
>>> If we want to make things extensible, Vulkan actually provides some
>>> prior art for this in the form of allowing queries to be extended just
>>> like input structs.  We could add a __u64 extensions field to
>>> memory_region_info and, if we ever need to query more info, the client
>>> can provide a chain of additional per-region queries.  Yeah, there are
>>> problems with it and it gets a bit clunky but it does work pretty
>>> well.
>>>
>>>> I probably cannot go into too much detail
>>>> here, but anyway the point is just to make sure too much is not stripped
>>>> out so that instead of simply adding fields/flags we have to add a new
>>>> query in the future. IMO some rsvd fields are not really harmful and if
>>>> they can make things easier in the future why not.
>>>
>>> Maybe it's my tired and generally grumpy state of mind but I'm not
>>> particularly favorable towards "why not?" as a justification for
>>> immutable kernel APIs.  We've already found a few places where
>>> Zoidberg API design has caused us problems.  We need an answer to
>>> "why?"  Future extensibility is a potentially valid answer but we need
>>> to do a better job of thinking through it than we have in the past.
>>
>> I did not simply say why not, did I?
> 
> You literally did:  "...and if they can make things easier in the
> future why not."

You quote the second *part* of *one* sentence from my reply in response 
to my statement that I said more in my reply that just that bit?

>> It is a balance thing between cost
>> and benefit. I see the cost of rsvd fields as approaching zero really ,
>> and cost of having to add query v2 if we end up having not enough rsvd
>> as definitely way bigger.
>>
>> If you look at the mentioned engine info query you will see that as soon
>> as you add some caps, flags become useful (so userspace can answer the
>> question of does the object not support this cap or does the kernel does
>> not even know about the cap).
>>
>> Furthermore, in that uapi, caps pertain to the property of the
>> underlying object being queried, while the flags pertain to the query
>> itself. I find that separation logical and useful.
> 
> Ok, that answers the question I asked above: "what are flags for and
> why are they different?"  At the very least, that should be
> documented.  Then again...  We really want a GETPARAM query any time a
> kernel interface changes, such as adding caps, and we can say that
> userspace should ignore caps it doesn't understand.  I think that
> solves both directions of the negotiation without flags.

I said to look at engine info didn't I.

Getparam also works to some extent, but it's IMO too flat and top-level 
to stuff the answers to random hierarchical questions.

GET_PARAM_DOES_QUERY_<x>_SUPPORT_CAP_<y>? Nah.. a bit clumsy I think 
when we can return the supported caps in the query itself.

>> I am not claiming to know memory region query will end up the same, and
>> I definitely agree we cannot guess the future. I am just saying rsvd
>> fields are inconsequential really in terms of maintenance burden and
>> have been proven useful in the past. So I disagree with the drive to
>> kick them all out.
> 
> Sure, it doesn't cost anything to have extra zeros in the struct.
> However, if/when the API grows using rsvd fields, we end up with "if
> CAP_FOO is set, rsvd[5] means blah" which makes for a horribly
> confusing API.  As a userspace person who has to remember how to use
> this stuff, I'd rather make another call or chain in a struct than try
> to remember and/or figure out what all 8 rsvd fields mean.

Well it's not called rsvd in the uapi which is aware of the new field 
but has a new name.

>> Btw extension chains also work for me. I this a bit more complicated and
>> we don't have prior art in i915 to use them on the read/get/query side
>> but we could add if a couple of rsvd is so objectionable.
> 
> Another option, which I think I mentioned somewhere, is that we could
> add a secondary query which takes a memory region class and instance
> and lets you query additional properties one-at-a-time.  That might be
> easier to extend.  Sure, it means more ioctls but they're not that
> expensive and they should only happen at driver initialization so I'm
> not that inclined to care about the cost there.

Or leave flags and some rsvd so you can add extensions later. :)

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-21 14:25                 ` Tvrtko Ursulin
@ 2021-04-21 17:17                   ` Jason Ekstrand
  2021-04-21 18:28                     ` Tvrtko Ursulin
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Ekstrand @ 2021-04-21 17:17 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev

On Wed, Apr 21, 2021 at 9:25 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 21/04/2021 14:54, Jason Ekstrand wrote:
> > On Wed, Apr 21, 2021 at 3:22 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >> On 20/04/2021 18:00, Jason Ekstrand wrote:
> >>> On Tue, Apr 20, 2021 at 11:34 AM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>>
> >>>>
> >>>> On 19/04/2021 16:19, Jason Ekstrand wrote:
> >>>>> On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>>>>>
> >>>>>> On 16/04/2021 17:38, Jason Ekstrand wrote:
> >>>>>>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
> >>>>>>>>
> >>>>>>>> Add an entry for the new uAPI needed for DG1.
> >>>>>>>>
> >>>>>>>> v2(Daniel):
> >>>>>>>>       - include the overall upstreaming plan
> >>>>>>>>       - add a note for mmap, there are differences here for TTM vs i915
> >>>>>>>>       - bunch of other suggestions from Daniel
> >>>>>>>> v3:
> >>>>>>>>      (Daniel)
> >>>>>>>>       - add a note for set/get caching stuff
> >>>>>>>>       - add some more docs for existing query and extensions stuff
> >>>>>>>>       - add an actual code example for regions query
> >>>>>>>>       - bunch of other stuff
> >>>>>>>>      (Jason)
> >>>>>>>>       - uAPI change(!):
> >>>>>>>>             - try a simpler design with the placements extension
> >>>>>>>>             - rather than have a generic setparam which can cover multiple
> >>>>>>>>               use cases, have each extension be responsible for one thing
> >>>>>>>>               only
> >>>>>>>>
> >>>>>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> >>>>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>>>>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
> >>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
> >>>>>>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
> >>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>>>>>> Cc: dri-devel@lists.freedesktop.org
> >>>>>>>> Cc: mesa-dev@lists.freedesktop.org
> >>>>>>>> ---
> >>>>>>>>      Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
> >>>>>>>>      Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
> >>>>>>>>      Documentation/gpu/rfc/index.rst         |   4 +
> >>>>>>>>      3 files changed, 398 insertions(+)
> >>>>>>>>      create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>>>>>      create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
> >>>>>>>>
> >>>>>>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>>>>> new file mode 100644
> >>>>>>>> index 000000000000..2a82a452e9f2
> >>>>>>>> --- /dev/null
> >>>>>>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
> >>>>>>>> @@ -0,0 +1,255 @@
> >>>>>>>> +/*
> >>>>>>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
> >>>>>>>> + * For the regions query we are just adding a new query id, so no actual new
> >>>>>>>> + * ioctl or anything, but including it here for reference.
> >>>>>>>> + */
> >>>>>>>> +struct drm_i915_query_item {
> >>>>>>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
> >>>>>>>> +       ....
> >>>>>>>> +        __u64 query_id;
> >>>>>>>> +
> >>>>>>>> +        /*
> >>>>>>>> +         * When set to zero by userspace, this is filled with the size of the
> >>>>>>>> +         * data to be written at the data_ptr pointer. The kernel sets this
> >>>>>>>> +         * value to a negative value to signal an error on a particular query
> >>>>>>>> +         * item.
> >>>>>>>> +         */
> >>>>>>>> +        __s32 length;
> >>>>>>>> +
> >>>>>>>> +        __u32 flags;
> >>>>>>>> +        /*
> >>>>>>>> +         * Data will be written at the location pointed by data_ptr when the
> >>>>>>>> +         * value of length matches the length of the data to be written by the
> >>>>>>>> +         * kernel.
> >>>>>>>> +         */
> >>>>>>>> +        __u64 data_ptr;
> >>>>>>>> +};
> >>>>>>>> +
> >>>>>>>> +struct drm_i915_query {
> >>>>>>>> +        __u32 num_items;
> >>>>>>>> +        /*
> >>>>>>>> +         * Unused for now. Must be cleared to zero.
> >>>>>>>> +         */
> >>>>>>>> +        __u32 flags;
> >>>>>>>> +        /*
> >>>>>>>> +         * This points to an array of num_items drm_i915_query_item structures.
> >>>>>>>> +         */
> >>>>>>>> +        __u64 items_ptr;
> >>>>>>>> +};
> >>>>>>>> +
> >>>>>>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> >>>>>>>> +
> >>>>>>>> +/**
> >>>>>>>> + * enum drm_i915_gem_memory_class
> >>>>>>>> + */
> >>>>>>>> +enum drm_i915_gem_memory_class {
> >>>>>>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
> >>>>>>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
> >>>>>>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
> >>>>>>>> +       I915_MEMORY_CLASS_DEVICE,
> >>>>>>>> +};
> >>>>>>>> +
> >>>>>>>> +/**
> >>>>>>>> + * struct drm_i915_gem_memory_class_instance
> >>>>>>>> + */
> >>>>>>>> +struct drm_i915_gem_memory_class_instance {
> >>>>>>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
> >>>>>>>> +       __u16 memory_class;
> >>>>>>>> +
> >>>>>>>> +       /** @memory_instance: which instance */
> >>>>>>>> +       __u16 memory_instance;
> >>>>>>>> +};
> >>>>>>>> +
> >>>>>>>> +/**
> >>>>>>>> + * struct drm_i915_memory_region_info
> >>>>>>>> + *
> >>>>>>>> + * Describes one region as known to the driver.
> >>>>>>>> + *
> >>>>>>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
> >>>>>>>> + * an example we might want expose the capabilities(see caps) for a given
> >>>>>>>> + * region, which could include things like if the region is CPU
> >>>>>>>> + * mappable/accessible etc.
> >>>>>>>
> >>>>>>> I get caps but I'm seriously at a loss as to what the rest of this
> >>>>>>> would be used for.  Why are caps and flags both there and separate?
> >>>>>>> Flags are typically something you set, not query.  Also, what's with
> >>>>>>> rsvd1 at the end?  This smells of substantial over-building to me.
> >>>>>>>
> >>>>>>> I thought to myself, "maybe I'm missing a future use-case" so I looked
> >>>>>>> at the internal tree and none of this is being used there either.
> >>>>>>> This indicates to me that either I'm missing something and there's
> >>>>>>> code somewhere I don't know about or, with three years of building on
> >>>>>>> internal branches, we still haven't proven that any of this is needed.
> >>>>>>> If it's the latter, which I strongly suspect, maybe we should drop the
> >>>>>>> unnecessary bits and only add them back in if and when we have proof
> >>>>>>> that they're useful.
> >>>>>>
> >>>>>> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
> >>>>>> which is less opinionated about future unknowns? If so, makes sense to me.
> >>>>>
> >>>>> I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
> >>>>> see some value to caps.  We may want to advertise, for instance, what
> >>>>> mapping coherency types are available per-heap.  But I don't see any
> >>>>> use for any of the other fields.
> >>>>
> >>>> I'd suggest making sure at least enough rsvd fields remain so that flags
> >>>> could be added later if needed. Experience from engine info shows that
> >>>> both were required in order to extend the query via re-purposing the
> >>>> rsvds and adding flag bits to indicate when a certain rsvd contains a
> >>>> new piece of information.
> >>>
> >>> Looking at DII, all I see is we started using caps.  I already said
> >>> I'm fine with caps.  I can already imagine some useful ones like
> >>> specifying what kinds of mappings we can do.
> >>>
> >>> If we're concerned about more complicated stuff, I argue that we have
> >>> no ability to predict what that will look like and so just throwing in
> >>> a bunch of __u32 rsvd[N] is blind guessing.  I'm seeing a lot of that
> >>> in the recently added APIs such as the flags and rsvd[4] in
> >>> i915_user_extension.  What's that there for?  Why can't you put that
> >>> information in the extension struct which derives from it?  Maybe it's
> >>> so that we can extend it.  But we added that struct as part of an
> >>> extension mechanism!?!
> >>>
> >>> If we want to make things extensible, Vulkan actually provides some
> >>> prior art for this in the form of allowing queries to be extended just
> >>> like input structs.  We could add a __u64 extensions field to
> >>> memory_region_info and, if we ever need to query more info, the client
> >>> can provide a chain of additional per-region queries.  Yeah, there are
> >>> problems with it and it gets a bit clunky but it does work pretty
> >>> well.
> >>>
> >>>> I probably cannot go into too much detail
> >>>> here, but anyway the point is just to make sure too much is not stripped
> >>>> out so that instead of simply adding fields/flags we have to add a new
> >>>> query in the future. IMO some rsvd fields are not really harmful and if
> >>>> they can make things easier in the future why not.
> >>>
> >>> Maybe it's my tired and generally grumpy state of mind but I'm not
> >>> particularly favorable towards "why not?" as a justification for
> >>> immutable kernel APIs.  We've already found a few places where
> >>> Zoidberg API design has caused us problems.  We need an answer to
> >>> "why?"  Future extensibility is a potentially valid answer but we need
> >>> to do a better job of thinking through it than we have in the past.
> >>
> >> I did not simply say why not, did I?
> >
> > You literally did:  "...and if they can make things easier in the
> > future why not."
>
> You quote the second *part* of *one* sentence from my reply in response
> to my statement that I said more in my reply that just that bit?

My point is that "might possibly be useful to someone some day and no
reason why not" is NOT a valid justification for an API.  That's
exactly how we ended up with mutable engine sets and similar horrors.
Yes, this is clearly less dangerous but it's not clearly any more
useful.  We need, at the very least, a plan.


> >> It is a balance thing between cost
> >> and benefit. I see the cost of rsvd fields as approaching zero really ,
> >> and cost of having to add query v2 if we end up having not enough rsvd
> >> as definitely way bigger.
> >>
> >> If you look at the mentioned engine info query you will see that as soon
> >> as you add some caps, flags become useful (so userspace can answer the
> >> question of does the object not support this cap or does the kernel does
> >> not even know about the cap).
> >>
> >> Furthermore, in that uapi, caps pertain to the property of the
> >> underlying object being queried, while the flags pertain to the query
> >> itself. I find that separation logical and useful.
> >
> > Ok, that answers the question I asked above: "what are flags for and
> > why are they different?"  At the very least, that should be
> > documented.  Then again...  We really want a GETPARAM query any time a
> > kernel interface changes, such as adding caps, and we can say that
> > userspace should ignore caps it doesn't understand.  I think that
> > solves both directions of the negotiation without flags.
>
> I said to look at engine info didn't I.
>
> Getparam also works to some extent, but it's IMO too flat and top-level
> to stuff the answers to random hierarchical questions.
>
> GET_PARAM_DOES_QUERY_<x>_SUPPORT_CAP_<y>? Nah.. a bit clumsy I think
> when we can return the supported caps in the query itself.

Ok, so the flags are an output from i915?  Sorry, but without digging
through the code, that wasn't obvious.

As far as GET_PARAM_DOES_QUERY_<x>_SUPPORT_CAP_<y>, that's fine with
me.  Sure, it's a little clumsy, but it's better than the
guess-and-check that we often do with -EINVAL.  If we had a
`known_caps`, that'd work too.  Or we could split the difference with
GET_PARAM_QUERY_ENGINE_INFO_KNOWN_CAPS.  In any case, there are many
options that aren't that bad and aren't throwing misc rsvd bits in
there.


> >> I am not claiming to know memory region query will end up the same, and
> >> I definitely agree we cannot guess the future. I am just saying rsvd
> >> fields are inconsequential really in terms of maintenance burden and
> >> have been proven useful in the past. So I disagree with the drive to
> >> kick them all out.
> >
> > Sure, it doesn't cost anything to have extra zeros in the struct.
> > However, if/when the API grows using rsvd fields, we end up with "if
> > CAP_FOO is set, rsvd[5] means blah" which makes for a horribly
> > confusing API.  As a userspace person who has to remember how to use
> > this stuff, I'd rather make another call or chain in a struct than try
> > to remember and/or figure out what all 8 rsvd fields mean.
>
> Well it's not called rsvd in the uapi which is aware of the new field
> but has a new name.

Are we allowed to do that?  This is a genuine question.  When I've
tried in the past (cliprects), I was told we couldn't rename it even
though literally no one had used it in code for years.

--Jason


> >> Btw extension chains also work for me. I this a bit more complicated and
> >> we don't have prior art in i915 to use them on the read/get/query side
> >> but we could add if a couple of rsvd is so objectionable.
> >
> > Another option, which I think I mentioned somewhere, is that we could
> > add a secondary query which takes a memory region class and instance
> > and lets you query additional properties one-at-a-time.  That might be
> > easier to extend.  Sure, it means more ioctls but they're not that
> > expensive and they should only happen at driver initialization so I'm
> > not that inclined to care about the cost there.
>
> Or leave flags and some rsvd so you can add extensions later. :)
>
> Regards,
>
> Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-21 17:17                   ` Jason Ekstrand
@ 2021-04-21 18:28                     ` Tvrtko Ursulin
  2021-04-21 19:23                       ` [Mesa-dev] " Daniel Vetter
  0 siblings, 1 reply; 27+ messages in thread
From: Tvrtko Ursulin @ 2021-04-21 18:28 UTC (permalink / raw)
  To: Jason Ekstrand
  Cc: Jordan Justen, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, Daniel Vetter, ML mesa-dev


On 21/04/2021 18:17, Jason Ekstrand wrote:
> On Wed, Apr 21, 2021 at 9:25 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 21/04/2021 14:54, Jason Ekstrand wrote:
>>> On Wed, Apr 21, 2021 at 3:22 AM Tvrtko Ursulin
>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>
>>>> On 20/04/2021 18:00, Jason Ekstrand wrote:
>>>>> On Tue, Apr 20, 2021 at 11:34 AM Tvrtko Ursulin
>>>>> <tvrtko.ursulin@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 19/04/2021 16:19, Jason Ekstrand wrote:
>>>>>>> On Mon, Apr 19, 2021 at 7:02 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>>>>>
>>>>>>>> On 16/04/2021 17:38, Jason Ekstrand wrote:
>>>>>>>>> On Thu, Apr 15, 2021 at 11:04 AM Matthew Auld <matthew.auld@intel.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Add an entry for the new uAPI needed for DG1.
>>>>>>>>>>
>>>>>>>>>> v2(Daniel):
>>>>>>>>>>        - include the overall upstreaming plan
>>>>>>>>>>        - add a note for mmap, there are differences here for TTM vs i915
>>>>>>>>>>        - bunch of other suggestions from Daniel
>>>>>>>>>> v3:
>>>>>>>>>>       (Daniel)
>>>>>>>>>>        - add a note for set/get caching stuff
>>>>>>>>>>        - add some more docs for existing query and extensions stuff
>>>>>>>>>>        - add an actual code example for regions query
>>>>>>>>>>        - bunch of other stuff
>>>>>>>>>>       (Jason)
>>>>>>>>>>        - uAPI change(!):
>>>>>>>>>>              - try a simpler design with the placements extension
>>>>>>>>>>              - rather than have a generic setparam which can cover multiple
>>>>>>>>>>                use cases, have each extension be responsible for one thing
>>>>>>>>>>                only
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>>>>>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>>>>>>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>>>>>>>>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>>>>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>>>>> Cc: dri-devel@lists.freedesktop.org
>>>>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>>>>> ---
>>>>>>>>>>       Documentation/gpu/rfc/i915_gem_lmem.h   | 255 ++++++++++++++++++++++++
>>>>>>>>>>       Documentation/gpu/rfc/i915_gem_lmem.rst | 139 +++++++++++++
>>>>>>>>>>       Documentation/gpu/rfc/index.rst         |   4 +
>>>>>>>>>>       3 files changed, 398 insertions(+)
>>>>>>>>>>       create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>>>>>       create mode 100644 Documentation/gpu/rfc/i915_gem_lmem.rst
>>>>>>>>>>
>>>>>>>>>> diff --git a/Documentation/gpu/rfc/i915_gem_lmem.h b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>>>>> new file mode 100644
>>>>>>>>>> index 000000000000..2a82a452e9f2
>>>>>>>>>> --- /dev/null
>>>>>>>>>> +++ b/Documentation/gpu/rfc/i915_gem_lmem.h
>>>>>>>>>> @@ -0,0 +1,255 @@
>>>>>>>>>> +/*
>>>>>>>>>> + * Note that drm_i915_query_item and drm_i915_query are existing bits of uAPI.
>>>>>>>>>> + * For the regions query we are just adding a new query id, so no actual new
>>>>>>>>>> + * ioctl or anything, but including it here for reference.
>>>>>>>>>> + */
>>>>>>>>>> +struct drm_i915_query_item {
>>>>>>>>>> +#define DRM_I915_QUERY_MEMORY_REGIONS   0xdeadbeaf
>>>>>>>>>> +       ....
>>>>>>>>>> +        __u64 query_id;
>>>>>>>>>> +
>>>>>>>>>> +        /*
>>>>>>>>>> +         * When set to zero by userspace, this is filled with the size of the
>>>>>>>>>> +         * data to be written at the data_ptr pointer. The kernel sets this
>>>>>>>>>> +         * value to a negative value to signal an error on a particular query
>>>>>>>>>> +         * item.
>>>>>>>>>> +         */
>>>>>>>>>> +        __s32 length;
>>>>>>>>>> +
>>>>>>>>>> +        __u32 flags;
>>>>>>>>>> +        /*
>>>>>>>>>> +         * Data will be written at the location pointed by data_ptr when the
>>>>>>>>>> +         * value of length matches the length of the data to be written by the
>>>>>>>>>> +         * kernel.
>>>>>>>>>> +         */
>>>>>>>>>> +        __u64 data_ptr;
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> +struct drm_i915_query {
>>>>>>>>>> +        __u32 num_items;
>>>>>>>>>> +        /*
>>>>>>>>>> +         * Unused for now. Must be cleared to zero.
>>>>>>>>>> +         */
>>>>>>>>>> +        __u32 flags;
>>>>>>>>>> +        /*
>>>>>>>>>> +         * This points to an array of num_items drm_i915_query_item structures.
>>>>>>>>>> +         */
>>>>>>>>>> +        __u64 items_ptr;
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> +#define DRM_IOCTL_I915_QUERY   DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * enum drm_i915_gem_memory_class
>>>>>>>>>> + */
>>>>>>>>>> +enum drm_i915_gem_memory_class {
>>>>>>>>>> +       /** @I915_MEMORY_CLASS_SYSTEM: system memory */
>>>>>>>>>> +       I915_MEMORY_CLASS_SYSTEM = 0,
>>>>>>>>>> +       /** @I915_MEMORY_CLASS_DEVICE: device local-memory */
>>>>>>>>>> +       I915_MEMORY_CLASS_DEVICE,
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * struct drm_i915_gem_memory_class_instance
>>>>>>>>>> + */
>>>>>>>>>> +struct drm_i915_gem_memory_class_instance {
>>>>>>>>>> +       /** @memory_class: see enum drm_i915_gem_memory_class */
>>>>>>>>>> +       __u16 memory_class;
>>>>>>>>>> +
>>>>>>>>>> +       /** @memory_instance: which instance */
>>>>>>>>>> +       __u16 memory_instance;
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * struct drm_i915_memory_region_info
>>>>>>>>>> + *
>>>>>>>>>> + * Describes one region as known to the driver.
>>>>>>>>>> + *
>>>>>>>>>> + * Note that we reserve quite a lot of stuff here for potential future work. As
>>>>>>>>>> + * an example we might want expose the capabilities(see caps) for a given
>>>>>>>>>> + * region, which could include things like if the region is CPU
>>>>>>>>>> + * mappable/accessible etc.
>>>>>>>>>
>>>>>>>>> I get caps but I'm seriously at a loss as to what the rest of this
>>>>>>>>> would be used for.  Why are caps and flags both there and separate?
>>>>>>>>> Flags are typically something you set, not query.  Also, what's with
>>>>>>>>> rsvd1 at the end?  This smells of substantial over-building to me.
>>>>>>>>>
>>>>>>>>> I thought to myself, "maybe I'm missing a future use-case" so I looked
>>>>>>>>> at the internal tree and none of this is being used there either.
>>>>>>>>> This indicates to me that either I'm missing something and there's
>>>>>>>>> code somewhere I don't know about or, with three years of building on
>>>>>>>>> internal branches, we still haven't proven that any of this is needed.
>>>>>>>>> If it's the latter, which I strongly suspect, maybe we should drop the
>>>>>>>>> unnecessary bits and only add them back in if and when we have proof
>>>>>>>>> that they're useful.
>>>>>>>>
>>>>>>>> Do you mean just drop caps/flags here, but keep/inflate rsvd0/rsvd1,
>>>>>>>> which is less opinionated about future unknowns? If so, makes sense to me.
>>>>>>>
>>>>>>> I meant drop flags and rsvd1.  We need rsvd0 for padding and  I can
>>>>>>> see some value to caps.  We may want to advertise, for instance, what
>>>>>>> mapping coherency types are available per-heap.  But I don't see any
>>>>>>> use for any of the other fields.
>>>>>>
>>>>>> I'd suggest making sure at least enough rsvd fields remain so that flags
>>>>>> could be added later if needed. Experience from engine info shows that
>>>>>> both were required in order to extend the query via re-purposing the
>>>>>> rsvds and adding flag bits to indicate when a certain rsvd contains a
>>>>>> new piece of information.
>>>>>
>>>>> Looking at DII, all I see is we started using caps.  I already said
>>>>> I'm fine with caps.  I can already imagine some useful ones like
>>>>> specifying what kinds of mappings we can do.
>>>>>
>>>>> If we're concerned about more complicated stuff, I argue that we have
>>>>> no ability to predict what that will look like and so just throwing in
>>>>> a bunch of __u32 rsvd[N] is blind guessing.  I'm seeing a lot of that
>>>>> in the recently added APIs such as the flags and rsvd[4] in
>>>>> i915_user_extension.  What's that there for?  Why can't you put that
>>>>> information in the extension struct which derives from it?  Maybe it's
>>>>> so that we can extend it.  But we added that struct as part of an
>>>>> extension mechanism!?!
>>>>>
>>>>> If we want to make things extensible, Vulkan actually provides some
>>>>> prior art for this in the form of allowing queries to be extended just
>>>>> like input structs.  We could add a __u64 extensions field to
>>>>> memory_region_info and, if we ever need to query more info, the client
>>>>> can provide a chain of additional per-region queries.  Yeah, there are
>>>>> problems with it and it gets a bit clunky but it does work pretty
>>>>> well.
>>>>>
>>>>>> I probably cannot go into too much detail
>>>>>> here, but anyway the point is just to make sure too much is not stripped
>>>>>> out so that instead of simply adding fields/flags we have to add a new
>>>>>> query in the future. IMO some rsvd fields are not really harmful and if
>>>>>> they can make things easier in the future why not.
>>>>>
>>>>> Maybe it's my tired and generally grumpy state of mind but I'm not
>>>>> particularly favorable towards "why not?" as a justification for
>>>>> immutable kernel APIs.  We've already found a few places where
>>>>> Zoidberg API design has caused us problems.  We need an answer to
>>>>> "why?"  Future extensibility is a potentially valid answer but we need
>>>>> to do a better job of thinking through it than we have in the past.
>>>>
>>>> I did not simply say why not, did I?
>>>
>>> You literally did:  "...and if they can make things easier in the
>>> future why not."
>>
>> You quote the second *part* of *one* sentence from my reply in response
>> to my statement that I said more in my reply that just that bit?
> 
> My point is that "might possibly be useful to someone some day and no
> reason why not" is NOT a valid justification for an API.  That's
> exactly how we ended up with mutable engine sets and similar horrors.
> Yes, this is clearly less dangerous but it's not clearly any more
> useful.  We need, at the very least, a plan.

I also mentioned that we had past experience with rsvd fields being 
useful in engine info query.

And I don't really subscribe to the black-and-white view of bringing 
mutable engines into the rsvd story. Not everything can be reduced to 
the same fundamentals.

>>>> It is a balance thing between cost
>>>> and benefit. I see the cost of rsvd fields as approaching zero really ,
>>>> and cost of having to add query v2 if we end up having not enough rsvd
>>>> as definitely way bigger.
>>>>
>>>> If you look at the mentioned engine info query you will see that as soon
>>>> as you add some caps, flags become useful (so userspace can answer the
>>>> question of does the object not support this cap or does the kernel does
>>>> not even know about the cap).
>>>>
>>>> Furthermore, in that uapi, caps pertain to the property of the
>>>> underlying object being queried, while the flags pertain to the query
>>>> itself. I find that separation logical and useful.
>>>
>>> Ok, that answers the question I asked above: "what are flags for and
>>> why are they different?"  At the very least, that should be
>>> documented.  Then again...  We really want a GETPARAM query any time a
>>> kernel interface changes, such as adding caps, and we can say that
>>> userspace should ignore caps it doesn't understand.  I think that
>>> solves both directions of the negotiation without flags.
>>
>> I said to look at engine info didn't I.
>>
>> Getparam also works to some extent, but it's IMO too flat and top-level
>> to stuff the answers to random hierarchical questions.
>>
>> GET_PARAM_DOES_QUERY_<x>_SUPPORT_CAP_<y>? Nah.. a bit clumsy I think
>> when we can return the supported caps in the query itself.
> 
> Ok, so the flags are an output from i915?  Sorry, but without digging
> through the code, that wasn't obvious.
> 
> As far as GET_PARAM_DOES_QUERY_<x>_SUPPORT_CAP_<y>, that's fine with
> me.  Sure, it's a little clumsy, but it's better than the
> guess-and-check that we often do with -EINVAL.  If we had a
> `known_caps`, that'd work too.  Or we could split the difference with
> GET_PARAM_QUERY_ENGINE_INFO_KNOWN_CAPS.  In any case, there are many
> options that aren't that bad and aren't throwing misc rsvd bits in
> there.

GET_PARAM route would IMO be horrible in the case of engine info. So 
anything but GET_PARAM in my view.

>>>> I am not claiming to know memory region query will end up the same, and
>>>> I definitely agree we cannot guess the future. I am just saying rsvd
>>>> fields are inconsequential really in terms of maintenance burden and
>>>> have been proven useful in the past. So I disagree with the drive to
>>>> kick them all out.
>>>
>>> Sure, it doesn't cost anything to have extra zeros in the struct.
>>> However, if/when the API grows using rsvd fields, we end up with "if
>>> CAP_FOO is set, rsvd[5] means blah" which makes for a horribly
>>> confusing API.  As a userspace person who has to remember how to use
>>> this stuff, I'd rather make another call or chain in a struct than try
>>> to remember and/or figure out what all 8 rsvd fields mean.
>>
>> Well it's not called rsvd in the uapi which is aware of the new field
>> but has a new name.
> 
> Are we allowed to do that?  This is a genuine question.  When I've
> tried in the past (cliprects), I was told we couldn't rename it even
> though literally no one had used it in code for years.

Well we did the union for pad_to_size so I thought we are allowed that 
trick at least. From my experience backward source level compatibility 
is not always there even with things like glibc. Despite that, are we 
generally required to stay backward source compatible I will not claim 
either way.

Regards,

Tvrtko
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Mesa-dev] [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-21 18:28                     ` Tvrtko Ursulin
@ 2021-04-21 19:23                       ` Daniel Vetter
  2021-04-26 15:22                         ` Jason Ekstrand
  0 siblings, 1 reply; 27+ messages in thread
From: Daniel Vetter @ 2021-04-21 19:23 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Intel GFX, Maling list - DRI developers, Kenneth Graunke,
	Matthew Auld, Jason Ekstrand, ML mesa-dev, Daniel Vetter

On Wed, Apr 21, 2021 at 8:28 PM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
> On 21/04/2021 18:17, Jason Ekstrand wrote:
> > On Wed, Apr 21, 2021 at 9:25 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >> On 21/04/2021 14:54, Jason Ekstrand wrote:
> >>> On Wed, Apr 21, 2021 at 3:22 AM Tvrtko Ursulin
> >>> <tvrtko.ursulin@linux.intel.com> wrote:
> >>>> On 20/04/2021 18:00, Jason Ekstrand wrote:
> >>>> I am not claiming to know memory region query will end up the same, and
> >>>> I definitely agree we cannot guess the future. I am just saying rsvd
> >>>> fields are inconsequential really in terms of maintenance burden and
> >>>> have been proven useful in the past. So I disagree with the drive to
> >>>> kick them all out.
> >>>
> >>> Sure, it doesn't cost anything to have extra zeros in the struct.
> >>> However, if/when the API grows using rsvd fields, we end up with "if
> >>> CAP_FOO is set, rsvd[5] means blah" which makes for a horribly
> >>> confusing API.  As a userspace person who has to remember how to use
> >>> this stuff, I'd rather make another call or chain in a struct than try
> >>> to remember and/or figure out what all 8 rsvd fields mean.
> >>
> >> Well it's not called rsvd in the uapi which is aware of the new field
> >> but has a new name.
> >
> > Are we allowed to do that?  This is a genuine question.  When I've
> > tried in the past (cliprects), I was told we couldn't rename it even
> > though literally no one had used it in code for years.
>
> Well we did the union for pad_to_size so I thought we are allowed that
> trick at least. From my experience backward source level compatibility
> is not always there even with things like glibc. Despite that, are we
> generally required to stay backward source compatible I will not claim
> either way.

I think the anonymous union with exactly same sized field is ok. We
also try hard to be source compatible, but we have screwed up in the
past and shrugged it off. The one example that comes to mind is
extended structures at the bottom with new field, which the kernel
automatically zero-extends for old userspace. But when you recompile,
your new-old userspace might no longer clear the new fields because
the ioctl code didn't start out by memset()ing the entire struct.

But by&large we managed to not botch things up on source compat, but
it's definitely a lot tricker than ABI compat.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Mesa-dev] [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI
  2021-04-21 19:23                       ` [Mesa-dev] " Daniel Vetter
@ 2021-04-26 15:22                         ` Jason Ekstrand
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Ekstrand @ 2021-04-26 15:22 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tvrtko Ursulin, Intel GFX, Maling list - DRI developers,
	Kenneth Graunke, Matthew Auld, ML mesa-dev, Daniel Vetter

On Wed, Apr 21, 2021 at 2:23 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Wed, Apr 21, 2021 at 8:28 PM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> > On 21/04/2021 18:17, Jason Ekstrand wrote:
> > > On Wed, Apr 21, 2021 at 9:25 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > >> On 21/04/2021 14:54, Jason Ekstrand wrote:
> > >>> On Wed, Apr 21, 2021 at 3:22 AM Tvrtko Ursulin
> > >>> <tvrtko.ursulin@linux.intel.com> wrote:
> > >>>> On 20/04/2021 18:00, Jason Ekstrand wrote:
> > >>>> I am not claiming to know memory region query will end up the same, and
> > >>>> I definitely agree we cannot guess the future. I am just saying rsvd
> > >>>> fields are inconsequential really in terms of maintenance burden and
> > >>>> have been proven useful in the past. So I disagree with the drive to
> > >>>> kick them all out.
> > >>>
> > >>> Sure, it doesn't cost anything to have extra zeros in the struct.
> > >>> However, if/when the API grows using rsvd fields, we end up with "if
> > >>> CAP_FOO is set, rsvd[5] means blah" which makes for a horribly
> > >>> confusing API.  As a userspace person who has to remember how to use
> > >>> this stuff, I'd rather make another call or chain in a struct than try
> > >>> to remember and/or figure out what all 8 rsvd fields mean.
> > >>
> > >> Well it's not called rsvd in the uapi which is aware of the new field
> > >> but has a new name.
> > >
> > > Are we allowed to do that?  This is a genuine question.  When I've
> > > tried in the past (cliprects), I was told we couldn't rename it even
> > > though literally no one had used it in code for years.
> >
> > Well we did the union for pad_to_size so I thought we are allowed that
> > trick at least. From my experience backward source level compatibility
> > is not always there even with things like glibc. Despite that, are we
> > generally required to stay backward source compatible I will not claim
> > either way.

I'm starting to lose the will to care about this particular bike shed.
I think I'm fine with keeping some RSVD fields as long as:

 1. We're very clear in the docs with flags and caps about what things
are inputs and what things are outputs and how they interact.
Preferably, everything is an output.
 2. If we already know that caps is useless without supported_caps,
let's either add supported_caps in now or throw out both and use some
rsvd for them when they begin to be needed.
 3. We have a plan for how we're going to use them in a
backwards-compatible way.

On 3, it sounds like we have a rough sketch of a plan but I'm still
unclear on some details.  In particular, we have an rsvd[8] at the end
but, if we're going to replace individual bits of it, we can't ever
shorten or re-name that array.  We could, potentially, do

union {
    __u32 rsvd[8];
    struct {
        __u32 new_field;
    };
};

and trust C to put all the fields of our anonymous struct at the top.
Otherwise, we've got to fill out our struct with more rsvd and that
gets to be a mess.

Another option would be to have

__u32 rsvd1;
__u32 rsvd2;
__u32 rsvd3;
/* etc... */

and we can replace them one at a time.

Again, my big point is that I want us to have a PLAN and not end up in
a scenario where we end up with rsvd[5] having magical meaning.  What
I see in DII does not give me confidence.  However, I do believe that
such a plan can exist.

--Jason

> I think the anonymous union with exactly same sized field is ok. We
> also try hard to be source compatible, but we have screwed up in the
> past and shrugged it off. The one example that comes to mind is
> extended structures at the bottom with new field, which the kernel
> automatically zero-extends for old userspace. But when you recompile,
> your new-old userspace might no longer clear the new fields because
> the ioctl code didn't start out by memset()ing the entire struct.

Also, we need to ensure that we memset everything to 0. :-)

--Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-04-26 15:22 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-15 15:59 [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Matthew Auld
2021-04-15 15:59 ` [PATCH v3 2/4] drm/i915/uapi: convert i915_user_extension to kernel doc Matthew Auld
2021-04-16  8:46   ` Daniel Vetter
2021-04-15 15:59 ` [PATCH v3 3/4] drm/i915/uapi: convert i915_query and friend " Matthew Auld
2021-04-15 22:25   ` [Mesa-dev] " Ian Romanick
2021-04-16  8:59     ` Daniel Vetter
2021-04-16 19:04       ` Jonathan Corbet
2021-04-16  7:57   ` [Intel-gfx] " Tvrtko Ursulin
2021-04-16  8:49   ` Daniel Vetter
2021-04-15 15:59 ` [PATCH v3 4/4] drm/doc/rfc: i915 DG1 uAPI Matthew Auld
2021-04-16  9:15   ` Daniel Vetter
2021-04-16 16:38   ` Jason Ekstrand
2021-04-16 17:02     ` Daniel Vetter
2021-04-16 17:33       ` Daniele Ceraolo Spurio
2021-04-19 12:02     ` Matthew Auld
2021-04-19 15:19       ` Jason Ekstrand
2021-04-20 16:34         ` Tvrtko Ursulin
2021-04-20 17:00           ` Jason Ekstrand
2021-04-21  8:22             ` Tvrtko Ursulin
2021-04-21 13:54               ` Jason Ekstrand
2021-04-21 14:25                 ` Tvrtko Ursulin
2021-04-21 17:17                   ` Jason Ekstrand
2021-04-21 18:28                     ` Tvrtko Ursulin
2021-04-21 19:23                       ` [Mesa-dev] " Daniel Vetter
2021-04-26 15:22                         ` Jason Ekstrand
2021-04-16  8:44 ` [PATCH v3 1/4] drm/i915/uapi: hide kernel doc warnings Daniel Vetter
2021-04-16  8:54   ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).