linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] new subsystem for compute accelerator devices
@ 2022-10-22 21:46 Oded Gabbay
  2022-10-22 21:46 ` [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS Oded Gabbay
                   ` (5 more replies)
  0 siblings, 6 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-22 21:46 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

In the last couple of months we had a discussion [1] about creating a new
subsystem for compute accelerator devices in the kernel.

After an analysis that was done by DRM maintainers and myself, and following
a BOF session at the Linux Plumbers conference a few weeks ago [2], we
decided to create a new subsystem that will use the DRM subsystem's code and
functionality. i.e. the accel core code will be part of the DRM subsystem.

This will allow us to leverage the extensive DRM code-base and
collaborate with DRM developers that have experience with this type of
devices. In addition, new features that will be added for the accelerator
drivers can be of use to GPU drivers as well (e.g. RAS).

As agreed in the BOF session, the accelerator devices will be exposed to
user-space with a new, dedicated device char files and a dedicated major
number (261), to clearly separate them from graphic cards and the graphic
user-space s/w stack. Furthermore, the drivers will be located in a separate
place in the kernel tree (drivers/accel/).

This series of patches is the first step in this direction as it adds the
necessary infrastructure for accelerator devices to DRM. The new devices will
be exposed with the following convention:

device char files - /dev/accel/accel*
sysfs             - /sys/class/accel/accel*/
debugfs           - /sys/kernel/debug/accel/accel*/

I tried to reuse the existing DRM code as much as possible, while keeping it
readable and maintainable.

One thing that is missing from this series is defining a namespace for the
new accel subsystem, while I'll add in the next iteration of this patch-set,
after I will receive feedback from the community.

As for drivers, once this series will be accepted (after adding the namespace),
I will start working on migrating the habanalabs driver to the new accel
subsystem. I have talked about it with Dave and we agreed that it will be
a good start to simply move the driver as-is with minimal changes, and then
start working on the driver's individual features that will be either added
to the accel core code (with or without changes), or will be removed and
instead the driver will use existing DRM code.

In addition, I know of at least 3 or 4 drivers that were submitted for review
and are good candidates to be included in this new subsystem, instead of being
a drm render node driver or a misc driver.

[1] https://lkml.org/lkml/2022/7/31/83
[2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html

Thanks,
Oded

Oded Gabbay (3):
  drivers/accel: add new kconfig and update MAINTAINERS
  drm: define new accel major and register it
  drm: add dedicated minor for accelerator devices

 Documentation/admin-guide/devices.txt |   5 +
 MAINTAINERS                           |   8 +
 drivers/Kconfig                       |   2 +
 drivers/accel/Kconfig                 |  24 +++
 drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
 drivers/gpu/drm/drm_file.c            |  69 ++++++---
 drivers/gpu/drm/drm_internal.h        |   5 +-
 drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
 include/drm/drm_device.h              |   3 +
 include/drm/drm_drv.h                 |   8 +
 include/drm/drm_file.h                |  21 ++-
 include/drm/drm_ioctl.h               |   1 +
 12 files changed, 374 insertions(+), 67 deletions(-)
 create mode 100644 drivers/accel/Kconfig

--
2.34.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS
  2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
@ 2022-10-22 21:46 ` Oded Gabbay
  2022-10-23 12:40   ` Greg Kroah-Hartman
  2022-10-24 15:01   ` Jeffrey Hugo
  2022-10-22 21:46 ` [RFC PATCH 2/3] drm: define new accel major and register it Oded Gabbay
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-22 21:46 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

Add a new Kconfig for the accel subsystem. The Kconfig currently
contains only the basic CONFIG_ACCEL option that will be used to
decide whether to compile the accel registration code as part of the
drm core functionality.

I have taken the liberty to appropriate the dri-devel mailing list and
the dri-devel IRC channel for the accel subsystem.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 MAINTAINERS           |  8 ++++++++
 drivers/Kconfig       |  2 ++
 drivers/accel/Kconfig | 24 ++++++++++++++++++++++++
 3 files changed, 34 insertions(+)
 create mode 100644 drivers/accel/Kconfig

diff --git a/MAINTAINERS b/MAINTAINERS
index cf0f18502372..790d472801d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6820,6 +6820,14 @@ F:	include/drm/drm*
 F:	include/linux/vga*
 F:	include/uapi/drm/drm*
 
+DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK
+M:	Oded Gabbay <ogabbay@kernel.org>
+L:	dri-devel@lists.freedesktop.org
+S:	Maintained
+C:	irc://irc.oftc.net/dri-devel
+T:	git https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git
+F:	drivers/accel/
+
 DRM DRIVERS FOR ALLWINNER A10
 M:	Maxime Ripard <mripard@kernel.org>
 M:	Chen-Yu Tsai <wens@csie.org>
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 19ee995bd0ae..968bd0a6fd78 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -99,6 +99,8 @@ source "drivers/media/Kconfig"
 
 source "drivers/video/Kconfig"
 
+source "drivers/accel/Kconfig"
+
 source "sound/Kconfig"
 
 source "drivers/hid/Kconfig"
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
new file mode 100644
index 000000000000..282ea24f90c5
--- /dev/null
+++ b/drivers/accel/Kconfig
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Compute Acceleration device configuration
+#
+# This framework provides support for compute acceleration devices, such
+# as, but not limited to, Machine-Learning and Deep-Learning acceleration
+# devices
+#
+menuconfig ACCEL
+	tristate "Compute Acceleration Framework"
+	depends on DRM
+	help
+	  Framework for device drivers of compute acceleration devices, such
+	  as, but not limited to, Machine-Learning and Deep-Learning
+	  acceleration devices.
+	  If you say Y here, you need to select the module that's right for
+	  your acceleration device from the list below.
+	  This framework is integrated with the DRM subsystem as compute
+	  accelerators and GPUs share a lot in common and can use almost the
+	  same infrastructure code.
+	  Having said that, acceleration devices will have a different
+	  major number than GPUs, and will be exposed to user-space using
+	  different device files, called accel/accel* (in /dev, sysfs
+	  and debugfs)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 2/3] drm: define new accel major and register it
  2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
  2022-10-22 21:46 ` [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS Oded Gabbay
@ 2022-10-22 21:46 ` Oded Gabbay
  2022-10-23 12:40   ` Greg Kroah-Hartman
  2022-10-22 21:46 ` [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices Oded Gabbay
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2022-10-22 21:46 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

The accelerator devices will be exposed to the user space with a new,
dedicated major number - 261.

The drm core registers the new major number as a char device and create
corresponding sysfs and debugfs root entries, same as for the drm major.

In case CONFIG_ACCEL is not selected, this code is not compiled in.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 Documentation/admin-guide/devices.txt |  5 +++
 drivers/gpu/drm/drm_drv.c             | 45 +++++++++++++++++++++++
 drivers/gpu/drm/drm_internal.h        |  3 ++
 drivers/gpu/drm/drm_sysfs.c           | 52 +++++++++++++++++++++++++++
 include/drm/drm_ioctl.h               |  1 +
 5 files changed, 106 insertions(+)

diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
index 9764d6edb189..06c525e01ea5 100644
--- a/Documentation/admin-guide/devices.txt
+++ b/Documentation/admin-guide/devices.txt
@@ -3080,6 +3080,11 @@
 		  ...
 		  255 = /dev/osd255	256th OSD Device
 
+ 261 char	Compute Acceleration Devices
+		  0 = /dev/accel/accel0	First acceleration device
+		  1 = /dev/accel/accel1	Second acceleration device
+		    ...
+
  384-511 char	RESERVED FOR DYNAMIC ASSIGNMENT
 		Character devices that request a dynamic allocation of major
 		number will take numbers starting from 511 and downward,
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 8214a0b1ab7f..b58ffb1433d6 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -67,6 +67,10 @@ static bool drm_core_init_complete;
 
 static struct dentry *drm_debugfs_root;
 
+#ifdef CONFIG_ACCEL
+static struct dentry *accel_debugfs_root;
+#endif
+
 DEFINE_STATIC_SRCU(drm_unplug_srcu);
 
 /*
@@ -1031,9 +1035,19 @@ static const struct file_operations drm_stub_fops = {
 	.llseek = noop_llseek,
 };
 
+static void accel_core_exit(void)
+{
+#ifdef CONFIG_ACCEL
+	unregister_chrdev(ACCEL_MAJOR, "accel");
+	debugfs_remove(accel_debugfs_root);
+	accel_sysfs_destroy();
+#endif
+}
+
 static void drm_core_exit(void)
 {
 	drm_privacy_screen_lookup_exit();
+	accel_core_exit();
 	unregister_chrdev(DRM_MAJOR, "drm");
 	debugfs_remove(drm_debugfs_root);
 	drm_sysfs_destroy();
@@ -1041,6 +1055,33 @@ static void drm_core_exit(void)
 	drm_connector_ida_destroy();
 }
 
+static int __init accel_core_init(void)
+{
+#ifdef CONFIG_ACCEL
+	int ret;
+
+	ret = accel_sysfs_init();
+	if (ret < 0) {
+		DRM_ERROR("Cannot create ACCEL class: %d\n", ret);
+		goto error;
+	}
+
+	accel_debugfs_root = debugfs_create_dir("accel", NULL);
+
+	ret = register_chrdev(ACCEL_MAJOR, "accel", &drm_stub_fops);
+	if (ret < 0)
+		goto error;
+
+error:
+	/* Any cleanup will be done in drm_core_exit() that will call
+	 * to accel_core_exit()
+	 */
+	return ret;
+#else
+	return 0;
+#endif
+}
+
 static int __init drm_core_init(void)
 {
 	int ret;
@@ -1061,6 +1102,10 @@ static int __init drm_core_init(void)
 	if (ret < 0)
 		goto error;
 
+	ret = accel_core_init();
+	if (ret < 0)
+		goto error;
+
 	drm_privacy_screen_lookup_init();
 
 	drm_core_init_complete = true;
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 7bb98e6a446d..cbeb9bd3c312 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -147,9 +147,12 @@ void drm_master_internal_release(struct drm_device *dev);
 
 /* drm_sysfs.c */
 extern struct class *drm_class;
+extern struct class *accel_class;
 
 int drm_sysfs_init(void);
 void drm_sysfs_destroy(void);
+int accel_sysfs_init(void);
+void accel_sysfs_destroy(void);
 struct device *drm_sysfs_minor_alloc(struct drm_minor *minor);
 int drm_sysfs_connector_add(struct drm_connector *connector);
 void drm_sysfs_connector_remove(struct drm_connector *connector);
diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 430e00b16eec..70b2a28f55c4 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -56,6 +56,7 @@ static struct device_type drm_sysfs_device_connector = {
 };
 
 struct class *drm_class;
+struct class *accel_class;
 
 #ifdef CONFIG_ACPI
 static bool drm_connector_acpi_bus_match(struct device *dev)
@@ -148,6 +149,57 @@ static void drm_sysfs_release(struct device *dev)
 	kfree(dev);
 }
 
+static char *accel_devnode(struct device *dev, umode_t *mode)
+{
+	return kasprintf(GFP_KERNEL, "accel/%s", dev_name(dev));
+}
+
+static CLASS_ATTR_STRING(accel_version, 0444, "accel 1.0.0 20221018");
+
+/**
+ * accel_sysfs_init - initialize sysfs helpers
+ *
+ * This is used to create the ACCEL class, which is the implicit parent of any
+ * other top-level ACCEL sysfs objects.
+ *
+ * You must call accel_sysfs_destroy() to release the allocated resources.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int accel_sysfs_init(void)
+{
+	int err;
+
+	accel_class = class_create(THIS_MODULE, "accel");
+	if (IS_ERR(accel_class))
+		return PTR_ERR(accel_class);
+
+	err = class_create_file(accel_class, &class_attr_accel_version.attr);
+	if (err) {
+		class_destroy(accel_class);
+		accel_class = NULL;
+		return err;
+	}
+
+	accel_class->devnode = accel_devnode;
+
+	return 0;
+}
+
+/**
+ * accel_sysfs_destroy - destroys ACCEL class
+ *
+ * Destroy the ACCEL device class.
+ */
+void accel_sysfs_destroy(void)
+{
+	if (IS_ERR_OR_NULL(accel_class))
+		return;
+	class_remove_file(accel_class, &class_attr_accel_version.attr);
+	class_destroy(accel_class);
+	accel_class = NULL;
+}
+
 /*
  * Connector properties
  */
diff --git a/include/drm/drm_ioctl.h b/include/drm/drm_ioctl.h
index 6ed61c371f6c..88e4926208e7 100644
--- a/include/drm/drm_ioctl.h
+++ b/include/drm/drm_ioctl.h
@@ -70,6 +70,7 @@ typedef int drm_ioctl_compat_t(struct file *filp, unsigned int cmd,
 #define DRM_IOCTL_NR(n)                _IOC_NR(n)
 #define DRM_IOCTL_TYPE(n)              _IOC_TYPE(n)
 #define DRM_MAJOR       226
+#define ACCEL_MAJOR     261
 
 /**
  * enum drm_ioctl_flags - DRM ioctl flags
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
  2022-10-22 21:46 ` [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS Oded Gabbay
  2022-10-22 21:46 ` [RFC PATCH 2/3] drm: define new accel major and register it Oded Gabbay
@ 2022-10-22 21:46 ` Oded Gabbay
  2022-10-23 12:41   ` Greg Kroah-Hartman
                     ` (2 more replies)
  2022-10-23 14:02 ` [RFC PATCH 0/3] new subsystem for compute " Bagas Sanjaya
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-22 21:46 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

The accelerator devices are exposed to user-space using a dedicated
major. In addition, they are represented in /dev with new, dedicated
device char names: /dev/accel/accel*. This is done to make sure any
user-space software that tries to open a graphic card won't open
the accelerator device by mistake.

The above implies that the minor numbering should be separated from
the rest of the drm devices. However, to avoid code duplication, we
want the drm_minor structure to be able to represent the accelerator
device.

To achieve this, we add a new drm_minor* to drm_device that represents
the accelerator device. This pointer is initialized for drivers that
declare they handle compute accelerator, using a new driver feature
flag called DRIVER_COMPUTE_ACCEL. It is important to note that this
driver feature is mutually exclusive with DRIVER_RENDER. Devices that
want to expose both graphics and compute device char files should be
handled by two drivers that are connected using the auxiliary bus
framework.

In addition, we define a different idr to handle the accelerators
minors. This is done to make the minor's index be identical to the
device index in /dev/. In most places, this is hidden inside the drm
core functions except when calling drm_minor_acquire(), where I had to
add an extra parameter to specify the idr to use (because the
accelerators minors index and the drm primary minor index both begin
at 0).

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/gpu/drm/drm_drv.c      | 171 +++++++++++++++++++++++++--------
 drivers/gpu/drm/drm_file.c     |  69 +++++++++----
 drivers/gpu/drm/drm_internal.h |   2 +-
 drivers/gpu/drm/drm_sysfs.c    |  29 ++++--
 include/drm/drm_device.h       |   3 +
 include/drm/drm_drv.h          |   8 ++
 include/drm/drm_file.h         |  21 +++-
 7 files changed, 235 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index b58ffb1433d6..c13701a8d4be 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
 static DEFINE_SPINLOCK(drm_minor_lock);
 static struct idr drm_minors_idr;
 
+static DEFINE_SPINLOCK(accel_minor_lock);
+static struct idr accel_minors_idr;
+
 /*
  * If the drm core fails to init for whatever reason,
  * we should prevent any drivers from registering with it.
@@ -94,6 +97,8 @@ static struct drm_minor **drm_minor_get_slot(struct drm_device *dev,
 		return &dev->primary;
 	case DRM_MINOR_RENDER:
 		return &dev->render;
+	case DRM_MINOR_ACCEL:
+		return &dev->accel;
 	default:
 		BUG();
 	}
@@ -108,9 +113,15 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data)
 
 	put_device(minor->kdev);
 
-	spin_lock_irqsave(&drm_minor_lock, flags);
-	idr_remove(&drm_minors_idr, minor->index);
-	spin_unlock_irqrestore(&drm_minor_lock, flags);
+	if (minor->type == DRM_MINOR_ACCEL) {
+		spin_lock_irqsave(&accel_minor_lock, flags);
+		idr_remove(&accel_minors_idr, minor->index);
+		spin_unlock_irqrestore(&accel_minor_lock, flags);
+	} else {
+		spin_lock_irqsave(&drm_minor_lock, flags);
+		idr_remove(&drm_minors_idr, minor->index);
+		spin_unlock_irqrestore(&drm_minor_lock, flags);
+	}
 }
 
 static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
@@ -127,13 +138,23 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
 	minor->dev = dev;
 
 	idr_preload(GFP_KERNEL);
-	spin_lock_irqsave(&drm_minor_lock, flags);
-	r = idr_alloc(&drm_minors_idr,
-		      NULL,
-		      64 * type,
-		      64 * (type + 1),
-		      GFP_NOWAIT);
-	spin_unlock_irqrestore(&drm_minor_lock, flags);
+	if (type == DRM_MINOR_ACCEL) {
+		spin_lock_irqsave(&accel_minor_lock, flags);
+		r = idr_alloc(&accel_minors_idr,
+			NULL,
+			64 * (type - DRM_MINOR_ACCEL),
+			64 * (type - DRM_MINOR_ACCEL + 1),
+			GFP_NOWAIT);
+		spin_unlock_irqrestore(&accel_minor_lock, flags);
+	} else {
+		spin_lock_irqsave(&drm_minor_lock, flags);
+		r = idr_alloc(&drm_minors_idr,
+			NULL,
+			64 * type,
+			64 * (type + 1),
+			GFP_NOWAIT);
+		spin_unlock_irqrestore(&drm_minor_lock, flags);
+	}
 	idr_preload_end();
 
 	if (r < 0)
@@ -167,7 +188,11 @@ static int drm_minor_register(struct drm_device *dev, unsigned int type)
 
 	ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
 	if (ret) {
-		DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/dri.\n");
+		if (minor->type == DRM_MINOR_ACCEL)
+			DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/accel.\n");
+		else
+			DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/dri.\n");
+
 		goto err_debugfs;
 	}
 
@@ -176,9 +201,15 @@ static int drm_minor_register(struct drm_device *dev, unsigned int type)
 		goto err_debugfs;
 
 	/* replace NULL with @minor so lookups will succeed from now on */
-	spin_lock_irqsave(&drm_minor_lock, flags);
-	idr_replace(&drm_minors_idr, minor, minor->index);
-	spin_unlock_irqrestore(&drm_minor_lock, flags);
+	if (minor->type == DRM_MINOR_ACCEL) {
+		spin_lock_irqsave(&accel_minor_lock, flags);
+		idr_replace(&accel_minors_idr, minor, minor->index);
+		spin_unlock_irqrestore(&accel_minor_lock, flags);
+	} else {
+		spin_lock_irqsave(&drm_minor_lock, flags);
+		idr_replace(&drm_minors_idr, minor, minor->index);
+		spin_unlock_irqrestore(&drm_minor_lock, flags);
+	}
 
 	DRM_DEBUG("new minor registered %d\n", minor->index);
 	return 0;
@@ -198,9 +229,15 @@ static void drm_minor_unregister(struct drm_device *dev, unsigned int type)
 		return;
 
 	/* replace @minor with NULL so lookups will fail from now on */
-	spin_lock_irqsave(&drm_minor_lock, flags);
-	idr_replace(&drm_minors_idr, NULL, minor->index);
-	spin_unlock_irqrestore(&drm_minor_lock, flags);
+	if (minor->type == DRM_MINOR_ACCEL) {
+		spin_lock_irqsave(&accel_minor_lock, flags);
+		idr_replace(&accel_minors_idr, NULL, minor->index);
+		spin_unlock_irqrestore(&accel_minor_lock, flags);
+	} else {
+		spin_lock_irqsave(&drm_minor_lock, flags);
+		idr_replace(&drm_minors_idr, NULL, minor->index);
+		spin_unlock_irqrestore(&drm_minor_lock, flags);
+	}
 
 	device_del(minor->kdev);
 	dev_set_drvdata(minor->kdev, NULL); /* safety belt */
@@ -216,16 +253,24 @@ static void drm_minor_unregister(struct drm_device *dev, unsigned int type)
  * minor->dev pointer will stay valid! However, the device may get unplugged and
  * unregistered while you hold the minor.
  */
-struct drm_minor *drm_minor_acquire(unsigned int minor_id)
+struct drm_minor *drm_minor_acquire(unsigned int minor_id, bool is_accel_minor)
 {
 	struct drm_minor *minor;
 	unsigned long flags;
 
-	spin_lock_irqsave(&drm_minor_lock, flags);
-	minor = idr_find(&drm_minors_idr, minor_id);
-	if (minor)
-		drm_dev_get(minor->dev);
-	spin_unlock_irqrestore(&drm_minor_lock, flags);
+	if (is_accel_minor) {
+		spin_lock_irqsave(&accel_minor_lock, flags);
+		minor = idr_find(&accel_minors_idr, minor_id);
+		if (minor)
+			drm_dev_get(minor->dev);
+		spin_unlock_irqrestore(&accel_minor_lock, flags);
+	} else {
+		spin_lock_irqsave(&drm_minor_lock, flags);
+		minor = idr_find(&drm_minors_idr, minor_id);
+		if (minor)
+			drm_dev_get(minor->dev);
+		spin_unlock_irqrestore(&drm_minor_lock, flags);
+	}
 
 	if (!minor) {
 		return ERR_PTR(-ENODEV);
@@ -607,6 +652,14 @@ static int drm_dev_init(struct drm_device *dev,
 	/* no per-device feature limits by default */
 	dev->driver_features = ~0u;
 
+	if (drm_core_check_feature(dev, DRIVER_COMPUTE_ACCEL) &&
+				(drm_core_check_feature(dev, DRIVER_RENDER) ||
+				drm_core_check_feature(dev, DRIVER_MODESET))) {
+
+		DRM_ERROR("DRM driver can't be both a compute acceleration and graphics driver\n");
+		return -EINVAL;
+	}
+
 	drm_legacy_init_members(dev);
 	INIT_LIST_HEAD(&dev->filelist);
 	INIT_LIST_HEAD(&dev->filelist_internal);
@@ -632,15 +685,21 @@ static int drm_dev_init(struct drm_device *dev,
 
 	dev->anon_inode = inode;
 
-	if (drm_core_check_feature(dev, DRIVER_RENDER)) {
-		ret = drm_minor_alloc(dev, DRM_MINOR_RENDER);
+	if (drm_core_check_feature(dev, DRIVER_COMPUTE_ACCEL)) {
+		ret = drm_minor_alloc(dev, DRM_MINOR_ACCEL);
 		if (ret)
 			goto err;
-	}
+	} else {
+		if (drm_core_check_feature(dev, DRIVER_RENDER)) {
+			ret = drm_minor_alloc(dev, DRM_MINOR_RENDER);
+			if (ret)
+				goto err;
+		}
 
-	ret = drm_minor_alloc(dev, DRM_MINOR_PRIMARY);
-	if (ret)
-		goto err;
+		ret = drm_minor_alloc(dev, DRM_MINOR_PRIMARY);
+		if (ret)
+			goto err;
+	}
 
 	ret = drm_legacy_create_map_hash(dev);
 	if (ret)
@@ -887,6 +946,10 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
 	if (ret)
 		goto err_minors;
 
+	ret = drm_minor_register(dev, DRM_MINOR_ACCEL);
+	if (ret)
+		goto err_minors;
+
 	ret = create_compat_control_link(dev);
 	if (ret)
 		goto err_minors;
@@ -906,12 +969,13 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
 		 driver->name, driver->major, driver->minor,
 		 driver->patchlevel, driver->date,
 		 dev->dev ? dev_name(dev->dev) : "virtual device",
-		 dev->primary->index);
+		 dev->primary ? dev->primary->index : dev->accel->index);
 
 	goto out_unlock;
 
 err_minors:
 	remove_compat_control_link(dev);
+	drm_minor_unregister(dev, DRM_MINOR_ACCEL);
 	drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
 	drm_minor_unregister(dev, DRM_MINOR_RENDER);
 out_unlock:
@@ -954,6 +1018,7 @@ void drm_dev_unregister(struct drm_device *dev)
 	drm_legacy_rmmaps(dev);
 
 	remove_compat_control_link(dev);
+	drm_minor_unregister(dev, DRM_MINOR_ACCEL);
 	drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
 	drm_minor_unregister(dev, DRM_MINOR_RENDER);
 }
@@ -999,18 +1064,11 @@ EXPORT_SYMBOL(drm_dev_set_unique);
  * registered minor.
  */
 
-static int drm_stub_open(struct inode *inode, struct file *filp)
+static int stub_open(struct inode *inode, struct file *filp, struct drm_minor *minor)
 {
 	const struct file_operations *new_fops;
-	struct drm_minor *minor;
 	int err;
 
-	DRM_DEBUG("\n");
-
-	minor = drm_minor_acquire(iminor(inode));
-	if (IS_ERR(minor))
-		return PTR_ERR(minor);
-
 	new_fops = fops_get(minor->dev->driver->fops);
 	if (!new_fops) {
 		err = -ENODEV;
@@ -1029,18 +1087,51 @@ static int drm_stub_open(struct inode *inode, struct file *filp)
 	return err;
 }
 
+static int drm_stub_open(struct inode *inode, struct file *filp)
+{
+	struct drm_minor *minor;
+
+	DRM_DEBUG("\n");
+
+	minor = drm_minor_acquire(iminor(inode), false);
+	if (IS_ERR(minor))
+		return PTR_ERR(minor);
+
+	return stub_open(inode, filp, minor);
+}
+
+static int accel_stub_open(struct inode *inode, struct file *filp)
+{
+	struct drm_minor *minor;
+
+	DRM_DEBUG("\n");
+
+	minor = drm_minor_acquire(iminor(inode), true);
+	if (IS_ERR(minor))
+		return PTR_ERR(minor);
+
+	return stub_open(inode, filp, minor);
+}
+
 static const struct file_operations drm_stub_fops = {
 	.owner = THIS_MODULE,
 	.open = drm_stub_open,
 	.llseek = noop_llseek,
 };
 
+static const struct file_operations accel_stub_fops = {
+	.owner = THIS_MODULE,
+	.open = accel_stub_open,
+	.llseek = noop_llseek,
+};
+
 static void accel_core_exit(void)
 {
 #ifdef CONFIG_ACCEL
 	unregister_chrdev(ACCEL_MAJOR, "accel");
 	debugfs_remove(accel_debugfs_root);
 	accel_sysfs_destroy();
+	idr_destroy(&accel_minors_idr);
 #endif
 }
 
@@ -1060,6 +1151,8 @@ static int __init accel_core_init(void)
 #ifdef CONFIG_ACCEL
 	int ret;
 
+	idr_init(&accel_minors_idr);
+
 	ret = accel_sysfs_init();
 	if (ret < 0) {
 		DRM_ERROR("Cannot create ACCEL class: %d\n", ret);
@@ -1068,7 +1161,7 @@ static int __init accel_core_init(void)
 
 	accel_debugfs_root = debugfs_create_dir("accel", NULL);
 
-	ret = register_chrdev(ACCEL_MAJOR, "accel", &drm_stub_fops);
+	ret = register_chrdev(ACCEL_MAJOR, "accel", &accel_stub_fops);
 	if (ret < 0)
 		goto error;
 
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index a8b4d918e9a3..d9c60108ab7b 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -389,30 +389,12 @@ static int drm_open_helper(struct file *filp, struct drm_minor *minor)
 	return 0;
 }
 
-/**
- * drm_open - open method for DRM file
- * @inode: device inode
- * @filp: file pointer.
- *
- * This function must be used by drivers as their &file_operations.open method.
- * It looks up the correct DRM device and instantiates all the per-file
- * resources for it. It also calls the &drm_driver.open driver callback.
- *
- * RETURNS:
- *
- * 0 on success or negative errno value on failure.
- */
-int drm_open(struct inode *inode, struct file *filp)
+static int __drm_open(struct inode *inode, struct file *filp, struct drm_minor *minor)
 {
 	struct drm_device *dev;
-	struct drm_minor *minor;
 	int retcode;
 	int need_setup = 0;
 
-	minor = drm_minor_acquire(iminor(inode));
-	if (IS_ERR(minor))
-		return PTR_ERR(minor);
-
 	dev = minor->dev;
 	if (drm_dev_needs_global_mutex(dev))
 		mutex_lock(&drm_global_mutex);
@@ -446,8 +428,57 @@ int drm_open(struct inode *inode, struct file *filp)
 	drm_minor_release(minor);
 	return retcode;
 }
+
+/**
+ * drm_open - open method for DRM file
+ * @inode: device inode
+ * @filp: file pointer.
+ *
+ * This function must be used by drivers as their &file_operations.open method.
+ * It looks up the correct DRM device and instantiates all the per-file
+ * resources for it. It also calls the &drm_driver.open driver callback.
+ *
+ * RETURNS:
+ *
+ * 0 on success or negative errno value on failure.
+ */
+int drm_open(struct inode *inode, struct file *filp)
+{
+	struct drm_minor *minor;
+
+	minor = drm_minor_acquire(iminor(inode), false);
+	if (IS_ERR(minor))
+		return PTR_ERR(minor);
+
+	return __drm_open(inode, filp, minor);
+}
 EXPORT_SYMBOL(drm_open);
 
+/**
+ * accel_open - open method for ACCEL file
+ * @inode: device inode
+ * @filp: file pointer.
+ *
+ * This function must be used by drivers as their &file_operations.open method.
+ * It looks up the correct ACCEL device and instantiates all the per-file
+ * resources for it. It also calls the &drm_driver.open driver callback.
+ *
+ * RETURNS:
+ *
+ * 0 on success or negative errno value on failure.
+ */
+int accel_open(struct inode *inode, struct file *filp)
+{
+	struct drm_minor *minor;
+
+	minor = drm_minor_acquire(iminor(inode), true);
+	if (IS_ERR(minor))
+		return PTR_ERR(minor);
+
+	return __drm_open(inode, filp, minor);
+}
+EXPORT_SYMBOL(accel_open);
+
 void drm_lastclose(struct drm_device * dev)
 {
 	DRM_DEBUG("\n");
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index cbeb9bd3c312..e12efc243527 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -78,7 +78,7 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
 				 uint32_t handle);
 
 /* drm_drv.c */
-struct drm_minor *drm_minor_acquire(unsigned int minor_id);
+struct drm_minor *drm_minor_acquire(unsigned int minor_id, bool is_accel_minor);
 void drm_minor_release(struct drm_minor *minor);
 
 /* drm_managed.c */
diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 70b2a28f55c4..1335476ad373 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -51,6 +51,10 @@ static struct device_type drm_sysfs_device_minor = {
 	.name = "drm_minor"
 };
 
+static struct device_type accel_sysfs_device_minor = {
+	.name = "accel_minor"
+};
+
 static struct device_type drm_sysfs_device_connector = {
 	.name = "drm_connector",
 };
@@ -523,19 +527,28 @@ struct device *drm_sysfs_minor_alloc(struct drm_minor *minor)
 	struct device *kdev;
 	int r;
 
-	if (minor->type == DRM_MINOR_RENDER)
-		minor_str = "renderD%d";
-	else
-		minor_str = "card%d";
-
 	kdev = kzalloc(sizeof(*kdev), GFP_KERNEL);
 	if (!kdev)
 		return ERR_PTR(-ENOMEM);
 
 	device_initialize(kdev);
-	kdev->devt = MKDEV(DRM_MAJOR, minor->index);
-	kdev->class = drm_class;
-	kdev->type = &drm_sysfs_device_minor;
+
+	if (minor->type == DRM_MINOR_ACCEL) {
+		minor_str = "accel%d";
+		kdev->devt = MKDEV(ACCEL_MAJOR, minor->index);
+		kdev->class = accel_class;
+		kdev->type = &accel_sysfs_device_minor;
+	} else {
+		if (minor->type == DRM_MINOR_RENDER)
+			minor_str = "renderD%d";
+		else
+			minor_str = "card%d";
+
+		kdev->devt = MKDEV(DRM_MAJOR, minor->index);
+		kdev->class = drm_class;
+		kdev->type = &drm_sysfs_device_minor;
+	}
+
 	kdev->parent = minor->dev->dev;
 	kdev->release = drm_sysfs_release;
 	dev_set_drvdata(kdev, minor);
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 9923c7a6885e..933ce2048e20 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -93,6 +93,9 @@ struct drm_device {
 	/** @render: Render node */
 	struct drm_minor *render;
 
+	/** @accel: Compute Acceleration node */
+	struct drm_minor *accel;
+
 	/**
 	 * @registered:
 	 *
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index f6159acb8856..4f64102ba8d9 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -94,6 +94,14 @@ enum drm_driver_feature {
 	 * synchronization of command submission.
 	 */
 	DRIVER_SYNCOBJ_TIMELINE         = BIT(6),
+	/**
+	 * @DRIVER_COMPUTE_ACCEL:
+	 *
+	 * Driver supports compute acceleration devices. This flag is mutually exclusive with
+	 * @DRIVER_RENDER and @DRIVER_MODESET. Devices that support both graphics and compute
+	 * acceleration should be handled by two drivers that are connected using auxiliry bus.
+	 */
+	DRIVER_COMPUTE_ACCEL     = BIT(7),
 
 	/* IMPORTANT: Below are all the legacy flags, add new ones above. */
 
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index d780fd151789..e0895a90f394 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -51,11 +51,15 @@ struct file;
 
 /* Note that the order of this enum is ABI (it determines
  * /dev/dri/renderD* numbers).
+ *
+ * Setting DRM_MINOR_ACCEL to 32 gives enough space for more drm minors to
+ * be implemented before we hit any future
  */
 enum drm_minor_type {
 	DRM_MINOR_PRIMARY,
 	DRM_MINOR_CONTROL,
 	DRM_MINOR_RENDER,
+	DRM_MINOR_ACCEL = 32,
 };
 
 /**
@@ -70,7 +74,7 @@ enum drm_minor_type {
 struct drm_minor {
 	/* private: */
 	int index;			/* Minor device number */
-	int type;                       /* Control or render */
+	int type;                       /* Control or render or accel */
 	struct device *kdev;		/* Linux device */
 	struct drm_device *dev;
 
@@ -397,7 +401,22 @@ static inline bool drm_is_render_client(const struct drm_file *file_priv)
 	return file_priv->minor->type == DRM_MINOR_RENDER;
 }
 
+/**
+ * drm_is_accel_client - is this an open file of the compute acceleration node
+ * @file_priv: DRM file
+ *
+ * Returns true if this is an open file of the compute acceleration node, i.e.
+ * &drm_file.minor of @file_priv is a accel minor.
+ *
+ * See also the :ref:`section on accel nodes <drm_accel_node>`.
+ */
+static inline bool drm_is_accel_client(const struct drm_file *file_priv)
+{
+	return file_priv->minor->type == DRM_MINOR_ACCEL;
+}
+
 int drm_open(struct inode *inode, struct file *filp);
+int accel_open(struct inode *inode, struct file *filp);
 ssize_t drm_read(struct file *filp, char __user *buffer,
 		 size_t count, loff_t *offset);
 int drm_release(struct inode *inode, struct file *filp);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 2/3] drm: define new accel major and register it
  2022-10-22 21:46 ` [RFC PATCH 2/3] drm: define new accel major and register it Oded Gabbay
@ 2022-10-23 12:40   ` Greg Kroah-Hartman
  2022-10-24  7:23     ` Oded Gabbay
  0 siblings, 1 reply; 35+ messages in thread
From: Greg Kroah-Hartman @ 2022-10-23 12:40 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Sun, Oct 23, 2022 at 12:46:21AM +0300, Oded Gabbay wrote:
> The accelerator devices will be exposed to the user space with a new,
> dedicated major number - 261.
> 
> The drm core registers the new major number as a char device and create
> corresponding sysfs and debugfs root entries, same as for the drm major.
> 
> In case CONFIG_ACCEL is not selected, this code is not compiled in.
> 
> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> ---
>  Documentation/admin-guide/devices.txt |  5 +++
>  drivers/gpu/drm/drm_drv.c             | 45 +++++++++++++++++++++++
>  drivers/gpu/drm/drm_internal.h        |  3 ++
>  drivers/gpu/drm/drm_sysfs.c           | 52 +++++++++++++++++++++++++++
>  include/drm/drm_ioctl.h               |  1 +
>  5 files changed, 106 insertions(+)
> 
> diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
> index 9764d6edb189..06c525e01ea5 100644
> --- a/Documentation/admin-guide/devices.txt
> +++ b/Documentation/admin-guide/devices.txt
> @@ -3080,6 +3080,11 @@
>  		  ...
>  		  255 = /dev/osd255	256th OSD Device
>  
> + 261 char	Compute Acceleration Devices
> +		  0 = /dev/accel/accel0	First acceleration device
> +		  1 = /dev/accel/accel1	Second acceleration device
> +		    ...
> +
>   384-511 char	RESERVED FOR DYNAMIC ASSIGNMENT
>  		Character devices that request a dynamic allocation of major
>  		number will take numbers starting from 511 and downward,
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 8214a0b1ab7f..b58ffb1433d6 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -67,6 +67,10 @@ static bool drm_core_init_complete;
>  
>  static struct dentry *drm_debugfs_root;
>  
> +#ifdef CONFIG_ACCEL
> +static struct dentry *accel_debugfs_root;
> +#endif
> +
>  DEFINE_STATIC_SRCU(drm_unplug_srcu);
>  
>  /*
> @@ -1031,9 +1035,19 @@ static const struct file_operations drm_stub_fops = {
>  	.llseek = noop_llseek,
>  };
>  
> +static void accel_core_exit(void)
> +{
> +#ifdef CONFIG_ACCEL
> +	unregister_chrdev(ACCEL_MAJOR, "accel");
> +	debugfs_remove(accel_debugfs_root);
> +	accel_sysfs_destroy();
> +#endif
> +}

Why is all of this in drm_drv.c?

Why not put it in drm/accel/accel.c or something like that?  Then put
the proper stuff into a .h file and then you have no #ifdef in the .c
files.

Keeping #ifdef out of C files is key, please do not do things like you
have here.  Especially as it ends up with this kind of mess:

> +static int __init accel_core_init(void)
> +{
> +#ifdef CONFIG_ACCEL
> +	int ret;
> +
> +	ret = accel_sysfs_init();
> +	if (ret < 0) {
> +		DRM_ERROR("Cannot create ACCEL class: %d\n", ret);
> +		goto error;
> +	}
> +
> +	accel_debugfs_root = debugfs_create_dir("accel", NULL);
> +
> +	ret = register_chrdev(ACCEL_MAJOR, "accel", &drm_stub_fops);
> +	if (ret < 0)
> +		goto error;
> +
> +error:
> +	/* Any cleanup will be done in drm_core_exit() that will call
> +	 * to accel_core_exit()
> +	 */
> +	return ret;
> +#else
> +	return 0;
> +#endif
> +}


That's just a mess.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS
  2022-10-22 21:46 ` [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS Oded Gabbay
@ 2022-10-23 12:40   ` Greg Kroah-Hartman
  2022-10-24  7:19     ` Oded Gabbay
  2022-10-24 15:01   ` Jeffrey Hugo
  1 sibling, 1 reply; 35+ messages in thread
From: Greg Kroah-Hartman @ 2022-10-23 12:40 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Sun, Oct 23, 2022 at 12:46:20AM +0300, Oded Gabbay wrote:
> Add a new Kconfig for the accel subsystem. The Kconfig currently
> contains only the basic CONFIG_ACCEL option that will be used to
> decide whether to compile the accel registration code as part of the
> drm core functionality.
> 
> I have taken the liberty to appropriate the dri-devel mailing list and
> the dri-devel IRC channel for the accel subsystem.
> 
> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> ---
>  MAINTAINERS           |  8 ++++++++
>  drivers/Kconfig       |  2 ++
>  drivers/accel/Kconfig | 24 ++++++++++++++++++++++++

YOu never use drivers/accel/ again in this patch series, was that
intentional?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-22 21:46 ` [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices Oded Gabbay
@ 2022-10-23 12:41   ` Greg Kroah-Hartman
  2022-10-24  7:23     ` Oded Gabbay
  2022-10-24 15:21   ` Jeffrey Hugo
  2022-10-25  6:43   ` Jiho Chu
  2 siblings, 1 reply; 35+ messages in thread
From: Greg Kroah-Hartman @ 2022-10-23 12:41 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Sun, Oct 23, 2022 at 12:46:22AM +0300, Oded Gabbay wrote:
> +/**
> + * accel_open - open method for ACCEL file
> + * @inode: device inode
> + * @filp: file pointer.
> + *
> + * This function must be used by drivers as their &file_operations.open method.
> + * It looks up the correct ACCEL device and instantiates all the per-file
> + * resources for it. It also calls the &drm_driver.open driver callback.
> + *
> + * RETURNS:
> + *
> + * 0 on success or negative errno value on failure.
> + */
> +int accel_open(struct inode *inode, struct file *filp)
> +{
> +	struct drm_minor *minor;
> +
> +	minor = drm_minor_acquire(iminor(inode), true);
> +	if (IS_ERR(minor))
> +		return PTR_ERR(minor);
> +
> +	return __drm_open(inode, filp, minor);
> +}
> +EXPORT_SYMBOL(accel_open);

EXPORT_SYMBOL_GPL() please.

And again, this should probably to into drivers/accel/ not here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
                   ` (2 preceding siblings ...)
  2022-10-22 21:46 ` [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices Oded Gabbay
@ 2022-10-23 14:02 ` Bagas Sanjaya
  2022-10-23 14:14   ` Greg Kroah-Hartman
  2022-10-24 11:55 ` Thomas Zimmermann
  2022-10-24 13:55 ` Alex Deucher
  5 siblings, 1 reply; 35+ messages in thread
From: Bagas Sanjaya @ 2022-10-23 14:02 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, linux-doc, Jason Gunthorpe,
	John Hubbard, Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Jiho Chu, Daniel Stone,
	Tvrtko Ursulin, Jeffrey Hugo, Christoph Hellwig, Kevin Hilman,
	Jagan Teki, Jacek Lawrynowicz, Maciej Kwapulinski,
	Jonathan Corbet

[-- Attachment #1: Type: text/plain, Size: 3268 bytes --]

On Sun, Oct 23, 2022 at 12:46:19AM +0300, Oded Gabbay wrote:
> In the last couple of months we had a discussion [1] about creating a new
> subsystem for compute accelerator devices in the kernel.
> 
> After an analysis that was done by DRM maintainers and myself, and following
> a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> decided to create a new subsystem that will use the DRM subsystem's code and
> functionality. i.e. the accel core code will be part of the DRM subsystem.
> 
> This will allow us to leverage the extensive DRM code-base and
> collaborate with DRM developers that have experience with this type of
> devices. In addition, new features that will be added for the accelerator
> drivers can be of use to GPU drivers as well (e.g. RAS).
> 
> As agreed in the BOF session, the accelerator devices will be exposed to
> user-space with a new, dedicated device char files and a dedicated major
> number (261), to clearly separate them from graphic cards and the graphic
> user-space s/w stack. Furthermore, the drivers will be located in a separate
> place in the kernel tree (drivers/accel/).
> 
> This series of patches is the first step in this direction as it adds the
> necessary infrastructure for accelerator devices to DRM. The new devices will
> be exposed with the following convention:
> 
> device char files - /dev/accel/accel*
> sysfs             - /sys/class/accel/accel*/
> debugfs           - /sys/kernel/debug/accel/accel*/
> 
> I tried to reuse the existing DRM code as much as possible, while keeping it
> readable and maintainable.
> 
> One thing that is missing from this series is defining a namespace for the
> new accel subsystem, while I'll add in the next iteration of this patch-set,
> after I will receive feedback from the community.
> 
> As for drivers, once this series will be accepted (after adding the namespace),
> I will start working on migrating the habanalabs driver to the new accel
> subsystem. I have talked about it with Dave and we agreed that it will be
> a good start to simply move the driver as-is with minimal changes, and then
> start working on the driver's individual features that will be either added
> to the accel core code (with or without changes), or will be removed and
> instead the driver will use existing DRM code.
> 
> In addition, I know of at least 3 or 4 drivers that were submitted for review
> and are good candidates to be included in this new subsystem, instead of being
> a drm render node driver or a misc driver.
> 
> [1] https://lkml.org/lkml/2022/7/31/83
> [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> 

Since this is new subsystem, it should have its own git tree where you
collected accelerator-related patches. By convention, there should be
"next" branch targeting for next kernel release and "fixes" branch for
bugfixes pending for current release. Both branches should be included
into linux-next. The names don't necessarily be that, though.

Also, it had been great if you write short, descriptive documentation
about the subsystem (maintainers handbook).

Cc'ing linux-doc folks.

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-23 14:02 ` [RFC PATCH 0/3] new subsystem for compute " Bagas Sanjaya
@ 2022-10-23 14:14   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 35+ messages in thread
From: Greg Kroah-Hartman @ 2022-10-23 14:14 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Oded Gabbay, David Airlie, Daniel Vetter, Arnd Bergmann,
	linux-kernel, dri-devel, linux-doc, Jason Gunthorpe,
	John Hubbard, Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Jiho Chu, Daniel Stone,
	Tvrtko Ursulin, Jeffrey Hugo, Christoph Hellwig, Kevin Hilman,
	Jagan Teki, Jacek Lawrynowicz, Maciej Kwapulinski,
	Jonathan Corbet

On Sun, Oct 23, 2022 at 09:02:49PM +0700, Bagas Sanjaya wrote:
> On Sun, Oct 23, 2022 at 12:46:19AM +0300, Oded Gabbay wrote:
> > In the last couple of months we had a discussion [1] about creating a new
> > subsystem for compute accelerator devices in the kernel.
> > 
> > After an analysis that was done by DRM maintainers and myself, and following
> > a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> > decided to create a new subsystem that will use the DRM subsystem's code and
> > functionality. i.e. the accel core code will be part of the DRM subsystem.
> > 
> > This will allow us to leverage the extensive DRM code-base and
> > collaborate with DRM developers that have experience with this type of
> > devices. In addition, new features that will be added for the accelerator
> > drivers can be of use to GPU drivers as well (e.g. RAS).
> > 
> > As agreed in the BOF session, the accelerator devices will be exposed to
> > user-space with a new, dedicated device char files and a dedicated major
> > number (261), to clearly separate them from graphic cards and the graphic
> > user-space s/w stack. Furthermore, the drivers will be located in a separate
> > place in the kernel tree (drivers/accel/).
> > 
> > This series of patches is the first step in this direction as it adds the
> > necessary infrastructure for accelerator devices to DRM. The new devices will
> > be exposed with the following convention:
> > 
> > device char files - /dev/accel/accel*
> > sysfs             - /sys/class/accel/accel*/
> > debugfs           - /sys/kernel/debug/accel/accel*/
> > 
> > I tried to reuse the existing DRM code as much as possible, while keeping it
> > readable and maintainable.
> > 
> > One thing that is missing from this series is defining a namespace for the
> > new accel subsystem, while I'll add in the next iteration of this patch-set,
> > after I will receive feedback from the community.
> > 
> > As for drivers, once this series will be accepted (after adding the namespace),
> > I will start working on migrating the habanalabs driver to the new accel
> > subsystem. I have talked about it with Dave and we agreed that it will be
> > a good start to simply move the driver as-is with minimal changes, and then
> > start working on the driver's individual features that will be either added
> > to the accel core code (with or without changes), or will be removed and
> > instead the driver will use existing DRM code.
> > 
> > In addition, I know of at least 3 or 4 drivers that were submitted for review
> > and are good candidates to be included in this new subsystem, instead of being
> > a drm render node driver or a misc driver.
> > 
> > [1] https://lkml.org/lkml/2022/7/31/83
> > [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> > 
> 
> Since this is new subsystem, it should have its own git tree where you
> collected accelerator-related patches.

No, that is never a requirement, where did you get that idea?

> By convention, there should be
> "next" branch targeting for next kernel release and "fixes" branch for
> bugfixes pending for current release. Both branches should be included
> into linux-next. The names don't necessarily be that, though.

Again, no, that has never been a requirement.

> Also, it had been great if you write short, descriptive documentation
> about the subsystem (maintainers handbook).

Also no, this is fine, it's an RFC sent to all of the people involved in
the discussions about this new subsystem.  The changelog here is totally
sufficient.

Please do not confuse people and ask them to do things that are not
requirements, that's not helpful at all.

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS
  2022-10-23 12:40   ` Greg Kroah-Hartman
@ 2022-10-24  7:19     ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-24  7:19 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Sun, Oct 23, 2022 at 3:40 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Sun, Oct 23, 2022 at 12:46:20AM +0300, Oded Gabbay wrote:
> > Add a new Kconfig for the accel subsystem. The Kconfig currently
> > contains only the basic CONFIG_ACCEL option that will be used to
> > decide whether to compile the accel registration code as part of the
> > drm core functionality.
> >
> > I have taken the liberty to appropriate the dri-devel mailing list and
> > the dri-devel IRC channel for the accel subsystem.
> >
> > Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> > ---
> >  MAINTAINERS           |  8 ++++++++
> >  drivers/Kconfig       |  2 ++
> >  drivers/accel/Kconfig | 24 ++++++++++++++++++++++++
>
> YOu never use drivers/accel/ again in this patch series, was that
> intentional?
Yes, because I didn't plan for accel to have any core code at this stage.
But according to your other comments, this will probably change
(unless someone thinks otherwise).
Oded
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 2/3] drm: define new accel major and register it
  2022-10-23 12:40   ` Greg Kroah-Hartman
@ 2022-10-24  7:23     ` Oded Gabbay
  2022-10-24  7:52       ` Dave Airlie
  0 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2022-10-24  7:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Sun, Oct 23, 2022 at 3:40 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Sun, Oct 23, 2022 at 12:46:21AM +0300, Oded Gabbay wrote:
> > The accelerator devices will be exposed to the user space with a new,
> > dedicated major number - 261.
> >
> > The drm core registers the new major number as a char device and create
> > corresponding sysfs and debugfs root entries, same as for the drm major.
> >
> > In case CONFIG_ACCEL is not selected, this code is not compiled in.
> >
> > Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> > ---
> >  Documentation/admin-guide/devices.txt |  5 +++
> >  drivers/gpu/drm/drm_drv.c             | 45 +++++++++++++++++++++++
> >  drivers/gpu/drm/drm_internal.h        |  3 ++
> >  drivers/gpu/drm/drm_sysfs.c           | 52 +++++++++++++++++++++++++++
> >  include/drm/drm_ioctl.h               |  1 +
> >  5 files changed, 106 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
> > index 9764d6edb189..06c525e01ea5 100644
> > --- a/Documentation/admin-guide/devices.txt
> > +++ b/Documentation/admin-guide/devices.txt
> > @@ -3080,6 +3080,11 @@
> >                 ...
> >                 255 = /dev/osd255     256th OSD Device
> >
> > + 261 char    Compute Acceleration Devices
> > +               0 = /dev/accel/accel0 First acceleration device
> > +               1 = /dev/accel/accel1 Second acceleration device
> > +                 ...
> > +
> >   384-511 char        RESERVED FOR DYNAMIC ASSIGNMENT
> >               Character devices that request a dynamic allocation of major
> >               number will take numbers starting from 511 and downward,
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 8214a0b1ab7f..b58ffb1433d6 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -67,6 +67,10 @@ static bool drm_core_init_complete;
> >
> >  static struct dentry *drm_debugfs_root;
> >
> > +#ifdef CONFIG_ACCEL
> > +static struct dentry *accel_debugfs_root;
> > +#endif
> > +
> >  DEFINE_STATIC_SRCU(drm_unplug_srcu);
> >
> >  /*
> > @@ -1031,9 +1035,19 @@ static const struct file_operations drm_stub_fops = {
> >       .llseek = noop_llseek,
> >  };
> >
> > +static void accel_core_exit(void)
> > +{
> > +#ifdef CONFIG_ACCEL
> > +     unregister_chrdev(ACCEL_MAJOR, "accel");
> > +     debugfs_remove(accel_debugfs_root);
> > +     accel_sysfs_destroy();
> > +#endif
> > +}
>
> Why is all of this in drm_drv.c?
>
> Why not put it in drm/accel/accel.c or something like that?  Then put
> the proper stuff into a .h file and then you have no #ifdef in the .c
> files.
I thought about that, adding an accel.c in drivers/accel/ and putting
this code there.
Eventually I thought that for two functions it's not worth it, but I
guess that in addition to the reason you gave, one can argue that
there will probably be more code in that file anyway, so why not open
it now.
I'll change this if no one else thinks otherwise.
Oded

>
> Keeping #ifdef out of C files is key, please do not do things like you
> have here.  Especially as it ends up with this kind of mess:
>
> > +static int __init accel_core_init(void)
> > +{
> > +#ifdef CONFIG_ACCEL
> > +     int ret;
> > +
> > +     ret = accel_sysfs_init();
> > +     if (ret < 0) {
> > +             DRM_ERROR("Cannot create ACCEL class: %d\n", ret);
> > +             goto error;
> > +     }
> > +
> > +     accel_debugfs_root = debugfs_create_dir("accel", NULL);
> > +
> > +     ret = register_chrdev(ACCEL_MAJOR, "accel", &drm_stub_fops);
> > +     if (ret < 0)
> > +             goto error;
> > +
> > +error:
> > +     /* Any cleanup will be done in drm_core_exit() that will call
> > +      * to accel_core_exit()
> > +      */
> > +     return ret;
> > +#else
> > +     return 0;
> > +#endif
> > +}
>
>
> That's just a mess.
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-23 12:41   ` Greg Kroah-Hartman
@ 2022-10-24  7:23     ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-24  7:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Sun, Oct 23, 2022 at 3:41 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Sun, Oct 23, 2022 at 12:46:22AM +0300, Oded Gabbay wrote:
> > +/**
> > + * accel_open - open method for ACCEL file
> > + * @inode: device inode
> > + * @filp: file pointer.
> > + *
> > + * This function must be used by drivers as their &file_operations.open method.
> > + * It looks up the correct ACCEL device and instantiates all the per-file
> > + * resources for it. It also calls the &drm_driver.open driver callback.
> > + *
> > + * RETURNS:
> > + *
> > + * 0 on success or negative errno value on failure.
> > + */
> > +int accel_open(struct inode *inode, struct file *filp)
> > +{
> > +     struct drm_minor *minor;
> > +
> > +     minor = drm_minor_acquire(iminor(inode), true);
> > +     if (IS_ERR(minor))
> > +             return PTR_ERR(minor);
> > +
> > +     return __drm_open(inode, filp, minor);
> > +}
> > +EXPORT_SYMBOL(accel_open);
>
> EXPORT_SYMBOL_GPL() please.
>
> And again, this should probably to into drivers/accel/ not here.
Got it, will do.
Thanks,
Oded
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 2/3] drm: define new accel major and register it
  2022-10-24  7:23     ` Oded Gabbay
@ 2022-10-24  7:52       ` Dave Airlie
  2022-10-24 15:08         ` Jeffrey Hugo
  0 siblings, 1 reply; 35+ messages in thread
From: Dave Airlie @ 2022-10-24  7:52 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Greg Kroah-Hartman, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Mon, 24 Oct 2022 at 17:23, Oded Gabbay <ogabbay@kernel.org> wrote:
>
> On Sun, Oct 23, 2022 at 3:40 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Sun, Oct 23, 2022 at 12:46:21AM +0300, Oded Gabbay wrote:
> > > The accelerator devices will be exposed to the user space with a new,
> > > dedicated major number - 261.
> > >
> > > The drm core registers the new major number as a char device and create
> > > corresponding sysfs and debugfs root entries, same as for the drm major.
> > >
> > > In case CONFIG_ACCEL is not selected, this code is not compiled in.
> > >
> > > Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> > > ---
> > >  Documentation/admin-guide/devices.txt |  5 +++
> > >  drivers/gpu/drm/drm_drv.c             | 45 +++++++++++++++++++++++
> > >  drivers/gpu/drm/drm_internal.h        |  3 ++
> > >  drivers/gpu/drm/drm_sysfs.c           | 52 +++++++++++++++++++++++++++
> > >  include/drm/drm_ioctl.h               |  1 +
> > >  5 files changed, 106 insertions(+)
> > >
> > > diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
> > > index 9764d6edb189..06c525e01ea5 100644
> > > --- a/Documentation/admin-guide/devices.txt
> > > +++ b/Documentation/admin-guide/devices.txt
> > > @@ -3080,6 +3080,11 @@
> > >                 ...
> > >                 255 = /dev/osd255     256th OSD Device
> > >
> > > + 261 char    Compute Acceleration Devices
> > > +               0 = /dev/accel/accel0 First acceleration device
> > > +               1 = /dev/accel/accel1 Second acceleration device
> > > +                 ...
> > > +
> > >   384-511 char        RESERVED FOR DYNAMIC ASSIGNMENT
> > >               Character devices that request a dynamic allocation of major
> > >               number will take numbers starting from 511 and downward,
> > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > index 8214a0b1ab7f..b58ffb1433d6 100644
> > > --- a/drivers/gpu/drm/drm_drv.c
> > > +++ b/drivers/gpu/drm/drm_drv.c
> > > @@ -67,6 +67,10 @@ static bool drm_core_init_complete;
> > >
> > >  static struct dentry *drm_debugfs_root;
> > >
> > > +#ifdef CONFIG_ACCEL
> > > +static struct dentry *accel_debugfs_root;
> > > +#endif
> > > +
> > >  DEFINE_STATIC_SRCU(drm_unplug_srcu);
> > >
> > >  /*
> > > @@ -1031,9 +1035,19 @@ static const struct file_operations drm_stub_fops = {
> > >       .llseek = noop_llseek,
> > >  };
> > >
> > > +static void accel_core_exit(void)
> > > +{
> > > +#ifdef CONFIG_ACCEL
> > > +     unregister_chrdev(ACCEL_MAJOR, "accel");
> > > +     debugfs_remove(accel_debugfs_root);
> > > +     accel_sysfs_destroy();
> > > +#endif
> > > +}
> >
> > Why is all of this in drm_drv.c?
> >
> > Why not put it in drm/accel/accel.c or something like that?  Then put
> > the proper stuff into a .h file and then you have no #ifdef in the .c
> > files.
> I thought about that, adding an accel.c in drivers/accel/ and putting
> this code there.
> Eventually I thought that for two functions it's not worth it, but I
> guess that in addition to the reason you gave, one can argue that
> there will probably be more code in that file anyway, so why not open
> it now.
> I'll change this if no one else thinks otherwise.

Seems like a good idea to start doing it now, might make things easier
to keep separated.

Dave.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
                   ` (3 preceding siblings ...)
  2022-10-23 14:02 ` [RFC PATCH 0/3] new subsystem for compute " Bagas Sanjaya
@ 2022-10-24 11:55 ` Thomas Zimmermann
  2022-10-24 12:35   ` Greg Kroah-Hartman
  2022-10-24 12:43   ` Oded Gabbay
  2022-10-24 13:55 ` Alex Deucher
  5 siblings, 2 replies; 35+ messages in thread
From: Thomas Zimmermann @ 2022-10-24 11:55 UTC (permalink / raw)
  To: Oded Gabbay, David Airlie, Daniel Vetter, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, dri-devel, Jason Gunthorpe,
	John Hubbard, Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Yuji Ishikawa, Jiho Chu,
	Daniel Stone, Tvrtko Ursulin, Jeffrey Hugo, Christoph Hellwig,
	Kevin Hilman, Jagan Teki, Jacek Lawrynowicz, Maciej Kwapulinski


[-- Attachment #1.1: Type: text/plain, Size: 4516 bytes --]

Hi

Am 22.10.22 um 23:46 schrieb Oded Gabbay:
> In the last couple of months we had a discussion [1] about creating a new
> subsystem for compute accelerator devices in the kernel.
> 
> After an analysis that was done by DRM maintainers and myself, and following
> a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> decided to create a new subsystem that will use the DRM subsystem's code and
> functionality. i.e. the accel core code will be part of the DRM subsystem.
> 
> This will allow us to leverage the extensive DRM code-base and
> collaborate with DRM developers that have experience with this type of
> devices. In addition, new features that will be added for the accelerator
> drivers can be of use to GPU drivers as well (e.g. RAS).
> 
> As agreed in the BOF session, the accelerator devices will be exposed to
> user-space with a new, dedicated device char files and a dedicated major
> number (261), to clearly separate them from graphic cards and the graphic
> user-space s/w stack. Furthermore, the drivers will be located in a separate
> place in the kernel tree (drivers/accel/).
> 
> This series of patches is the first step in this direction as it adds the
> necessary infrastructure for accelerator devices to DRM. The new devices will
> be exposed with the following convention:
> 
> device char files - /dev/accel/accel*
> sysfs             - /sys/class/accel/accel*/
> debugfs           - /sys/kernel/debug/accel/accel*/

I know I'm really late to this discussion, but wouldn't 'compute' be a 
better name?

(I agree that skynet would also be nice :)

> 
> I tried to reuse the existing DRM code as much as possible, while keeping it
> readable and maintainable.
> 
> One thing that is missing from this series is defining a namespace for the
> new accel subsystem, while I'll add in the next iteration of this patch-set,
> after I will receive feedback from the community.
> 
> As for drivers, once this series will be accepted (after adding the namespace),
> I will start working on migrating the habanalabs driver to the new accel
> subsystem. I have talked about it with Dave and we agreed that it will be
> a good start to simply move the driver as-is with minimal changes, and then
> start working on the driver's individual features that will be either added
> to the accel core code (with or without changes), or will be removed and
> instead the driver will use existing DRM code.

What's your opinion on the long-term prospect of DRM vs accel? I assume 
that over time, DRM helpers will move into accel and some DRM drivers 
will start depending on accel?

After reading the provided links, I wondered if we shouldn't rename 
drivers/gpu to drivers/accel and put the new subsystem into 
drivers/accel/compute. We'd have DRM and compute devices next to each 
other and shared helpers could be located in other subdirectories within 
accel/

Best regards
Thomas

> 
> In addition, I know of at least 3 or 4 drivers that were submitted for review
> and are good candidates to be included in this new subsystem, instead of being
> a drm render node driver or a misc driver.
> 
> [1] https://lkml.org/lkml/2022/7/31/83
> [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> 
> Thanks,
> Oded
> 
> Oded Gabbay (3):
>    drivers/accel: add new kconfig and update MAINTAINERS
>    drm: define new accel major and register it
>    drm: add dedicated minor for accelerator devices
> 
>   Documentation/admin-guide/devices.txt |   5 +
>   MAINTAINERS                           |   8 +
>   drivers/Kconfig                       |   2 +
>   drivers/accel/Kconfig                 |  24 +++
>   drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
>   drivers/gpu/drm/drm_file.c            |  69 ++++++---
>   drivers/gpu/drm/drm_internal.h        |   5 +-
>   drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
>   include/drm/drm_device.h              |   3 +
>   include/drm/drm_drv.h                 |   8 +
>   include/drm/drm_file.h                |  21 ++-
>   include/drm/drm_ioctl.h               |   1 +
>   12 files changed, 374 insertions(+), 67 deletions(-)
>   create mode 100644 drivers/accel/Kconfig
> 
> --
> 2.34.1
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-24 11:55 ` Thomas Zimmermann
@ 2022-10-24 12:35   ` Greg Kroah-Hartman
  2022-10-24 12:43   ` Oded Gabbay
  1 sibling, 0 replies; 35+ messages in thread
From: Greg Kroah-Hartman @ 2022-10-24 12:35 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Oded Gabbay, David Airlie, Daniel Vetter, Arnd Bergmann,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard, Yuji Ishikawa,
	Jiho Chu, Daniel Stone, Tvrtko Ursulin, Jeffrey Hugo,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On Mon, Oct 24, 2022 at 01:55:56PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 22.10.22 um 23:46 schrieb Oded Gabbay:
> > In the last couple of months we had a discussion [1] about creating a new
> > subsystem for compute accelerator devices in the kernel.
> > 
> > After an analysis that was done by DRM maintainers and myself, and following
> > a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> > decided to create a new subsystem that will use the DRM subsystem's code and
> > functionality. i.e. the accel core code will be part of the DRM subsystem.
> > 
> > This will allow us to leverage the extensive DRM code-base and
> > collaborate with DRM developers that have experience with this type of
> > devices. In addition, new features that will be added for the accelerator
> > drivers can be of use to GPU drivers as well (e.g. RAS).
> > 
> > As agreed in the BOF session, the accelerator devices will be exposed to
> > user-space with a new, dedicated device char files and a dedicated major
> > number (261), to clearly separate them from graphic cards and the graphic
> > user-space s/w stack. Furthermore, the drivers will be located in a separate
> > place in the kernel tree (drivers/accel/).
> > 
> > This series of patches is the first step in this direction as it adds the
> > necessary infrastructure for accelerator devices to DRM. The new devices will
> > be exposed with the following convention:
> > 
> > device char files - /dev/accel/accel*
> > sysfs             - /sys/class/accel/accel*/
> > debugfs           - /sys/kernel/debug/accel/accel*/
> 
> I know I'm really late to this discussion, but wouldn't 'compute' be a
> better name?
> 
> (I agree that skynet would also be nice :)

See the summary of the meeting we all held at the Plumbers conference
about this.  "accel" was the "least hated" of all of the options, so I
think we'll stick with that for now.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-24 11:55 ` Thomas Zimmermann
  2022-10-24 12:35   ` Greg Kroah-Hartman
@ 2022-10-24 12:43   ` Oded Gabbay
  2022-10-25  2:21     ` John Hubbard
  1 sibling, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2022-10-24 12:43 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard, Yuji Ishikawa,
	Jiho Chu, Daniel Stone, Tvrtko Ursulin, Jeffrey Hugo,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On Mon, Oct 24, 2022 at 2:56 PM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 22.10.22 um 23:46 schrieb Oded Gabbay:
> > In the last couple of months we had a discussion [1] about creating a new
> > subsystem for compute accelerator devices in the kernel.
> >
> > After an analysis that was done by DRM maintainers and myself, and following
> > a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> > decided to create a new subsystem that will use the DRM subsystem's code and
> > functionality. i.e. the accel core code will be part of the DRM subsystem.
> >
> > This will allow us to leverage the extensive DRM code-base and
> > collaborate with DRM developers that have experience with this type of
> > devices. In addition, new features that will be added for the accelerator
> > drivers can be of use to GPU drivers as well (e.g. RAS).
> >
> > As agreed in the BOF session, the accelerator devices will be exposed to
> > user-space with a new, dedicated device char files and a dedicated major
> > number (261), to clearly separate them from graphic cards and the graphic
> > user-space s/w stack. Furthermore, the drivers will be located in a separate
> > place in the kernel tree (drivers/accel/).
> >
> > This series of patches is the first step in this direction as it adds the
> > necessary infrastructure for accelerator devices to DRM. The new devices will
> > be exposed with the following convention:
> >
> > device char files - /dev/accel/accel*
> > sysfs             - /sys/class/accel/accel*/
> > debugfs           - /sys/kernel/debug/accel/accel*/
>
> I know I'm really late to this discussion, but wouldn't 'compute' be a
> better name?
I also thought like you :)
But I consulted with Dave while writing these patches and he suggested
accel/accel.
I'm fine either way...

>
> (I agree that skynet would also be nice :)
>
> >
> > I tried to reuse the existing DRM code as much as possible, while keeping it
> > readable and maintainable.
> >
> > One thing that is missing from this series is defining a namespace for the
> > new accel subsystem, while I'll add in the next iteration of this patch-set,
> > after I will receive feedback from the community.
> >
> > As for drivers, once this series will be accepted (after adding the namespace),
> > I will start working on migrating the habanalabs driver to the new accel
> > subsystem. I have talked about it with Dave and we agreed that it will be
> > a good start to simply move the driver as-is with minimal changes, and then
> > start working on the driver's individual features that will be either added
> > to the accel core code (with or without changes), or will be removed and
> > instead the driver will use existing DRM code.
>
> What's your opinion on the long-term prospect of DRM vs accel? I assume
> that over time, DRM helpers will move into accel and some DRM drivers
> will start depending on accel?
I don't think that is what I had in mind.
What I had in mind is that accel helpers are only relevant for accel
drivers, and any code that might also be relevant for DRM drivers will
be placed in DRM core code. e.g. GEM enhancements, RAS netlink
support. btw, I suspect this will be the majority of the code.
In addition, DRM drivers should never set the new DRIVER_COMPUTE_ACCEL
driver feature in their structure so they should have zero dependency
on the accel core code.

>
> After reading the provided links, I wondered if we shouldn't rename
> drivers/gpu to drivers/accel and put the new subsystem into
> drivers/accel/compute. We'd have DRM and compute devices next to each
> other and shared helpers could be located in other subdirectories within
> accel/
I think this idea was brought up at the BOF session and Dave and
others said it will be too big of a burden (due to backports) to do
it.
From Dave's blogpost:
"Moving things around now for current drivers is too hard to deal with
for backports etc. Adding a new directory for accel drivers would be a
good plan, even if they used the drm framework."

Thanks,
Oded
>
> Best regards
> Thomas
>
> >
> > In addition, I know of at least 3 or 4 drivers that were submitted for review
> > and are good candidates to be included in this new subsystem, instead of being
> > a drm render node driver or a misc driver.
> >
> > [1] https://lkml.org/lkml/2022/7/31/83
> > [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> >
> > Thanks,
> > Oded
> >
> > Oded Gabbay (3):
> >    drivers/accel: add new kconfig and update MAINTAINERS
> >    drm: define new accel major and register it
> >    drm: add dedicated minor for accelerator devices
> >
> >   Documentation/admin-guide/devices.txt |   5 +
> >   MAINTAINERS                           |   8 +
> >   drivers/Kconfig                       |   2 +
> >   drivers/accel/Kconfig                 |  24 +++
> >   drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
> >   drivers/gpu/drm/drm_file.c            |  69 ++++++---
> >   drivers/gpu/drm/drm_internal.h        |   5 +-
> >   drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
> >   include/drm/drm_device.h              |   3 +
> >   include/drm/drm_drv.h                 |   8 +
> >   include/drm/drm_file.h                |  21 ++-
> >   include/drm/drm_ioctl.h               |   1 +
> >   12 files changed, 374 insertions(+), 67 deletions(-)
> >   create mode 100644 drivers/accel/Kconfig
> >
> > --
> > 2.34.1
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Ivo Totev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
                   ` (4 preceding siblings ...)
  2022-10-24 11:55 ` Thomas Zimmermann
@ 2022-10-24 13:55 ` Alex Deucher
  2022-10-24 14:41   ` Oded Gabbay
  5 siblings, 1 reply; 35+ messages in thread
From: Alex Deucher @ 2022-10-24 13:55 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Tvrtko Ursulin, Jacek Lawrynowicz, Jeffrey Hugo,
	Jiho Chu, Christoph Hellwig, Thomas Zimmermann, Kevin Hilman,
	Yuji Ishikawa, Maciej Kwapulinski, Jagan Teki

On Sat, Oct 22, 2022 at 5:46 PM Oded Gabbay <ogabbay@kernel.org> wrote:
>
> In the last couple of months we had a discussion [1] about creating a new
> subsystem for compute accelerator devices in the kernel.
>
> After an analysis that was done by DRM maintainers and myself, and following
> a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> decided to create a new subsystem that will use the DRM subsystem's code and
> functionality. i.e. the accel core code will be part of the DRM subsystem.
>
> This will allow us to leverage the extensive DRM code-base and
> collaborate with DRM developers that have experience with this type of
> devices. In addition, new features that will be added for the accelerator
> drivers can be of use to GPU drivers as well (e.g. RAS).
>
> As agreed in the BOF session, the accelerator devices will be exposed to
> user-space with a new, dedicated device char files and a dedicated major
> number (261), to clearly separate them from graphic cards and the graphic
> user-space s/w stack. Furthermore, the drivers will be located in a separate
> place in the kernel tree (drivers/accel/).
>
> This series of patches is the first step in this direction as it adds the
> necessary infrastructure for accelerator devices to DRM. The new devices will
> be exposed with the following convention:
>
> device char files - /dev/accel/accel*
> sysfs             - /sys/class/accel/accel*/
> debugfs           - /sys/kernel/debug/accel/accel*/
>
> I tried to reuse the existing DRM code as much as possible, while keeping it
> readable and maintainable.

Wouldn't something like this:
https://patchwork.freedesktop.org/series/109575/
Be simpler and provide better backwards compatibility for existing
non-gfx devices in the drm subsystem as well as newer devices?

Alex

>
> One thing that is missing from this series is defining a namespace for the
> new accel subsystem, while I'll add in the next iteration of this patch-set,
> after I will receive feedback from the community.
>
> As for drivers, once this series will be accepted (after adding the namespace),
> I will start working on migrating the habanalabs driver to the new accel
> subsystem. I have talked about it with Dave and we agreed that it will be
> a good start to simply move the driver as-is with minimal changes, and then
> start working on the driver's individual features that will be either added
> to the accel core code (with or without changes), or will be removed and
> instead the driver will use existing DRM code.
>
> In addition, I know of at least 3 or 4 drivers that were submitted for review
> and are good candidates to be included in this new subsystem, instead of being
> a drm render node driver or a misc driver.
>
> [1] https://lkml.org/lkml/2022/7/31/83
> [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
>
> Thanks,
> Oded
>
> Oded Gabbay (3):
>   drivers/accel: add new kconfig and update MAINTAINERS
>   drm: define new accel major and register it
>   drm: add dedicated minor for accelerator devices
>
>  Documentation/admin-guide/devices.txt |   5 +
>  MAINTAINERS                           |   8 +
>  drivers/Kconfig                       |   2 +
>  drivers/accel/Kconfig                 |  24 +++
>  drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
>  drivers/gpu/drm/drm_file.c            |  69 ++++++---
>  drivers/gpu/drm/drm_internal.h        |   5 +-
>  drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
>  include/drm/drm_device.h              |   3 +
>  include/drm/drm_drv.h                 |   8 +
>  include/drm/drm_file.h                |  21 ++-
>  include/drm/drm_ioctl.h               |   1 +
>  12 files changed, 374 insertions(+), 67 deletions(-)
>  create mode 100644 drivers/accel/Kconfig
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-24 13:55 ` Alex Deucher
@ 2022-10-24 14:41   ` Oded Gabbay
  2022-10-24 15:10     ` Alex Deucher
  0 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2022-10-24 14:41 UTC (permalink / raw)
  To: Alex Deucher
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Tvrtko Ursulin, Jacek Lawrynowicz, Jeffrey Hugo,
	Jiho Chu, Christoph Hellwig, Thomas Zimmermann, Kevin Hilman,
	Yuji Ishikawa, Maciej Kwapulinski, Jagan Teki

On Mon, Oct 24, 2022 at 4:55 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Sat, Oct 22, 2022 at 5:46 PM Oded Gabbay <ogabbay@kernel.org> wrote:
> >
> > In the last couple of months we had a discussion [1] about creating a new
> > subsystem for compute accelerator devices in the kernel.
> >
> > After an analysis that was done by DRM maintainers and myself, and following
> > a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> > decided to create a new subsystem that will use the DRM subsystem's code and
> > functionality. i.e. the accel core code will be part of the DRM subsystem.
> >
> > This will allow us to leverage the extensive DRM code-base and
> > collaborate with DRM developers that have experience with this type of
> > devices. In addition, new features that will be added for the accelerator
> > drivers can be of use to GPU drivers as well (e.g. RAS).
> >
> > As agreed in the BOF session, the accelerator devices will be exposed to
> > user-space with a new, dedicated device char files and a dedicated major
> > number (261), to clearly separate them from graphic cards and the graphic
> > user-space s/w stack. Furthermore, the drivers will be located in a separate
> > place in the kernel tree (drivers/accel/).
> >
> > This series of patches is the first step in this direction as it adds the
> > necessary infrastructure for accelerator devices to DRM. The new devices will
> > be exposed with the following convention:
> >
> > device char files - /dev/accel/accel*
> > sysfs             - /sys/class/accel/accel*/
> > debugfs           - /sys/kernel/debug/accel/accel*/
> >
> > I tried to reuse the existing DRM code as much as possible, while keeping it
> > readable and maintainable.
>
> Wouldn't something like this:
> https://patchwork.freedesktop.org/series/109575/
> Be simpler and provide better backwards compatibility for existing
> non-gfx devices in the drm subsystem as well as newer devices?

As Greg said, see the summary. The consensus in the LPC session was
that we need to clearly separate accel devices from existing gpu
devices (whether they use primary and/or render nodes). That is the
main guideline according to which I wrote the patches. I don't think I
want to change this decision.

Also, there was never any intention to provide backward compatibility
for existing non-gfx devices. Why would we want that ? We are mainly
talking about drivers that are currently trying to get upstream, and
the habana driver.

Oded
>
> Alex
>
> >
> > One thing that is missing from this series is defining a namespace for the
> > new accel subsystem, while I'll add in the next iteration of this patch-set,
> > after I will receive feedback from the community.
> >
> > As for drivers, once this series will be accepted (after adding the namespace),
> > I will start working on migrating the habanalabs driver to the new accel
> > subsystem. I have talked about it with Dave and we agreed that it will be
> > a good start to simply move the driver as-is with minimal changes, and then
> > start working on the driver's individual features that will be either added
> > to the accel core code (with or without changes), or will be removed and
> > instead the driver will use existing DRM code.
> >
> > In addition, I know of at least 3 or 4 drivers that were submitted for review
> > and are good candidates to be included in this new subsystem, instead of being
> > a drm render node driver or a misc driver.
> >
> > [1] https://lkml.org/lkml/2022/7/31/83
> > [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> >
> > Thanks,
> > Oded
> >
> > Oded Gabbay (3):
> >   drivers/accel: add new kconfig and update MAINTAINERS
> >   drm: define new accel major and register it
> >   drm: add dedicated minor for accelerator devices
> >
> >  Documentation/admin-guide/devices.txt |   5 +
> >  MAINTAINERS                           |   8 +
> >  drivers/Kconfig                       |   2 +
> >  drivers/accel/Kconfig                 |  24 +++
> >  drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
> >  drivers/gpu/drm/drm_file.c            |  69 ++++++---
> >  drivers/gpu/drm/drm_internal.h        |   5 +-
> >  drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
> >  include/drm/drm_device.h              |   3 +
> >  include/drm/drm_drv.h                 |   8 +
> >  include/drm/drm_file.h                |  21 ++-
> >  include/drm/drm_ioctl.h               |   1 +
> >  12 files changed, 374 insertions(+), 67 deletions(-)
> >  create mode 100644 drivers/accel/Kconfig
> >
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS
  2022-10-22 21:46 ` [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS Oded Gabbay
  2022-10-23 12:40   ` Greg Kroah-Hartman
@ 2022-10-24 15:01   ` Jeffrey Hugo
  1 sibling, 0 replies; 35+ messages in thread
From: Jeffrey Hugo @ 2022-10-24 15:01 UTC (permalink / raw)
  To: Oded Gabbay, David Airlie, Daniel Vetter, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, dri-devel, Jason Gunthorpe,
	John Hubbard, Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On 10/22/2022 3:46 PM, Oded Gabbay wrote:
> Add a new Kconfig for the accel subsystem. The Kconfig currently
> contains only the basic CONFIG_ACCEL option that will be used to
> decide whether to compile the accel registration code as part of the
> drm core functionality.
> 
> I have taken the liberty to appropriate the dri-devel mailing list and
> the dri-devel IRC channel for the accel subsystem.
> 
> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> ---
>   MAINTAINERS           |  8 ++++++++
>   drivers/Kconfig       |  2 ++
>   drivers/accel/Kconfig | 24 ++++++++++++++++++++++++
>   3 files changed, 34 insertions(+)
>   create mode 100644 drivers/accel/Kconfig
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index cf0f18502372..790d472801d5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6820,6 +6820,14 @@ F:	include/drm/drm*
>   F:	include/linux/vga*
>   F:	include/uapi/drm/drm*
>   
> +DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK
> +M:	Oded Gabbay <ogabbay@kernel.org>
> +L:	dri-devel@lists.freedesktop.org
> +S:	Maintained
> +C:	irc://irc.oftc.net/dri-devel
> +T:	git https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git
> +F:	drivers/accel/

I'm willing to assist with reviews, etc.  While I appreciate your 
efforts to drive this, you shouldn't be taking on everything going 
forward. Feel free to add me (or don't) to this entry as you see fit.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 2/3] drm: define new accel major and register it
  2022-10-24  7:52       ` Dave Airlie
@ 2022-10-24 15:08         ` Jeffrey Hugo
  0 siblings, 0 replies; 35+ messages in thread
From: Jeffrey Hugo @ 2022-10-24 15:08 UTC (permalink / raw)
  To: Dave Airlie, Oded Gabbay
  Cc: Greg Kroah-Hartman, Daniel Vetter, Arnd Bergmann, linux-kernel,
	dri-devel, Jason Gunthorpe, John Hubbard, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On 10/24/2022 1:52 AM, Dave Airlie wrote:
> On Mon, 24 Oct 2022 at 17:23, Oded Gabbay <ogabbay@kernel.org> wrote:
>>
>> On Sun, Oct 23, 2022 at 3:40 PM Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>>>
>>> On Sun, Oct 23, 2022 at 12:46:21AM +0300, Oded Gabbay wrote:
>>>> The accelerator devices will be exposed to the user space with a new,
>>>> dedicated major number - 261.
>>>>
>>>> The drm core registers the new major number as a char device and create
>>>> corresponding sysfs and debugfs root entries, same as for the drm major.
>>>>
>>>> In case CONFIG_ACCEL is not selected, this code is not compiled in.
>>>>
>>>> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
>>>> ---
>>>>   Documentation/admin-guide/devices.txt |  5 +++
>>>>   drivers/gpu/drm/drm_drv.c             | 45 +++++++++++++++++++++++
>>>>   drivers/gpu/drm/drm_internal.h        |  3 ++
>>>>   drivers/gpu/drm/drm_sysfs.c           | 52 +++++++++++++++++++++++++++
>>>>   include/drm/drm_ioctl.h               |  1 +
>>>>   5 files changed, 106 insertions(+)
>>>>
>>>> diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
>>>> index 9764d6edb189..06c525e01ea5 100644
>>>> --- a/Documentation/admin-guide/devices.txt
>>>> +++ b/Documentation/admin-guide/devices.txt
>>>> @@ -3080,6 +3080,11 @@
>>>>                  ...
>>>>                  255 = /dev/osd255     256th OSD Device
>>>>
>>>> + 261 char    Compute Acceleration Devices
>>>> +               0 = /dev/accel/accel0 First acceleration device
>>>> +               1 = /dev/accel/accel1 Second acceleration device
>>>> +                 ...
>>>> +
>>>>    384-511 char        RESERVED FOR DYNAMIC ASSIGNMENT
>>>>                Character devices that request a dynamic allocation of major
>>>>                number will take numbers starting from 511 and downward,
>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>> index 8214a0b1ab7f..b58ffb1433d6 100644
>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>> @@ -67,6 +67,10 @@ static bool drm_core_init_complete;
>>>>
>>>>   static struct dentry *drm_debugfs_root;
>>>>
>>>> +#ifdef CONFIG_ACCEL
>>>> +static struct dentry *accel_debugfs_root;
>>>> +#endif
>>>> +
>>>>   DEFINE_STATIC_SRCU(drm_unplug_srcu);
>>>>
>>>>   /*
>>>> @@ -1031,9 +1035,19 @@ static const struct file_operations drm_stub_fops = {
>>>>        .llseek = noop_llseek,
>>>>   };
>>>>
>>>> +static void accel_core_exit(void)
>>>> +{
>>>> +#ifdef CONFIG_ACCEL
>>>> +     unregister_chrdev(ACCEL_MAJOR, "accel");
>>>> +     debugfs_remove(accel_debugfs_root);
>>>> +     accel_sysfs_destroy();
>>>> +#endif
>>>> +}
>>>
>>> Why is all of this in drm_drv.c?
>>>
>>> Why not put it in drm/accel/accel.c or something like that?  Then put
>>> the proper stuff into a .h file and then you have no #ifdef in the .c
>>> files.
>> I thought about that, adding an accel.c in drivers/accel/ and putting
>> this code there.
>> Eventually I thought that for two functions it's not worth it, but I
>> guess that in addition to the reason you gave, one can argue that
>> there will probably be more code in that file anyway, so why not open
>> it now.
>> I'll change this if no one else thinks otherwise.
> 
> Seems like a good idea to start doing it now, might make things easier
> to keep separated.

I agree.  I was a bit confused going through this patch, and envisioning 
how an accel driver would use the interface.  I think an 
accel_internal.h would be clearer.

-Jeff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-24 14:41   ` Oded Gabbay
@ 2022-10-24 15:10     ` Alex Deucher
  2022-10-26  6:10       ` Oded Gabbay
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Deucher @ 2022-10-24 15:10 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Tvrtko Ursulin, Jacek Lawrynowicz, Jeffrey Hugo,
	Jiho Chu, Christoph Hellwig, Thomas Zimmermann, Kevin Hilman,
	Yuji Ishikawa, Maciej Kwapulinski, Jagan Teki

On Mon, Oct 24, 2022 at 10:41 AM Oded Gabbay <ogabbay@kernel.org> wrote:
>
> On Mon, Oct 24, 2022 at 4:55 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> >
> > On Sat, Oct 22, 2022 at 5:46 PM Oded Gabbay <ogabbay@kernel.org> wrote:
> > >
> > > In the last couple of months we had a discussion [1] about creating a new
> > > subsystem for compute accelerator devices in the kernel.
> > >
> > > After an analysis that was done by DRM maintainers and myself, and following
> > > a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> > > decided to create a new subsystem that will use the DRM subsystem's code and
> > > functionality. i.e. the accel core code will be part of the DRM subsystem.
> > >
> > > This will allow us to leverage the extensive DRM code-base and
> > > collaborate with DRM developers that have experience with this type of
> > > devices. In addition, new features that will be added for the accelerator
> > > drivers can be of use to GPU drivers as well (e.g. RAS).
> > >
> > > As agreed in the BOF session, the accelerator devices will be exposed to
> > > user-space with a new, dedicated device char files and a dedicated major
> > > number (261), to clearly separate them from graphic cards and the graphic
> > > user-space s/w stack. Furthermore, the drivers will be located in a separate
> > > place in the kernel tree (drivers/accel/).
> > >
> > > This series of patches is the first step in this direction as it adds the
> > > necessary infrastructure for accelerator devices to DRM. The new devices will
> > > be exposed with the following convention:
> > >
> > > device char files - /dev/accel/accel*
> > > sysfs             - /sys/class/accel/accel*/
> > > debugfs           - /sys/kernel/debug/accel/accel*/
> > >
> > > I tried to reuse the existing DRM code as much as possible, while keeping it
> > > readable and maintainable.
> >
> > Wouldn't something like this:
> > https://patchwork.freedesktop.org/series/109575/
> > Be simpler and provide better backwards compatibility for existing
> > non-gfx devices in the drm subsystem as well as newer devices?
>
> As Greg said, see the summary. The consensus in the LPC session was
> that we need to clearly separate accel devices from existing gpu
> devices (whether they use primary and/or render nodes). That is the
> main guideline according to which I wrote the patches. I don't think I
> want to change this decision.
>
> Also, there was never any intention to provide backward compatibility
> for existing non-gfx devices. Why would we want that ? We are mainly
> talking about drivers that are currently trying to get upstream, and
> the habana driver.

If someone already has a non-gfx device which uses the drm subsystem,
should they be converted to the new accel stuff?  What about new
devices that utilize the same driver?  SHould they use accel or
continue to use drm?  For the sake of the rest of the stack drm would
make more sense, but if accel grows a bunch of stuff that all accel
drivers should be using what do we do?  Also using render nodes also
makes the devices compatible with all of the existing user space tools
that use the existing drm device nodes like libdrm, etc.  I'm failing
to see what advantage accel brings other than requiring userspace to
support two very similar device nodes.

Alex

>
> Oded
> >
> > Alex
> >
> > >
> > > One thing that is missing from this series is defining a namespace for the
> > > new accel subsystem, while I'll add in the next iteration of this patch-set,
> > > after I will receive feedback from the community.
> > >
> > > As for drivers, once this series will be accepted (after adding the namespace),
> > > I will start working on migrating the habanalabs driver to the new accel
> > > subsystem. I have talked about it with Dave and we agreed that it will be
> > > a good start to simply move the driver as-is with minimal changes, and then
> > > start working on the driver's individual features that will be either added
> > > to the accel core code (with or without changes), or will be removed and
> > > instead the driver will use existing DRM code.
> > >
> > > In addition, I know of at least 3 or 4 drivers that were submitted for review
> > > and are good candidates to be included in this new subsystem, instead of being
> > > a drm render node driver or a misc driver.
> > >
> > > [1] https://lkml.org/lkml/2022/7/31/83
> > > [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> > >
> > > Thanks,
> > > Oded
> > >
> > > Oded Gabbay (3):
> > >   drivers/accel: add new kconfig and update MAINTAINERS
> > >   drm: define new accel major and register it
> > >   drm: add dedicated minor for accelerator devices
> > >
> > >  Documentation/admin-guide/devices.txt |   5 +
> > >  MAINTAINERS                           |   8 +
> > >  drivers/Kconfig                       |   2 +
> > >  drivers/accel/Kconfig                 |  24 +++
> > >  drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
> > >  drivers/gpu/drm/drm_file.c            |  69 ++++++---
> > >  drivers/gpu/drm/drm_internal.h        |   5 +-
> > >  drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
> > >  include/drm/drm_device.h              |   3 +
> > >  include/drm/drm_drv.h                 |   8 +
> > >  include/drm/drm_file.h                |  21 ++-
> > >  include/drm/drm_ioctl.h               |   1 +
> > >  12 files changed, 374 insertions(+), 67 deletions(-)
> > >  create mode 100644 drivers/accel/Kconfig
> > >
> > > --
> > > 2.34.1
> > >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-22 21:46 ` [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices Oded Gabbay
  2022-10-23 12:41   ` Greg Kroah-Hartman
@ 2022-10-24 15:21   ` Jeffrey Hugo
  2022-10-24 17:43     ` Oded Gabbay
  2022-10-25  6:43   ` Jiho Chu
  2 siblings, 1 reply; 35+ messages in thread
From: Jeffrey Hugo @ 2022-10-24 15:21 UTC (permalink / raw)
  To: Oded Gabbay, David Airlie, Daniel Vetter, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, dri-devel, Jason Gunthorpe,
	John Hubbard, Alex Deucher
  Cc: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	Yuji Ishikawa, Jiho Chu, Daniel Stone, Tvrtko Ursulin,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On 10/22/2022 3:46 PM, Oded Gabbay wrote:
> The accelerator devices are exposed to user-space using a dedicated
> major. In addition, they are represented in /dev with new, dedicated
> device char names: /dev/accel/accel*. This is done to make sure any
> user-space software that tries to open a graphic card won't open
> the accelerator device by mistake.
> 
> The above implies that the minor numbering should be separated from
> the rest of the drm devices. However, to avoid code duplication, we
> want the drm_minor structure to be able to represent the accelerator
> device.
> 
> To achieve this, we add a new drm_minor* to drm_device that represents
> the accelerator device. This pointer is initialized for drivers that
> declare they handle compute accelerator, using a new driver feature
> flag called DRIVER_COMPUTE_ACCEL. It is important to note that this
> driver feature is mutually exclusive with DRIVER_RENDER. Devices that
> want to expose both graphics and compute device char files should be
> handled by two drivers that are connected using the auxiliary bus
> framework.
> 
> In addition, we define a different idr to handle the accelerators
> minors. This is done to make the minor's index be identical to the
> device index in /dev/. In most places, this is hidden inside the drm
> core functions except when calling drm_minor_acquire(), where I had to
> add an extra parameter to specify the idr to use (because the
> accelerators minors index and the drm primary minor index both begin
> at 0).
> 
> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> ---
>   drivers/gpu/drm/drm_drv.c      | 171 +++++++++++++++++++++++++--------
>   drivers/gpu/drm/drm_file.c     |  69 +++++++++----
>   drivers/gpu/drm/drm_internal.h |   2 +-
>   drivers/gpu/drm/drm_sysfs.c    |  29 ++++--
>   include/drm/drm_device.h       |   3 +
>   include/drm/drm_drv.h          |   8 ++
>   include/drm/drm_file.h         |  21 +++-
>   7 files changed, 235 insertions(+), 68 deletions(-)

Can we please add something to Documentation?  I know this leverages DRM 
a lot, but I believe that a new subsystem should not be introduced 
without documentation.  A lot of the info in the commit message is very 
good, but should not be buried in the git log.

Besides, imagine this has been in mainline for N years, and someone 
completely new to the kernel wants to write an accel driver.  They 
should be able to get started with something from Documentation that 
at-least gives that person some insight into what to grep the code for.

> 
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index b58ffb1433d6..c13701a8d4be 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
>   static DEFINE_SPINLOCK(drm_minor_lock);
>   static struct idr drm_minors_idr;
>   
> +static DEFINE_SPINLOCK(accel_minor_lock);
> +static struct idr accel_minors_idr;

IDR is deprecated.  XArray is the preferred mechanism.
Yes, there already is IDR here, but I believe we should not be adding 
new uses.  Maybe at some point, the current IDR will be converted.  Also 
with XArray, I think you don't need the spinlock since XArray has 
internal locking already.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-24 15:21   ` Jeffrey Hugo
@ 2022-10-24 17:43     ` Oded Gabbay
  2022-10-25 13:26       ` Michał Winiarski
  0 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2022-10-24 17:43 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Jiho Chu, Daniel Stone,
	Tvrtko Ursulin, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Mon, Oct 24, 2022 at 6:21 PM Jeffrey Hugo <quic_jhugo@quicinc.com> wrote:
>
> On 10/22/2022 3:46 PM, Oded Gabbay wrote:
> > The accelerator devices are exposed to user-space using a dedicated
> > major. In addition, they are represented in /dev with new, dedicated
> > device char names: /dev/accel/accel*. This is done to make sure any
> > user-space software that tries to open a graphic card won't open
> > the accelerator device by mistake.
> >
> > The above implies that the minor numbering should be separated from
> > the rest of the drm devices. However, to avoid code duplication, we
> > want the drm_minor structure to be able to represent the accelerator
> > device.
> >
> > To achieve this, we add a new drm_minor* to drm_device that represents
> > the accelerator device. This pointer is initialized for drivers that
> > declare they handle compute accelerator, using a new driver feature
> > flag called DRIVER_COMPUTE_ACCEL. It is important to note that this
> > driver feature is mutually exclusive with DRIVER_RENDER. Devices that
> > want to expose both graphics and compute device char files should be
> > handled by two drivers that are connected using the auxiliary bus
> > framework.
> >
> > In addition, we define a different idr to handle the accelerators
> > minors. This is done to make the minor's index be identical to the
> > device index in /dev/. In most places, this is hidden inside the drm
> > core functions except when calling drm_minor_acquire(), where I had to
> > add an extra parameter to specify the idr to use (because the
> > accelerators minors index and the drm primary minor index both begin
> > at 0).
> >
> > Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> > ---
> >   drivers/gpu/drm/drm_drv.c      | 171 +++++++++++++++++++++++++--------
> >   drivers/gpu/drm/drm_file.c     |  69 +++++++++----
> >   drivers/gpu/drm/drm_internal.h |   2 +-
> >   drivers/gpu/drm/drm_sysfs.c    |  29 ++++--
> >   include/drm/drm_device.h       |   3 +
> >   include/drm/drm_drv.h          |   8 ++
> >   include/drm/drm_file.h         |  21 +++-
> >   7 files changed, 235 insertions(+), 68 deletions(-)
>
> Can we please add something to Documentation?  I know this leverages DRM
> a lot, but I believe that a new subsystem should not be introduced
> without documentation.  A lot of the info in the commit message is very
> good, but should not be buried in the git log.
>
> Besides, imagine this has been in mainline for N years, and someone
> completely new to the kernel wants to write an accel driver.  They
> should be able to get started with something from Documentation that
> at-least gives that person some insight into what to grep the code for.
Agreed. The only reason I haven't done it at this stage was because I
wanted to get an initial reaction to the code itself, see if the
direction is accepted.
I didn't want to write documentation and then completely re-write it.
So I will do it for the next patch-set, once I collect everyone's
feedback and I see there is a majority agreement.
>
> >
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index b58ffb1433d6..c13701a8d4be 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
> >   static DEFINE_SPINLOCK(drm_minor_lock);
> >   static struct idr drm_minors_idr;
> >
> > +static DEFINE_SPINLOCK(accel_minor_lock);
> > +static struct idr accel_minors_idr;
>
> IDR is deprecated.  XArray is the preferred mechanism.
> Yes, there already is IDR here, but I believe we should not be adding
> new uses.  Maybe at some point, the current IDR will be converted.  Also
> with XArray, I think you don't need the spinlock since XArray has
> internal locking already.
ok, I wasn't aware. I don't have any problem replacing the idr to xarray.

Thanks,
Oded

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-24 12:43   ` Oded Gabbay
@ 2022-10-25  2:21     ` John Hubbard
  2022-10-25  2:27       ` Dave Airlie
  0 siblings, 1 reply; 35+ messages in thread
From: John Hubbard @ 2022-10-25  2:21 UTC (permalink / raw)
  To: Oded Gabbay, Thomas Zimmermann
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, Alex Deucher,
	Maarten Lankhorst, Maxime Ripard, Yuji Ishikawa, Jiho Chu,
	Daniel Stone, Tvrtko Ursulin, Jeffrey Hugo, Christoph Hellwig,
	Kevin Hilman, Jagan Teki, Jacek Lawrynowicz, Maciej Kwapulinski

On 10/24/22 05:43, Oded Gabbay wrote:

Hi Oded,

The patches make sense to me. I'm still just reading through and looking
for minor issues, but at a high level it seems to match what the LPC
discussions pointed to.

>> What's your opinion on the long-term prospect of DRM vs accel? I assume
>> that over time, DRM helpers will move into accel and some DRM drivers
>> will start depending on accel?
> I don't think that is what I had in mind.
> What I had in mind is that accel helpers are only relevant for accel
> drivers, and any code that might also be relevant for DRM drivers will
> be placed in DRM core code. e.g. GEM enhancements, RAS netlink

Yes. That is how I understood it ("it" being both the LPC discussions,
and this patchset) as well:

     * accel-only code goes in drivers/accel, thus allowing for
       smaller, simpler drivers (as compared to full drm) for that case.

     * graphics and display code still goes in drivers/gpu/drm, because
       it is much too hard to rename or move that directory.

     * code common to both also goes in drivers/gpu/drm.

Looking ahead a bit more:

For full-featured GPUs that do both Graphics and Compute, I expect
that a *lot* of the code will end up in drivers/gpu/drm. Because so
much of setting up for Compute is also really just setting up for
Graphics--that's how it evolved, after all!

And as things are structured now, it looks like those full featured
GPU stacks will also need an aux bus (which I only just now learned
about, but it looks quite helpful here). And also, user space will
need to open both /dev/dri/* and /dev/accel/* nodes, if it needs
access to anything live objects that drivers/accel owns.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-25  2:21     ` John Hubbard
@ 2022-10-25  2:27       ` Dave Airlie
  2022-10-25 11:15         ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Dave Airlie @ 2022-10-25  2:27 UTC (permalink / raw)
  To: John Hubbard
  Cc: Oded Gabbay, Thomas Zimmermann, Daniel Vetter, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, dri-devel, Jason Gunthorpe,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard, Yuji Ishikawa,
	Jiho Chu, Daniel Stone, Tvrtko Ursulin, Jeffrey Hugo,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On Tue, 25 Oct 2022 at 12:21, John Hubbard <jhubbard@nvidia.com> wrote:
>
> On 10/24/22 05:43, Oded Gabbay wrote:
>
> Hi Oded,
>
> The patches make sense to me. I'm still just reading through and looking
> for minor issues, but at a high level it seems to match what the LPC
> discussions pointed to.
>
> >> What's your opinion on the long-term prospect of DRM vs accel? I assume
> >> that over time, DRM helpers will move into accel and some DRM drivers
> >> will start depending on accel?
> > I don't think that is what I had in mind.
> > What I had in mind is that accel helpers are only relevant for accel
> > drivers, and any code that might also be relevant for DRM drivers will
> > be placed in DRM core code. e.g. GEM enhancements, RAS netlink
>
> Yes. That is how I understood it ("it" being both the LPC discussions,
> and this patchset) as well:
>
>      * accel-only code goes in drivers/accel, thus allowing for
>        smaller, simpler drivers (as compared to full drm) for that case.
>
>      * graphics and display code still goes in drivers/gpu/drm, because
>        it is much too hard to rename or move that directory.
>
>      * code common to both also goes in drivers/gpu/drm.
>
> Looking ahead a bit more:
>
> For full-featured GPUs that do both Graphics and Compute, I expect
> that a *lot* of the code will end up in drivers/gpu/drm. Because so
> much of setting up for Compute is also really just setting up for
> Graphics--that's how it evolved, after all!
>
> And as things are structured now, it looks like those full featured
> GPU stacks will also need an aux bus (which I only just now learned
> about, but it looks quite helpful here). And also, user space will
> need to open both /dev/dri/* and /dev/accel/* nodes, if it needs
> access to anything live objects that drivers/accel owns.
>

I actually don't know if we really need to worry about compute nodes
for fully featured devices.

The userspace for those is normally bespoke like ROCm, which uses
amdkfd, and amdkfd doesn't operate like most device files from what I
know, so I'm not sure we'd want it to operate as an accel device.
Or the userspace is OpenCL like where we have stacks that already bind
using the drm interfaces so again not sure if there's any value there.

For anything which already has a userspace and stuff I don't think
this adds any value, for nvidia type cards I doubt there is much use
in using an accel node for the GPU related things at all.

Dave.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-22 21:46 ` [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices Oded Gabbay
  2022-10-23 12:41   ` Greg Kroah-Hartman
  2022-10-24 15:21   ` Jeffrey Hugo
@ 2022-10-25  6:43   ` Jiho Chu
  2022-10-26  6:38     ` Oded Gabbay
  2 siblings, 1 reply; 35+ messages in thread
From: Jiho Chu @ 2022-10-25  6:43 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski


On Sun, 23 Oct 2022 00:46:22 +0300
Oded Gabbay <ogabbay@kernel.org> wrote:

> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index b58ffb1433d6..c13701a8d4be 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
>  static DEFINE_SPINLOCK(drm_minor_lock);
>  static struct idr drm_minors_idr;
>  
> +static DEFINE_SPINLOCK(accel_minor_lock);
> +static struct idr accel_minors_idr;
> +
>  /*
>   * If the drm core fails to init for whatever reason,
>   * we should prevent any drivers from registering with it.
> @@ -94,6 +97,8 @@ static struct drm_minor **drm_minor_get_slot(struct drm_device *dev,
>  		return &dev->primary;
>  	case DRM_MINOR_RENDER:
>  		return &dev->render;
> +	case DRM_MINOR_ACCEL:
> +		return &dev->accel;
>  	default:
>  		BUG();
>  	}
> @@ -108,9 +113,15 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data)
>  
>  	put_device(minor->kdev);
>  
> -	spin_lock_irqsave(&drm_minor_lock, flags);
> -	idr_remove(&drm_minors_idr, minor->index);
> -	spin_unlock_irqrestore(&drm_minor_lock, flags);
> +	if (minor->type == DRM_MINOR_ACCEL) {
> +		spin_lock_irqsave(&accel_minor_lock, flags);
> +		idr_remove(&accel_minors_idr, minor->index);
> +		spin_unlock_irqrestore(&accel_minor_lock, flags);
> +	} else {
> +		spin_lock_irqsave(&drm_minor_lock, flags);
> +		idr_remove(&drm_minors_idr, minor->index);
> +		spin_unlock_irqrestore(&drm_minor_lock, flags);
> +	}
>  }
>  
>  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> @@ -127,13 +138,23 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
>  	minor->dev = dev;
>  
>  	idr_preload(GFP_KERNEL);
> -	spin_lock_irqsave(&drm_minor_lock, flags);
> -	r = idr_alloc(&drm_minors_idr,
> -		      NULL,
> -		      64 * type,
> -		      64 * (type + 1),
> -		      GFP_NOWAIT);
> -	spin_unlock_irqrestore(&drm_minor_lock, flags);
> +	if (type == DRM_MINOR_ACCEL) {
> +		spin_lock_irqsave(&accel_minor_lock, flags);
> +		r = idr_alloc(&accel_minors_idr,
> +			NULL,
> +			64 * (type - DRM_MINOR_ACCEL),
> +			64 * (type - DRM_MINOR_ACCEL + 1),
> +			GFP_NOWAIT);
> +		spin_unlock_irqrestore(&accel_minor_lock, flags);
> +	} else {
> +		spin_lock_irqsave(&drm_minor_lock, flags);
> +		r = idr_alloc(&drm_minors_idr,
> +			NULL,
> +			64 * type,
> +			64 * (type + 1),
> +			GFP_NOWAIT);
> +		spin_unlock_irqrestore(&drm_minor_lock, flags);
> +	}

Hi,
There are many functions which checks drm type and decides its behaviors. It's good to
re-use exiting codes, but accel devices use totally different major/minor, and so it needs to be moved to 
/drvier/accel/ (maybe later..). How about seperating functions for alloc/release minor (accel_minor_alloc..)? 
also, for others which have drm type related codes.




> @@ -607,6 +652,14 @@ static int drm_dev_init(struct drm_device *dev,
>  	/* no per-device feature limits by default */
>  	dev->driver_features = ~0u;
>  
> +	if (drm_core_check_feature(dev, DRIVER_COMPUTE_ACCEL) &&
> +				(drm_core_check_feature(dev, DRIVER_RENDER) ||
> +				drm_core_check_feature(dev, DRIVER_MODESET))) {
> +
> +		DRM_ERROR("DRM driver can't be both a compute acceleration and graphics driver\n");
> +		return -EINVAL;
> +	}
> +

It's fine for the device only for acceleration, but can't graphic devices have acceleration feature?


Thanks,
Jiho Chu

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-25  2:27       ` Dave Airlie
@ 2022-10-25 11:15         ` Jason Gunthorpe
  2022-10-25 14:21           ` Alex Deucher
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-10-25 11:15 UTC (permalink / raw)
  To: Dave Airlie
  Cc: John Hubbard, Oded Gabbay, Thomas Zimmermann, Daniel Vetter,
	Arnd Bergmann, Greg Kroah-Hartman, linux-kernel, dri-devel,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard, Yuji Ishikawa,
	Jiho Chu, Daniel Stone, Tvrtko Ursulin, Jeffrey Hugo,
	Christoph Hellwig, Kevin Hilman, Jagan Teki, Jacek Lawrynowicz,
	Maciej Kwapulinski

On Tue, Oct 25, 2022 at 12:27:11PM +1000, Dave Airlie wrote:

> The userspace for those is normally bespoke like ROCm, which uses
> amdkfd, and amdkfd doesn't operate like most device files from what I
> know, so I'm not sure we'd want it to operate as an accel device.

I intensely dislike this direction that drivers will create their own
char devs buried inside their device driver with no support or
supervision.

We've been here before with RDMA and it is just a complete mess.

Whatever special non-drm stuff amdkfd need to do should be supported
through the new subsystem, in a proper maintainable way.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-24 17:43     ` Oded Gabbay
@ 2022-10-25 13:26       ` Michał Winiarski
  2022-10-26  6:38         ` Oded Gabbay
  0 siblings, 1 reply; 35+ messages in thread
From: Michał Winiarski @ 2022-10-25 13:26 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: Jeffrey Hugo, David Airlie, Daniel Vetter, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, dri-devel, Jason Gunthorpe,
	John Hubbard, Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Jiho Chu, Daniel Stone,
	Tvrtko Ursulin, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Mon, Oct 24, 2022 at 08:43:58PM +0300, Oded Gabbay wrote:
> On Mon, Oct 24, 2022 at 6:21 PM Jeffrey Hugo <quic_jhugo@quicinc.com> wrote:
> >
> > On 10/22/2022 3:46 PM, Oded Gabbay wrote:
> > > The accelerator devices are exposed to user-space using a dedicated
> > > major. In addition, they are represented in /dev with new, dedicated
> > > device char names: /dev/accel/accel*. This is done to make sure any
> > > user-space software that tries to open a graphic card won't open
> > > the accelerator device by mistake.
> > >
> > > The above implies that the minor numbering should be separated from
> > > the rest of the drm devices. However, to avoid code duplication, we
> > > want the drm_minor structure to be able to represent the accelerator
> > > device.
> > >
> > > To achieve this, we add a new drm_minor* to drm_device that represents
> > > the accelerator device. This pointer is initialized for drivers that
> > > declare they handle compute accelerator, using a new driver feature
> > > flag called DRIVER_COMPUTE_ACCEL. It is important to note that this
> > > driver feature is mutually exclusive with DRIVER_RENDER. Devices that
> > > want to expose both graphics and compute device char files should be
> > > handled by two drivers that are connected using the auxiliary bus
> > > framework.
> > >
> > > In addition, we define a different idr to handle the accelerators
> > > minors. This is done to make the minor's index be identical to the
> > > device index in /dev/. In most places, this is hidden inside the drm
> > > core functions except when calling drm_minor_acquire(), where I had to
> > > add an extra parameter to specify the idr to use (because the
> > > accelerators minors index and the drm primary minor index both begin
> > > at 0).
> > >
> > > Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> > > ---
> > >   drivers/gpu/drm/drm_drv.c      | 171 +++++++++++++++++++++++++--------
> > >   drivers/gpu/drm/drm_file.c     |  69 +++++++++----
> > >   drivers/gpu/drm/drm_internal.h |   2 +-
> > >   drivers/gpu/drm/drm_sysfs.c    |  29 ++++--
> > >   include/drm/drm_device.h       |   3 +
> > >   include/drm/drm_drv.h          |   8 ++
> > >   include/drm/drm_file.h         |  21 +++-
> > >   7 files changed, 235 insertions(+), 68 deletions(-)
> >
> > Can we please add something to Documentation?  I know this leverages DRM
> > a lot, but I believe that a new subsystem should not be introduced
> > without documentation.  A lot of the info in the commit message is very
> > good, but should not be buried in the git log.
> >
> > Besides, imagine this has been in mainline for N years, and someone
> > completely new to the kernel wants to write an accel driver.  They
> > should be able to get started with something from Documentation that
> > at-least gives that person some insight into what to grep the code for.
> Agreed. The only reason I haven't done it at this stage was because I
> wanted to get an initial reaction to the code itself, see if the
> direction is accepted.
> I didn't want to write documentation and then completely re-write it.
> So I will do it for the next patch-set, once I collect everyone's
> feedback and I see there is a majority agreement.
> >
> > >
> > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > index b58ffb1433d6..c13701a8d4be 100644
> > > --- a/drivers/gpu/drm/drm_drv.c
> > > +++ b/drivers/gpu/drm/drm_drv.c
> > > @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
> > >   static DEFINE_SPINLOCK(drm_minor_lock);
> > >   static struct idr drm_minors_idr;
> > >
> > > +static DEFINE_SPINLOCK(accel_minor_lock);
> > > +static struct idr accel_minors_idr;
> >
> > IDR is deprecated.  XArray is the preferred mechanism.
> > Yes, there already is IDR here, but I believe we should not be adding
> > new uses.  Maybe at some point, the current IDR will be converted.  Also
> > with XArray, I think you don't need the spinlock since XArray has
> > internal locking already.
> ok, I wasn't aware. I don't have any problem replacing the idr to xarray.

The conversion is sitting on the mailinglist for a while now
(unfortunately, without much interest).
Perhaps you could help with reviewing it?
https://lore.kernel.org/dri-devel/20220911211443.581481-2-michal.winiarski@intel.com/

-Michał

> 
> Thanks,
> Oded
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-25 11:15         ` Jason Gunthorpe
@ 2022-10-25 14:21           ` Alex Deucher
  2022-10-25 14:34             ` Jason Gunthorpe
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Deucher @ 2022-10-25 14:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dave Airlie, Tvrtko Ursulin, Jiho Chu, Jeffrey Hugo,
	Thomas Zimmermann, Arnd Bergmann, John Hubbard, Oded Gabbay,
	linux-kernel, dri-devel, Christoph Hellwig, Jacek Lawrynowicz,
	Greg Kroah-Hartman, Alex Deucher, Yuji Ishikawa, Kevin Hilman,
	Maciej Kwapulinski, Jagan Teki

On Tue, Oct 25, 2022 at 7:15 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Oct 25, 2022 at 12:27:11PM +1000, Dave Airlie wrote:
>
> > The userspace for those is normally bespoke like ROCm, which uses
> > amdkfd, and amdkfd doesn't operate like most device files from what I
> > know, so I'm not sure we'd want it to operate as an accel device.
>
> I intensely dislike this direction that drivers will create their own
> char devs buried inside their device driver with no support or
> supervision.
>
> We've been here before with RDMA and it is just a complete mess.
>
> Whatever special non-drm stuff amdkfd need to do should be supported
> through the new subsystem, in a proper maintainable way.

We plan to eventually move ROCm over the drm interfaces once we get
user mode queues working on non-compute queues which is already in
progress.  ROCm already uses the existing drm nodes and libdrm for a
number of things today (buffer sharing, media and compute command
submission in certain cases, etc.).  I don't see much value in the
accel nodes for AMD products at this time.  Even when we transition,
there are still a bunch of things that we'd need to think about, so
the current kfd node may stick around until we figure out a plan for
those areas.  E.g., the kfd node provides platform level compute
topology information; e.g., the NUMA details for connected GPUs and
CPUs, non-GPU compute node information, cache level topologies, etc.

Alex

>
> Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-25 14:21           ` Alex Deucher
@ 2022-10-25 14:34             ` Jason Gunthorpe
  2022-10-25 14:43               ` Alex Deucher
  0 siblings, 1 reply; 35+ messages in thread
From: Jason Gunthorpe @ 2022-10-25 14:34 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Dave Airlie, Tvrtko Ursulin, Jiho Chu, Jeffrey Hugo,
	Thomas Zimmermann, Arnd Bergmann, John Hubbard, Oded Gabbay,
	linux-kernel, dri-devel, Christoph Hellwig, Jacek Lawrynowicz,
	Greg Kroah-Hartman, Alex Deucher, Yuji Ishikawa, Kevin Hilman,
	Maciej Kwapulinski, Jagan Teki

On Tue, Oct 25, 2022 at 10:21:34AM -0400, Alex Deucher wrote:

> E.g., the kfd node provides platform level compute
> topology information; e.g., the NUMA details for connected GPUs and
> CPUs, non-GPU compute node information, cache level topologies, etc.

See, this is exactly what I'm talking about. What on earth does any of
this have to do with DRM?

We alread have places in the kernel that own and expose these kinds of
information, drivers need to use them. Not re-invent them.

Jason

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-25 14:34             ` Jason Gunthorpe
@ 2022-10-25 14:43               ` Alex Deucher
  0 siblings, 0 replies; 35+ messages in thread
From: Alex Deucher @ 2022-10-25 14:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dave Airlie, Tvrtko Ursulin, Jiho Chu, Jeffrey Hugo,
	Thomas Zimmermann, Arnd Bergmann, John Hubbard, Oded Gabbay,
	linux-kernel, dri-devel, Christoph Hellwig, Jacek Lawrynowicz,
	Greg Kroah-Hartman, Alex Deucher, Yuji Ishikawa, Kevin Hilman,
	Maciej Kwapulinski, Jagan Teki

On Tue, Oct 25, 2022 at 10:34 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Oct 25, 2022 at 10:21:34AM -0400, Alex Deucher wrote:
>
> > E.g., the kfd node provides platform level compute
> > topology information; e.g., the NUMA details for connected GPUs and
> > CPUs, non-GPU compute node information, cache level topologies, etc.
>
> See, this is exactly what I'm talking about. What on earth does any of
> this have to do with DRM?

At least for the GPU information it seems relevant.  What value are
acceleration device cache topologies outside of the subsytsem that
uses them?

>
> We alread have places in the kernel that own and expose these kinds of
> information, drivers need to use them. Not re-invent them.

I don't disagree, but I'm not sure where the best place for these
should be.  Probably a lack of knowledge of where this should actually
live and indifference from the maintainers of those areas since this
use case doesn't match existing ones.

Alex

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices
  2022-10-24 15:10     ` Alex Deucher
@ 2022-10-26  6:10       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-26  6:10 UTC (permalink / raw)
  To: Alex Deucher
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Tvrtko Ursulin, Jacek Lawrynowicz, Jeffrey Hugo,
	Jiho Chu, Christoph Hellwig, Thomas Zimmermann, Kevin Hilman,
	Yuji Ishikawa, Maciej Kwapulinski, Jagan Teki

On Mon, Oct 24, 2022 at 6:10 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Mon, Oct 24, 2022 at 10:41 AM Oded Gabbay <ogabbay@kernel.org> wrote:
> >
> > On Mon, Oct 24, 2022 at 4:55 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > >
> > > On Sat, Oct 22, 2022 at 5:46 PM Oded Gabbay <ogabbay@kernel.org> wrote:
> > > >
> > > > In the last couple of months we had a discussion [1] about creating a new
> > > > subsystem for compute accelerator devices in the kernel.
> > > >
> > > > After an analysis that was done by DRM maintainers and myself, and following
> > > > a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> > > > decided to create a new subsystem that will use the DRM subsystem's code and
> > > > functionality. i.e. the accel core code will be part of the DRM subsystem.
> > > >
> > > > This will allow us to leverage the extensive DRM code-base and
> > > > collaborate with DRM developers that have experience with this type of
> > > > devices. In addition, new features that will be added for the accelerator
> > > > drivers can be of use to GPU drivers as well (e.g. RAS).
> > > >
> > > > As agreed in the BOF session, the accelerator devices will be exposed to
> > > > user-space with a new, dedicated device char files and a dedicated major
> > > > number (261), to clearly separate them from graphic cards and the graphic
> > > > user-space s/w stack. Furthermore, the drivers will be located in a separate
> > > > place in the kernel tree (drivers/accel/).
> > > >
> > > > This series of patches is the first step in this direction as it adds the
> > > > necessary infrastructure for accelerator devices to DRM. The new devices will
> > > > be exposed with the following convention:
> > > >
> > > > device char files - /dev/accel/accel*
> > > > sysfs             - /sys/class/accel/accel*/
> > > > debugfs           - /sys/kernel/debug/accel/accel*/
> > > >
> > > > I tried to reuse the existing DRM code as much as possible, while keeping it
> > > > readable and maintainable.
> > >
> > > Wouldn't something like this:
> > > https://patchwork.freedesktop.org/series/109575/
> > > Be simpler and provide better backwards compatibility for existing
> > > non-gfx devices in the drm subsystem as well as newer devices?
> >
> > As Greg said, see the summary. The consensus in the LPC session was
> > that we need to clearly separate accel devices from existing gpu
> > devices (whether they use primary and/or render nodes). That is the
> > main guideline according to which I wrote the patches. I don't think I
> > want to change this decision.
> >
> > Also, there was never any intention to provide backward compatibility
> > for existing non-gfx devices. Why would we want that ? We are mainly
> > talking about drivers that are currently trying to get upstream, and
> > the habana driver.
>
> If someone already has a non-gfx device which uses the drm subsystem,
> should they be converted to the new accel stuff?  What about new
> devices that utilize the same driver?  SHould they use accel or
> continue to use drm?
My baseline assumption was that this subsystem is mainly (but not
solely) for new drivers that are now trying to get upstreamed and for
the habana driver.
imo we should not force existing drivers to convert their entire
driver just because we created a new subsystem. If they want to do it,
they are more than welcomed.
But that's only my opinion and other maintainers might think otherwise.

> For the sake of the rest of the stack drm would
> make more sense, but if accel grows a bunch of stuff that all accel
> drivers should be using what do we do?
First of all, as I wrote in another email, I don't think accel core
code will be very large. Otherwise, I probably would have tried to
convince people that the accel stuff should be totally independent of
drm.
You can see I tried to make the code tightly-coupled with drm (too
much according to the reviews) and I did that because I believe most
core code will be common to drm and accel. So I'm not worried about
this aspect.

Second, yes, if for some reason there will be accel-only features that
devices want to use, they will need to create an accel device that
will have this functionality and be connected via auxiliary bus to
their main driver (which can be drm or other subsystem, e.g. nvme).
For example, to utilize Ethernet and RDMA features, habana is now
writing Ethernet and RDMA drivers that will be upstreamed and they
will be connected to the main/compute driver via auxiliary bus.

> Also using render nodes also
> makes the devices compatible with all of the existing user space tools
> that use the existing drm device nodes like libdrm, etc.  I'm failing
> to see what advantage accel brings other than requiring userspace to
> support two very similar device nodes.
This is exactly what we are trying to avoid here :) We want to make
sure that all existing user space tools that use drm devices will NOT
work with the accel devices.
Accel devices are not GPUs. The h/w ip might be a part of a GPU ASIC,
but the specific functionality is not related to the
drm/mesa/x-server/wayland/opengl/vulkan stack.
I don't want them to expose render nodes that Chrome or some other
application tries to open because it thinks it is a GPU...

So it was the majority opinion of the people in LPC that we should
make a clear separation. If there is no separation, then I don't see
the point in doing an accel subsystem, let's just continue to do drm.

Thanks,
Oded

>
> Alex
>
> >
> > Oded
> > >
> > > Alex
> > >
> > > >
> > > > One thing that is missing from this series is defining a namespace for the
> > > > new accel subsystem, while I'll add in the next iteration of this patch-set,
> > > > after I will receive feedback from the community.
> > > >
> > > > As for drivers, once this series will be accepted (after adding the namespace),
> > > > I will start working on migrating the habanalabs driver to the new accel
> > > > subsystem. I have talked about it with Dave and we agreed that it will be
> > > > a good start to simply move the driver as-is with minimal changes, and then
> > > > start working on the driver's individual features that will be either added
> > > > to the accel core code (with or without changes), or will be removed and
> > > > instead the driver will use existing DRM code.
> > > >
> > > > In addition, I know of at least 3 or 4 drivers that were submitted for review
> > > > and are good candidates to be included in this new subsystem, instead of being
> > > > a drm render node driver or a misc driver.
> > > >
> > > > [1] https://lkml.org/lkml/2022/7/31/83
> > > > [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
> > > >
> > > > Thanks,
> > > > Oded
> > > >
> > > > Oded Gabbay (3):
> > > >   drivers/accel: add new kconfig and update MAINTAINERS
> > > >   drm: define new accel major and register it
> > > >   drm: add dedicated minor for accelerator devices
> > > >
> > > >  Documentation/admin-guide/devices.txt |   5 +
> > > >  MAINTAINERS                           |   8 +
> > > >  drivers/Kconfig                       |   2 +
> > > >  drivers/accel/Kconfig                 |  24 +++
> > > >  drivers/gpu/drm/drm_drv.c             | 214 +++++++++++++++++++++-----
> > > >  drivers/gpu/drm/drm_file.c            |  69 ++++++---
> > > >  drivers/gpu/drm/drm_internal.h        |   5 +-
> > > >  drivers/gpu/drm/drm_sysfs.c           |  81 +++++++++-
> > > >  include/drm/drm_device.h              |   3 +
> > > >  include/drm/drm_drv.h                 |   8 +
> > > >  include/drm/drm_file.h                |  21 ++-
> > > >  include/drm/drm_ioctl.h               |   1 +
> > > >  12 files changed, 374 insertions(+), 67 deletions(-)
> > > >  create mode 100644 drivers/accel/Kconfig
> > > >
> > > > --
> > > > 2.34.1
> > > >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-25  6:43   ` Jiho Chu
@ 2022-10-26  6:38     ` Oded Gabbay
  2022-10-28  6:56       ` Jiho Chu
  0 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2022-10-26  6:38 UTC (permalink / raw)
  To: Jiho Chu
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Tue, Oct 25, 2022 at 9:43 AM Jiho Chu <jiho.chu@samsung.com> wrote:
>
>
> On Sun, 23 Oct 2022 00:46:22 +0300
> Oded Gabbay <ogabbay@kernel.org> wrote:
>
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index b58ffb1433d6..c13701a8d4be 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
> >  static DEFINE_SPINLOCK(drm_minor_lock);
> >  static struct idr drm_minors_idr;
> >
> > +static DEFINE_SPINLOCK(accel_minor_lock);
> > +static struct idr accel_minors_idr;
> > +
> >  /*
> >   * If the drm core fails to init for whatever reason,
> >   * we should prevent any drivers from registering with it.
> > @@ -94,6 +97,8 @@ static struct drm_minor **drm_minor_get_slot(struct drm_device *dev,
> >               return &dev->primary;
> >       case DRM_MINOR_RENDER:
> >               return &dev->render;
> > +     case DRM_MINOR_ACCEL:
> > +             return &dev->accel;
> >       default:
> >               BUG();
> >       }
> > @@ -108,9 +113,15 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data)
> >
> >       put_device(minor->kdev);
> >
> > -     spin_lock_irqsave(&drm_minor_lock, flags);
> > -     idr_remove(&drm_minors_idr, minor->index);
> > -     spin_unlock_irqrestore(&drm_minor_lock, flags);
> > +     if (minor->type == DRM_MINOR_ACCEL) {
> > +             spin_lock_irqsave(&accel_minor_lock, flags);
> > +             idr_remove(&accel_minors_idr, minor->index);
> > +             spin_unlock_irqrestore(&accel_minor_lock, flags);
> > +     } else {
> > +             spin_lock_irqsave(&drm_minor_lock, flags);
> > +             idr_remove(&drm_minors_idr, minor->index);
> > +             spin_unlock_irqrestore(&drm_minor_lock, flags);
> > +     }
> >  }
> >
> >  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> > @@ -127,13 +138,23 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> >       minor->dev = dev;
> >
> >       idr_preload(GFP_KERNEL);
> > -     spin_lock_irqsave(&drm_minor_lock, flags);
> > -     r = idr_alloc(&drm_minors_idr,
> > -                   NULL,
> > -                   64 * type,
> > -                   64 * (type + 1),
> > -                   GFP_NOWAIT);
> > -     spin_unlock_irqrestore(&drm_minor_lock, flags);
> > +     if (type == DRM_MINOR_ACCEL) {
> > +             spin_lock_irqsave(&accel_minor_lock, flags);
> > +             r = idr_alloc(&accel_minors_idr,
> > +                     NULL,
> > +                     64 * (type - DRM_MINOR_ACCEL),
> > +                     64 * (type - DRM_MINOR_ACCEL + 1),
> > +                     GFP_NOWAIT);
> > +             spin_unlock_irqrestore(&accel_minor_lock, flags);
> > +     } else {
> > +             spin_lock_irqsave(&drm_minor_lock, flags);
> > +             r = idr_alloc(&drm_minors_idr,
> > +                     NULL,
> > +                     64 * type,
> > +                     64 * (type + 1),
> > +                     GFP_NOWAIT);
> > +             spin_unlock_irqrestore(&drm_minor_lock, flags);
> > +     }
>
> Hi,
> There are many functions which checks drm type and decides its behaviors. It's good to
> re-use exiting codes, but accel devices use totally different major/minor, and so it needs to be moved to
> /drvier/accel/ (maybe later..). How about seperating functions for alloc/release minor (accel_minor_alloc..)?
> also, for others which have drm type related codes.
My feeling was moving the minor code handling to a different file (in
addition to moving the major code handling) will cause too much
duplication.
My main theme is that an accel minor is another minor in drm, even if
a bit different. i.e. It uses the same drm_minor structure.
The driver declares he wants to use this minor using a drm driver feature flag.
imo, all of that indicates the code should be inside drm.
>
>
>
>
> > @@ -607,6 +652,14 @@ static int drm_dev_init(struct drm_device *dev,
> >       /* no per-device feature limits by default */
> >       dev->driver_features = ~0u;
> >
> > +     if (drm_core_check_feature(dev, DRIVER_COMPUTE_ACCEL) &&
> > +                             (drm_core_check_feature(dev, DRIVER_RENDER) ||
> > +                             drm_core_check_feature(dev, DRIVER_MODESET))) {
> > +
> > +             DRM_ERROR("DRM driver can't be both a compute acceleration and graphics driver\n");
> > +             return -EINVAL;
> > +     }
> > +
>
> It's fine for the device only for acceleration, but can't graphic devices have acceleration feature?
Of course they can :) In that case, and if they want to expose an
accel device char, they should write an accel driver and connect it to
their main graphics driver via auxiliary bus.

I could have added two flags - compute_accel, and compute_accel_only
(similar to a patch that was sent to add render only flag), but imo it
would make the code more convoluted. I prefer the clean separation and
using standard auxiliary bus.

Thanks,
Oded

>
>
> Thanks,
> Jiho Chu

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-25 13:26       ` Michał Winiarski
@ 2022-10-26  6:38         ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2022-10-26  6:38 UTC (permalink / raw)
  To: Michał Winiarski
  Cc: Jeffrey Hugo, David Airlie, Daniel Vetter, Arnd Bergmann,
	Greg Kroah-Hartman, linux-kernel, dri-devel, Jason Gunthorpe,
	John Hubbard, Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Jiho Chu, Daniel Stone,
	Tvrtko Ursulin, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Tue, Oct 25, 2022 at 4:27 PM Michał Winiarski
<michal.winiarski@intel.com> wrote:
>
> On Mon, Oct 24, 2022 at 08:43:58PM +0300, Oded Gabbay wrote:
> > On Mon, Oct 24, 2022 at 6:21 PM Jeffrey Hugo <quic_jhugo@quicinc.com> wrote:
> > >
> > > On 10/22/2022 3:46 PM, Oded Gabbay wrote:
> > > > The accelerator devices are exposed to user-space using a dedicated
> > > > major. In addition, they are represented in /dev with new, dedicated
> > > > device char names: /dev/accel/accel*. This is done to make sure any
> > > > user-space software that tries to open a graphic card won't open
> > > > the accelerator device by mistake.
> > > >
> > > > The above implies that the minor numbering should be separated from
> > > > the rest of the drm devices. However, to avoid code duplication, we
> > > > want the drm_minor structure to be able to represent the accelerator
> > > > device.
> > > >
> > > > To achieve this, we add a new drm_minor* to drm_device that represents
> > > > the accelerator device. This pointer is initialized for drivers that
> > > > declare they handle compute accelerator, using a new driver feature
> > > > flag called DRIVER_COMPUTE_ACCEL. It is important to note that this
> > > > driver feature is mutually exclusive with DRIVER_RENDER. Devices that
> > > > want to expose both graphics and compute device char files should be
> > > > handled by two drivers that are connected using the auxiliary bus
> > > > framework.
> > > >
> > > > In addition, we define a different idr to handle the accelerators
> > > > minors. This is done to make the minor's index be identical to the
> > > > device index in /dev/. In most places, this is hidden inside the drm
> > > > core functions except when calling drm_minor_acquire(), where I had to
> > > > add an extra parameter to specify the idr to use (because the
> > > > accelerators minors index and the drm primary minor index both begin
> > > > at 0).
> > > >
> > > > Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
> > > > ---
> > > >   drivers/gpu/drm/drm_drv.c      | 171 +++++++++++++++++++++++++--------
> > > >   drivers/gpu/drm/drm_file.c     |  69 +++++++++----
> > > >   drivers/gpu/drm/drm_internal.h |   2 +-
> > > >   drivers/gpu/drm/drm_sysfs.c    |  29 ++++--
> > > >   include/drm/drm_device.h       |   3 +
> > > >   include/drm/drm_drv.h          |   8 ++
> > > >   include/drm/drm_file.h         |  21 +++-
> > > >   7 files changed, 235 insertions(+), 68 deletions(-)
> > >
> > > Can we please add something to Documentation?  I know this leverages DRM
> > > a lot, but I believe that a new subsystem should not be introduced
> > > without documentation.  A lot of the info in the commit message is very
> > > good, but should not be buried in the git log.
> > >
> > > Besides, imagine this has been in mainline for N years, and someone
> > > completely new to the kernel wants to write an accel driver.  They
> > > should be able to get started with something from Documentation that
> > > at-least gives that person some insight into what to grep the code for.
> > Agreed. The only reason I haven't done it at this stage was because I
> > wanted to get an initial reaction to the code itself, see if the
> > direction is accepted.
> > I didn't want to write documentation and then completely re-write it.
> > So I will do it for the next patch-set, once I collect everyone's
> > feedback and I see there is a majority agreement.
> > >
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > > index b58ffb1433d6..c13701a8d4be 100644
> > > > --- a/drivers/gpu/drm/drm_drv.c
> > > > +++ b/drivers/gpu/drm/drm_drv.c
> > > > @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
> > > >   static DEFINE_SPINLOCK(drm_minor_lock);
> > > >   static struct idr drm_minors_idr;
> > > >
> > > > +static DEFINE_SPINLOCK(accel_minor_lock);
> > > > +static struct idr accel_minors_idr;
> > >
> > > IDR is deprecated.  XArray is the preferred mechanism.
> > > Yes, there already is IDR here, but I believe we should not be adding
> > > new uses.  Maybe at some point, the current IDR will be converted.  Also
> > > with XArray, I think you don't need the spinlock since XArray has
> > > internal locking already.
> > ok, I wasn't aware. I don't have any problem replacing the idr to xarray.
>
> The conversion is sitting on the mailinglist for a while now
> (unfortunately, without much interest).
> Perhaps you could help with reviewing it?
> https://lore.kernel.org/dri-devel/20220911211443.581481-2-michal.winiarski@intel.com/
>
> -Michał
I'll do it.
Oded

>
> >
> > Thanks,
> > Oded
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices
  2022-10-26  6:38     ` Oded Gabbay
@ 2022-10-28  6:56       ` Jiho Chu
  0 siblings, 0 replies; 35+ messages in thread
From: Jiho Chu @ 2022-10-28  6:56 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: David Airlie, Daniel Vetter, Arnd Bergmann, Greg Kroah-Hartman,
	linux-kernel, dri-devel, Jason Gunthorpe, John Hubbard,
	Alex Deucher, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, Yuji Ishikawa, Daniel Stone, Tvrtko Ursulin,
	Jeffrey Hugo, Christoph Hellwig, Kevin Hilman, Jagan Teki,
	Jacek Lawrynowicz, Maciej Kwapulinski

On Wed, 26 Oct 2022 09:38:13 +0300
Oded Gabbay <ogabbay@kernel.org> wrote:

> On Tue, Oct 25, 2022 at 9:43 AM Jiho Chu <jiho.chu@samsung.com> wrote:
> >
> >
> > On Sun, 23 Oct 2022 00:46:22 +0300
> > Oded Gabbay <ogabbay@kernel.org> wrote:
> >
> > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > > index b58ffb1433d6..c13701a8d4be 100644
> > > --- a/drivers/gpu/drm/drm_drv.c
> > > +++ b/drivers/gpu/drm/drm_drv.c
> > > @@ -56,6 +56,9 @@ MODULE_LICENSE("GPL and additional rights");
> > >  static DEFINE_SPINLOCK(drm_minor_lock);
> > >  static struct idr drm_minors_idr;
> > >
> > > +static DEFINE_SPINLOCK(accel_minor_lock);
> > > +static struct idr accel_minors_idr;
> > > +
> > >  /*
> > >   * If the drm core fails to init for whatever reason,
> > >   * we should prevent any drivers from registering with it.
> > > @@ -94,6 +97,8 @@ static struct drm_minor **drm_minor_get_slot(struct drm_device *dev,
> > >               return &dev->primary;
> > >       case DRM_MINOR_RENDER:
> > >               return &dev->render;
> > > +     case DRM_MINOR_ACCEL:
> > > +             return &dev->accel;
> > >       default:
> > >               BUG();
> > >       }
> > > @@ -108,9 +113,15 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data)
> > >
> > >       put_device(minor->kdev);
> > >
> > > -     spin_lock_irqsave(&drm_minor_lock, flags);
> > > -     idr_remove(&drm_minors_idr, minor->index);
> > > -     spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > +     if (minor->type == DRM_MINOR_ACCEL) {
> > > +             spin_lock_irqsave(&accel_minor_lock, flags);
> > > +             idr_remove(&accel_minors_idr, minor->index);
> > > +             spin_unlock_irqrestore(&accel_minor_lock, flags);
> > > +     } else {
> > > +             spin_lock_irqsave(&drm_minor_lock, flags);
> > > +             idr_remove(&drm_minors_idr, minor->index);
> > > +             spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > +     }
> > >  }
> > >
> > >  static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> > > @@ -127,13 +138,23 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
> > >       minor->dev = dev;
> > >
> > >       idr_preload(GFP_KERNEL);
> > > -     spin_lock_irqsave(&drm_minor_lock, flags);
> > > -     r = idr_alloc(&drm_minors_idr,
> > > -                   NULL,
> > > -                   64 * type,
> > > -                   64 * (type + 1),
> > > -                   GFP_NOWAIT);
> > > -     spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > +     if (type == DRM_MINOR_ACCEL) {
> > > +             spin_lock_irqsave(&accel_minor_lock, flags);
> > > +             r = idr_alloc(&accel_minors_idr,
> > > +                     NULL,
> > > +                     64 * (type - DRM_MINOR_ACCEL),
> > > +                     64 * (type - DRM_MINOR_ACCEL + 1),
> > > +                     GFP_NOWAIT);
> > > +             spin_unlock_irqrestore(&accel_minor_lock, flags);
> > > +     } else {
> > > +             spin_lock_irqsave(&drm_minor_lock, flags);
> > > +             r = idr_alloc(&drm_minors_idr,
> > > +                     NULL,
> > > +                     64 * type,
> > > +                     64 * (type + 1),
> > > +                     GFP_NOWAIT);
> > > +             spin_unlock_irqrestore(&drm_minor_lock, flags);
> > > +     }
> >
> > Hi,
> > There are many functions which checks drm type and decides its behaviors. It's good to
> > re-use exiting codes, but accel devices use totally different major/minor, and so it needs to be moved to
> > /drvier/accel/ (maybe later..). How about seperating functions for alloc/release minor (accel_minor_alloc..)?
> > also, for others which have drm type related codes.
> My feeling was moving the minor code handling to a different file (in
> addition to moving the major code handling) will cause too much
> duplication.
> My main theme is that an accel minor is another minor in drm, even if
> a bit different. i.e. It uses the same drm_minor structure.
> The driver declares he wants to use this minor using a drm driver feature flag.
> imo, all of that indicates the code should be inside drm.
> >
> >
> >
> >
> > > @@ -607,6 +652,14 @@ static int drm_dev_init(struct drm_device *dev,
> > >       /* no per-device feature limits by default */
> > >       dev->driver_features = ~0u;
> > >
> > > +     if (drm_core_check_feature(dev, DRIVER_COMPUTE_ACCEL) &&
> > > +                             (drm_core_check_feature(dev, DRIVER_RENDER) ||
> > > +                             drm_core_check_feature(dev, DRIVER_MODESET))) {
> > > +
> > > +             DRM_ERROR("DRM driver can't be both a compute acceleration and graphics driver\n");
> > > +             return -EINVAL;
> > > +     }
> > > +
> >
> > It's fine for the device only for acceleration, but can't graphic devices have acceleration feature?
> Of course they can :) In that case, and if they want to expose an
> accel device char, they should write an accel driver and connect it to
> their main graphics driver via auxiliary bus.
> 
> I could have added two flags - compute_accel, and compute_accel_only
> (similar to a patch that was sent to add render only flag), but imo it
> would make the code more convoluted. I prefer the clean separation and
> using standard auxiliary bus.
> 
> Thanks,
> Oded
> 

I understood. Seperation would be good as you mentioned in other mail.
This subsystem would be better choice for acceleration only devices, who need some features
of drm, but deoesnot want to include whole graphics related considerations.
I'll prepare Samsung's NPU driver using this after your reference driver is presented (maybe habana').

Thanks.
Jiho Chu



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-10-28  6:57 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-22 21:46 [RFC PATCH 0/3] new subsystem for compute accelerator devices Oded Gabbay
2022-10-22 21:46 ` [RFC PATCH 1/3] drivers/accel: add new kconfig and update MAINTAINERS Oded Gabbay
2022-10-23 12:40   ` Greg Kroah-Hartman
2022-10-24  7:19     ` Oded Gabbay
2022-10-24 15:01   ` Jeffrey Hugo
2022-10-22 21:46 ` [RFC PATCH 2/3] drm: define new accel major and register it Oded Gabbay
2022-10-23 12:40   ` Greg Kroah-Hartman
2022-10-24  7:23     ` Oded Gabbay
2022-10-24  7:52       ` Dave Airlie
2022-10-24 15:08         ` Jeffrey Hugo
2022-10-22 21:46 ` [RFC PATCH 3/3] drm: add dedicated minor for accelerator devices Oded Gabbay
2022-10-23 12:41   ` Greg Kroah-Hartman
2022-10-24  7:23     ` Oded Gabbay
2022-10-24 15:21   ` Jeffrey Hugo
2022-10-24 17:43     ` Oded Gabbay
2022-10-25 13:26       ` Michał Winiarski
2022-10-26  6:38         ` Oded Gabbay
2022-10-25  6:43   ` Jiho Chu
2022-10-26  6:38     ` Oded Gabbay
2022-10-28  6:56       ` Jiho Chu
2022-10-23 14:02 ` [RFC PATCH 0/3] new subsystem for compute " Bagas Sanjaya
2022-10-23 14:14   ` Greg Kroah-Hartman
2022-10-24 11:55 ` Thomas Zimmermann
2022-10-24 12:35   ` Greg Kroah-Hartman
2022-10-24 12:43   ` Oded Gabbay
2022-10-25  2:21     ` John Hubbard
2022-10-25  2:27       ` Dave Airlie
2022-10-25 11:15         ` Jason Gunthorpe
2022-10-25 14:21           ` Alex Deucher
2022-10-25 14:34             ` Jason Gunthorpe
2022-10-25 14:43               ` Alex Deucher
2022-10-24 13:55 ` Alex Deucher
2022-10-24 14:41   ` Oded Gabbay
2022-10-24 15:10     ` Alex Deucher
2022-10-26  6:10       ` Oded Gabbay

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).