dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Try to address the drm_debugfs issues
@ 2023-02-09  8:18 Christian König
  2023-02-09  8:18 ` [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register Christian König
                   ` (4 more replies)
  0 siblings, 5 replies; 50+ messages in thread
From: Christian König @ 2023-02-09  8:18 UTC (permalink / raw)
  To: daniel.vetter, wambui.karugax, mcanal, maxime, mwen, mairacanal; +Cc: dri-devel

Hello everyone,

the drm_debugfs has a couple of well known design problems.

Especially it wasn't possible to add files between initializing and registering
of DRM devices since the underlying debugfs directory wasn't created yet.

The resulting necessity of the driver->debugfs_init() callback function is a
mid-layering which is really frowned on since it creates a horrible
driver->DRM->driver design layering.

The recent patch "drm/debugfs: create device-centered debugfs functions" tried
to address those problem, but doesn't seem to work correctly. This looks like
a misunderstanding of the call flow around drm_debugfs_init(), which is called
multiple times, once for the primary and once for the render node.

So what happens now is the following:

1. drm_dev_init() initially allocates the drm_minor objects.
2. ... back to the driver ...
3. drm_dev_register() is called.

4. drm_debugfs_init() is called for the primary node.
5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
   drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
   for the primary node.
6. The driver->debugfs_init() callback is called to add debugfs files for the
   primary node.
7. The added files are consumed and added to the primary node debugfs directory.

8. drm_debugfs_init() is called for the render node.
9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
   drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
   again for the render node.
10. The driver->debugfs_init() callback is called to add debugfs files for the
    render node.
11. The added files are consumed and added to the render node debugfs directory.

12. Some more files are added through drm_debugfs_add_file().
13. drm_debugfs_late_register() add the files once more to the primary node
    debugfs directory.
14. From this point on files added through drm_debugfs_add_file() are simply ignored.
15. ... back to the driver ...

Because of this the dev->debugfs_mutex lock is also completely pointless since
any concurrent use of the interface would just randomly either add the files to
the primary or render node or just not at all.

Even worse is that this implementation nails the coffin for removing the
driver->debugfs_init() mid-layering because otherwise drivers can't control
where their debugfs (primary/render node) are actually added.

This patch set here now tries to clean this up a bit, but most likely isn't
fully complete either since I didn't audit every driver/call path.

Please comment/discuss.

Cheers,
Christian.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register
  2023-02-09  8:18 Try to address the drm_debugfs issues Christian König
@ 2023-02-09  8:18 ` Christian König
  2023-02-14 11:56   ` Stanislaw Gruszka
  2023-02-09  8:18 ` [PATCH 2/3] drm/debugfs: split registration into dev and minor Christian König
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-09  8:18 UTC (permalink / raw)
  To: daniel.vetter, wambui.karugax, mcanal, maxime, mwen, mairacanal; +Cc: dri-devel

This way we can create debugfs files directly, even between init and register.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_debugfs.c  | 12 ++++++++----
 drivers/gpu/drm/drm_drv.c      | 15 +++++++--------
 drivers/gpu/drm/drm_internal.h |  1 +
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index 4f643a490dc3..2724cac03509 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -221,8 +221,6 @@ EXPORT_SYMBOL(drm_debugfs_create_files);
 int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 		     struct dentry *root)
 {
-	struct drm_device *dev = minor->dev;
-	struct drm_debugfs_entry *entry, *tmp;
 	char name[64];
 
 	INIT_LIST_HEAD(&minor->debugfs_list);
@@ -230,6 +228,14 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 	sprintf(name, "%d", minor_id);
 	minor->debugfs_root = debugfs_create_dir(name, root);
 
+	return 0;
+}
+
+void drm_debugfs_register(struct drm_minor *minor)
+{
+	struct drm_device *dev = minor->dev;
+	struct drm_debugfs_entry *entry, *tmp;
+
 	drm_debugfs_add_files(minor->dev, drm_debugfs_list, DRM_DEBUGFS_ENTRIES);
 
 	if (drm_drv_uses_atomic_modeset(dev)) {
@@ -250,8 +256,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
 		list_del(&entry->list);
 	}
-
-	return 0;
 }
 
 void drm_debugfs_late_register(struct drm_device *dev)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index c6eb8972451a..88ce22c04672 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -156,6 +156,10 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
 	if (IS_ERR(minor->kdev))
 		return PTR_ERR(minor->kdev);
 
+	r = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
+	if (r)
+		return r;
+
 	*drm_minor_get_slot(dev, type) = minor;
 	return 0;
 }
@@ -172,15 +176,10 @@ static int drm_minor_register(struct drm_device *dev, unsigned int type)
 	if (!minor)
 		return 0;
 
-	if (minor->type == DRM_MINOR_ACCEL) {
+	if (minor->type == DRM_MINOR_ACCEL)
 		accel_debugfs_init(minor, minor->index);
-	} else {
-		ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
-		if (ret) {
-			DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/dri.\n");
-			goto err_debugfs;
-		}
-	}
+	else
+		drm_debugfs_register(minor);
 
 	ret = device_add(minor->kdev);
 	if (ret)
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index ed2103ee272c..332fb65a935a 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -185,6 +185,7 @@ int drm_gem_dumb_destroy(struct drm_file *file, struct drm_device *dev,
 #if defined(CONFIG_DEBUG_FS)
 int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 		     struct dentry *root);
+void drm_debugfs_register(struct drm_minor *minor);
 void drm_debugfs_cleanup(struct drm_minor *minor);
 void drm_debugfs_late_register(struct drm_device *dev);
 void drm_debugfs_connector_add(struct drm_connector *connector);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/3] drm/debugfs: split registration into dev and minor
  2023-02-09  8:18 Try to address the drm_debugfs issues Christian König
  2023-02-09  8:18 ` [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register Christian König
@ 2023-02-09  8:18 ` Christian König
  2023-02-09 11:12   ` Maíra Canal
  2023-02-09  8:18 ` [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex Christian König
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-09  8:18 UTC (permalink / raw)
  To: daniel.vetter, wambui.karugax, mcanal, maxime, mwen, mairacanal; +Cc: dri-devel

The different subsystems should probably only register their debugfs
files once.

This temporary removes the common files from the render node directory.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_atomic.c        |  4 ++--
 drivers/gpu/drm/drm_client.c        |  4 ++--
 drivers/gpu/drm/drm_crtc_internal.h |  2 +-
 drivers/gpu/drm/drm_debugfs.c       | 24 ++++++++++++------------
 drivers/gpu/drm/drm_drv.c           |  4 +++-
 drivers/gpu/drm/drm_framebuffer.c   |  4 ++--
 drivers/gpu/drm/drm_internal.h      |  5 +++--
 include/drm/drm_client.h            |  2 +-
 8 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 5457c02ca1ab..ae6ec1dd162a 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -1770,9 +1770,9 @@ static const struct drm_debugfs_info drm_atomic_debugfs_list[] = {
 	{"state", drm_state_info, 0},
 };
 
-void drm_atomic_debugfs_init(struct drm_minor *minor)
+void drm_atomic_debugfs_init(struct drm_device *dev)
 {
-	drm_debugfs_add_files(minor->dev, drm_atomic_debugfs_list,
+	drm_debugfs_add_files(dev, drm_atomic_debugfs_list,
 			      ARRAY_SIZE(drm_atomic_debugfs_list));
 }
 #endif
diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index 009e7b10455c..847acf0ef570 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -507,9 +507,9 @@ static const struct drm_debugfs_info drm_client_debugfs_list[] = {
 	{ "internal_clients", drm_client_debugfs_internal_clients, 0 },
 };
 
-void drm_client_debugfs_init(struct drm_minor *minor)
+void drm_client_debugfs_init(struct drm_device *dev)
 {
-	drm_debugfs_add_files(minor->dev, drm_client_debugfs_list,
+	drm_debugfs_add_files(dev, drm_client_debugfs_list,
 			      ARRAY_SIZE(drm_client_debugfs_list));
 }
 #endif
diff --git a/drivers/gpu/drm/drm_crtc_internal.h b/drivers/gpu/drm/drm_crtc_internal.h
index 501a10edd0e1..8556c3b3ff88 100644
--- a/drivers/gpu/drm/drm_crtc_internal.h
+++ b/drivers/gpu/drm/drm_crtc_internal.h
@@ -232,7 +232,7 @@ int drm_mode_dirtyfb_ioctl(struct drm_device *dev,
 /* drm_atomic.c */
 #ifdef CONFIG_DEBUG_FS
 struct drm_minor;
-void drm_atomic_debugfs_init(struct drm_minor *minor);
+void drm_atomic_debugfs_init(struct drm_device *dev);
 #endif
 
 int __drm_atomic_helper_disable_plane(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index 2724cac03509..558e3a7271a5 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -231,22 +231,22 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 	return 0;
 }
 
-void drm_debugfs_register(struct drm_minor *minor)
+void drm_debugfs_dev_register(struct drm_device *dev)
 {
-	struct drm_device *dev = minor->dev;
-	struct drm_debugfs_entry *entry, *tmp;
-
-	drm_debugfs_add_files(minor->dev, drm_debugfs_list, DRM_DEBUGFS_ENTRIES);
-
-	if (drm_drv_uses_atomic_modeset(dev)) {
-		drm_atomic_debugfs_init(minor);
-	}
+	drm_debugfs_add_files(dev, drm_debugfs_list, DRM_DEBUGFS_ENTRIES);
 
 	if (drm_core_check_feature(dev, DRIVER_MODESET)) {
-		drm_framebuffer_debugfs_init(minor);
-
-		drm_client_debugfs_init(minor);
+		drm_framebuffer_debugfs_init(dev);
+		drm_client_debugfs_init(dev);
 	}
+	if (drm_drv_uses_atomic_modeset(dev))
+		drm_atomic_debugfs_init(dev);
+}
+
+void drm_debugfs_minor_register(struct drm_minor *minor)
+{
+	struct drm_device *dev = minor->dev;
+	struct drm_debugfs_entry *entry, *tmp;
 
 	if (dev->driver->debugfs_init)
 		dev->driver->debugfs_init(minor);
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 88ce22c04672..2cbe028e548c 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -179,7 +179,7 @@ static int drm_minor_register(struct drm_device *dev, unsigned int type)
 	if (minor->type == DRM_MINOR_ACCEL)
 		accel_debugfs_init(minor, minor->index);
 	else
-		drm_debugfs_register(minor);
+		drm_debugfs_minor_register(minor);
 
 	ret = device_add(minor->kdev);
 	if (ret)
@@ -913,6 +913,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
 	if (drm_dev_needs_global_mutex(dev))
 		mutex_lock(&drm_global_mutex);
 
+	drm_debugfs_dev_register(dev);
+
 	ret = drm_minor_register(dev, DRM_MINOR_RENDER);
 	if (ret)
 		goto err_minors;
diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
index aff3746dedfb..ba51deb6d042 100644
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -1222,9 +1222,9 @@ static const struct drm_debugfs_info drm_framebuffer_debugfs_list[] = {
 	{ "framebuffer", drm_framebuffer_info, 0 },
 };
 
-void drm_framebuffer_debugfs_init(struct drm_minor *minor)
+void drm_framebuffer_debugfs_init(struct drm_device *dev)
 {
-	drm_debugfs_add_files(minor->dev, drm_framebuffer_debugfs_list,
+	drm_debugfs_add_files(dev, drm_framebuffer_debugfs_list,
 			      ARRAY_SIZE(drm_framebuffer_debugfs_list));
 }
 #endif
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 332fb65a935a..5ff7bf88f162 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -185,7 +185,8 @@ int drm_gem_dumb_destroy(struct drm_file *file, struct drm_device *dev,
 #if defined(CONFIG_DEBUG_FS)
 int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 		     struct dentry *root);
-void drm_debugfs_register(struct drm_minor *minor);
+void drm_debugfs_dev_register(struct drm_device *dev);
+void drm_debugfs_minor_register(struct drm_minor *minor);
 void drm_debugfs_cleanup(struct drm_minor *minor);
 void drm_debugfs_late_register(struct drm_device *dev);
 void drm_debugfs_connector_add(struct drm_connector *connector);
@@ -261,4 +262,4 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
 /* drm_framebuffer.c */
 void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent,
 				const struct drm_framebuffer *fb);
-void drm_framebuffer_debugfs_init(struct drm_minor *minor);
+void drm_framebuffer_debugfs_init(struct drm_device *dev);
diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
index 39482527a775..507d132cf494 100644
--- a/include/drm/drm_client.h
+++ b/include/drm/drm_client.h
@@ -200,6 +200,6 @@ int drm_client_modeset_dpms(struct drm_client_dev *client, int mode);
 	drm_for_each_connector_iter(connector, iter) \
 		if (connector->connector_type != DRM_MODE_CONNECTOR_WRITEBACK)
 
-void drm_client_debugfs_init(struct drm_minor *minor);
+void drm_client_debugfs_init(struct drm_device *dev);
 
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-09  8:18 Try to address the drm_debugfs issues Christian König
  2023-02-09  8:18 ` [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register Christian König
  2023-02-09  8:18 ` [PATCH 2/3] drm/debugfs: split registration into dev and minor Christian König
@ 2023-02-09  8:18 ` Christian König
  2023-02-14 12:19   ` Stanislaw Gruszka
  2023-02-16 11:33   ` Daniel Vetter
  2023-02-09 11:23 ` Try to address the drm_debugfs issues Maíra Canal
  2023-02-14  8:59 ` Stanislaw Gruszka
  4 siblings, 2 replies; 50+ messages in thread
From: Christian König @ 2023-02-09  8:18 UTC (permalink / raw)
  To: daniel.vetter, wambui.karugax, mcanal, maxime, mwen, mairacanal; +Cc: dri-devel

The mutex was completely pointless in the first place since any
parallel adding of files to this list would result in random
behavior since the list is filled and consumed multiple times.

Completely drop that approach and just create the files directly.

This also re-adds the debugfs files to the render node directory and
removes drm_debugfs_late_register().

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
 drivers/gpu/drm/drm_drv.c         |  3 ---
 drivers/gpu/drm/drm_internal.h    |  5 -----
 drivers/gpu/drm/drm_mode_config.c |  2 --
 include/drm/drm_device.h          | 15 ---------------
 5 files changed, 7 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index 558e3a7271a5..a40288e67264 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
 void drm_debugfs_minor_register(struct drm_minor *minor)
 {
 	struct drm_device *dev = minor->dev;
-	struct drm_debugfs_entry *entry, *tmp;
 
 	if (dev->driver->debugfs_init)
 		dev->driver->debugfs_init(minor);
-
-	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
-		debugfs_create_file(entry->file.name, 0444,
-				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
-		list_del(&entry->list);
-	}
-}
-
-void drm_debugfs_late_register(struct drm_device *dev)
-{
-	struct drm_minor *minor = dev->primary;
-	struct drm_debugfs_entry *entry, *tmp;
-
-	if (!minor)
-		return;
-
-	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
-		debugfs_create_file(entry->file.name, 0444,
-				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
-		list_del(&entry->list);
-	}
 }
 
 int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
@@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
 	entry->file.data = data;
 	entry->dev = dev;
 
-	mutex_lock(&dev->debugfs_mutex);
-	list_add(&entry->list, &dev->debugfs_list);
-	mutex_unlock(&dev->debugfs_mutex);
+	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
+			    &drm_debugfs_entry_fops);
+
+	/* TODO: This should probably only be a symlink */
+	if (dev->render)
+		debugfs_create_file(name, 0444, dev->render->debugfs_root,
+				    entry, &drm_debugfs_entry_fops);
 }
 EXPORT_SYMBOL(drm_debugfs_add_file);
 
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 2cbe028e548c..e7b88b65866c 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
 	mutex_destroy(&dev->clientlist_mutex);
 	mutex_destroy(&dev->filelist_mutex);
 	mutex_destroy(&dev->struct_mutex);
-	mutex_destroy(&dev->debugfs_mutex);
 	drm_legacy_destroy_members(dev);
 }
 
@@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
 	INIT_LIST_HEAD(&dev->filelist_internal);
 	INIT_LIST_HEAD(&dev->clientlist);
 	INIT_LIST_HEAD(&dev->vblank_event_list);
-	INIT_LIST_HEAD(&dev->debugfs_list);
 
 	spin_lock_init(&dev->event_lock);
 	mutex_init(&dev->struct_mutex);
 	mutex_init(&dev->filelist_mutex);
 	mutex_init(&dev->clientlist_mutex);
 	mutex_init(&dev->master_mutex);
-	mutex_init(&dev->debugfs_mutex);
 
 	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
 	if (ret)
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 5ff7bf88f162..e215d00ba65c 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
 void drm_debugfs_dev_register(struct drm_device *dev);
 void drm_debugfs_minor_register(struct drm_minor *minor);
 void drm_debugfs_cleanup(struct drm_minor *minor);
-void drm_debugfs_late_register(struct drm_device *dev);
 void drm_debugfs_connector_add(struct drm_connector *connector);
 void drm_debugfs_connector_remove(struct drm_connector *connector);
 void drm_debugfs_crtc_add(struct drm_crtc *crtc);
@@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
 {
 }
 
-static inline void drm_debugfs_late_register(struct drm_device *dev)
-{
-}
-
 static inline void drm_debugfs_connector_add(struct drm_connector *connector)
 {
 }
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
index 87eb591fe9b5..8525ef851540 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
 	if (ret)
 		goto err_connector;
 
-	drm_debugfs_late_register(dev);
-
 	return 0;
 
 err_connector:
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
index 7cf4afae2e79..900ad7478dd8 100644
--- a/include/drm/drm_device.h
+++ b/include/drm/drm_device.h
@@ -311,21 +311,6 @@ struct drm_device {
 	 */
 	struct drm_fb_helper *fb_helper;
 
-	/**
-	 * @debugfs_mutex:
-	 *
-	 * Protects &debugfs_list access.
-	 */
-	struct mutex debugfs_mutex;
-
-	/**
-	 * @debugfs_list:
-	 *
-	 * List of debugfs files to be created by the DRM device. The files
-	 * must be added during drm_dev_register().
-	 */
-	struct list_head debugfs_list;
-
 	/* Everything below here is for legacy driver, never use! */
 	/* private: */
 #if IS_ENABLED(CONFIG_DRM_LEGACY)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] drm/debugfs: split registration into dev and minor
  2023-02-09  8:18 ` [PATCH 2/3] drm/debugfs: split registration into dev and minor Christian König
@ 2023-02-09 11:12   ` Maíra Canal
  2023-02-09 12:03     ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Maíra Canal @ 2023-02-09 11:12 UTC (permalink / raw)
  To: Christian König, daniel.vetter, wambui.karugax, maxime,
	mwen, mairacanal
  Cc: dri-devel

On 2/9/23 05:18, Christian König wrote:
> The different subsystems should probably only register their debugfs
> files once.
> 
> This temporary removes the common files from the render node directory.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/drm_atomic.c        |  4 ++--
>   drivers/gpu/drm/drm_client.c        |  4 ++--
>   drivers/gpu/drm/drm_crtc_internal.h |  2 +-
>   drivers/gpu/drm/drm_debugfs.c       | 24 ++++++++++++------------
>   drivers/gpu/drm/drm_drv.c           |  4 +++-
>   drivers/gpu/drm/drm_framebuffer.c   |  4 ++--
>   drivers/gpu/drm/drm_internal.h      |  5 +++--
>   include/drm/drm_client.h            |  2 +-
>   8 files changed, 26 insertions(+), 23 deletions(-)
> 

[...]

> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> index 332fb65a935a..5ff7bf88f162 100644
> --- a/drivers/gpu/drm/drm_internal.h
> +++ b/drivers/gpu/drm/drm_internal.h
> @@ -185,7 +185,8 @@ int drm_gem_dumb_destroy(struct drm_file *file, struct drm_device *dev,
>   #if defined(CONFIG_DEBUG_FS)
>   int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>   		     struct dentry *root);
> -void drm_debugfs_register(struct drm_minor *minor);
> +void drm_debugfs_dev_register(struct drm_device *dev);
> +void drm_debugfs_minor_register(struct drm_minor *minor);

For this patch and the previous one, I believe you need to add the functions
to the #else path as well, otherwise it won't compile for CONFIG_DEBUG_FS=n.

Best Regards,
- Maíra Canal

>   void drm_debugfs_cleanup(struct drm_minor *minor);
>   void drm_debugfs_late_register(struct drm_device *dev);
>   void drm_debugfs_connector_add(struct drm_connector *connector);
> @@ -261,4 +262,4 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
>   /* drm_framebuffer.c */
>   void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent,
>   				const struct drm_framebuffer *fb);
> -void drm_framebuffer_debugfs_init(struct drm_minor *minor);
> +void drm_framebuffer_debugfs_init(struct drm_device *dev);
> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
> index 39482527a775..507d132cf494 100644
> --- a/include/drm/drm_client.h
> +++ b/include/drm/drm_client.h
> @@ -200,6 +200,6 @@ int drm_client_modeset_dpms(struct drm_client_dev *client, int mode);
>   	drm_for_each_connector_iter(connector, iter) \
>   		if (connector->connector_type != DRM_MODE_CONNECTOR_WRITEBACK)
>   
> -void drm_client_debugfs_init(struct drm_minor *minor);
> +void drm_client_debugfs_init(struct drm_device *dev);
>   
>   #endif

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09  8:18 Try to address the drm_debugfs issues Christian König
                   ` (2 preceding siblings ...)
  2023-02-09  8:18 ` [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex Christian König
@ 2023-02-09 11:23 ` Maíra Canal
  2023-02-09 12:13   ` Christian König
  2023-02-14  8:59 ` Stanislaw Gruszka
  4 siblings, 1 reply; 50+ messages in thread
From: Maíra Canal @ 2023-02-09 11:23 UTC (permalink / raw)
  To: Christian König, daniel.vetter, wambui.karugax, maxime,
	mwen, mairacanal
  Cc: dri-devel

On 2/9/23 05:18, Christian König wrote:
> Hello everyone,
> 
> the drm_debugfs has a couple of well known design problems.
> 
> Especially it wasn't possible to add files between initializing and registering
> of DRM devices since the underlying debugfs directory wasn't created yet.
> 
> The resulting necessity of the driver->debugfs_init() callback function is a
> mid-layering which is really frowned on since it creates a horrible
> driver->DRM->driver design layering.
> 
> The recent patch "drm/debugfs: create device-centered debugfs functions" tried
> to address those problem, but doesn't seem to work correctly. This looks like
> a misunderstanding of the call flow around drm_debugfs_init(), which is called
> multiple times, once for the primary and once for the render node.
> 
> So what happens now is the following:
> 
> 1. drm_dev_init() initially allocates the drm_minor objects.
> 2. ... back to the driver ...
> 3. drm_dev_register() is called.
> 
> 4. drm_debugfs_init() is called for the primary node.
> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>     for the primary node.
> 6. The driver->debugfs_init() callback is called to add debugfs files for the
>     primary node.
> 7. The added files are consumed and added to the primary node debugfs directory.
> 
> 8. drm_debugfs_init() is called for the render node.
> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>     again for the render node.
> 10. The driver->debugfs_init() callback is called to add debugfs files for the
>      render node.
> 11. The added files are consumed and added to the render node debugfs directory.
> 
> 12. Some more files are added through drm_debugfs_add_file().
> 13. drm_debugfs_late_register() add the files once more to the primary node
>      debugfs directory.
> 14. From this point on files added through drm_debugfs_add_file() are simply ignored.
> 15. ... back to the driver ...
> 
> Because of this the dev->debugfs_mutex lock is also completely pointless since
> any concurrent use of the interface would just randomly either add the files to
> the primary or render node or just not at all.
> 
> Even worse is that this implementation nails the coffin for removing the
> driver->debugfs_init() mid-layering because otherwise drivers can't control
> where their debugfs (primary/render node) are actually added.
> 
> This patch set here now tries to clean this up a bit, but most likely isn't
> fully complete either since I didn't audit every driver/call path.

I tested the patchset on the v3d, vc4 and vkms and all the files are generated
as expected, but I'm getting the following errors on dmesg:

[    3.872026] debugfs: File 'v3d_ident' in directory '0' already present!
[    3.872064] debugfs: File 'v3d_ident' in directory '128' already present!
[    3.872078] debugfs: File 'v3d_regs' in directory '0' already present!
[    3.872087] debugfs: File 'v3d_regs' in directory '128' already present!
[    3.872097] debugfs: File 'measure_clock' in directory '0' already present!
[    3.872105] debugfs: File 'measure_clock' in directory '128' already present!
[    3.872116] debugfs: File 'bo_stats' in directory '0' already present!
[    3.872124] debugfs: File 'bo_stats' in directory '128' already present!

It looks like the render node is being added twice, since this doesn't happen
for vc4 and vkms.

Otherwise, the patchset looks good to me, but maybe Daniel has some other
thoughts about it.

Best Regards,
- Maíra Canal

> 
> Please comment/discuss.
> 
> Cheers,
> Christian.
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/3] drm/debugfs: split registration into dev and minor
  2023-02-09 11:12   ` Maíra Canal
@ 2023-02-09 12:03     ` Christian König
  0 siblings, 0 replies; 50+ messages in thread
From: Christian König @ 2023-02-09 12:03 UTC (permalink / raw)
  To: Maíra Canal, daniel.vetter, wambui.karugax, maxime, mwen,
	mairacanal
  Cc: dri-devel

Am 09.02.23 um 12:12 schrieb Maíra Canal:
> On 2/9/23 05:18, Christian König wrote:
>> The different subsystems should probably only register their debugfs
>> files once.
>>
>> This temporary removes the common files from the render node directory.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/drm_atomic.c        |  4 ++--
>>   drivers/gpu/drm/drm_client.c        |  4 ++--
>>   drivers/gpu/drm/drm_crtc_internal.h |  2 +-
>>   drivers/gpu/drm/drm_debugfs.c       | 24 ++++++++++++------------
>>   drivers/gpu/drm/drm_drv.c           |  4 +++-
>>   drivers/gpu/drm/drm_framebuffer.c   |  4 ++--
>>   drivers/gpu/drm/drm_internal.h      |  5 +++--
>>   include/drm/drm_client.h            |  2 +-
>>   8 files changed, 26 insertions(+), 23 deletions(-)
>>
>
> [...]
>
>> diff --git a/drivers/gpu/drm/drm_internal.h 
>> b/drivers/gpu/drm/drm_internal.h
>> index 332fb65a935a..5ff7bf88f162 100644
>> --- a/drivers/gpu/drm/drm_internal.h
>> +++ b/drivers/gpu/drm/drm_internal.h
>> @@ -185,7 +185,8 @@ int drm_gem_dumb_destroy(struct drm_file *file, 
>> struct drm_device *dev,
>>   #if defined(CONFIG_DEBUG_FS)
>>   int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>>                struct dentry *root);
>> -void drm_debugfs_register(struct drm_minor *minor);
>> +void drm_debugfs_dev_register(struct drm_device *dev);
>> +void drm_debugfs_minor_register(struct drm_minor *minor);
>
> For this patch and the previous one, I believe you need to add the 
> functions
> to the #else path as well, otherwise it won't compile for 
> CONFIG_DEBUG_FS=n.

Oh, good point. Going to fix this.

Thanks,
Christian.

>
> Best Regards,
> - Maíra Canal
>
>>   void drm_debugfs_cleanup(struct drm_minor *minor);
>>   void drm_debugfs_late_register(struct drm_device *dev);
>>   void drm_debugfs_connector_add(struct drm_connector *connector);
>> @@ -261,4 +262,4 @@ int drm_syncobj_query_ioctl(struct drm_device 
>> *dev, void *data,
>>   /* drm_framebuffer.c */
>>   void drm_framebuffer_print_info(struct drm_printer *p, unsigned int 
>> indent,
>>                   const struct drm_framebuffer *fb);
>> -void drm_framebuffer_debugfs_init(struct drm_minor *minor);
>> +void drm_framebuffer_debugfs_init(struct drm_device *dev);
>> diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h
>> index 39482527a775..507d132cf494 100644
>> --- a/include/drm/drm_client.h
>> +++ b/include/drm/drm_client.h
>> @@ -200,6 +200,6 @@ int drm_client_modeset_dpms(struct drm_client_dev 
>> *client, int mode);
>>       drm_for_each_connector_iter(connector, iter) \
>>           if (connector->connector_type != DRM_MODE_CONNECTOR_WRITEBACK)
>>   -void drm_client_debugfs_init(struct drm_minor *minor);
>> +void drm_client_debugfs_init(struct drm_device *dev);
>>     #endif


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 11:23 ` Try to address the drm_debugfs issues Maíra Canal
@ 2023-02-09 12:13   ` Christian König
  2023-02-09 13:06     ` Maíra Canal
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-09 12:13 UTC (permalink / raw)
  To: Maíra Canal, daniel.vetter, wambui.karugax, maxime, mwen,
	mairacanal
  Cc: dri-devel

Am 09.02.23 um 12:23 schrieb Maíra Canal:
> On 2/9/23 05:18, Christian König wrote:
>> Hello everyone,
>>
>> the drm_debugfs has a couple of well known design problems.
>>
>> Especially it wasn't possible to add files between initializing and 
>> registering
>> of DRM devices since the underlying debugfs directory wasn't created 
>> yet.
>>
>> The resulting necessity of the driver->debugfs_init() callback 
>> function is a
>> mid-layering which is really frowned on since it creates a horrible
>> driver->DRM->driver design layering.
>>
>> The recent patch "drm/debugfs: create device-centered debugfs 
>> functions" tried
>> to address those problem, but doesn't seem to work correctly. This 
>> looks like
>> a misunderstanding of the call flow around drm_debugfs_init(), which 
>> is called
>> multiple times, once for the primary and once for the render node.
>>
>> So what happens now is the following:
>>
>> 1. drm_dev_init() initially allocates the drm_minor objects.
>> 2. ... back to the driver ...
>> 3. drm_dev_register() is called.
>>
>> 4. drm_debugfs_init() is called for the primary node.
>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add 
>> the files
>>     for the primary node.
>> 6. The driver->debugfs_init() callback is called to add debugfs files 
>> for the
>>     primary node.
>> 7. The added files are consumed and added to the primary node debugfs 
>> directory.
>>
>> 8. drm_debugfs_init() is called for the render node.
>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add 
>> the files
>>     again for the render node.
>> 10. The driver->debugfs_init() callback is called to add debugfs 
>> files for the
>>      render node.
>> 11. The added files are consumed and added to the render node debugfs 
>> directory.
>>
>> 12. Some more files are added through drm_debugfs_add_file().
>> 13. drm_debugfs_late_register() add the files once more to the 
>> primary node
>>      debugfs directory.
>> 14. From this point on files added through drm_debugfs_add_file() are 
>> simply ignored.
>> 15. ... back to the driver ...
>>
>> Because of this the dev->debugfs_mutex lock is also completely 
>> pointless since
>> any concurrent use of the interface would just randomly either add 
>> the files to
>> the primary or render node or just not at all.
>>
>> Even worse is that this implementation nails the coffin for removing the
>> driver->debugfs_init() mid-layering because otherwise drivers can't 
>> control
>> where their debugfs (primary/render node) are actually added.
>>
>> This patch set here now tries to clean this up a bit, but most likely 
>> isn't
>> fully complete either since I didn't audit every driver/call path.
>
> I tested the patchset on the v3d, vc4 and vkms and all the files are 
> generated
> as expected, but I'm getting the following errors on dmesg:
>
> [    3.872026] debugfs: File 'v3d_ident' in directory '0' already 
> present!
> [    3.872064] debugfs: File 'v3d_ident' in directory '128' already 
> present!
> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already present!
> [    3.872087] debugfs: File 'v3d_regs' in directory '128' already 
> present!
> [    3.872097] debugfs: File 'measure_clock' in directory '0' already 
> present!
> [    3.872105] debugfs: File 'measure_clock' in directory '128' 
> already present!
> [    3.872116] debugfs: File 'bo_stats' in directory '0' already present!
> [    3.872124] debugfs: File 'bo_stats' in directory '128' already 
> present!
>
> It looks like the render node is being added twice, since this doesn't 
> happen
> for vc4 and vkms.

Thanks for the feedback and yes that's exactly what I meant with that I 
haven't looked into all code paths.

Could it be that v3d registers it's debugfs files from the debugfs_init 
callback?

One alternative would be to just completely nuke support for separate 
render node debugfs files and only add a symlink to the primary node. 
Opinions?

Regards,
Christian.

>
> Otherwise, the patchset looks good to me, but maybe Daniel has some other
> thoughts about it.
>
> Best Regards,
> - Maíra Canal
>
>>
>> Please comment/discuss.
>>
>> Cheers,
>> Christian.
>>
>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 12:13   ` Christian König
@ 2023-02-09 13:06     ` Maíra Canal
  2023-02-09 14:06       ` Christian König
  2023-02-13 18:16       ` Stanislaw Gruszka
  0 siblings, 2 replies; 50+ messages in thread
From: Maíra Canal @ 2023-02-09 13:06 UTC (permalink / raw)
  To: Christian König, daniel.vetter, wambui.karugax, maxime,
	mwen, mairacanal
  Cc: dri-devel

On 2/9/23 09:13, Christian König wrote:
> Am 09.02.23 um 12:23 schrieb Maíra Canal:
>> On 2/9/23 05:18, Christian König wrote:
>>> Hello everyone,
>>>
>>> the drm_debugfs has a couple of well known design problems.
>>>
>>> Especially it wasn't possible to add files between initializing and registering
>>> of DRM devices since the underlying debugfs directory wasn't created yet.
>>>
>>> The resulting necessity of the driver->debugfs_init() callback function is a
>>> mid-layering which is really frowned on since it creates a horrible
>>> driver->DRM->driver design layering.
>>>
>>> The recent patch "drm/debugfs: create device-centered debugfs functions" tried
>>> to address those problem, but doesn't seem to work correctly. This looks like
>>> a misunderstanding of the call flow around drm_debugfs_init(), which is called
>>> multiple times, once for the primary and once for the render node.
>>>
>>> So what happens now is the following:
>>>
>>> 1. drm_dev_init() initially allocates the drm_minor objects.
>>> 2. ... back to the driver ...
>>> 3. drm_dev_register() is called.
>>>
>>> 4. drm_debugfs_init() is called for the primary node.
>>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>>>     for the primary node.
>>> 6. The driver->debugfs_init() callback is called to add debugfs files for the
>>>     primary node.
>>> 7. The added files are consumed and added to the primary node debugfs directory.
>>>
>>> 8. drm_debugfs_init() is called for the render node.
>>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>>>     again for the render node.
>>> 10. The driver->debugfs_init() callback is called to add debugfs files for the
>>>      render node.
>>> 11. The added files are consumed and added to the render node debugfs directory.
>>>
>>> 12. Some more files are added through drm_debugfs_add_file().
>>> 13. drm_debugfs_late_register() add the files once more to the primary node
>>>      debugfs directory.
>>> 14. From this point on files added through drm_debugfs_add_file() are simply ignored.
>>> 15. ... back to the driver ...
>>>
>>> Because of this the dev->debugfs_mutex lock is also completely pointless since
>>> any concurrent use of the interface would just randomly either add the files to
>>> the primary or render node or just not at all.
>>>
>>> Even worse is that this implementation nails the coffin for removing the
>>> driver->debugfs_init() mid-layering because otherwise drivers can't control
>>> where their debugfs (primary/render node) are actually added.
>>>
>>> This patch set here now tries to clean this up a bit, but most likely isn't
>>> fully complete either since I didn't audit every driver/call path.
>>
>> I tested the patchset on the v3d, vc4 and vkms and all the files are generated
>> as expected, but I'm getting the following errors on dmesg:
>>
>> [    3.872026] debugfs: File 'v3d_ident' in directory '0' already present!
>> [    3.872064] debugfs: File 'v3d_ident' in directory '128' already present!
>> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already present!
>> [    3.872087] debugfs: File 'v3d_regs' in directory '128' already present!
>> [    3.872097] debugfs: File 'measure_clock' in directory '0' already present!
>> [    3.872105] debugfs: File 'measure_clock' in directory '128' already present!
>> [    3.872116] debugfs: File 'bo_stats' in directory '0' already present!
>> [    3.872124] debugfs: File 'bo_stats' in directory '128' already present!
>>
>> It looks like the render node is being added twice, since this doesn't happen
>> for vc4 and vkms.
> 
> Thanks for the feedback and yes that's exactly what I meant with that I haven't looked into all code paths.
> 
> Could it be that v3d registers it's debugfs files from the debugfs_init callback?

Although this is true, I'm not sure if this is the reason why the files are
being registered twice, as this doesn't happen to vc4, and it also uses the
debugfs_init callback. I believe it is somewhat related to the fact that
v3d is the primary node and the render node.

Best Regards,
- Maíra Canal

> 
> One alternative would be to just completely nuke support for separate render node debugfs files and only add a symlink to the primary node. Opinions?
> 
> Regards,
> Christian.
> 
>>
>> Otherwise, the patchset looks good to me, but maybe Daniel has some other
>> thoughts about it.
>>
>> Best Regards,
>> - Maíra Canal
>>
>>>
>>> Please comment/discuss.
>>>
>>> Cheers,
>>> Christian.
>>>
>>>
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 13:06     ` Maíra Canal
@ 2023-02-09 14:06       ` Christian König
  2023-02-09 14:19         ` Maxime Ripard
  2023-02-16 11:34         ` Daniel Vetter
  2023-02-13 18:16       ` Stanislaw Gruszka
  1 sibling, 2 replies; 50+ messages in thread
From: Christian König @ 2023-02-09 14:06 UTC (permalink / raw)
  To: Maíra Canal, daniel.vetter, wambui.karugax, maxime, mwen,
	mairacanal
  Cc: dri-devel

Am 09.02.23 um 14:06 schrieb Maíra Canal:
> On 2/9/23 09:13, Christian König wrote:
>> Am 09.02.23 um 12:23 schrieb Maíra Canal:
>>> On 2/9/23 05:18, Christian König wrote:
>>>> Hello everyone,
>>>>
>>>> the drm_debugfs has a couple of well known design problems.
>>>>
>>>> Especially it wasn't possible to add files between initializing and 
>>>> registering
>>>> of DRM devices since the underlying debugfs directory wasn't 
>>>> created yet.
>>>>
>>>> The resulting necessity of the driver->debugfs_init() callback 
>>>> function is a
>>>> mid-layering which is really frowned on since it creates a horrible
>>>> driver->DRM->driver design layering.
>>>>
>>>> The recent patch "drm/debugfs: create device-centered debugfs 
>>>> functions" tried
>>>> to address those problem, but doesn't seem to work correctly. This 
>>>> looks like
>>>> a misunderstanding of the call flow around drm_debugfs_init(), 
>>>> which is called
>>>> multiple times, once for the primary and once for the render node.
>>>>
>>>> So what happens now is the following:
>>>>
>>>> 1. drm_dev_init() initially allocates the drm_minor objects.
>>>> 2. ... back to the driver ...
>>>> 3. drm_dev_register() is called.
>>>>
>>>> 4. drm_debugfs_init() is called for the primary node.
>>>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add 
>>>> the files
>>>>     for the primary node.
>>>> 6. The driver->debugfs_init() callback is called to add debugfs 
>>>> files for the
>>>>     primary node.
>>>> 7. The added files are consumed and added to the primary node 
>>>> debugfs directory.
>>>>
>>>> 8. drm_debugfs_init() is called for the render node.
>>>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add 
>>>> the files
>>>>     again for the render node.
>>>> 10. The driver->debugfs_init() callback is called to add debugfs 
>>>> files for the
>>>>      render node.
>>>> 11. The added files are consumed and added to the render node 
>>>> debugfs directory.
>>>>
>>>> 12. Some more files are added through drm_debugfs_add_file().
>>>> 13. drm_debugfs_late_register() add the files once more to the 
>>>> primary node
>>>>      debugfs directory.
>>>> 14. From this point on files added through drm_debugfs_add_file() 
>>>> are simply ignored.
>>>> 15. ... back to the driver ...
>>>>
>>>> Because of this the dev->debugfs_mutex lock is also completely 
>>>> pointless since
>>>> any concurrent use of the interface would just randomly either add 
>>>> the files to
>>>> the primary or render node or just not at all.
>>>>
>>>> Even worse is that this implementation nails the coffin for 
>>>> removing the
>>>> driver->debugfs_init() mid-layering because otherwise drivers can't 
>>>> control
>>>> where their debugfs (primary/render node) are actually added.
>>>>
>>>> This patch set here now tries to clean this up a bit, but most 
>>>> likely isn't
>>>> fully complete either since I didn't audit every driver/call path.
>>>
>>> I tested the patchset on the v3d, vc4 and vkms and all the files are 
>>> generated
>>> as expected, but I'm getting the following errors on dmesg:
>>>
>>> [    3.872026] debugfs: File 'v3d_ident' in directory '0' already 
>>> present!
>>> [    3.872064] debugfs: File 'v3d_ident' in directory '128' already 
>>> present!
>>> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already 
>>> present!
>>> [    3.872087] debugfs: File 'v3d_regs' in directory '128' already 
>>> present!
>>> [    3.872097] debugfs: File 'measure_clock' in directory '0' 
>>> already present!
>>> [    3.872105] debugfs: File 'measure_clock' in directory '128' 
>>> already present!
>>> [    3.872116] debugfs: File 'bo_stats' in directory '0' already 
>>> present!
>>> [    3.872124] debugfs: File 'bo_stats' in directory '128' already 
>>> present!
>>>
>>> It looks like the render node is being added twice, since this 
>>> doesn't happen
>>> for vc4 and vkms.
>>
>> Thanks for the feedback and yes that's exactly what I meant with that 
>> I haven't looked into all code paths.
>>
>> Could it be that v3d registers it's debugfs files from the 
>> debugfs_init callback?
>
> Although this is true, I'm not sure if this is the reason why the 
> files are
> being registered twice, as this doesn't happen to vc4, and it also 
> uses the
> debugfs_init callback. I believe it is somewhat related to the fact that
> v3d is the primary node and the render node.

I see. Thanks for the hint.

>
> Best Regards,
> - Maíra Canal
>
>>
>> One alternative would be to just completely nuke support for separate 
>> render node debugfs files and only add a symlink to the primary node. 
>> Opinions?

What do you think of this approach? I can't come up with any reason why 
we should have separate debugfs files for render nodes and I think it is 
pretty much the same reason you came up with the patch for per device 
debugfs files instead of per minor.

Regards,
Christian.

>>
>> Regards,
>> Christian.
>>
>>>
>>> Otherwise, the patchset looks good to me, but maybe Daniel has some 
>>> other
>>> thoughts about it.
>>>
>>> Best Regards,
>>> - Maíra Canal
>>>
>>>>
>>>> Please comment/discuss.
>>>>
>>>> Cheers,
>>>> Christian.
>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 14:06       ` Christian König
@ 2023-02-09 14:19         ` Maxime Ripard
  2023-02-09 15:52           ` Christian König
  2023-02-16 11:34         ` Daniel Vetter
  1 sibling, 1 reply; 50+ messages in thread
From: Maxime Ripard @ 2023-02-09 14:19 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	wambui.karugax

[-- Attachment #1: Type: text/plain, Size: 5882 bytes --]

On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
> Am 09.02.23 um 14:06 schrieb Maíra Canal:
> > On 2/9/23 09:13, Christian König wrote:
> > > Am 09.02.23 um 12:23 schrieb Maíra Canal:
> > > > On 2/9/23 05:18, Christian König wrote:
> > > > > Hello everyone,
> > > > > 
> > > > > the drm_debugfs has a couple of well known design problems.
> > > > > 
> > > > > Especially it wasn't possible to add files between
> > > > > initializing and registering
> > > > > of DRM devices since the underlying debugfs directory wasn't
> > > > > created yet.
> > > > > 
> > > > > The resulting necessity of the driver->debugfs_init()
> > > > > callback function is a
> > > > > mid-layering which is really frowned on since it creates a horrible
> > > > > driver->DRM->driver design layering.
> > > > > 
> > > > > The recent patch "drm/debugfs: create device-centered
> > > > > debugfs functions" tried
> > > > > to address those problem, but doesn't seem to work
> > > > > correctly. This looks like
> > > > > a misunderstanding of the call flow around
> > > > > drm_debugfs_init(), which is called
> > > > > multiple times, once for the primary and once for the render node.
> > > > > 
> > > > > So what happens now is the following:
> > > > > 
> > > > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > > > 2. ... back to the driver ...
> > > > > 3. drm_dev_register() is called.
> > > > > 
> > > > > 4. drm_debugfs_init() is called for the primary node.
> > > > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > >     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > to add the files
> > > > >     for the primary node.
> > > > > 6. The driver->debugfs_init() callback is called to add
> > > > > debugfs files for the
> > > > >     primary node.
> > > > > 7. The added files are consumed and added to the primary
> > > > > node debugfs directory.
> > > > > 
> > > > > 8. drm_debugfs_init() is called for the render node.
> > > > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > >     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > to add the files
> > > > >     again for the render node.
> > > > > 10. The driver->debugfs_init() callback is called to add
> > > > > debugfs files for the
> > > > >      render node.
> > > > > 11. The added files are consumed and added to the render
> > > > > node debugfs directory.
> > > > > 
> > > > > 12. Some more files are added through drm_debugfs_add_file().
> > > > > 13. drm_debugfs_late_register() add the files once more to
> > > > > the primary node
> > > > >      debugfs directory.
> > > > > 14. From this point on files added through
> > > > > drm_debugfs_add_file() are simply ignored.
> > > > > 15. ... back to the driver ...
> > > > > 
> > > > > Because of this the dev->debugfs_mutex lock is also
> > > > > completely pointless since
> > > > > any concurrent use of the interface would just randomly
> > > > > either add the files to
> > > > > the primary or render node or just not at all.
> > > > > 
> > > > > Even worse is that this implementation nails the coffin for
> > > > > removing the
> > > > > driver->debugfs_init() mid-layering because otherwise
> > > > > drivers can't control
> > > > > where their debugfs (primary/render node) are actually added.
> > > > > 
> > > > > This patch set here now tries to clean this up a bit, but
> > > > > most likely isn't
> > > > > fully complete either since I didn't audit every driver/call path.
> > > > 
> > > > I tested the patchset on the v3d, vc4 and vkms and all the files
> > > > are generated
> > > > as expected, but I'm getting the following errors on dmesg:
> > > > 
> > > > [    3.872026] debugfs: File 'v3d_ident' in directory '0'
> > > > already present!
> > > > [    3.872064] debugfs: File 'v3d_ident' in directory '128'
> > > > already present!
> > > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
> > > > present!
> > > > [    3.872087] debugfs: File 'v3d_regs' in directory '128'
> > > > already present!
> > > > [    3.872097] debugfs: File 'measure_clock' in directory '0'
> > > > already present!
> > > > [    3.872105] debugfs: File 'measure_clock' in directory '128'
> > > > already present!
> > > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already
> > > > present!
> > > > [    3.872124] debugfs: File 'bo_stats' in directory '128'
> > > > already present!
> > > > 
> > > > It looks like the render node is being added twice, since this
> > > > doesn't happen
> > > > for vc4 and vkms.
> > > 
> > > Thanks for the feedback and yes that's exactly what I meant with
> > > that I haven't looked into all code paths.
> > > 
> > > Could it be that v3d registers it's debugfs files from the
> > > debugfs_init callback?
> > 
> > Although this is true, I'm not sure if this is the reason why the files
> > are
> > being registered twice, as this doesn't happen to vc4, and it also uses
> > the
> > debugfs_init callback. I believe it is somewhat related to the fact that
> > v3d is the primary node and the render node.
> 
> I see. Thanks for the hint.
> 
> > 
> > Best Regards,
> > - Maíra Canal
> > 
> > > 
> > > One alternative would be to just completely nuke support for
> > > separate render node debugfs files and only add a symlink to the
> > > primary node. Opinions?
> 
> What do you think of this approach? I can't come up with any reason why we
> should have separate debugfs files for render nodes and I think it is pretty
> much the same reason you came up with the patch for per device debugfs files
> instead of per minor.

They are two entirely separate devices and drivers, it doesn't make much
sense to move their debugfs files to one or the other.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 14:19         ` Maxime Ripard
@ 2023-02-09 15:52           ` Christian König
  2023-02-09 18:48             ` Maxime Ripard
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-09 15:52 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	wambui.karugax

Am 09.02.23 um 15:19 schrieb Maxime Ripard:
> On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
>> Am 09.02.23 um 14:06 schrieb Maíra Canal:
>>> On 2/9/23 09:13, Christian König wrote:
>>>> Am 09.02.23 um 12:23 schrieb Maíra Canal:
>>>>> On 2/9/23 05:18, Christian König wrote:
>>>>>> Hello everyone,
>>>>>>
>>>>>> the drm_debugfs has a couple of well known design problems.
>>>>>>
>>>>>> Especially it wasn't possible to add files between
>>>>>> initializing and registering
>>>>>> of DRM devices since the underlying debugfs directory wasn't
>>>>>> created yet.
>>>>>>
>>>>>> The resulting necessity of the driver->debugfs_init()
>>>>>> callback function is a
>>>>>> mid-layering which is really frowned on since it creates a horrible
>>>>>> driver->DRM->driver design layering.
>>>>>>
>>>>>> The recent patch "drm/debugfs: create device-centered
>>>>>> debugfs functions" tried
>>>>>> to address those problem, but doesn't seem to work
>>>>>> correctly. This looks like
>>>>>> a misunderstanding of the call flow around
>>>>>> drm_debugfs_init(), which is called
>>>>>> multiple times, once for the primary and once for the render node.
>>>>>>
>>>>>> So what happens now is the following:
>>>>>>
>>>>>> 1. drm_dev_init() initially allocates the drm_minor objects.
>>>>>> 2. ... back to the driver ...
>>>>>> 3. drm_dev_register() is called.
>>>>>>
>>>>>> 4. drm_debugfs_init() is called for the primary node.
>>>>>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>>>      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
>>>>>> to add the files
>>>>>>      for the primary node.
>>>>>> 6. The driver->debugfs_init() callback is called to add
>>>>>> debugfs files for the
>>>>>>      primary node.
>>>>>> 7. The added files are consumed and added to the primary
>>>>>> node debugfs directory.
>>>>>>
>>>>>> 8. drm_debugfs_init() is called for the render node.
>>>>>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>>>      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
>>>>>> to add the files
>>>>>>      again for the render node.
>>>>>> 10. The driver->debugfs_init() callback is called to add
>>>>>> debugfs files for the
>>>>>>       render node.
>>>>>> 11. The added files are consumed and added to the render
>>>>>> node debugfs directory.
>>>>>>
>>>>>> 12. Some more files are added through drm_debugfs_add_file().
>>>>>> 13. drm_debugfs_late_register() add the files once more to
>>>>>> the primary node
>>>>>>       debugfs directory.
>>>>>> 14. From this point on files added through
>>>>>> drm_debugfs_add_file() are simply ignored.
>>>>>> 15. ... back to the driver ...
>>>>>>
>>>>>> Because of this the dev->debugfs_mutex lock is also
>>>>>> completely pointless since
>>>>>> any concurrent use of the interface would just randomly
>>>>>> either add the files to
>>>>>> the primary or render node or just not at all.
>>>>>>
>>>>>> Even worse is that this implementation nails the coffin for
>>>>>> removing the
>>>>>> driver->debugfs_init() mid-layering because otherwise
>>>>>> drivers can't control
>>>>>> where their debugfs (primary/render node) are actually added.
>>>>>>
>>>>>> This patch set here now tries to clean this up a bit, but
>>>>>> most likely isn't
>>>>>> fully complete either since I didn't audit every driver/call path.
>>>>> I tested the patchset on the v3d, vc4 and vkms and all the files
>>>>> are generated
>>>>> as expected, but I'm getting the following errors on dmesg:
>>>>>
>>>>> [    3.872026] debugfs: File 'v3d_ident' in directory '0'
>>>>> already present!
>>>>> [    3.872064] debugfs: File 'v3d_ident' in directory '128'
>>>>> already present!
>>>>> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
>>>>> present!
>>>>> [    3.872087] debugfs: File 'v3d_regs' in directory '128'
>>>>> already present!
>>>>> [    3.872097] debugfs: File 'measure_clock' in directory '0'
>>>>> already present!
>>>>> [    3.872105] debugfs: File 'measure_clock' in directory '128'
>>>>> already present!
>>>>> [    3.872116] debugfs: File 'bo_stats' in directory '0' already
>>>>> present!
>>>>> [    3.872124] debugfs: File 'bo_stats' in directory '128'
>>>>> already present!
>>>>>
>>>>> It looks like the render node is being added twice, since this
>>>>> doesn't happen
>>>>> for vc4 and vkms.
>>>> Thanks for the feedback and yes that's exactly what I meant with
>>>> that I haven't looked into all code paths.
>>>>
>>>> Could it be that v3d registers it's debugfs files from the
>>>> debugfs_init callback?
>>> Although this is true, I'm not sure if this is the reason why the files
>>> are
>>> being registered twice, as this doesn't happen to vc4, and it also uses
>>> the
>>> debugfs_init callback. I believe it is somewhat related to the fact that
>>> v3d is the primary node and the render node.
>> I see. Thanks for the hint.
>>
>>> Best Regards,
>>> - Maíra Canal
>>>
>>>> One alternative would be to just completely nuke support for
>>>> separate render node debugfs files and only add a symlink to the
>>>> primary node. Opinions?
>> What do you think of this approach? I can't come up with any reason why we
>> should have separate debugfs files for render nodes and I think it is pretty
>> much the same reason you came up with the patch for per device debugfs files
>> instead of per minor.
> They are two entirely separate devices and drivers, it doesn't make much
> sense to move their debugfs files to one or the other.

Well exactly that isn't true. The primary and render node are just two 
file under /dev for the same hardware device and driver.

We just offer different functionality through the two interfaces, but 
essentially there isn't any information we could expose for one which 
isn't true for the other as well.

Christian.

>
> Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 15:52           ` Christian König
@ 2023-02-09 18:48             ` Maxime Ripard
  2023-02-10 12:07               ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Maxime Ripard @ 2023-02-09 18:48 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	wambui.karugax

On Thu, Feb 09, 2023 at 04:52:54PM +0100, Christian König wrote:
> Am 09.02.23 um 15:19 schrieb Maxime Ripard:
> > On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
> > > Am 09.02.23 um 14:06 schrieb Maíra Canal:
> > > > On 2/9/23 09:13, Christian König wrote:
> > > > > Am 09.02.23 um 12:23 schrieb Maíra Canal:
> > > > > > On 2/9/23 05:18, Christian König wrote:
> > > > > > > Hello everyone,
> > > > > > > 
> > > > > > > the drm_debugfs has a couple of well known design problems.
> > > > > > > 
> > > > > > > Especially it wasn't possible to add files between
> > > > > > > initializing and registering
> > > > > > > of DRM devices since the underlying debugfs directory wasn't
> > > > > > > created yet.
> > > > > > > 
> > > > > > > The resulting necessity of the driver->debugfs_init()
> > > > > > > callback function is a
> > > > > > > mid-layering which is really frowned on since it creates a horrible
> > > > > > > driver->DRM->driver design layering.
> > > > > > > 
> > > > > > > The recent patch "drm/debugfs: create device-centered
> > > > > > > debugfs functions" tried
> > > > > > > to address those problem, but doesn't seem to work
> > > > > > > correctly. This looks like
> > > > > > > a misunderstanding of the call flow around
> > > > > > > drm_debugfs_init(), which is called
> > > > > > > multiple times, once for the primary and once for the render node.
> > > > > > > 
> > > > > > > So what happens now is the following:
> > > > > > > 
> > > > > > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > > > > > 2. ... back to the driver ...
> > > > > > > 3. drm_dev_register() is called.
> > > > > > > 
> > > > > > > 4. drm_debugfs_init() is called for the primary node.
> > > > > > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > >      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > to add the files
> > > > > > >      for the primary node.
> > > > > > > 6. The driver->debugfs_init() callback is called to add
> > > > > > > debugfs files for the
> > > > > > >      primary node.
> > > > > > > 7. The added files are consumed and added to the primary
> > > > > > > node debugfs directory.
> > > > > > > 
> > > > > > > 8. drm_debugfs_init() is called for the render node.
> > > > > > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > >      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > to add the files
> > > > > > >      again for the render node.
> > > > > > > 10. The driver->debugfs_init() callback is called to add
> > > > > > > debugfs files for the
> > > > > > >       render node.
> > > > > > > 11. The added files are consumed and added to the render
> > > > > > > node debugfs directory.
> > > > > > > 
> > > > > > > 12. Some more files are added through drm_debugfs_add_file().
> > > > > > > 13. drm_debugfs_late_register() add the files once more to
> > > > > > > the primary node
> > > > > > >       debugfs directory.
> > > > > > > 14. From this point on files added through
> > > > > > > drm_debugfs_add_file() are simply ignored.
> > > > > > > 15. ... back to the driver ...
> > > > > > > 
> > > > > > > Because of this the dev->debugfs_mutex lock is also
> > > > > > > completely pointless since
> > > > > > > any concurrent use of the interface would just randomly
> > > > > > > either add the files to
> > > > > > > the primary or render node or just not at all.
> > > > > > > 
> > > > > > > Even worse is that this implementation nails the coffin for
> > > > > > > removing the
> > > > > > > driver->debugfs_init() mid-layering because otherwise
> > > > > > > drivers can't control
> > > > > > > where their debugfs (primary/render node) are actually added.
> > > > > > > 
> > > > > > > This patch set here now tries to clean this up a bit, but
> > > > > > > most likely isn't
> > > > > > > fully complete either since I didn't audit every driver/call path.
> > > > > > I tested the patchset on the v3d, vc4 and vkms and all the files
> > > > > > are generated
> > > > > > as expected, but I'm getting the following errors on dmesg:
> > > > > > 
> > > > > > [    3.872026] debugfs: File 'v3d_ident' in directory '0'
> > > > > > already present!
> > > > > > [    3.872064] debugfs: File 'v3d_ident' in directory '128'
> > > > > > already present!
> > > > > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
> > > > > > present!
> > > > > > [    3.872087] debugfs: File 'v3d_regs' in directory '128'
> > > > > > already present!
> > > > > > [    3.872097] debugfs: File 'measure_clock' in directory '0'
> > > > > > already present!
> > > > > > [    3.872105] debugfs: File 'measure_clock' in directory '128'
> > > > > > already present!
> > > > > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already
> > > > > > present!
> > > > > > [    3.872124] debugfs: File 'bo_stats' in directory '128'
> > > > > > already present!
> > > > > > 
> > > > > > It looks like the render node is being added twice, since this
> > > > > > doesn't happen
> > > > > > for vc4 and vkms.
> > > > > Thanks for the feedback and yes that's exactly what I meant with
> > > > > that I haven't looked into all code paths.
> > > > > 
> > > > > Could it be that v3d registers it's debugfs files from the
> > > > > debugfs_init callback?
> > > > Although this is true, I'm not sure if this is the reason why the files
> > > > are
> > > > being registered twice, as this doesn't happen to vc4, and it also uses
> > > > the
> > > > debugfs_init callback. I believe it is somewhat related to the fact that
> > > > v3d is the primary node and the render node.
> > > I see. Thanks for the hint.
> > > 
> > > > Best Regards,
> > > > - Maíra Canal
> > > > 
> > > > > One alternative would be to just completely nuke support for
> > > > > separate render node debugfs files and only add a symlink to the
> > > > > primary node. Opinions?
> > > What do you think of this approach? I can't come up with any reason why we
> > > should have separate debugfs files for render nodes and I think it is pretty
> > > much the same reason you came up with the patch for per device debugfs files
> > > instead of per minor.
> > They are two entirely separate devices and drivers, it doesn't make much
> > sense to move their debugfs files to one or the other.
> 
> Well exactly that isn't true.

Ok.

> The primary and render node are just two file under /dev for the same
> hardware device and driver.
> 
> We just offer different functionality through the two interfaces, but
> essentially there isn't any information we could expose for one which
> isn't true for the other as well.

I'd like to know what criteria you're using to say that they are the
same hardware device then, because they don't share the same MMIO
mappings, interrupts, clocks, IOMMUs, power domains, etc. They can also
operate independently.

So unless that criteria is that they share the RAM, they cannot be
considered the same hardware device.

Maxime

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 18:48             ` Maxime Ripard
@ 2023-02-10 12:07               ` Christian König
  2023-02-10 12:18                 ` Maxime Ripard
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-10 12:07 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	wambui.karugax

Am 09.02.23 um 19:48 schrieb Maxime Ripard:
> On Thu, Feb 09, 2023 at 04:52:54PM +0100, Christian König wrote:
>> Am 09.02.23 um 15:19 schrieb Maxime Ripard:
>>> On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
>>>> Am 09.02.23 um 14:06 schrieb Maíra Canal:
>>>>> On 2/9/23 09:13, Christian König wrote:
>>>>>> Am 09.02.23 um 12:23 schrieb Maíra Canal:
>>>>>>> On 2/9/23 05:18, Christian König wrote:
>>>>>>>> Hello everyone,
>>>>>>>>
>>>>>>>> the drm_debugfs has a couple of well known design problems.
>>>>>>>>
>>>>>>>> Especially it wasn't possible to add files between
>>>>>>>> initializing and registering
>>>>>>>> of DRM devices since the underlying debugfs directory wasn't
>>>>>>>> created yet.
>>>>>>>>
>>>>>>>> The resulting necessity of the driver->debugfs_init()
>>>>>>>> callback function is a
>>>>>>>> mid-layering which is really frowned on since it creates a horrible
>>>>>>>> driver->DRM->driver design layering.
>>>>>>>>
>>>>>>>> The recent patch "drm/debugfs: create device-centered
>>>>>>>> debugfs functions" tried
>>>>>>>> to address those problem, but doesn't seem to work
>>>>>>>> correctly. This looks like
>>>>>>>> a misunderstanding of the call flow around
>>>>>>>> drm_debugfs_init(), which is called
>>>>>>>> multiple times, once for the primary and once for the render node.
>>>>>>>>
>>>>>>>> So what happens now is the following:
>>>>>>>>
>>>>>>>> 1. drm_dev_init() initially allocates the drm_minor objects.
>>>>>>>> 2. ... back to the driver ...
>>>>>>>> 3. drm_dev_register() is called.
>>>>>>>>
>>>>>>>> 4. drm_debugfs_init() is called for the primary node.
>>>>>>>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>>>>>       drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
>>>>>>>> to add the files
>>>>>>>>       for the primary node.
>>>>>>>> 6. The driver->debugfs_init() callback is called to add
>>>>>>>> debugfs files for the
>>>>>>>>       primary node.
>>>>>>>> 7. The added files are consumed and added to the primary
>>>>>>>> node debugfs directory.
>>>>>>>>
>>>>>>>> 8. drm_debugfs_init() is called for the render node.
>>>>>>>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>>>>>       drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
>>>>>>>> to add the files
>>>>>>>>       again for the render node.
>>>>>>>> 10. The driver->debugfs_init() callback is called to add
>>>>>>>> debugfs files for the
>>>>>>>>        render node.
>>>>>>>> 11. The added files are consumed and added to the render
>>>>>>>> node debugfs directory.
>>>>>>>>
>>>>>>>> 12. Some more files are added through drm_debugfs_add_file().
>>>>>>>> 13. drm_debugfs_late_register() add the files once more to
>>>>>>>> the primary node
>>>>>>>>        debugfs directory.
>>>>>>>> 14. From this point on files added through
>>>>>>>> drm_debugfs_add_file() are simply ignored.
>>>>>>>> 15. ... back to the driver ...
>>>>>>>>
>>>>>>>> Because of this the dev->debugfs_mutex lock is also
>>>>>>>> completely pointless since
>>>>>>>> any concurrent use of the interface would just randomly
>>>>>>>> either add the files to
>>>>>>>> the primary or render node or just not at all.
>>>>>>>>
>>>>>>>> Even worse is that this implementation nails the coffin for
>>>>>>>> removing the
>>>>>>>> driver->debugfs_init() mid-layering because otherwise
>>>>>>>> drivers can't control
>>>>>>>> where their debugfs (primary/render node) are actually added.
>>>>>>>>
>>>>>>>> This patch set here now tries to clean this up a bit, but
>>>>>>>> most likely isn't
>>>>>>>> fully complete either since I didn't audit every driver/call path.
>>>>>>> I tested the patchset on the v3d, vc4 and vkms and all the files
>>>>>>> are generated
>>>>>>> as expected, but I'm getting the following errors on dmesg:
>>>>>>>
>>>>>>> [    3.872026] debugfs: File 'v3d_ident' in directory '0'
>>>>>>> already present!
>>>>>>> [    3.872064] debugfs: File 'v3d_ident' in directory '128'
>>>>>>> already present!
>>>>>>> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
>>>>>>> present!
>>>>>>> [    3.872087] debugfs: File 'v3d_regs' in directory '128'
>>>>>>> already present!
>>>>>>> [    3.872097] debugfs: File 'measure_clock' in directory '0'
>>>>>>> already present!
>>>>>>> [    3.872105] debugfs: File 'measure_clock' in directory '128'
>>>>>>> already present!
>>>>>>> [    3.872116] debugfs: File 'bo_stats' in directory '0' already
>>>>>>> present!
>>>>>>> [    3.872124] debugfs: File 'bo_stats' in directory '128'
>>>>>>> already present!
>>>>>>>
>>>>>>> It looks like the render node is being added twice, since this
>>>>>>> doesn't happen
>>>>>>> for vc4 and vkms.
>>>>>> Thanks for the feedback and yes that's exactly what I meant with
>>>>>> that I haven't looked into all code paths.
>>>>>>
>>>>>> Could it be that v3d registers it's debugfs files from the
>>>>>> debugfs_init callback?
>>>>> Although this is true, I'm not sure if this is the reason why the files
>>>>> are
>>>>> being registered twice, as this doesn't happen to vc4, and it also uses
>>>>> the
>>>>> debugfs_init callback. I believe it is somewhat related to the fact that
>>>>> v3d is the primary node and the render node.
>>>> I see. Thanks for the hint.
>>>>
>>>>> Best Regards,
>>>>> - Maíra Canal
>>>>>
>>>>>> One alternative would be to just completely nuke support for
>>>>>> separate render node debugfs files and only add a symlink to the
>>>>>> primary node. Opinions?
>>>> What do you think of this approach? I can't come up with any reason why we
>>>> should have separate debugfs files for render nodes and I think it is pretty
>>>> much the same reason you came up with the patch for per device debugfs files
>>>> instead of per minor.
>>> They are two entirely separate devices and drivers, it doesn't make much
>>> sense to move their debugfs files to one or the other.
>> Well exactly that isn't true.
> Ok.
>
>> The primary and render node are just two file under /dev for the same
>> hardware device and driver.
>>
>> We just offer different functionality through the two interfaces, but
>> essentially there isn't any information we could expose for one which
>> isn't true for the other as well.
> I'd like to know what criteria you're using to say that they are the
> same hardware device then, because they don't share the same MMIO
> mappings, interrupts, clocks, IOMMUs, power domains, etc. They can also
> operate independently.

Well you don't seem to understand what I'm talking about.

This is about the primary and render node under /dev/dri/, not some 
separate hw device.

So you really have only one hardware device. E.g. clocks, IOMMU, power 
etc... is all the same. It's just one physical device which only one 
drm_device structure.

Regards,
Christian.

>
> So unless that criteria is that they share the RAM, they cannot be
> considered the same hardware device.
>
> Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-10 12:07               ` Christian König
@ 2023-02-10 12:18                 ` Maxime Ripard
  2023-02-10 13:10                   ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Maxime Ripard @ 2023-02-10 12:18 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	wambui.karugax

On Fri, Feb 10, 2023 at 01:07:39PM +0100, Christian König wrote:
> Am 09.02.23 um 19:48 schrieb Maxime Ripard:
> > On Thu, Feb 09, 2023 at 04:52:54PM +0100, Christian König wrote:
> > > Am 09.02.23 um 15:19 schrieb Maxime Ripard:
> > > > On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
> > > > > Am 09.02.23 um 14:06 schrieb Maíra Canal:
> > > > > > On 2/9/23 09:13, Christian König wrote:
> > > > > > > Am 09.02.23 um 12:23 schrieb Maíra Canal:
> > > > > > > > On 2/9/23 05:18, Christian König wrote:
> > > > > > > > > Hello everyone,
> > > > > > > > > 
> > > > > > > > > the drm_debugfs has a couple of well known design problems.
> > > > > > > > > 
> > > > > > > > > Especially it wasn't possible to add files between
> > > > > > > > > initializing and registering
> > > > > > > > > of DRM devices since the underlying debugfs directory wasn't
> > > > > > > > > created yet.
> > > > > > > > > 
> > > > > > > > > The resulting necessity of the driver->debugfs_init()
> > > > > > > > > callback function is a
> > > > > > > > > mid-layering which is really frowned on since it creates a horrible
> > > > > > > > > driver->DRM->driver design layering.
> > > > > > > > > 
> > > > > > > > > The recent patch "drm/debugfs: create device-centered
> > > > > > > > > debugfs functions" tried
> > > > > > > > > to address those problem, but doesn't seem to work
> > > > > > > > > correctly. This looks like
> > > > > > > > > a misunderstanding of the call flow around
> > > > > > > > > drm_debugfs_init(), which is called
> > > > > > > > > multiple times, once for the primary and once for the render node.
> > > > > > > > > 
> > > > > > > > > So what happens now is the following:
> > > > > > > > > 
> > > > > > > > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > > > > > > > 2. ... back to the driver ...
> > > > > > > > > 3. drm_dev_register() is called.
> > > > > > > > > 
> > > > > > > > > 4. drm_debugfs_init() is called for the primary node.
> > > > > > > > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > > > >       drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > > > to add the files
> > > > > > > > >       for the primary node.
> > > > > > > > > 6. The driver->debugfs_init() callback is called to add
> > > > > > > > > debugfs files for the
> > > > > > > > >       primary node.
> > > > > > > > > 7. The added files are consumed and added to the primary
> > > > > > > > > node debugfs directory.
> > > > > > > > > 
> > > > > > > > > 8. drm_debugfs_init() is called for the render node.
> > > > > > > > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > > > >       drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > > > to add the files
> > > > > > > > >       again for the render node.
> > > > > > > > > 10. The driver->debugfs_init() callback is called to add
> > > > > > > > > debugfs files for the
> > > > > > > > >        render node.
> > > > > > > > > 11. The added files are consumed and added to the render
> > > > > > > > > node debugfs directory.
> > > > > > > > > 
> > > > > > > > > 12. Some more files are added through drm_debugfs_add_file().
> > > > > > > > > 13. drm_debugfs_late_register() add the files once more to
> > > > > > > > > the primary node
> > > > > > > > >        debugfs directory.
> > > > > > > > > 14. From this point on files added through
> > > > > > > > > drm_debugfs_add_file() are simply ignored.
> > > > > > > > > 15. ... back to the driver ...
> > > > > > > > > 
> > > > > > > > > Because of this the dev->debugfs_mutex lock is also
> > > > > > > > > completely pointless since
> > > > > > > > > any concurrent use of the interface would just randomly
> > > > > > > > > either add the files to
> > > > > > > > > the primary or render node or just not at all.
> > > > > > > > > 
> > > > > > > > > Even worse is that this implementation nails the coffin for
> > > > > > > > > removing the
> > > > > > > > > driver->debugfs_init() mid-layering because otherwise
> > > > > > > > > drivers can't control
> > > > > > > > > where their debugfs (primary/render node) are actually added.
> > > > > > > > > 
> > > > > > > > > This patch set here now tries to clean this up a bit, but
> > > > > > > > > most likely isn't
> > > > > > > > > fully complete either since I didn't audit every driver/call path.
> > > > > > > > I tested the patchset on the v3d, vc4 and vkms and all the files
> > > > > > > > are generated
> > > > > > > > as expected, but I'm getting the following errors on dmesg:
> > > > > > > > 
> > > > > > > > [    3.872026] debugfs: File 'v3d_ident' in directory '0'
> > > > > > > > already present!
> > > > > > > > [    3.872064] debugfs: File 'v3d_ident' in directory '128'
> > > > > > > > already present!
> > > > > > > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
> > > > > > > > present!
> > > > > > > > [    3.872087] debugfs: File 'v3d_regs' in directory '128'
> > > > > > > > already present!
> > > > > > > > [    3.872097] debugfs: File 'measure_clock' in directory '0'
> > > > > > > > already present!
> > > > > > > > [    3.872105] debugfs: File 'measure_clock' in directory '128'
> > > > > > > > already present!
> > > > > > > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already
> > > > > > > > present!
> > > > > > > > [    3.872124] debugfs: File 'bo_stats' in directory '128'
> > > > > > > > already present!
> > > > > > > > 
> > > > > > > > It looks like the render node is being added twice, since this
> > > > > > > > doesn't happen
> > > > > > > > for vc4 and vkms.
> > > > > > > Thanks for the feedback and yes that's exactly what I meant with
> > > > > > > that I haven't looked into all code paths.
> > > > > > > 
> > > > > > > Could it be that v3d registers it's debugfs files from the
> > > > > > > debugfs_init callback?
> > > > > > Although this is true, I'm not sure if this is the reason why the files
> > > > > > are
> > > > > > being registered twice, as this doesn't happen to vc4, and it also uses
> > > > > > the
> > > > > > debugfs_init callback. I believe it is somewhat related to the fact that
> > > > > > v3d is the primary node and the render node.
> > > > > I see. Thanks for the hint.
> > > > > 
> > > > > > Best Regards,
> > > > > > - Maíra Canal
> > > > > > 
> > > > > > > One alternative would be to just completely nuke support for
> > > > > > > separate render node debugfs files and only add a symlink to the
> > > > > > > primary node. Opinions?
> > > > > What do you think of this approach? I can't come up with any reason why we
> > > > > should have separate debugfs files for render nodes and I think it is pretty
> > > > > much the same reason you came up with the patch for per device debugfs files
> > > > > instead of per minor.
> > > > They are two entirely separate devices and drivers, it doesn't make much
> > > > sense to move their debugfs files to one or the other.
> > > Well exactly that isn't true.
> > Ok.
> > 
> > > The primary and render node are just two file under /dev for the same
> > > hardware device and driver.
> > > 
> > > We just offer different functionality through the two interfaces, but
> > > essentially there isn't any information we could expose for one which
> > > isn't true for the other as well.
> > I'd like to know what criteria you're using to say that they are the
> > same hardware device then, because they don't share the same MMIO
> > mappings, interrupts, clocks, IOMMUs, power domains, etc. They can also
> > operate independently.
> 
> Well you don't seem to understand what I'm talking about.

I would certainly like you to stop making those kind of statements.
Apart from creating unnecessary tension, they don't bring anything to
the discussion.

> This is about the primary and render node under /dev/dri/, not some
> separate hw device.

The thing is, vc4 and v3d are both different nodes under /dev/dri and
separate hw devices.

> So you really have only one hardware device. E.g. clocks, IOMMU, power
> etc... is all the same.

Well, I mean, you can claim that all you want, but they certainly aren't
the same hardware device. Just like on virtually any !x86 SoC, the GPU
and display engines aren't the same device, and most of the time don't
even come from the same vendor.

Going back to the initial issue, one of the files exposed by the v3d
driver is the v3d registers content. It makes no sense to expose the v3d
registers into the primary (vc4) node when the hardware doesn't match,
and v3d has its own node.

Maxime

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-10 12:18                 ` Maxime Ripard
@ 2023-02-10 13:10                   ` Christian König
  0 siblings, 0 replies; 50+ messages in thread
From: Christian König @ 2023-02-10 13:10 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	wambui.karugax

Am 10.02.23 um 13:18 schrieb Maxime Ripard:
> [SNIP]
>> Well you don't seem to understand what I'm talking about.
> I would certainly like you to stop making those kind of statements.
> Apart from creating unnecessary tension, they don't bring anything to
> the discussion.

Sorry for saying that. It was really not very polite from me.

It's just that you indeed seem to be talking about something completely 
different.

>> This is about the primary and render node under /dev/dri/, not some
>> separate hw device.
> The thing is, vc4 and v3d are both different nodes under /dev/dri and
> separate hw devices.
>
>> So you really have only one hardware device. E.g. clocks, IOMMU, power
>> etc... is all the same.
> Well, I mean, you can claim that all you want, but they certainly aren't
> the same hardware device. Just like on virtually any !x86 SoC, the GPU
> and display engines aren't the same device, and most of the time don't
> even come from the same vendor.

Yeah, I'm perfectly aware of that.

This is just about the primary and render node under /dev/dri. This is a 
software construct we use for access control, nothing else.

As far as I can see separate render and display hardware are a 
completely different topic. Or am I missing something?

> Going back to the initial issue, one of the files exposed by the v3d
> driver is the v3d registers content. It makes no sense to expose the v3d
> registers into the primary (vc4) node when the hardware doesn't match,
> and v3d has its own node.

But those are different primary nodes, aren't they? E.g. you have 
different /dev/dri/card0 and /dev/dri/card1 for them?

For the IOCTL level the render node is just a secure subset of the 
functionality of the primary node. So I would not expect that there is 
something different for the debugfs files.

Regards,
Christian.

>
> Maxime


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 13:06     ` Maíra Canal
  2023-02-09 14:06       ` Christian König
@ 2023-02-13 18:16       ` Stanislaw Gruszka
  2023-02-13 19:59         ` Christian König
  1 sibling, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-13 18:16 UTC (permalink / raw)
  To: Maíra Canal
  Cc: Christian König, dri-devel, mwen, mairacanal, maxime,
	daniel.vetter, wambui.karugax

On Thu, Feb 09, 2023 at 10:06:25AM -0300, Maíra Canal wrote:
> > > [    3.872026] debugfs: File 'v3d_ident' in directory '0' already present!
> > > [    3.872064] debugfs: File 'v3d_ident' in directory '128' already present!
> > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already present!
> > > [    3.872087] debugfs: File 'v3d_regs' in directory '128' already present!
> > > [    3.872097] debugfs: File 'measure_clock' in directory '0' already present!
> > > [    3.872105] debugfs: File 'measure_clock' in directory '128' already present!
> > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already present!
> > > [    3.872124] debugfs: File 'bo_stats' in directory '128' already present!
> > > 
> > > It looks like the render node is being added twice, since this doesn't happen
> > > for vc4 and vkms.
> > 
> > Thanks for the feedback and yes that's exactly what I meant with that I haven't looked into all code paths.
> > 
> > Could it be that v3d registers it's debugfs files from the debugfs_init callback?
> 
> Although this is true, I'm not sure if this is the reason why the files are
> being registered twice, as this doesn't happen to vc4, and it also uses the
> debugfs_init callback. I believe it is somewhat related to the fact that
> v3d is the primary node and the render node.

Yes, this seems to be because ->debugfs_init = v3d_debugfs_init() uses
drm_debugfs_add_files() which create files for both primary and render.
And ->debugfs_init is called via drm_minor_register() also for both
when registering. 

Probably need to change debugfs_init callback to create files just
for one minor. And if we don't want to use minor pointer directly in
drivers, the callback can take debugfs dir as argument.

Regards
Stanislaw

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-13 18:16       ` Stanislaw Gruszka
@ 2023-02-13 19:59         ` Christian König
  0 siblings, 0 replies; 50+ messages in thread
From: Christian König @ 2023-02-13 19:59 UTC (permalink / raw)
  To: Stanislaw Gruszka, Maíra Canal
  Cc: daniel.vetter, dri-devel, mwen, mairacanal, maxime, wambui.karugax

Am 13.02.23 um 19:16 schrieb Stanislaw Gruszka:
> On Thu, Feb 09, 2023 at 10:06:25AM -0300, Maíra Canal wrote:
>>>> [    3.872026] debugfs: File 'v3d_ident' in directory '0' already present!
>>>> [    3.872064] debugfs: File 'v3d_ident' in directory '128' already present!
>>>> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already present!
>>>> [    3.872087] debugfs: File 'v3d_regs' in directory '128' already present!
>>>> [    3.872097] debugfs: File 'measure_clock' in directory '0' already present!
>>>> [    3.872105] debugfs: File 'measure_clock' in directory '128' already present!
>>>> [    3.872116] debugfs: File 'bo_stats' in directory '0' already present!
>>>> [    3.872124] debugfs: File 'bo_stats' in directory '128' already present!
>>>>
>>>> It looks like the render node is being added twice, since this doesn't happen
>>>> for vc4 and vkms.
>>> Thanks for the feedback and yes that's exactly what I meant with that I haven't looked into all code paths.
>>>
>>> Could it be that v3d registers it's debugfs files from the debugfs_init callback?
>> Although this is true, I'm not sure if this is the reason why the files are
>> being registered twice, as this doesn't happen to vc4, and it also uses the
>> debugfs_init callback. I believe it is somewhat related to the fact that
>> v3d is the primary node and the render node.
> Yes, this seems to be because ->debugfs_init = v3d_debugfs_init() uses
> drm_debugfs_add_files() which create files for both primary and render.
> And ->debugfs_init is called via drm_minor_register() also for both
> when registering.
>
> Probably need to change debugfs_init callback to create files just
> for one minor. And if we don't want to use minor pointer directly in
> drivers, the callback can take debugfs dir as argument.

Well the intention of Maira's and my work is to get rid of the callback 
altogether.

So far nobody came up with an argument why we should keep the 
distinction of the debugfs directories into primary and render node. I 
will just go ahead and remove that.

This way we will have the callback used only once and can slowly 
deprecate it.

Regards,
Christian.

>
> Regards
> Stanislaw


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09  8:18 Try to address the drm_debugfs issues Christian König
                   ` (3 preceding siblings ...)
  2023-02-09 11:23 ` Try to address the drm_debugfs issues Maíra Canal
@ 2023-02-14  8:59 ` Stanislaw Gruszka
  2023-02-14  9:28   ` Christian König
  4 siblings, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-14  8:59 UTC (permalink / raw)
  To: Christian König
  Cc: jacek.lawrynowicz, Jeffrey Hugo, daniel.vetter, Oded Gabbay,
	mcanal, dri-devel, mwen, maxime, wambui.karugax

On Thu, Feb 09, 2023 at 09:18:35AM +0100, Christian König wrote:
> Hello everyone,
> 
> the drm_debugfs has a couple of well known design problems.
> 
> Especially it wasn't possible to add files between initializing and registering
> of DRM devices since the underlying debugfs directory wasn't created yet.
> 
> The resulting necessity of the driver->debugfs_init() callback function is a
> mid-layering which is really frowned on since it creates a horrible
> driver->DRM->driver design layering.
> 
> The recent patch "drm/debugfs: create device-centered debugfs functions" tried
> to address those problem, but doesn't seem to work correctly. This looks like
> a misunderstanding of the call flow around drm_debugfs_init(), which is called
> multiple times, once for the primary and once for the render node.
> 
> So what happens now is the following:
> 
> 1. drm_dev_init() initially allocates the drm_minor objects.
> 2. ... back to the driver ...
> 3. drm_dev_register() is called.
> 
> 4. drm_debugfs_init() is called for the primary node.
> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>    drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>    for the primary node.
> 6. The driver->debugfs_init() callback is called to add debugfs files for the
>    primary node.
> 7. The added files are consumed and added to the primary node debugfs directory.
> 
> 8. drm_debugfs_init() is called for the render node.
> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>    drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>    again for the render node.
> 10. The driver->debugfs_init() callback is called to add debugfs files for the
>     render node.
> 11. The added files are consumed and added to the render node debugfs directory.
> 
> 12. Some more files are added through drm_debugfs_add_file().
> 13. drm_debugfs_late_register() add the files once more to the primary node
>     debugfs directory.
> 14. From this point on files added through drm_debugfs_add_file() are simply ignored.
> 15. ... back to the driver ...
> 
> Because of this the dev->debugfs_mutex lock is also completely pointless since
> any concurrent use of the interface would just randomly either add the files to
> the primary or render node or just not at all.
> 
> Even worse is that this implementation nails the coffin for removing the
> driver->debugfs_init() mid-layering because otherwise drivers can't control
> where their debugfs (primary/render node) are actually added.
> 
> This patch set here now tries to clean this up a bit, but most likely isn't
> fully complete either since I didn't audit every driver/call path.
> 
> Please comment/discuss.

What is end goal here regarding debugfs in DRM ? My undersigning is that
the direction is get rid of debugfs_init callback as described in:
https://cgit.freedesktop.org/drm/drm-misc/tree/Documentation/gpu/todo.rst#n511
and also make it driver/device-centric instead of minor-centric as
described here:
https://cgit.freedesktop.org/drm/drm-misc/commit/?id=99845faae7099cd704ebf67514c1157c26960a	

I'm asking from accel point of view. We can make things there as they
should look like at the end for DRM, since currently no drivers have
established their interfaces and they can be changed.

Is drivers/device-centric mean we should use drm_dev->unique for debugfs 
dir entry name instead of minor ?
Or perhaps we should have 2 separate dir entries: one (old dri/minor/)
for device drm debugfs files and other one for driver specific files ?

Also what regarding sysfs ? Should we do something with accel_sysfs_device_minor ?

Regards
Stanislaw

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-14  8:59 ` Stanislaw Gruszka
@ 2023-02-14  9:28   ` Christian König
  2023-02-14 11:46     ` Stanislaw Gruszka
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-14  9:28 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: jacek.lawrynowicz, Jeffrey Hugo, daniel.vetter, Oded Gabbay,
	mcanal, dri-devel, mwen, maxime, wambui.karugax

Am 14.02.23 um 09:59 schrieb Stanislaw Gruszka:
> On Thu, Feb 09, 2023 at 09:18:35AM +0100, Christian König wrote:
>> Hello everyone,
>>
>> the drm_debugfs has a couple of well known design problems.
>>
>> Especially it wasn't possible to add files between initializing and registering
>> of DRM devices since the underlying debugfs directory wasn't created yet.
>>
>> The resulting necessity of the driver->debugfs_init() callback function is a
>> mid-layering which is really frowned on since it creates a horrible
>> driver->DRM->driver design layering.
>>
>> The recent patch "drm/debugfs: create device-centered debugfs functions" tried
>> to address those problem, but doesn't seem to work correctly. This looks like
>> a misunderstanding of the call flow around drm_debugfs_init(), which is called
>> multiple times, once for the primary and once for the render node.
>>
>> So what happens now is the following:
>>
>> 1. drm_dev_init() initially allocates the drm_minor objects.
>> 2. ... back to the driver ...
>> 3. drm_dev_register() is called.
>>
>> 4. drm_debugfs_init() is called for the primary node.
>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>>     for the primary node.
>> 6. The driver->debugfs_init() callback is called to add debugfs files for the
>>     primary node.
>> 7. The added files are consumed and added to the primary node debugfs directory.
>>
>> 8. drm_debugfs_init() is called for the render node.
>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
>>     again for the render node.
>> 10. The driver->debugfs_init() callback is called to add debugfs files for the
>>      render node.
>> 11. The added files are consumed and added to the render node debugfs directory.
>>
>> 12. Some more files are added through drm_debugfs_add_file().
>> 13. drm_debugfs_late_register() add the files once more to the primary node
>>      debugfs directory.
>> 14. From this point on files added through drm_debugfs_add_file() are simply ignored.
>> 15. ... back to the driver ...
>>
>> Because of this the dev->debugfs_mutex lock is also completely pointless since
>> any concurrent use of the interface would just randomly either add the files to
>> the primary or render node or just not at all.
>>
>> Even worse is that this implementation nails the coffin for removing the
>> driver->debugfs_init() mid-layering because otherwise drivers can't control
>> where their debugfs (primary/render node) are actually added.
>>
>> This patch set here now tries to clean this up a bit, but most likely isn't
>> fully complete either since I didn't audit every driver/call path.
>>
>> Please comment/discuss.
> What is end goal here regarding debugfs in DRM ? My undersigning is that
> the direction is get rid of debugfs_init callback as described in:
> https://cgit.freedesktop.org/drm/drm-misc/tree/Documentation/gpu/todo.rst#n511
> and also make it driver/device-centric instead of minor-centric as
> described here:
> https://cgit.freedesktop.org/drm/drm-misc/commit/?id=99845faae7099cd704ebf67514c1157c26960a	

Well my main goal is to get rid of the debugfs_init() mid-layering in 
the mid term, everything else is just nice to have.

> I'm asking from accel point of view. We can make things there as they
> should look like at the end for DRM, since currently no drivers have
> established their interfaces and they can be changed.
>
> Is drivers/device-centric mean we should use drm_dev->unique for debugfs
> dir entry name instead of minor ?

Oh, good idea! That would also finally make it a bit less problematic to 
figure out which PCI or platform device corresponds to which debugfs 
directory.

Only potential problem I see is that we would need to rename the 
directory should a driver every decide to set drm_dev->unique to 
something else than the default. But a quick check shows no users of 
drm_dev_set_unique(), so we could potentially just unexport the function

> Or perhaps we should have 2 separate dir entries: one (old dri/minor/)
> for device drm debugfs files and other one for driver specific files ?

How about we just create symlinks between the old and the new directory 
for now which we remove after everything has settled again?

> Also what regarding sysfs ? Should we do something with accel_sysfs_device_minor ?

I see sysfs as a different and probably even more complicated topic.

Regards,
Christian.

>
> Regards
> Stanislaw


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-14  9:28   ` Christian König
@ 2023-02-14 11:46     ` Stanislaw Gruszka
  0 siblings, 0 replies; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-14 11:46 UTC (permalink / raw)
  To: Christian König
  Cc: Jeffrey Hugo, daniel.vetter, Oded Gabbay, mcanal, dri-devel,
	mwen, jacek.lawrynowicz, wambui.karugax, maxime

On Tue, Feb 14, 2023 at 10:28:24AM +0100, Christian König wrote:
> Am 14.02.23 um 09:59 schrieb Stanislaw Gruszka:
> > On Thu, Feb 09, 2023 at 09:18:35AM +0100, Christian König wrote:
> > > Hello everyone,
> > > 
> > > the drm_debugfs has a couple of well known design problems.
> > > 
> > > Especially it wasn't possible to add files between initializing and registering
> > > of DRM devices since the underlying debugfs directory wasn't created yet.
> > > 
> > > The resulting necessity of the driver->debugfs_init() callback function is a
> > > mid-layering which is really frowned on since it creates a horrible
> > > driver->DRM->driver design layering.
> > > 
> > > The recent patch "drm/debugfs: create device-centered debugfs functions" tried
> > > to address those problem, but doesn't seem to work correctly. This looks like
> > > a misunderstanding of the call flow around drm_debugfs_init(), which is called
> > > multiple times, once for the primary and once for the render node.
> > > 
> > > So what happens now is the following:
> > > 
> > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > 2. ... back to the driver ...
> > > 3. drm_dev_register() is called.
> > > 
> > > 4. drm_debugfs_init() is called for the primary node.
> > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > >     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
> > >     for the primary node.
> > > 6. The driver->debugfs_init() callback is called to add debugfs files for the
> > >     primary node.
> > > 7. The added files are consumed and added to the primary node debugfs directory.
> > > 
> > > 8. drm_debugfs_init() is called for the render node.
> > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > >     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)() to add the files
> > >     again for the render node.
> > > 10. The driver->debugfs_init() callback is called to add debugfs files for the
> > >      render node.
> > > 11. The added files are consumed and added to the render node debugfs directory.
> > > 
> > > 12. Some more files are added through drm_debugfs_add_file().
> > > 13. drm_debugfs_late_register() add the files once more to the primary node
> > >      debugfs directory.
> > > 14. From this point on files added through drm_debugfs_add_file() are simply ignored.
> > > 15. ... back to the driver ...
> > > 
> > > Because of this the dev->debugfs_mutex lock is also completely pointless since
> > > any concurrent use of the interface would just randomly either add the files to
> > > the primary or render node or just not at all.
> > > 
> > > Even worse is that this implementation nails the coffin for removing the
> > > driver->debugfs_init() mid-layering because otherwise drivers can't control
> > > where their debugfs (primary/render node) are actually added.
> > > 
> > > This patch set here now tries to clean this up a bit, but most likely isn't
> > > fully complete either since I didn't audit every driver/call path.
> > > 
> > > Please comment/discuss.
> > What is end goal here regarding debugfs in DRM ? My undersigning is that
> > the direction is get rid of debugfs_init callback as described in:
> > https://cgit.freedesktop.org/drm/drm-misc/tree/Documentation/gpu/todo.rst#n511
> > and also make it driver/device-centric instead of minor-centric as
> > described here:
> > https://cgit.freedesktop.org/drm/drm-misc/commit/?id=99845faae7099cd704ebf67514c1157c26960a	
> 
> Well my main goal is to get rid of the debugfs_init() mid-layering in the
> mid term, everything else is just nice to have.
> 
> > I'm asking from accel point of view. We can make things there as they
> > should look like at the end for DRM, since currently no drivers have
> > established their interfaces and they can be changed.
> > 
> > Is drivers/device-centric mean we should use drm_dev->unique for debugfs
> > dir entry name instead of minor ?
> 
> Oh, good idea! That would also finally make it a bit less problematic to
> figure out which PCI or platform device corresponds to which debugfs
> directory.
> 
> Only potential problem I see is that we would need to rename the directory
> should a driver every decide to set drm_dev->unique to something else than
> the default. But a quick check shows no users of drm_dev_set_unique(), so we
> could potentially just unexport the function
>
> > Or perhaps we should have 2 separate dir entries: one (old dri/minor/)
> > for device drm debugfs files and other one for driver specific files ?
> 
> How about we just create symlinks between the old and the new directory for
> now which we remove after everything has settled again?

Yes, that would make perfect sense. 

However my idea was a bit different, that we have separate directories
one for drm specific debugfs files (i.e. clints, framebuffer, gem, ... )
and another one for driver specific files (registers, whatever
individual needs for debugging). I'm just considering different options.

> > Also what regarding sysfs ? Should we do something with accel_sysfs_device_minor ?
> 
> I see sysfs as a different and probably even more complicated topic.

I wish to have some clear guidance how things should be done regarding
sysfs. But I guess we can stick with accel_sysfs_device_minor for accel
as it is currently. And make changes along with whole DRM.

Regards
Stanislaw

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register
  2023-02-09  8:18 ` [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register Christian König
@ 2023-02-14 11:56   ` Stanislaw Gruszka
  0 siblings, 0 replies; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-14 11:56 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

On Thu, Feb 09, 2023 at 09:18:36AM +0100, Christian König wrote:
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index c6eb8972451a..88ce22c04672 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -156,6 +156,10 @@ static int drm_minor_alloc(struct drm_device *dev, unsigned int type)
>  	if (IS_ERR(minor->kdev))
>  		return PTR_ERR(minor->kdev);
>  
> +	r = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> +	if (r)
> +		return r;
> +
>  	*drm_minor_get_slot(dev, type) = minor;
>  	return 0;
>  }
> @@ -172,15 +176,10 @@ static int drm_minor_register(struct drm_device *dev, unsigned int type)
>  	if (!minor)
>  		return 0;
>  
> -	if (minor->type == DRM_MINOR_ACCEL) {
> +	if (minor->type == DRM_MINOR_ACCEL)
>  		accel_debugfs_init(minor, minor->index);

Please move this to drm_minor_alloc() as well. Or perhaps make
conditional code for DRM_MINOR_ACCEL inside drm_debugfs_init().

Regards
Stanislaw

> -	} else {
> -		ret = drm_debugfs_init(minor, minor->index, drm_debugfs_root);
> -		if (ret) {
> -			DRM_ERROR("DRM: Failed to initialize /sys/kernel/debug/dri.\n");
> -			goto err_debugfs;
> -		}
> -	}
> +	else
> +		drm_debugfs_register(minor);

>  
>  	ret = device_add(minor->kdev);
>  	if (ret)
> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> index ed2103ee272c..332fb65a935a 100644
> --- a/drivers/gpu/drm/drm_internal.h
> +++ b/drivers/gpu/drm/drm_internal.h
> @@ -185,6 +185,7 @@ int drm_gem_dumb_destroy(struct drm_file *file, struct drm_device *dev,
>  #if defined(CONFIG_DEBUG_FS)
>  int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>  		     struct dentry *root);
> +void drm_debugfs_register(struct drm_minor *minor);
>  void drm_debugfs_cleanup(struct drm_minor *minor);
>  void drm_debugfs_late_register(struct drm_device *dev);
>  void drm_debugfs_connector_add(struct drm_connector *connector);
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-09  8:18 ` [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex Christian König
@ 2023-02-14 12:19   ` Stanislaw Gruszka
  2023-02-14 12:46     ` Stanislaw Gruszka
  2023-02-16 11:33   ` Daniel Vetter
  1 sibling, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-14 12:19 UTC (permalink / raw)
  To: Christian König
  Cc: jacek.lawrynowicz, Jeffrey Hugo, daniel.vetter, Oded Gabbay,
	mcanal, dri-devel, mwen, mairacanal, maxime, wambui.karugax

On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> -void drm_debugfs_late_register(struct drm_device *dev)
> -{
> -	struct drm_minor *minor = dev->primary;
> -	struct drm_debugfs_entry *entry, *tmp;
> -
> -	if (!minor)
> -		return;
> -
> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> -		debugfs_create_file(entry->file.name, 0444,
> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> -		list_del(&entry->list);
> -	}
>  }
>  
>  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>  	entry->file.data = data;
>  	entry->dev = dev;
>  
> -	mutex_lock(&dev->debugfs_mutex);
> -	list_add(&entry->list, &dev->debugfs_list);
> -	mutex_unlock(&dev->debugfs_mutex);
> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> +			    &drm_debugfs_entry_fops);
> +
> +	/* TODO: This should probably only be a symlink */
> +	if (dev->render)
> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> +				    entry, &drm_debugfs_entry_fops);

For accel we would need conditional check for DRM_MINOR_ACCEL here as
well.

With this change and one from first patch, drm_debugfs_add_file() should
work for accel as well. We could get rid of debugfs_init from accel_debugfs_init().

However we still need support for writable files. I think we can just
add helper for providing debugfs dir to drivers i.e:

struct dentry *accel_debugfs_dir(struct drm_device *drm) 
{
	return drm->accel->debugfs_root;
}

Then individual accel driver could create files with different permissions there.

Regards
Stanislaw


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-14 12:19   ` Stanislaw Gruszka
@ 2023-02-14 12:46     ` Stanislaw Gruszka
  0 siblings, 0 replies; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-14 12:46 UTC (permalink / raw)
  To: Christian König
  Cc: Jeffrey Hugo, daniel.vetter, Oded Gabbay, mcanal, dri-devel,
	mwen, mairacanal, jacek.lawrynowicz, wambui.karugax, maxime

On Tue, Feb 14, 2023 at 01:19:51PM +0100, Stanislaw Gruszka wrote:
> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> > -void drm_debugfs_late_register(struct drm_device *dev)
> > -{
> > -	struct drm_minor *minor = dev->primary;
> > -	struct drm_debugfs_entry *entry, *tmp;
> > -
> > -	if (!minor)
> > -		return;
> > -
> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > -		debugfs_create_file(entry->file.name, 0444,
> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > -		list_del(&entry->list);
> > -	}
> >  }
> >  
> >  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> >  	entry->file.data = data;
> >  	entry->dev = dev;
> >  
> > -	mutex_lock(&dev->debugfs_mutex);
> > -	list_add(&entry->list, &dev->debugfs_list);
> > -	mutex_unlock(&dev->debugfs_mutex);
> > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> > +			    &drm_debugfs_entry_fops);
> > +
> > +	/* TODO: This should probably only be a symlink */
> > +	if (dev->render)
> > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> > +				    entry, &drm_debugfs_entry_fops);
> 
> For accel we would need conditional check for DRM_MINOR_ACCEL here as
> well.

Actually my comment make no sense, since we do not have minor pointer
here. What is needed is additional dev->accel code like for dev->render,
perhaps also make dev->primary conditional.

Alternatively we can just create separate helper: accel_debugfs_add_file.

> With this change and one from first patch, drm_debugfs_add_file() should
> work for accel as well. We could get rid of debugfs_init from accel_debugfs_init().
> 
> However we still need support for writable files. I think we can just
> add helper for providing debugfs dir to drivers i.e:
> 
> struct dentry *accel_debugfs_dir(struct drm_device *drm) 
> {
> 	return drm->accel->debugfs_root;
> }

or just this :-)

Regards
Stanislaw


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-09  8:18 ` [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex Christian König
  2023-02-14 12:19   ` Stanislaw Gruszka
@ 2023-02-16 11:33   ` Daniel Vetter
  2023-02-16 11:37     ` Daniel Vetter
                       ` (2 more replies)
  1 sibling, 3 replies; 50+ messages in thread
From: Daniel Vetter @ 2023-02-16 11:33 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> The mutex was completely pointless in the first place since any
> parallel adding of files to this list would result in random
> behavior since the list is filled and consumed multiple times.
> 
> Completely drop that approach and just create the files directly.
> 
> This also re-adds the debugfs files to the render node directory and
> removes drm_debugfs_late_register().
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>  drivers/gpu/drm/drm_drv.c         |  3 ---
>  drivers/gpu/drm/drm_internal.h    |  5 -----
>  drivers/gpu/drm/drm_mode_config.c |  2 --
>  include/drm/drm_device.h          | 15 ---------------
>  5 files changed, 7 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> index 558e3a7271a5..a40288e67264 100644
> --- a/drivers/gpu/drm/drm_debugfs.c
> +++ b/drivers/gpu/drm/drm_debugfs.c
> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>  void drm_debugfs_minor_register(struct drm_minor *minor)
>  {
>  	struct drm_device *dev = minor->dev;
> -	struct drm_debugfs_entry *entry, *tmp;
>  
>  	if (dev->driver->debugfs_init)
>  		dev->driver->debugfs_init(minor);
> -
> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> -		debugfs_create_file(entry->file.name, 0444,
> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> -		list_del(&entry->list);
> -	}
> -}
> -
> -void drm_debugfs_late_register(struct drm_device *dev)
> -{
> -	struct drm_minor *minor = dev->primary;
> -	struct drm_debugfs_entry *entry, *tmp;
> -
> -	if (!minor)
> -		return;
> -
> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> -		debugfs_create_file(entry->file.name, 0444,
> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> -		list_del(&entry->list);
> -	}
>  }
>  
>  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>  	entry->file.data = data;
>  	entry->dev = dev;
>  
> -	mutex_lock(&dev->debugfs_mutex);
> -	list_add(&entry->list, &dev->debugfs_list);
> -	mutex_unlock(&dev->debugfs_mutex);
> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> +			    &drm_debugfs_entry_fops);
> +
> +	/* TODO: This should probably only be a symlink */
> +	if (dev->render)
> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> +				    entry, &drm_debugfs_entry_fops);

Nope. You are fundamentally missing the point of all this, which is:

- drivers create debugfs files whenever they want to, as long as it's
  _before_ drm_dev_register is called.

- drm_dev_register will set them all up.

This is necessary because otherwise you have the potential for some nice
oops and stuff when userspace tries to access these files before the
driver is ready.

Note that with sysfs all this infrastructure already exists, which is why
you can create sysfs files whenever you feel like, and things wont go
boom.

So yeah we need the list.

This also means that we really should not create the debugfs directories
_before_ drm_dev_register is called. That's just fundamentally not how
device interface setup should work:

1. you allocate stucts and stuff
2. you fully init everything
3. you register interfaces so they become userspace visible
-Daniel

>  }
>  EXPORT_SYMBOL(drm_debugfs_add_file);
>  
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 2cbe028e548c..e7b88b65866c 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
>  	mutex_destroy(&dev->clientlist_mutex);
>  	mutex_destroy(&dev->filelist_mutex);
>  	mutex_destroy(&dev->struct_mutex);
> -	mutex_destroy(&dev->debugfs_mutex);
>  	drm_legacy_destroy_members(dev);
>  }
>  
> @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
>  	INIT_LIST_HEAD(&dev->filelist_internal);
>  	INIT_LIST_HEAD(&dev->clientlist);
>  	INIT_LIST_HEAD(&dev->vblank_event_list);
> -	INIT_LIST_HEAD(&dev->debugfs_list);
>  
>  	spin_lock_init(&dev->event_lock);
>  	mutex_init(&dev->struct_mutex);
>  	mutex_init(&dev->filelist_mutex);
>  	mutex_init(&dev->clientlist_mutex);
>  	mutex_init(&dev->master_mutex);
> -	mutex_init(&dev->debugfs_mutex);
>  
>  	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
>  	if (ret)
> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> index 5ff7bf88f162..e215d00ba65c 100644
> --- a/drivers/gpu/drm/drm_internal.h
> +++ b/drivers/gpu/drm/drm_internal.h
> @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>  void drm_debugfs_dev_register(struct drm_device *dev);
>  void drm_debugfs_minor_register(struct drm_minor *minor);
>  void drm_debugfs_cleanup(struct drm_minor *minor);
> -void drm_debugfs_late_register(struct drm_device *dev);
>  void drm_debugfs_connector_add(struct drm_connector *connector);
>  void drm_debugfs_connector_remove(struct drm_connector *connector);
>  void drm_debugfs_crtc_add(struct drm_crtc *crtc);
> @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
>  {
>  }
>  
> -static inline void drm_debugfs_late_register(struct drm_device *dev)
> -{
> -}
> -
>  static inline void drm_debugfs_connector_add(struct drm_connector *connector)
>  {
>  }
> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
> index 87eb591fe9b5..8525ef851540 100644
> --- a/drivers/gpu/drm/drm_mode_config.c
> +++ b/drivers/gpu/drm/drm_mode_config.c
> @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
>  	if (ret)
>  		goto err_connector;
>  
> -	drm_debugfs_late_register(dev);
> -
>  	return 0;
>  
>  err_connector:
> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> index 7cf4afae2e79..900ad7478dd8 100644
> --- a/include/drm/drm_device.h
> +++ b/include/drm/drm_device.h
> @@ -311,21 +311,6 @@ struct drm_device {
>  	 */
>  	struct drm_fb_helper *fb_helper;
>  
> -	/**
> -	 * @debugfs_mutex:
> -	 *
> -	 * Protects &debugfs_list access.
> -	 */
> -	struct mutex debugfs_mutex;
> -
> -	/**
> -	 * @debugfs_list:
> -	 *
> -	 * List of debugfs files to be created by the DRM device. The files
> -	 * must be added during drm_dev_register().
> -	 */
> -	struct list_head debugfs_list;
> -
>  	/* Everything below here is for legacy driver, never use! */
>  	/* private: */
>  #if IS_ENABLED(CONFIG_DRM_LEGACY)
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-09 14:06       ` Christian König
  2023-02-09 14:19         ` Maxime Ripard
@ 2023-02-16 11:34         ` Daniel Vetter
  2023-02-16 16:31           ` Christian König
  1 sibling, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2023-02-16 11:34 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	maxime, wambui.karugax

On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
> Am 09.02.23 um 14:06 schrieb Maíra Canal:
> > On 2/9/23 09:13, Christian König wrote:
> > > Am 09.02.23 um 12:23 schrieb Maíra Canal:
> > > > On 2/9/23 05:18, Christian König wrote:
> > > > > Hello everyone,
> > > > > 
> > > > > the drm_debugfs has a couple of well known design problems.
> > > > > 
> > > > > Especially it wasn't possible to add files between
> > > > > initializing and registering
> > > > > of DRM devices since the underlying debugfs directory wasn't
> > > > > created yet.
> > > > > 
> > > > > The resulting necessity of the driver->debugfs_init()
> > > > > callback function is a
> > > > > mid-layering which is really frowned on since it creates a horrible
> > > > > driver->DRM->driver design layering.
> > > > > 
> > > > > The recent patch "drm/debugfs: create device-centered
> > > > > debugfs functions" tried
> > > > > to address those problem, but doesn't seem to work
> > > > > correctly. This looks like
> > > > > a misunderstanding of the call flow around
> > > > > drm_debugfs_init(), which is called
> > > > > multiple times, once for the primary and once for the render node.
> > > > > 
> > > > > So what happens now is the following:
> > > > > 
> > > > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > > > 2. ... back to the driver ...
> > > > > 3. drm_dev_register() is called.
> > > > > 
> > > > > 4. drm_debugfs_init() is called for the primary node.
> > > > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > >     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > to add the files
> > > > >     for the primary node.
> > > > > 6. The driver->debugfs_init() callback is called to add
> > > > > debugfs files for the
> > > > >     primary node.
> > > > > 7. The added files are consumed and added to the primary
> > > > > node debugfs directory.
> > > > > 
> > > > > 8. drm_debugfs_init() is called for the render node.
> > > > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > >     drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > to add the files
> > > > >     again for the render node.
> > > > > 10. The driver->debugfs_init() callback is called to add
> > > > > debugfs files for the
> > > > >      render node.
> > > > > 11. The added files are consumed and added to the render
> > > > > node debugfs directory.
> > > > > 
> > > > > 12. Some more files are added through drm_debugfs_add_file().
> > > > > 13. drm_debugfs_late_register() add the files once more to
> > > > > the primary node
> > > > >      debugfs directory.
> > > > > 14. From this point on files added through
> > > > > drm_debugfs_add_file() are simply ignored.
> > > > > 15. ... back to the driver ...
> > > > > 
> > > > > Because of this the dev->debugfs_mutex lock is also
> > > > > completely pointless since
> > > > > any concurrent use of the interface would just randomly
> > > > > either add the files to
> > > > > the primary or render node or just not at all.
> > > > > 
> > > > > Even worse is that this implementation nails the coffin for
> > > > > removing the
> > > > > driver->debugfs_init() mid-layering because otherwise
> > > > > drivers can't control
> > > > > where their debugfs (primary/render node) are actually added.
> > > > > 
> > > > > This patch set here now tries to clean this up a bit, but
> > > > > most likely isn't
> > > > > fully complete either since I didn't audit every driver/call path.
> > > > 
> > > > I tested the patchset on the v3d, vc4 and vkms and all the files
> > > > are generated
> > > > as expected, but I'm getting the following errors on dmesg:
> > > > 
> > > > [    3.872026] debugfs: File 'v3d_ident' in directory '0'
> > > > already present!
> > > > [    3.872064] debugfs: File 'v3d_ident' in directory '128'
> > > > already present!
> > > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
> > > > present!
> > > > [    3.872087] debugfs: File 'v3d_regs' in directory '128'
> > > > already present!
> > > > [    3.872097] debugfs: File 'measure_clock' in directory '0'
> > > > already present!
> > > > [    3.872105] debugfs: File 'measure_clock' in directory '128'
> > > > already present!
> > > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already
> > > > present!
> > > > [    3.872124] debugfs: File 'bo_stats' in directory '128'
> > > > already present!
> > > > 
> > > > It looks like the render node is being added twice, since this
> > > > doesn't happen
> > > > for vc4 and vkms.
> > > 
> > > Thanks for the feedback and yes that's exactly what I meant with
> > > that I haven't looked into all code paths.
> > > 
> > > Could it be that v3d registers it's debugfs files from the
> > > debugfs_init callback?
> > 
> > Although this is true, I'm not sure if this is the reason why the files
> > are
> > being registered twice, as this doesn't happen to vc4, and it also uses
> > the
> > debugfs_init callback. I believe it is somewhat related to the fact that
> > v3d is the primary node and the render node.
> 
> I see. Thanks for the hint.
> 
> > 
> > Best Regards,
> > - Maíra Canal
> > 
> > > 
> > > One alternative would be to just completely nuke support for
> > > separate render node debugfs files and only add a symlink to the
> > > primary node. Opinions?
> 
> What do you think of this approach? I can't come up with any reason why we
> should have separate debugfs files for render nodes and I think it is pretty
> much the same reason you came up with the patch for per device debugfs files
> instead of per minor.

Yeah I think best is to symlink around a bit for compat. I thought we
where doing that already, and you can't actually create debugfs files on
render nodes? Or did I only dream about this?
-Daniel

> 
> Regards,
> Christian.
> 
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > > 
> > > > Otherwise, the patchset looks good to me, but maybe Daniel has
> > > > some other
> > > > thoughts about it.
> > > > 
> > > > Best Regards,
> > > > - Maíra Canal
> > > > 
> > > > > 
> > > > > Please comment/discuss.
> > > > > 
> > > > > Cheers,
> > > > > Christian.
> > > > > 
> > > > > 
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 11:33   ` Daniel Vetter
@ 2023-02-16 11:37     ` Daniel Vetter
  2023-02-16 16:00     ` Christian König
  2023-02-16 16:37     ` Stanislaw Gruszka
  2 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2023-02-16 11:37 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

On Thu, Feb 16, 2023 at 12:33:08PM +0100, Daniel Vetter wrote:
> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> > The mutex was completely pointless in the first place since any
> > parallel adding of files to this list would result in random
> > behavior since the list is filled and consumed multiple times.
> > 
> > Completely drop that approach and just create the files directly.
> > 
> > This also re-adds the debugfs files to the render node directory and
> > removes drm_debugfs_late_register().
> > 
> > Signed-off-by: Christian König <christian.koenig@amd.com>
> > ---
> >  drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
> >  drivers/gpu/drm/drm_drv.c         |  3 ---
> >  drivers/gpu/drm/drm_internal.h    |  5 -----
> >  drivers/gpu/drm/drm_mode_config.c |  2 --
> >  include/drm/drm_device.h          | 15 ---------------
> >  5 files changed, 7 insertions(+), 50 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> > index 558e3a7271a5..a40288e67264 100644
> > --- a/drivers/gpu/drm/drm_debugfs.c
> > +++ b/drivers/gpu/drm/drm_debugfs.c
> > @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
> >  void drm_debugfs_minor_register(struct drm_minor *minor)
> >  {
> >  	struct drm_device *dev = minor->dev;
> > -	struct drm_debugfs_entry *entry, *tmp;
> >  
> >  	if (dev->driver->debugfs_init)
> >  		dev->driver->debugfs_init(minor);
> > -
> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > -		debugfs_create_file(entry->file.name, 0444,
> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > -		list_del(&entry->list);
> > -	}
> > -}
> > -
> > -void drm_debugfs_late_register(struct drm_device *dev)
> > -{
> > -	struct drm_minor *minor = dev->primary;
> > -	struct drm_debugfs_entry *entry, *tmp;
> > -
> > -	if (!minor)
> > -		return;
> > -
> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > -		debugfs_create_file(entry->file.name, 0444,
> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > -		list_del(&entry->list);
> > -	}
> >  }
> >  
> >  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> >  	entry->file.data = data;
> >  	entry->dev = dev;
> >  
> > -	mutex_lock(&dev->debugfs_mutex);
> > -	list_add(&entry->list, &dev->debugfs_list);
> > -	mutex_unlock(&dev->debugfs_mutex);
> > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> > +			    &drm_debugfs_entry_fops);
> > +
> > +	/* TODO: This should probably only be a symlink */
> > +	if (dev->render)
> > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> > +				    entry, &drm_debugfs_entry_fops);
> 
> Nope. You are fundamentally missing the point of all this, which is:
> 
> - drivers create debugfs files whenever they want to, as long as it's
>   _before_ drm_dev_register is called.
> 
> - drm_dev_register will set them all up.
> 
> This is necessary because otherwise you have the potential for some nice
> oops and stuff when userspace tries to access these files before the
> driver is ready.
> 
> Note that with sysfs all this infrastructure already exists, which is why
> you can create sysfs files whenever you feel like, and things wont go
> boom.
> 
> So yeah we need the list.
> 
> This also means that we really should not create the debugfs directories
> _before_ drm_dev_register is called. That's just fundamentally not how
> device interface setup should work:
> 
> 1. you allocate stucts and stuff
> 2. you fully init everything
> 3. you register interfaces so they become userspace visible

What I forgot to add: The mutex seems surplus and could probably be
removed. But we need the mutex once this infra is extracted to other drm
things like connector/crtc debugfs files, because you can hotplug
connectors. But maybe the mutex isn't even need in that case (since for a
single object you still should not multi-thread anything).

So removing the mutex here seems like a reasonable thing to do, but
funamentally the list and the entire delayed debugfs setup must stay.
Otherwise we cannot remove the entire debugfs_init midlayer mess without
creating huge amounts of driver bugs in the init sequencing.
-Daniel


> -Daniel
> 
> >  }
> >  EXPORT_SYMBOL(drm_debugfs_add_file);
> >  
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 2cbe028e548c..e7b88b65866c 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
> >  	mutex_destroy(&dev->clientlist_mutex);
> >  	mutex_destroy(&dev->filelist_mutex);
> >  	mutex_destroy(&dev->struct_mutex);
> > -	mutex_destroy(&dev->debugfs_mutex);
> >  	drm_legacy_destroy_members(dev);
> >  }
> >  
> > @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
> >  	INIT_LIST_HEAD(&dev->filelist_internal);
> >  	INIT_LIST_HEAD(&dev->clientlist);
> >  	INIT_LIST_HEAD(&dev->vblank_event_list);
> > -	INIT_LIST_HEAD(&dev->debugfs_list);
> >  
> >  	spin_lock_init(&dev->event_lock);
> >  	mutex_init(&dev->struct_mutex);
> >  	mutex_init(&dev->filelist_mutex);
> >  	mutex_init(&dev->clientlist_mutex);
> >  	mutex_init(&dev->master_mutex);
> > -	mutex_init(&dev->debugfs_mutex);
> >  
> >  	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
> >  	if (ret)
> > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> > index 5ff7bf88f162..e215d00ba65c 100644
> > --- a/drivers/gpu/drm/drm_internal.h
> > +++ b/drivers/gpu/drm/drm_internal.h
> > @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
> >  void drm_debugfs_dev_register(struct drm_device *dev);
> >  void drm_debugfs_minor_register(struct drm_minor *minor);
> >  void drm_debugfs_cleanup(struct drm_minor *minor);
> > -void drm_debugfs_late_register(struct drm_device *dev);
> >  void drm_debugfs_connector_add(struct drm_connector *connector);
> >  void drm_debugfs_connector_remove(struct drm_connector *connector);
> >  void drm_debugfs_crtc_add(struct drm_crtc *crtc);
> > @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
> >  {
> >  }
> >  
> > -static inline void drm_debugfs_late_register(struct drm_device *dev)
> > -{
> > -}
> > -
> >  static inline void drm_debugfs_connector_add(struct drm_connector *connector)
> >  {
> >  }
> > diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
> > index 87eb591fe9b5..8525ef851540 100644
> > --- a/drivers/gpu/drm/drm_mode_config.c
> > +++ b/drivers/gpu/drm/drm_mode_config.c
> > @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
> >  	if (ret)
> >  		goto err_connector;
> >  
> > -	drm_debugfs_late_register(dev);
> > -
> >  	return 0;
> >  
> >  err_connector:
> > diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> > index 7cf4afae2e79..900ad7478dd8 100644
> > --- a/include/drm/drm_device.h
> > +++ b/include/drm/drm_device.h
> > @@ -311,21 +311,6 @@ struct drm_device {
> >  	 */
> >  	struct drm_fb_helper *fb_helper;
> >  
> > -	/**
> > -	 * @debugfs_mutex:
> > -	 *
> > -	 * Protects &debugfs_list access.
> > -	 */
> > -	struct mutex debugfs_mutex;
> > -
> > -	/**
> > -	 * @debugfs_list:
> > -	 *
> > -	 * List of debugfs files to be created by the DRM device. The files
> > -	 * must be added during drm_dev_register().
> > -	 */
> > -	struct list_head debugfs_list;
> > -
> >  	/* Everything below here is for legacy driver, never use! */
> >  	/* private: */
> >  #if IS_ENABLED(CONFIG_DRM_LEGACY)
> > -- 
> > 2.34.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 11:33   ` Daniel Vetter
  2023-02-16 11:37     ` Daniel Vetter
@ 2023-02-16 16:00     ` Christian König
  2023-02-16 16:46       ` Jani Nikula
  2023-02-16 16:37     ` Stanislaw Gruszka
  2 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-16 16:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

Am 16.02.23 um 12:33 schrieb Daniel Vetter:
> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>> The mutex was completely pointless in the first place since any
>> parallel adding of files to this list would result in random
>> behavior since the list is filled and consumed multiple times.
>>
>> Completely drop that approach and just create the files directly.
>>
>> This also re-adds the debugfs files to the render node directory and
>> removes drm_debugfs_late_register().
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>>   drivers/gpu/drm/drm_drv.c         |  3 ---
>>   drivers/gpu/drm/drm_internal.h    |  5 -----
>>   drivers/gpu/drm/drm_mode_config.c |  2 --
>>   include/drm/drm_device.h          | 15 ---------------
>>   5 files changed, 7 insertions(+), 50 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>> index 558e3a7271a5..a40288e67264 100644
>> --- a/drivers/gpu/drm/drm_debugfs.c
>> +++ b/drivers/gpu/drm/drm_debugfs.c
>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>>   void drm_debugfs_minor_register(struct drm_minor *minor)
>>   {
>>   	struct drm_device *dev = minor->dev;
>> -	struct drm_debugfs_entry *entry, *tmp;
>>   
>>   	if (dev->driver->debugfs_init)
>>   		dev->driver->debugfs_init(minor);
>> -
>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>> -		debugfs_create_file(entry->file.name, 0444,
>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>> -		list_del(&entry->list);
>> -	}
>> -}
>> -
>> -void drm_debugfs_late_register(struct drm_device *dev)
>> -{
>> -	struct drm_minor *minor = dev->primary;
>> -	struct drm_debugfs_entry *entry, *tmp;
>> -
>> -	if (!minor)
>> -		return;
>> -
>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>> -		debugfs_create_file(entry->file.name, 0444,
>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>> -		list_del(&entry->list);
>> -	}
>>   }
>>   
>>   int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>>   	entry->file.data = data;
>>   	entry->dev = dev;
>>   
>> -	mutex_lock(&dev->debugfs_mutex);
>> -	list_add(&entry->list, &dev->debugfs_list);
>> -	mutex_unlock(&dev->debugfs_mutex);
>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>> +			    &drm_debugfs_entry_fops);
>> +
>> +	/* TODO: This should probably only be a symlink */
>> +	if (dev->render)
>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>> +				    entry, &drm_debugfs_entry_fops);
> Nope. You are fundamentally missing the point of all this, which is:
>
> - drivers create debugfs files whenever they want to, as long as it's
>    _before_ drm_dev_register is called.
>
> - drm_dev_register will set them all up.
>
> This is necessary because otherwise you have the potential for some nice
> oops and stuff when userspace tries to access these files before the
> driver is ready.
>
> Note that with sysfs all this infrastructure already exists, which is why
> you can create sysfs files whenever you feel like, and things wont go
> boom.

Well Yeah I've considered that, I just don't think it's a good idea for 
debugfs.

debugfs is meant to be a helper for debugging things and that especially 
includes the time between drm_dev_init() and drm_dev_register() because 
that's where we probe the hardware and try to get it working.

Not having the debugfs files which allows for things like hardware 
register access and reading internal state during that is a really and I 
mean REALLY bad idea. This is essentially what we have those files for.

> So yeah we need the list.
>
> This also means that we really should not create the debugfs directories
> _before_ drm_dev_register is called. That's just fundamentally not how
> device interface setup should work:
>
> 1. you allocate stucts and stuff
> 2. you fully init everything
> 3. you register interfaces so they become userspace visible

How about we create the debugfs directory early and only delay the files 
registered through this drm_debugfs interface until registration time?

This way drivers can still decide if they want the files available 
immediately or only after registration.

What drivers currently do is like radeon setting an accel_working flag 
and registering anyway even if halve the hardware doesn't work.

Regards,
Christian.

> -Daniel
>
>>   }
>>   EXPORT_SYMBOL(drm_debugfs_add_file);
>>   
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 2cbe028e548c..e7b88b65866c 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
>>   	mutex_destroy(&dev->clientlist_mutex);
>>   	mutex_destroy(&dev->filelist_mutex);
>>   	mutex_destroy(&dev->struct_mutex);
>> -	mutex_destroy(&dev->debugfs_mutex);
>>   	drm_legacy_destroy_members(dev);
>>   }
>>   
>> @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
>>   	INIT_LIST_HEAD(&dev->filelist_internal);
>>   	INIT_LIST_HEAD(&dev->clientlist);
>>   	INIT_LIST_HEAD(&dev->vblank_event_list);
>> -	INIT_LIST_HEAD(&dev->debugfs_list);
>>   
>>   	spin_lock_init(&dev->event_lock);
>>   	mutex_init(&dev->struct_mutex);
>>   	mutex_init(&dev->filelist_mutex);
>>   	mutex_init(&dev->clientlist_mutex);
>>   	mutex_init(&dev->master_mutex);
>> -	mutex_init(&dev->debugfs_mutex);
>>   
>>   	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
>>   	if (ret)
>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>> index 5ff7bf88f162..e215d00ba65c 100644
>> --- a/drivers/gpu/drm/drm_internal.h
>> +++ b/drivers/gpu/drm/drm_internal.h
>> @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>>   void drm_debugfs_dev_register(struct drm_device *dev);
>>   void drm_debugfs_minor_register(struct drm_minor *minor);
>>   void drm_debugfs_cleanup(struct drm_minor *minor);
>> -void drm_debugfs_late_register(struct drm_device *dev);
>>   void drm_debugfs_connector_add(struct drm_connector *connector);
>>   void drm_debugfs_connector_remove(struct drm_connector *connector);
>>   void drm_debugfs_crtc_add(struct drm_crtc *crtc);
>> @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
>>   {
>>   }
>>   
>> -static inline void drm_debugfs_late_register(struct drm_device *dev)
>> -{
>> -}
>> -
>>   static inline void drm_debugfs_connector_add(struct drm_connector *connector)
>>   {
>>   }
>> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
>> index 87eb591fe9b5..8525ef851540 100644
>> --- a/drivers/gpu/drm/drm_mode_config.c
>> +++ b/drivers/gpu/drm/drm_mode_config.c
>> @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
>>   	if (ret)
>>   		goto err_connector;
>>   
>> -	drm_debugfs_late_register(dev);
>> -
>>   	return 0;
>>   
>>   err_connector:
>> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
>> index 7cf4afae2e79..900ad7478dd8 100644
>> --- a/include/drm/drm_device.h
>> +++ b/include/drm/drm_device.h
>> @@ -311,21 +311,6 @@ struct drm_device {
>>   	 */
>>   	struct drm_fb_helper *fb_helper;
>>   
>> -	/**
>> -	 * @debugfs_mutex:
>> -	 *
>> -	 * Protects &debugfs_list access.
>> -	 */
>> -	struct mutex debugfs_mutex;
>> -
>> -	/**
>> -	 * @debugfs_list:
>> -	 *
>> -	 * List of debugfs files to be created by the DRM device. The files
>> -	 * must be added during drm_dev_register().
>> -	 */
>> -	struct list_head debugfs_list;
>> -
>>   	/* Everything below here is for legacy driver, never use! */
>>   	/* private: */
>>   #if IS_ENABLED(CONFIG_DRM_LEGACY)
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-16 11:34         ` Daniel Vetter
@ 2023-02-16 16:31           ` Christian König
  2023-02-16 19:57             ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-16 16:31 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	maxime, wambui.karugax



Am 16.02.23 um 12:34 schrieb Daniel Vetter:
> On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
>> Am 09.02.23 um 14:06 schrieb Maíra Canal:
>>> On 2/9/23 09:13, Christian König wrote:
>>>> Am 09.02.23 um 12:23 schrieb Maíra Canal:
>>>>> On 2/9/23 05:18, Christian König wrote:
>>>>>> Hello everyone,
>>>>>>
>>>>>> the drm_debugfs has a couple of well known design problems.
>>>>>>
>>>>>> Especially it wasn't possible to add files between
>>>>>> initializing and registering
>>>>>> of DRM devices since the underlying debugfs directory wasn't
>>>>>> created yet.
>>>>>>
>>>>>> The resulting necessity of the driver->debugfs_init()
>>>>>> callback function is a
>>>>>> mid-layering which is really frowned on since it creates a horrible
>>>>>> driver->DRM->driver design layering.
>>>>>>
>>>>>> The recent patch "drm/debugfs: create device-centered
>>>>>> debugfs functions" tried
>>>>>> to address those problem, but doesn't seem to work
>>>>>> correctly. This looks like
>>>>>> a misunderstanding of the call flow around
>>>>>> drm_debugfs_init(), which is called
>>>>>> multiple times, once for the primary and once for the render node.
>>>>>>
>>>>>> So what happens now is the following:
>>>>>>
>>>>>> 1. drm_dev_init() initially allocates the drm_minor objects.
>>>>>> 2. ... back to the driver ...
>>>>>> 3. drm_dev_register() is called.
>>>>>>
>>>>>> 4. drm_debugfs_init() is called for the primary node.
>>>>>> 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>>>      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
>>>>>> to add the files
>>>>>>      for the primary node.
>>>>>> 6. The driver->debugfs_init() callback is called to add
>>>>>> debugfs files for the
>>>>>>      primary node.
>>>>>> 7. The added files are consumed and added to the primary
>>>>>> node debugfs directory.
>>>>>>
>>>>>> 8. drm_debugfs_init() is called for the render node.
>>>>>> 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
>>>>>>      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
>>>>>> to add the files
>>>>>>      again for the render node.
>>>>>> 10. The driver->debugfs_init() callback is called to add
>>>>>> debugfs files for the
>>>>>>       render node.
>>>>>> 11. The added files are consumed and added to the render
>>>>>> node debugfs directory.
>>>>>>
>>>>>> 12. Some more files are added through drm_debugfs_add_file().
>>>>>> 13. drm_debugfs_late_register() add the files once more to
>>>>>> the primary node
>>>>>>       debugfs directory.
>>>>>> 14. From this point on files added through
>>>>>> drm_debugfs_add_file() are simply ignored.
>>>>>> 15. ... back to the driver ...
>>>>>>
>>>>>> Because of this the dev->debugfs_mutex lock is also
>>>>>> completely pointless since
>>>>>> any concurrent use of the interface would just randomly
>>>>>> either add the files to
>>>>>> the primary or render node or just not at all.
>>>>>>
>>>>>> Even worse is that this implementation nails the coffin for
>>>>>> removing the
>>>>>> driver->debugfs_init() mid-layering because otherwise
>>>>>> drivers can't control
>>>>>> where their debugfs (primary/render node) are actually added.
>>>>>>
>>>>>> This patch set here now tries to clean this up a bit, but
>>>>>> most likely isn't
>>>>>> fully complete either since I didn't audit every driver/call path.
>>>>> I tested the patchset on the v3d, vc4 and vkms and all the files
>>>>> are generated
>>>>> as expected, but I'm getting the following errors on dmesg:
>>>>>
>>>>> [    3.872026] debugfs: File 'v3d_ident' in directory '0'
>>>>> already present!
>>>>> [    3.872064] debugfs: File 'v3d_ident' in directory '128'
>>>>> already present!
>>>>> [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
>>>>> present!
>>>>> [    3.872087] debugfs: File 'v3d_regs' in directory '128'
>>>>> already present!
>>>>> [    3.872097] debugfs: File 'measure_clock' in directory '0'
>>>>> already present!
>>>>> [    3.872105] debugfs: File 'measure_clock' in directory '128'
>>>>> already present!
>>>>> [    3.872116] debugfs: File 'bo_stats' in directory '0' already
>>>>> present!
>>>>> [    3.872124] debugfs: File 'bo_stats' in directory '128'
>>>>> already present!
>>>>>
>>>>> It looks like the render node is being added twice, since this
>>>>> doesn't happen
>>>>> for vc4 and vkms.
>>>> Thanks for the feedback and yes that's exactly what I meant with
>>>> that I haven't looked into all code paths.
>>>>
>>>> Could it be that v3d registers it's debugfs files from the
>>>> debugfs_init callback?
>>> Although this is true, I'm not sure if this is the reason why the files
>>> are
>>> being registered twice, as this doesn't happen to vc4, and it also uses
>>> the
>>> debugfs_init callback. I believe it is somewhat related to the fact that
>>> v3d is the primary node and the render node.
>> I see. Thanks for the hint.
>>
>>> Best Regards,
>>> - Maíra Canal
>>>
>>>> One alternative would be to just completely nuke support for
>>>> separate render node debugfs files and only add a symlink to the
>>>> primary node. Opinions?
>> What do you think of this approach? I can't come up with any reason why we
>> should have separate debugfs files for render nodes and I think it is pretty
>> much the same reason you came up with the patch for per device debugfs files
>> instead of per minor.
> Yeah I think best is to symlink around a bit for compat. I thought we
> where doing that already, and you can't actually create debugfs files on
> render nodes? Or did I only dream about this?

No, we still have that distinction around unfortunately.

That's why this went boom for me in the first place.

Christian.

> -Daniel
>
>> Regards,
>> Christian.
>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Otherwise, the patchset looks good to me, but maybe Daniel has
>>>>> some other
>>>>> thoughts about it.
>>>>>
>>>>> Best Regards,
>>>>> - Maíra Canal
>>>>>
>>>>>> Please comment/discuss.
>>>>>>
>>>>>> Cheers,
>>>>>> Christian.
>>>>>>
>>>>>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 11:33   ` Daniel Vetter
  2023-02-16 11:37     ` Daniel Vetter
  2023-02-16 16:00     ` Christian König
@ 2023-02-16 16:37     ` Stanislaw Gruszka
  2023-02-16 17:06       ` Jani Nikula
  2 siblings, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-16 16:37 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, mcanal, dri-devel, mwen, mairacanal,
	maxime, daniel.vetter, wambui.karugax

On Thu, Feb 16, 2023 at 12:33:08PM +0100, Daniel Vetter wrote:
> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> > The mutex was completely pointless in the first place since any
> > parallel adding of files to this list would result in random
> > behavior since the list is filled and consumed multiple times.
> > 
> > Completely drop that approach and just create the files directly.
> > 
> > This also re-adds the debugfs files to the render node directory and
> > removes drm_debugfs_late_register().
> > 
> > Signed-off-by: Christian König <christian.koenig@amd.com>
> > ---
> >  drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
> >  drivers/gpu/drm/drm_drv.c         |  3 ---
> >  drivers/gpu/drm/drm_internal.h    |  5 -----
> >  drivers/gpu/drm/drm_mode_config.c |  2 --
> >  include/drm/drm_device.h          | 15 ---------------
> >  5 files changed, 7 insertions(+), 50 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> > index 558e3a7271a5..a40288e67264 100644
> > --- a/drivers/gpu/drm/drm_debugfs.c
> > +++ b/drivers/gpu/drm/drm_debugfs.c
> > @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
> >  void drm_debugfs_minor_register(struct drm_minor *minor)
> >  {
> >  	struct drm_device *dev = minor->dev;
> > -	struct drm_debugfs_entry *entry, *tmp;
> >  
> >  	if (dev->driver->debugfs_init)
> >  		dev->driver->debugfs_init(minor);
> > -
> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > -		debugfs_create_file(entry->file.name, 0444,
> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > -		list_del(&entry->list);
> > -	}
> > -}
> > -
> > -void drm_debugfs_late_register(struct drm_device *dev)
> > -{
> > -	struct drm_minor *minor = dev->primary;
> > -	struct drm_debugfs_entry *entry, *tmp;
> > -
> > -	if (!minor)
> > -		return;
> > -
> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > -		debugfs_create_file(entry->file.name, 0444,
> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > -		list_del(&entry->list);
> > -	}
> >  }
> >  
> >  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> >  	entry->file.data = data;
> >  	entry->dev = dev;
> >  
> > -	mutex_lock(&dev->debugfs_mutex);
> > -	list_add(&entry->list, &dev->debugfs_list);
> > -	mutex_unlock(&dev->debugfs_mutex);
> > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> > +			    &drm_debugfs_entry_fops);
> > +
> > +	/* TODO: This should probably only be a symlink */
> > +	if (dev->render)
> > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> > +				    entry, &drm_debugfs_entry_fops);
> 
> Nope. You are fundamentally missing the point of all this, which is:
> 
> - drivers create debugfs files whenever they want to, as long as it's
>   _before_ drm_dev_register is called.
> 
> - drm_dev_register will set them all up.
> 
> This is necessary because otherwise you have the potential for some nice
> oops and stuff when userspace tries to access these files before the
> driver is ready.

But should not this the driver responsibility, call drm_debugfs_add_file()
whenever you are ready to handle operations on added file ?

Regards
Stanislaw

> Note that with sysfs all this infrastructure already exists, which is why
> you can create sysfs files whenever you feel like, and things wont go
> boom.
> 
> So yeah we need the list.
> 
> This also means that we really should not create the debugfs directories
> _before_ drm_dev_register is called. That's just fundamentally not how
> device interface setup should work:
> 
> 1. you allocate stucts and stuff
> 2. you fully init everything
> 3. you register interfaces so they become userspace visible
> -Daniel
> 
> >  }
> >  EXPORT_SYMBOL(drm_debugfs_add_file);
> >  
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 2cbe028e548c..e7b88b65866c 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
> >  	mutex_destroy(&dev->clientlist_mutex);
> >  	mutex_destroy(&dev->filelist_mutex);
> >  	mutex_destroy(&dev->struct_mutex);
> > -	mutex_destroy(&dev->debugfs_mutex);
> >  	drm_legacy_destroy_members(dev);
> >  }
> >  
> > @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
> >  	INIT_LIST_HEAD(&dev->filelist_internal);
> >  	INIT_LIST_HEAD(&dev->clientlist);
> >  	INIT_LIST_HEAD(&dev->vblank_event_list);
> > -	INIT_LIST_HEAD(&dev->debugfs_list);
> >  
> >  	spin_lock_init(&dev->event_lock);
> >  	mutex_init(&dev->struct_mutex);
> >  	mutex_init(&dev->filelist_mutex);
> >  	mutex_init(&dev->clientlist_mutex);
> >  	mutex_init(&dev->master_mutex);
> > -	mutex_init(&dev->debugfs_mutex);
> >  
> >  	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
> >  	if (ret)
> > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> > index 5ff7bf88f162..e215d00ba65c 100644
> > --- a/drivers/gpu/drm/drm_internal.h
> > +++ b/drivers/gpu/drm/drm_internal.h
> > @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
> >  void drm_debugfs_dev_register(struct drm_device *dev);
> >  void drm_debugfs_minor_register(struct drm_minor *minor);
> >  void drm_debugfs_cleanup(struct drm_minor *minor);
> > -void drm_debugfs_late_register(struct drm_device *dev);
> >  void drm_debugfs_connector_add(struct drm_connector *connector);
> >  void drm_debugfs_connector_remove(struct drm_connector *connector);
> >  void drm_debugfs_crtc_add(struct drm_crtc *crtc);
> > @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
> >  {
> >  }
> >  
> > -static inline void drm_debugfs_late_register(struct drm_device *dev)
> > -{
> > -}
> > -
> >  static inline void drm_debugfs_connector_add(struct drm_connector *connector)
> >  {
> >  }
> > diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
> > index 87eb591fe9b5..8525ef851540 100644
> > --- a/drivers/gpu/drm/drm_mode_config.c
> > +++ b/drivers/gpu/drm/drm_mode_config.c
> > @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
> >  	if (ret)
> >  		goto err_connector;
> >  
> > -	drm_debugfs_late_register(dev);
> > -
> >  	return 0;
> >  
> >  err_connector:
> > diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> > index 7cf4afae2e79..900ad7478dd8 100644
> > --- a/include/drm/drm_device.h
> > +++ b/include/drm/drm_device.h
> > @@ -311,21 +311,6 @@ struct drm_device {
> >  	 */
> >  	struct drm_fb_helper *fb_helper;
> >  
> > -	/**
> > -	 * @debugfs_mutex:
> > -	 *
> > -	 * Protects &debugfs_list access.
> > -	 */
> > -	struct mutex debugfs_mutex;
> > -
> > -	/**
> > -	 * @debugfs_list:
> > -	 *
> > -	 * List of debugfs files to be created by the DRM device. The files
> > -	 * must be added during drm_dev_register().
> > -	 */
> > -	struct list_head debugfs_list;
> > -
> >  	/* Everything below here is for legacy driver, never use! */
> >  	/* private: */
> >  #if IS_ENABLED(CONFIG_DRM_LEGACY)
> > -- 
> > 2.34.1
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 16:00     ` Christian König
@ 2023-02-16 16:46       ` Jani Nikula
  2023-02-16 16:56         ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Jani Nikula @ 2023-02-16 16:46 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> Am 16.02.23 um 12:33 schrieb Daniel Vetter:
>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>>> The mutex was completely pointless in the first place since any
>>> parallel adding of files to this list would result in random
>>> behavior since the list is filled and consumed multiple times.
>>>
>>> Completely drop that approach and just create the files directly.
>>>
>>> This also re-adds the debugfs files to the render node directory and
>>> removes drm_debugfs_late_register().
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>>>   drivers/gpu/drm/drm_drv.c         |  3 ---
>>>   drivers/gpu/drm/drm_internal.h    |  5 -----
>>>   drivers/gpu/drm/drm_mode_config.c |  2 --
>>>   include/drm/drm_device.h          | 15 ---------------
>>>   5 files changed, 7 insertions(+), 50 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>>> index 558e3a7271a5..a40288e67264 100644
>>> --- a/drivers/gpu/drm/drm_debugfs.c
>>> +++ b/drivers/gpu/drm/drm_debugfs.c
>>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>>>   void drm_debugfs_minor_register(struct drm_minor *minor)
>>>   {
>>>   	struct drm_device *dev = minor->dev;
>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>   
>>>   	if (dev->driver->debugfs_init)
>>>   		dev->driver->debugfs_init(minor);
>>> -
>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>> -		debugfs_create_file(entry->file.name, 0444,
>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>> -		list_del(&entry->list);
>>> -	}
>>> -}
>>> -
>>> -void drm_debugfs_late_register(struct drm_device *dev)
>>> -{
>>> -	struct drm_minor *minor = dev->primary;
>>> -	struct drm_debugfs_entry *entry, *tmp;
>>> -
>>> -	if (!minor)
>>> -		return;
>>> -
>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>> -		debugfs_create_file(entry->file.name, 0444,
>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>> -		list_del(&entry->list);
>>> -	}
>>>   }
>>>   
>>>   int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>>>   	entry->file.data = data;
>>>   	entry->dev = dev;
>>>   
>>> -	mutex_lock(&dev->debugfs_mutex);
>>> -	list_add(&entry->list, &dev->debugfs_list);
>>> -	mutex_unlock(&dev->debugfs_mutex);
>>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>>> +			    &drm_debugfs_entry_fops);
>>> +
>>> +	/* TODO: This should probably only be a symlink */
>>> +	if (dev->render)
>>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>>> +				    entry, &drm_debugfs_entry_fops);
>> Nope. You are fundamentally missing the point of all this, which is:
>>
>> - drivers create debugfs files whenever they want to, as long as it's
>>    _before_ drm_dev_register is called.
>>
>> - drm_dev_register will set them all up.
>>
>> This is necessary because otherwise you have the potential for some nice
>> oops and stuff when userspace tries to access these files before the
>> driver is ready.
>>
>> Note that with sysfs all this infrastructure already exists, which is why
>> you can create sysfs files whenever you feel like, and things wont go
>> boom.
>
> Well Yeah I've considered that, I just don't think it's a good idea for 
> debugfs.
>
> debugfs is meant to be a helper for debugging things and that especially 
> includes the time between drm_dev_init() and drm_dev_register() because 
> that's where we probe the hardware and try to get it working.
>
> Not having the debugfs files which allows for things like hardware 
> register access and reading internal state during that is a really and I 
> mean REALLY bad idea. This is essentially what we have those files for.

So you mean you want to have early debugfs so you can have some script
hammering the debugfs to get info out between init and register during
probe?

I just think registering debugfs before everything is ready is a recipe
for disaster. All of the debugfs needs to check all the conditions that
they need across all of the probe stages. It'll be difficult to get it
right. And you'll get cargo culted checks copy pasted all over the
place.


BR,
Jani.


>
>> So yeah we need the list.
>>
>> This also means that we really should not create the debugfs directories
>> _before_ drm_dev_register is called. That's just fundamentally not how
>> device interface setup should work:
>>
>> 1. you allocate stucts and stuff
>> 2. you fully init everything
>> 3. you register interfaces so they become userspace visible
>
> How about we create the debugfs directory early and only delay the files 
> registered through this drm_debugfs interface until registration time?
>
> This way drivers can still decide if they want the files available 
> immediately or only after registration.
>
> What drivers currently do is like radeon setting an accel_working flag 
> and registering anyway even if halve the hardware doesn't work.
>
> Regards,
> Christian.
>
>> -Daniel
>>
>>>   }
>>>   EXPORT_SYMBOL(drm_debugfs_add_file);
>>>   
>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>> index 2cbe028e548c..e7b88b65866c 100644
>>> --- a/drivers/gpu/drm/drm_drv.c
>>> +++ b/drivers/gpu/drm/drm_drv.c
>>> @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
>>>   	mutex_destroy(&dev->clientlist_mutex);
>>>   	mutex_destroy(&dev->filelist_mutex);
>>>   	mutex_destroy(&dev->struct_mutex);
>>> -	mutex_destroy(&dev->debugfs_mutex);
>>>   	drm_legacy_destroy_members(dev);
>>>   }
>>>   
>>> @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
>>>   	INIT_LIST_HEAD(&dev->filelist_internal);
>>>   	INIT_LIST_HEAD(&dev->clientlist);
>>>   	INIT_LIST_HEAD(&dev->vblank_event_list);
>>> -	INIT_LIST_HEAD(&dev->debugfs_list);
>>>   
>>>   	spin_lock_init(&dev->event_lock);
>>>   	mutex_init(&dev->struct_mutex);
>>>   	mutex_init(&dev->filelist_mutex);
>>>   	mutex_init(&dev->clientlist_mutex);
>>>   	mutex_init(&dev->master_mutex);
>>> -	mutex_init(&dev->debugfs_mutex);
>>>   
>>>   	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
>>>   	if (ret)
>>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>>> index 5ff7bf88f162..e215d00ba65c 100644
>>> --- a/drivers/gpu/drm/drm_internal.h
>>> +++ b/drivers/gpu/drm/drm_internal.h
>>> @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>>>   void drm_debugfs_dev_register(struct drm_device *dev);
>>>   void drm_debugfs_minor_register(struct drm_minor *minor);
>>>   void drm_debugfs_cleanup(struct drm_minor *minor);
>>> -void drm_debugfs_late_register(struct drm_device *dev);
>>>   void drm_debugfs_connector_add(struct drm_connector *connector);
>>>   void drm_debugfs_connector_remove(struct drm_connector *connector);
>>>   void drm_debugfs_crtc_add(struct drm_crtc *crtc);
>>> @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
>>>   {
>>>   }
>>>   
>>> -static inline void drm_debugfs_late_register(struct drm_device *dev)
>>> -{
>>> -}
>>> -
>>>   static inline void drm_debugfs_connector_add(struct drm_connector *connector)
>>>   {
>>>   }
>>> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
>>> index 87eb591fe9b5..8525ef851540 100644
>>> --- a/drivers/gpu/drm/drm_mode_config.c
>>> +++ b/drivers/gpu/drm/drm_mode_config.c
>>> @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
>>>   	if (ret)
>>>   		goto err_connector;
>>>   
>>> -	drm_debugfs_late_register(dev);
>>> -
>>>   	return 0;
>>>   
>>>   err_connector:
>>> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
>>> index 7cf4afae2e79..900ad7478dd8 100644
>>> --- a/include/drm/drm_device.h
>>> +++ b/include/drm/drm_device.h
>>> @@ -311,21 +311,6 @@ struct drm_device {
>>>   	 */
>>>   	struct drm_fb_helper *fb_helper;
>>>   
>>> -	/**
>>> -	 * @debugfs_mutex:
>>> -	 *
>>> -	 * Protects &debugfs_list access.
>>> -	 */
>>> -	struct mutex debugfs_mutex;
>>> -
>>> -	/**
>>> -	 * @debugfs_list:
>>> -	 *
>>> -	 * List of debugfs files to be created by the DRM device. The files
>>> -	 * must be added during drm_dev_register().
>>> -	 */
>>> -	struct list_head debugfs_list;
>>> -
>>>   	/* Everything below here is for legacy driver, never use! */
>>>   	/* private: */
>>>   #if IS_ENABLED(CONFIG_DRM_LEGACY)
>>> -- 
>>> 2.34.1
>>>
>

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 16:46       ` Jani Nikula
@ 2023-02-16 16:56         ` Christian König
  2023-02-16 17:08           ` Jani Nikula
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-16 16:56 UTC (permalink / raw)
  To: Jani Nikula, Daniel Vetter
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

Am 16.02.23 um 17:46 schrieb Jani Nikula:
> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
>> Am 16.02.23 um 12:33 schrieb Daniel Vetter:
>>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>>>> The mutex was completely pointless in the first place since any
>>>> parallel adding of files to this list would result in random
>>>> behavior since the list is filled and consumed multiple times.
>>>>
>>>> Completely drop that approach and just create the files directly.
>>>>
>>>> This also re-adds the debugfs files to the render node directory and
>>>> removes drm_debugfs_late_register().
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>>>>    drivers/gpu/drm/drm_drv.c         |  3 ---
>>>>    drivers/gpu/drm/drm_internal.h    |  5 -----
>>>>    drivers/gpu/drm/drm_mode_config.c |  2 --
>>>>    include/drm/drm_device.h          | 15 ---------------
>>>>    5 files changed, 7 insertions(+), 50 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>>>> index 558e3a7271a5..a40288e67264 100644
>>>> --- a/drivers/gpu/drm/drm_debugfs.c
>>>> +++ b/drivers/gpu/drm/drm_debugfs.c
>>>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>>>>    void drm_debugfs_minor_register(struct drm_minor *minor)
>>>>    {
>>>>    	struct drm_device *dev = minor->dev;
>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>    
>>>>    	if (dev->driver->debugfs_init)
>>>>    		dev->driver->debugfs_init(minor);
>>>> -
>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>> -		list_del(&entry->list);
>>>> -	}
>>>> -}
>>>> -
>>>> -void drm_debugfs_late_register(struct drm_device *dev)
>>>> -{
>>>> -	struct drm_minor *minor = dev->primary;
>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>> -
>>>> -	if (!minor)
>>>> -		return;
>>>> -
>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>> -		list_del(&entry->list);
>>>> -	}
>>>>    }
>>>>    
>>>>    int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>>>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>>>>    	entry->file.data = data;
>>>>    	entry->dev = dev;
>>>>    
>>>> -	mutex_lock(&dev->debugfs_mutex);
>>>> -	list_add(&entry->list, &dev->debugfs_list);
>>>> -	mutex_unlock(&dev->debugfs_mutex);
>>>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>>>> +			    &drm_debugfs_entry_fops);
>>>> +
>>>> +	/* TODO: This should probably only be a symlink */
>>>> +	if (dev->render)
>>>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>>>> +				    entry, &drm_debugfs_entry_fops);
>>> Nope. You are fundamentally missing the point of all this, which is:
>>>
>>> - drivers create debugfs files whenever they want to, as long as it's
>>>     _before_ drm_dev_register is called.
>>>
>>> - drm_dev_register will set them all up.
>>>
>>> This is necessary because otherwise you have the potential for some nice
>>> oops and stuff when userspace tries to access these files before the
>>> driver is ready.
>>>
>>> Note that with sysfs all this infrastructure already exists, which is why
>>> you can create sysfs files whenever you feel like, and things wont go
>>> boom.
>> Well Yeah I've considered that, I just don't think it's a good idea for
>> debugfs.
>>
>> debugfs is meant to be a helper for debugging things and that especially
>> includes the time between drm_dev_init() and drm_dev_register() because
>> that's where we probe the hardware and try to get it working.
>>
>> Not having the debugfs files which allows for things like hardware
>> register access and reading internal state during that is a really and I
>> mean REALLY bad idea. This is essentially what we have those files for.
> So you mean you want to have early debugfs so you can have some script
> hammering the debugfs to get info out between init and register during
> probe?

Well not hammering. What we usually do in bringup is to set firmware 
timeout to infinity and the driver then sits and waits for the hw.

The tool used to access registers then goes directly through the PCI bar 
at the moment, but that's essentially a bad idea for registers which you 
grab a lock for to access (like index/data).

>
> I just think registering debugfs before everything is ready is a recipe
> for disaster. All of the debugfs needs to check all the conditions that
> they need across all of the probe stages. It'll be difficult to get it
> right. And you'll get cargo culted checks copy pasted all over the
> place.

Yeah, but it's debugfs. That is not supposed to work under all conditions.

Just try to read amdgpu_regs on a not existing register index. This will 
just hang or reboot your box immediately on APUs.

Regards,
Christian.

>
>
> BR,
> Jani.
>
>
>>> So yeah we need the list.
>>>
>>> This also means that we really should not create the debugfs directories
>>> _before_ drm_dev_register is called. That's just fundamentally not how
>>> device interface setup should work:
>>>
>>> 1. you allocate stucts and stuff
>>> 2. you fully init everything
>>> 3. you register interfaces so they become userspace visible
>> How about we create the debugfs directory early and only delay the files
>> registered through this drm_debugfs interface until registration time?
>>
>> This way drivers can still decide if they want the files available
>> immediately or only after registration.
>>
>> What drivers currently do is like radeon setting an accel_working flag
>> and registering anyway even if halve the hardware doesn't work.
>>
>> Regards,
>> Christian.
>>
>>> -Daniel
>>>
>>>>    }
>>>>    EXPORT_SYMBOL(drm_debugfs_add_file);
>>>>    
>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>> index 2cbe028e548c..e7b88b65866c 100644
>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>> @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
>>>>    	mutex_destroy(&dev->clientlist_mutex);
>>>>    	mutex_destroy(&dev->filelist_mutex);
>>>>    	mutex_destroy(&dev->struct_mutex);
>>>> -	mutex_destroy(&dev->debugfs_mutex);
>>>>    	drm_legacy_destroy_members(dev);
>>>>    }
>>>>    
>>>> @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
>>>>    	INIT_LIST_HEAD(&dev->filelist_internal);
>>>>    	INIT_LIST_HEAD(&dev->clientlist);
>>>>    	INIT_LIST_HEAD(&dev->vblank_event_list);
>>>> -	INIT_LIST_HEAD(&dev->debugfs_list);
>>>>    
>>>>    	spin_lock_init(&dev->event_lock);
>>>>    	mutex_init(&dev->struct_mutex);
>>>>    	mutex_init(&dev->filelist_mutex);
>>>>    	mutex_init(&dev->clientlist_mutex);
>>>>    	mutex_init(&dev->master_mutex);
>>>> -	mutex_init(&dev->debugfs_mutex);
>>>>    
>>>>    	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
>>>>    	if (ret)
>>>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>>>> index 5ff7bf88f162..e215d00ba65c 100644
>>>> --- a/drivers/gpu/drm/drm_internal.h
>>>> +++ b/drivers/gpu/drm/drm_internal.h
>>>> @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>>>>    void drm_debugfs_dev_register(struct drm_device *dev);
>>>>    void drm_debugfs_minor_register(struct drm_minor *minor);
>>>>    void drm_debugfs_cleanup(struct drm_minor *minor);
>>>> -void drm_debugfs_late_register(struct drm_device *dev);
>>>>    void drm_debugfs_connector_add(struct drm_connector *connector);
>>>>    void drm_debugfs_connector_remove(struct drm_connector *connector);
>>>>    void drm_debugfs_crtc_add(struct drm_crtc *crtc);
>>>> @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
>>>>    {
>>>>    }
>>>>    
>>>> -static inline void drm_debugfs_late_register(struct drm_device *dev)
>>>> -{
>>>> -}
>>>> -
>>>>    static inline void drm_debugfs_connector_add(struct drm_connector *connector)
>>>>    {
>>>>    }
>>>> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
>>>> index 87eb591fe9b5..8525ef851540 100644
>>>> --- a/drivers/gpu/drm/drm_mode_config.c
>>>> +++ b/drivers/gpu/drm/drm_mode_config.c
>>>> @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
>>>>    	if (ret)
>>>>    		goto err_connector;
>>>>    
>>>> -	drm_debugfs_late_register(dev);
>>>> -
>>>>    	return 0;
>>>>    
>>>>    err_connector:
>>>> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
>>>> index 7cf4afae2e79..900ad7478dd8 100644
>>>> --- a/include/drm/drm_device.h
>>>> +++ b/include/drm/drm_device.h
>>>> @@ -311,21 +311,6 @@ struct drm_device {
>>>>    	 */
>>>>    	struct drm_fb_helper *fb_helper;
>>>>    
>>>> -	/**
>>>> -	 * @debugfs_mutex:
>>>> -	 *
>>>> -	 * Protects &debugfs_list access.
>>>> -	 */
>>>> -	struct mutex debugfs_mutex;
>>>> -
>>>> -	/**
>>>> -	 * @debugfs_list:
>>>> -	 *
>>>> -	 * List of debugfs files to be created by the DRM device. The files
>>>> -	 * must be added during drm_dev_register().
>>>> -	 */
>>>> -	struct list_head debugfs_list;
>>>> -
>>>>    	/* Everything below here is for legacy driver, never use! */
>>>>    	/* private: */
>>>>    #if IS_ENABLED(CONFIG_DRM_LEGACY)
>>>> -- 
>>>> 2.34.1
>>>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 16:37     ` Stanislaw Gruszka
@ 2023-02-16 17:06       ` Jani Nikula
  2023-02-16 19:56         ` Daniel Vetter
  2023-02-17 10:35         ` Stanislaw Gruszka
  0 siblings, 2 replies; 50+ messages in thread
From: Jani Nikula @ 2023-02-16 17:06 UTC (permalink / raw)
  To: Stanislaw Gruszka, Daniel Vetter
  Cc: Christian König, mcanal, dri-devel, mwen, mairacanal,
	maxime, daniel.vetter, wambui.karugax

On Thu, 16 Feb 2023, Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote:
> On Thu, Feb 16, 2023 at 12:33:08PM +0100, Daniel Vetter wrote:
>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>> > The mutex was completely pointless in the first place since any
>> > parallel adding of files to this list would result in random
>> > behavior since the list is filled and consumed multiple times.
>> > 
>> > Completely drop that approach and just create the files directly.
>> > 
>> > This also re-adds the debugfs files to the render node directory and
>> > removes drm_debugfs_late_register().
>> > 
>> > Signed-off-by: Christian König <christian.koenig@amd.com>
>> > ---
>> >  drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>> >  drivers/gpu/drm/drm_drv.c         |  3 ---
>> >  drivers/gpu/drm/drm_internal.h    |  5 -----
>> >  drivers/gpu/drm/drm_mode_config.c |  2 --
>> >  include/drm/drm_device.h          | 15 ---------------
>> >  5 files changed, 7 insertions(+), 50 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>> > index 558e3a7271a5..a40288e67264 100644
>> > --- a/drivers/gpu/drm/drm_debugfs.c
>> > +++ b/drivers/gpu/drm/drm_debugfs.c
>> > @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>> >  void drm_debugfs_minor_register(struct drm_minor *minor)
>> >  {
>> >  	struct drm_device *dev = minor->dev;
>> > -	struct drm_debugfs_entry *entry, *tmp;
>> >  
>> >  	if (dev->driver->debugfs_init)
>> >  		dev->driver->debugfs_init(minor);
>> > -
>> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>> > -		debugfs_create_file(entry->file.name, 0444,
>> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>> > -		list_del(&entry->list);
>> > -	}
>> > -}
>> > -
>> > -void drm_debugfs_late_register(struct drm_device *dev)
>> > -{
>> > -	struct drm_minor *minor = dev->primary;
>> > -	struct drm_debugfs_entry *entry, *tmp;
>> > -
>> > -	if (!minor)
>> > -		return;
>> > -
>> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>> > -		debugfs_create_file(entry->file.name, 0444,
>> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>> > -		list_del(&entry->list);
>> > -	}
>> >  }
>> >  
>> >  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>> > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>> >  	entry->file.data = data;
>> >  	entry->dev = dev;
>> >  
>> > -	mutex_lock(&dev->debugfs_mutex);
>> > -	list_add(&entry->list, &dev->debugfs_list);
>> > -	mutex_unlock(&dev->debugfs_mutex);
>> > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>> > +			    &drm_debugfs_entry_fops);
>> > +
>> > +	/* TODO: This should probably only be a symlink */
>> > +	if (dev->render)
>> > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>> > +				    entry, &drm_debugfs_entry_fops);
>> 
>> Nope. You are fundamentally missing the point of all this, which is:
>> 
>> - drivers create debugfs files whenever they want to, as long as it's
>>   _before_ drm_dev_register is called.
>> 
>> - drm_dev_register will set them all up.
>> 
>> This is necessary because otherwise you have the potential for some nice
>> oops and stuff when userspace tries to access these files before the
>> driver is ready.
>
> But should not this the driver responsibility, call drm_debugfs_add_file()
> whenever you are ready to handle operations on added file ?

In theory, yes, but in practice it's pretty hard for a non-trivial
driver to maintain that all the conditions are met.

In i915 we call debugfs register all over the place only after we've
called drm_dev_register(), because it's the only sane way. But it means
we need the init and register separated everywhere, instead of init
adding files to a list to be registered later.

BR,
Jani.



>
> Regards
> Stanislaw
>
>> Note that with sysfs all this infrastructure already exists, which is why
>> you can create sysfs files whenever you feel like, and things wont go
>> boom.
>> 
>> So yeah we need the list.
>> 
>> This also means that we really should not create the debugfs directories
>> _before_ drm_dev_register is called. That's just fundamentally not how
>> device interface setup should work:
>> 
>> 1. you allocate stucts and stuff
>> 2. you fully init everything
>> 3. you register interfaces so they become userspace visible
>> -Daniel
>> 
>> >  }
>> >  EXPORT_SYMBOL(drm_debugfs_add_file);
>> >  
>> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> > index 2cbe028e548c..e7b88b65866c 100644
>> > --- a/drivers/gpu/drm/drm_drv.c
>> > +++ b/drivers/gpu/drm/drm_drv.c
>> > @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
>> >  	mutex_destroy(&dev->clientlist_mutex);
>> >  	mutex_destroy(&dev->filelist_mutex);
>> >  	mutex_destroy(&dev->struct_mutex);
>> > -	mutex_destroy(&dev->debugfs_mutex);
>> >  	drm_legacy_destroy_members(dev);
>> >  }
>> >  
>> > @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
>> >  	INIT_LIST_HEAD(&dev->filelist_internal);
>> >  	INIT_LIST_HEAD(&dev->clientlist);
>> >  	INIT_LIST_HEAD(&dev->vblank_event_list);
>> > -	INIT_LIST_HEAD(&dev->debugfs_list);
>> >  
>> >  	spin_lock_init(&dev->event_lock);
>> >  	mutex_init(&dev->struct_mutex);
>> >  	mutex_init(&dev->filelist_mutex);
>> >  	mutex_init(&dev->clientlist_mutex);
>> >  	mutex_init(&dev->master_mutex);
>> > -	mutex_init(&dev->debugfs_mutex);
>> >  
>> >  	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
>> >  	if (ret)
>> > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>> > index 5ff7bf88f162..e215d00ba65c 100644
>> > --- a/drivers/gpu/drm/drm_internal.h
>> > +++ b/drivers/gpu/drm/drm_internal.h
>> > @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>> >  void drm_debugfs_dev_register(struct drm_device *dev);
>> >  void drm_debugfs_minor_register(struct drm_minor *minor);
>> >  void drm_debugfs_cleanup(struct drm_minor *minor);
>> > -void drm_debugfs_late_register(struct drm_device *dev);
>> >  void drm_debugfs_connector_add(struct drm_connector *connector);
>> >  void drm_debugfs_connector_remove(struct drm_connector *connector);
>> >  void drm_debugfs_crtc_add(struct drm_crtc *crtc);
>> > @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
>> >  {
>> >  }
>> >  
>> > -static inline void drm_debugfs_late_register(struct drm_device *dev)
>> > -{
>> > -}
>> > -
>> >  static inline void drm_debugfs_connector_add(struct drm_connector *connector)
>> >  {
>> >  }
>> > diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
>> > index 87eb591fe9b5..8525ef851540 100644
>> > --- a/drivers/gpu/drm/drm_mode_config.c
>> > +++ b/drivers/gpu/drm/drm_mode_config.c
>> > @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
>> >  	if (ret)
>> >  		goto err_connector;
>> >  
>> > -	drm_debugfs_late_register(dev);
>> > -
>> >  	return 0;
>> >  
>> >  err_connector:
>> > diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
>> > index 7cf4afae2e79..900ad7478dd8 100644
>> > --- a/include/drm/drm_device.h
>> > +++ b/include/drm/drm_device.h
>> > @@ -311,21 +311,6 @@ struct drm_device {
>> >  	 */
>> >  	struct drm_fb_helper *fb_helper;
>> >  
>> > -	/**
>> > -	 * @debugfs_mutex:
>> > -	 *
>> > -	 * Protects &debugfs_list access.
>> > -	 */
>> > -	struct mutex debugfs_mutex;
>> > -
>> > -	/**
>> > -	 * @debugfs_list:
>> > -	 *
>> > -	 * List of debugfs files to be created by the DRM device. The files
>> > -	 * must be added during drm_dev_register().
>> > -	 */
>> > -	struct list_head debugfs_list;
>> > -
>> >  	/* Everything below here is for legacy driver, never use! */
>> >  	/* private: */
>> >  #if IS_ENABLED(CONFIG_DRM_LEGACY)
>> > -- 
>> > 2.34.1
>> > 
>> 
>> -- 
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 16:56         ` Christian König
@ 2023-02-16 17:08           ` Jani Nikula
  2023-02-16 19:54             ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Jani Nikula @ 2023-02-16 17:08 UTC (permalink / raw)
  To: Christian König, Daniel Vetter
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> Am 16.02.23 um 17:46 schrieb Jani Nikula:
>> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
>>> Am 16.02.23 um 12:33 schrieb Daniel Vetter:
>>>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>>>>> The mutex was completely pointless in the first place since any
>>>>> parallel adding of files to this list would result in random
>>>>> behavior since the list is filled and consumed multiple times.
>>>>>
>>>>> Completely drop that approach and just create the files directly.
>>>>>
>>>>> This also re-adds the debugfs files to the render node directory and
>>>>> removes drm_debugfs_late_register().
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>>>>>    drivers/gpu/drm/drm_drv.c         |  3 ---
>>>>>    drivers/gpu/drm/drm_internal.h    |  5 -----
>>>>>    drivers/gpu/drm/drm_mode_config.c |  2 --
>>>>>    include/drm/drm_device.h          | 15 ---------------
>>>>>    5 files changed, 7 insertions(+), 50 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>>>>> index 558e3a7271a5..a40288e67264 100644
>>>>> --- a/drivers/gpu/drm/drm_debugfs.c
>>>>> +++ b/drivers/gpu/drm/drm_debugfs.c
>>>>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>>>>>    void drm_debugfs_minor_register(struct drm_minor *minor)
>>>>>    {
>>>>>    	struct drm_device *dev = minor->dev;
>>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>>    
>>>>>    	if (dev->driver->debugfs_init)
>>>>>    		dev->driver->debugfs_init(minor);
>>>>> -
>>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>>> -		list_del(&entry->list);
>>>>> -	}
>>>>> -}
>>>>> -
>>>>> -void drm_debugfs_late_register(struct drm_device *dev)
>>>>> -{
>>>>> -	struct drm_minor *minor = dev->primary;
>>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>> -
>>>>> -	if (!minor)
>>>>> -		return;
>>>>> -
>>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>>> -		list_del(&entry->list);
>>>>> -	}
>>>>>    }
>>>>>    
>>>>>    int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>>>>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>>>>>    	entry->file.data = data;
>>>>>    	entry->dev = dev;
>>>>>    
>>>>> -	mutex_lock(&dev->debugfs_mutex);
>>>>> -	list_add(&entry->list, &dev->debugfs_list);
>>>>> -	mutex_unlock(&dev->debugfs_mutex);
>>>>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>>>>> +			    &drm_debugfs_entry_fops);
>>>>> +
>>>>> +	/* TODO: This should probably only be a symlink */
>>>>> +	if (dev->render)
>>>>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>>>>> +				    entry, &drm_debugfs_entry_fops);
>>>> Nope. You are fundamentally missing the point of all this, which is:
>>>>
>>>> - drivers create debugfs files whenever they want to, as long as it's
>>>>     _before_ drm_dev_register is called.
>>>>
>>>> - drm_dev_register will set them all up.
>>>>
>>>> This is necessary because otherwise you have the potential for some nice
>>>> oops and stuff when userspace tries to access these files before the
>>>> driver is ready.
>>>>
>>>> Note that with sysfs all this infrastructure already exists, which is why
>>>> you can create sysfs files whenever you feel like, and things wont go
>>>> boom.
>>> Well Yeah I've considered that, I just don't think it's a good idea for
>>> debugfs.
>>>
>>> debugfs is meant to be a helper for debugging things and that especially
>>> includes the time between drm_dev_init() and drm_dev_register() because
>>> that's where we probe the hardware and try to get it working.
>>>
>>> Not having the debugfs files which allows for things like hardware
>>> register access and reading internal state during that is a really and I
>>> mean REALLY bad idea. This is essentially what we have those files for.
>> So you mean you want to have early debugfs so you can have some script
>> hammering the debugfs to get info out between init and register during
>> probe?
>
> Well not hammering. What we usually do in bringup is to set firmware 
> timeout to infinity and the driver then sits and waits for the hw.
>
> The tool used to access registers then goes directly through the PCI bar 
> at the moment, but that's essentially a bad idea for registers which you 
> grab a lock for to access (like index/data).
>
>>
>> I just think registering debugfs before everything is ready is a recipe
>> for disaster. All of the debugfs needs to check all the conditions that
>> they need across all of the probe stages. It'll be difficult to get it
>> right. And you'll get cargo culted checks copy pasted all over the
>> place.
>
> Yeah, but it's debugfs. That is not supposed to work under all conditions.
>
> Just try to read amdgpu_regs on a not existing register index. This will 
> just hang or reboot your box immediately on APUs.

I'm firmly in the camp that debugfs does not need to work under all
conditions, but that it must fail gracefully instead of crashing.


BR,
Jani.


>
> Regards,
> Christian.
>
>>
>>
>> BR,
>> Jani.
>>
>>
>>>> So yeah we need the list.
>>>>
>>>> This also means that we really should not create the debugfs directories
>>>> _before_ drm_dev_register is called. That's just fundamentally not how
>>>> device interface setup should work:
>>>>
>>>> 1. you allocate stucts and stuff
>>>> 2. you fully init everything
>>>> 3. you register interfaces so they become userspace visible
>>> How about we create the debugfs directory early and only delay the files
>>> registered through this drm_debugfs interface until registration time?
>>>
>>> This way drivers can still decide if they want the files available
>>> immediately or only after registration.
>>>
>>> What drivers currently do is like radeon setting an accel_working flag
>>> and registering anyway even if halve the hardware doesn't work.
>>>
>>> Regards,
>>> Christian.
>>>
>>>> -Daniel
>>>>
>>>>>    }
>>>>>    EXPORT_SYMBOL(drm_debugfs_add_file);
>>>>>    
>>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>>>>> index 2cbe028e548c..e7b88b65866c 100644
>>>>> --- a/drivers/gpu/drm/drm_drv.c
>>>>> +++ b/drivers/gpu/drm/drm_drv.c
>>>>> @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
>>>>>    	mutex_destroy(&dev->clientlist_mutex);
>>>>>    	mutex_destroy(&dev->filelist_mutex);
>>>>>    	mutex_destroy(&dev->struct_mutex);
>>>>> -	mutex_destroy(&dev->debugfs_mutex);
>>>>>    	drm_legacy_destroy_members(dev);
>>>>>    }
>>>>>    
>>>>> @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
>>>>>    	INIT_LIST_HEAD(&dev->filelist_internal);
>>>>>    	INIT_LIST_HEAD(&dev->clientlist);
>>>>>    	INIT_LIST_HEAD(&dev->vblank_event_list);
>>>>> -	INIT_LIST_HEAD(&dev->debugfs_list);
>>>>>    
>>>>>    	spin_lock_init(&dev->event_lock);
>>>>>    	mutex_init(&dev->struct_mutex);
>>>>>    	mutex_init(&dev->filelist_mutex);
>>>>>    	mutex_init(&dev->clientlist_mutex);
>>>>>    	mutex_init(&dev->master_mutex);
>>>>> -	mutex_init(&dev->debugfs_mutex);
>>>>>    
>>>>>    	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
>>>>>    	if (ret)
>>>>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>>>>> index 5ff7bf88f162..e215d00ba65c 100644
>>>>> --- a/drivers/gpu/drm/drm_internal.h
>>>>> +++ b/drivers/gpu/drm/drm_internal.h
>>>>> @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
>>>>>    void drm_debugfs_dev_register(struct drm_device *dev);
>>>>>    void drm_debugfs_minor_register(struct drm_minor *minor);
>>>>>    void drm_debugfs_cleanup(struct drm_minor *minor);
>>>>> -void drm_debugfs_late_register(struct drm_device *dev);
>>>>>    void drm_debugfs_connector_add(struct drm_connector *connector);
>>>>>    void drm_debugfs_connector_remove(struct drm_connector *connector);
>>>>>    void drm_debugfs_crtc_add(struct drm_crtc *crtc);
>>>>> @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
>>>>>    {
>>>>>    }
>>>>>    
>>>>> -static inline void drm_debugfs_late_register(struct drm_device *dev)
>>>>> -{
>>>>> -}
>>>>> -
>>>>>    static inline void drm_debugfs_connector_add(struct drm_connector *connector)
>>>>>    {
>>>>>    }
>>>>> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
>>>>> index 87eb591fe9b5..8525ef851540 100644
>>>>> --- a/drivers/gpu/drm/drm_mode_config.c
>>>>> +++ b/drivers/gpu/drm/drm_mode_config.c
>>>>> @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
>>>>>    	if (ret)
>>>>>    		goto err_connector;
>>>>>    
>>>>> -	drm_debugfs_late_register(dev);
>>>>> -
>>>>>    	return 0;
>>>>>    
>>>>>    err_connector:
>>>>> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
>>>>> index 7cf4afae2e79..900ad7478dd8 100644
>>>>> --- a/include/drm/drm_device.h
>>>>> +++ b/include/drm/drm_device.h
>>>>> @@ -311,21 +311,6 @@ struct drm_device {
>>>>>    	 */
>>>>>    	struct drm_fb_helper *fb_helper;
>>>>>    
>>>>> -	/**
>>>>> -	 * @debugfs_mutex:
>>>>> -	 *
>>>>> -	 * Protects &debugfs_list access.
>>>>> -	 */
>>>>> -	struct mutex debugfs_mutex;
>>>>> -
>>>>> -	/**
>>>>> -	 * @debugfs_list:
>>>>> -	 *
>>>>> -	 * List of debugfs files to be created by the DRM device. The files
>>>>> -	 * must be added during drm_dev_register().
>>>>> -	 */
>>>>> -	struct list_head debugfs_list;
>>>>> -
>>>>>    	/* Everything below here is for legacy driver, never use! */
>>>>>    	/* private: */
>>>>>    #if IS_ENABLED(CONFIG_DRM_LEGACY)
>>>>> -- 
>>>>> 2.34.1
>>>>>
>

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 17:08           ` Jani Nikula
@ 2023-02-16 19:54             ` Daniel Vetter
  2023-02-17  9:22               ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2023-02-16 19:54 UTC (permalink / raw)
  To: Jani Nikula
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	Christian König, wambui.karugax

On Thu, Feb 16, 2023 at 07:08:49PM +0200, Jani Nikula wrote:
> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> > Am 16.02.23 um 17:46 schrieb Jani Nikula:
> >> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> >>> Am 16.02.23 um 12:33 schrieb Daniel Vetter:
> >>>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> >>>>> The mutex was completely pointless in the first place since any
> >>>>> parallel adding of files to this list would result in random
> >>>>> behavior since the list is filled and consumed multiple times.
> >>>>>
> >>>>> Completely drop that approach and just create the files directly.
> >>>>>
> >>>>> This also re-adds the debugfs files to the render node directory and
> >>>>> removes drm_debugfs_late_register().
> >>>>>
> >>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>> ---
> >>>>>    drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
> >>>>>    drivers/gpu/drm/drm_drv.c         |  3 ---
> >>>>>    drivers/gpu/drm/drm_internal.h    |  5 -----
> >>>>>    drivers/gpu/drm/drm_mode_config.c |  2 --
> >>>>>    include/drm/drm_device.h          | 15 ---------------
> >>>>>    5 files changed, 7 insertions(+), 50 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> >>>>> index 558e3a7271a5..a40288e67264 100644
> >>>>> --- a/drivers/gpu/drm/drm_debugfs.c
> >>>>> +++ b/drivers/gpu/drm/drm_debugfs.c
> >>>>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
> >>>>>    void drm_debugfs_minor_register(struct drm_minor *minor)
> >>>>>    {
> >>>>>    	struct drm_device *dev = minor->dev;
> >>>>> -	struct drm_debugfs_entry *entry, *tmp;
> >>>>>    
> >>>>>    	if (dev->driver->debugfs_init)
> >>>>>    		dev->driver->debugfs_init(minor);
> >>>>> -
> >>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> >>>>> -		debugfs_create_file(entry->file.name, 0444,
> >>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> >>>>> -		list_del(&entry->list);
> >>>>> -	}
> >>>>> -}
> >>>>> -
> >>>>> -void drm_debugfs_late_register(struct drm_device *dev)
> >>>>> -{
> >>>>> -	struct drm_minor *minor = dev->primary;
> >>>>> -	struct drm_debugfs_entry *entry, *tmp;
> >>>>> -
> >>>>> -	if (!minor)
> >>>>> -		return;
> >>>>> -
> >>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> >>>>> -		debugfs_create_file(entry->file.name, 0444,
> >>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> >>>>> -		list_del(&entry->list);
> >>>>> -	}
> >>>>>    }
> >>>>>    
> >>>>>    int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> >>>>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> >>>>>    	entry->file.data = data;
> >>>>>    	entry->dev = dev;
> >>>>>    
> >>>>> -	mutex_lock(&dev->debugfs_mutex);
> >>>>> -	list_add(&entry->list, &dev->debugfs_list);
> >>>>> -	mutex_unlock(&dev->debugfs_mutex);
> >>>>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> >>>>> +			    &drm_debugfs_entry_fops);
> >>>>> +
> >>>>> +	/* TODO: This should probably only be a symlink */
> >>>>> +	if (dev->render)
> >>>>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> >>>>> +				    entry, &drm_debugfs_entry_fops);
> >>>> Nope. You are fundamentally missing the point of all this, which is:
> >>>>
> >>>> - drivers create debugfs files whenever they want to, as long as it's
> >>>>     _before_ drm_dev_register is called.
> >>>>
> >>>> - drm_dev_register will set them all up.
> >>>>
> >>>> This is necessary because otherwise you have the potential for some nice
> >>>> oops and stuff when userspace tries to access these files before the
> >>>> driver is ready.
> >>>>
> >>>> Note that with sysfs all this infrastructure already exists, which is why
> >>>> you can create sysfs files whenever you feel like, and things wont go
> >>>> boom.
> >>> Well Yeah I've considered that, I just don't think it's a good idea for
> >>> debugfs.
> >>>
> >>> debugfs is meant to be a helper for debugging things and that especially
> >>> includes the time between drm_dev_init() and drm_dev_register() because
> >>> that's where we probe the hardware and try to get it working.
> >>>
> >>> Not having the debugfs files which allows for things like hardware
> >>> register access and reading internal state during that is a really and I
> >>> mean REALLY bad idea. This is essentially what we have those files for.
> >> So you mean you want to have early debugfs so you can have some script
> >> hammering the debugfs to get info out between init and register during
> >> probe?
> >
> > Well not hammering. What we usually do in bringup is to set firmware 
> > timeout to infinity and the driver then sits and waits for the hw.
> >
> > The tool used to access registers then goes directly through the PCI bar 
> > at the moment, but that's essentially a bad idea for registers which you 
> > grab a lock for to access (like index/data).
> >
> >>
> >> I just think registering debugfs before everything is ready is a recipe
> >> for disaster. All of the debugfs needs to check all the conditions that
> >> they need across all of the probe stages. It'll be difficult to get it
> >> right. And you'll get cargo culted checks copy pasted all over the
> >> place.
> >
> > Yeah, but it's debugfs. That is not supposed to work under all conditions.
> >
> > Just try to read amdgpu_regs on a not existing register index. This will 
> > just hang or reboot your box immediately on APUs.
> 
> I'm firmly in the camp that debugfs does not need to work under all
> conditions, but that it must fail gracefully instead of crashing.

Yeah I mean once we talk bring-up, you can just hand-roll the necessary
bring debugfs things that you need to work before the driver is ready to
do anything.

But bring-up debugfs fun is rather special, same way pre-silicon support
tends to be rather special. Shipping that in distros does not sound like a
good idea at all to me.
-Daniel

> 
> 
> BR,
> Jani.
> 
> 
> >
> > Regards,
> > Christian.
> >
> >>
> >>
> >> BR,
> >> Jani.
> >>
> >>
> >>>> So yeah we need the list.
> >>>>
> >>>> This also means that we really should not create the debugfs directories
> >>>> _before_ drm_dev_register is called. That's just fundamentally not how
> >>>> device interface setup should work:
> >>>>
> >>>> 1. you allocate stucts and stuff
> >>>> 2. you fully init everything
> >>>> 3. you register interfaces so they become userspace visible
> >>> How about we create the debugfs directory early and only delay the files
> >>> registered through this drm_debugfs interface until registration time?
> >>>
> >>> This way drivers can still decide if they want the files available
> >>> immediately or only after registration.
> >>>
> >>> What drivers currently do is like radeon setting an accel_working flag
> >>> and registering anyway even if halve the hardware doesn't work.
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> -Daniel
> >>>>
> >>>>>    }
> >>>>>    EXPORT_SYMBOL(drm_debugfs_add_file);
> >>>>>    
> >>>>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>> index 2cbe028e548c..e7b88b65866c 100644
> >>>>> --- a/drivers/gpu/drm/drm_drv.c
> >>>>> +++ b/drivers/gpu/drm/drm_drv.c
> >>>>> @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
> >>>>>    	mutex_destroy(&dev->clientlist_mutex);
> >>>>>    	mutex_destroy(&dev->filelist_mutex);
> >>>>>    	mutex_destroy(&dev->struct_mutex);
> >>>>> -	mutex_destroy(&dev->debugfs_mutex);
> >>>>>    	drm_legacy_destroy_members(dev);
> >>>>>    }
> >>>>>    
> >>>>> @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
> >>>>>    	INIT_LIST_HEAD(&dev->filelist_internal);
> >>>>>    	INIT_LIST_HEAD(&dev->clientlist);
> >>>>>    	INIT_LIST_HEAD(&dev->vblank_event_list);
> >>>>> -	INIT_LIST_HEAD(&dev->debugfs_list);
> >>>>>    
> >>>>>    	spin_lock_init(&dev->event_lock);
> >>>>>    	mutex_init(&dev->struct_mutex);
> >>>>>    	mutex_init(&dev->filelist_mutex);
> >>>>>    	mutex_init(&dev->clientlist_mutex);
> >>>>>    	mutex_init(&dev->master_mutex);
> >>>>> -	mutex_init(&dev->debugfs_mutex);
> >>>>>    
> >>>>>    	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
> >>>>>    	if (ret)
> >>>>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> >>>>> index 5ff7bf88f162..e215d00ba65c 100644
> >>>>> --- a/drivers/gpu/drm/drm_internal.h
> >>>>> +++ b/drivers/gpu/drm/drm_internal.h
> >>>>> @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
> >>>>>    void drm_debugfs_dev_register(struct drm_device *dev);
> >>>>>    void drm_debugfs_minor_register(struct drm_minor *minor);
> >>>>>    void drm_debugfs_cleanup(struct drm_minor *minor);
> >>>>> -void drm_debugfs_late_register(struct drm_device *dev);
> >>>>>    void drm_debugfs_connector_add(struct drm_connector *connector);
> >>>>>    void drm_debugfs_connector_remove(struct drm_connector *connector);
> >>>>>    void drm_debugfs_crtc_add(struct drm_crtc *crtc);
> >>>>> @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
> >>>>>    {
> >>>>>    }
> >>>>>    
> >>>>> -static inline void drm_debugfs_late_register(struct drm_device *dev)
> >>>>> -{
> >>>>> -}
> >>>>> -
> >>>>>    static inline void drm_debugfs_connector_add(struct drm_connector *connector)
> >>>>>    {
> >>>>>    }
> >>>>> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
> >>>>> index 87eb591fe9b5..8525ef851540 100644
> >>>>> --- a/drivers/gpu/drm/drm_mode_config.c
> >>>>> +++ b/drivers/gpu/drm/drm_mode_config.c
> >>>>> @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
> >>>>>    	if (ret)
> >>>>>    		goto err_connector;
> >>>>>    
> >>>>> -	drm_debugfs_late_register(dev);
> >>>>> -
> >>>>>    	return 0;
> >>>>>    
> >>>>>    err_connector:
> >>>>> diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> >>>>> index 7cf4afae2e79..900ad7478dd8 100644
> >>>>> --- a/include/drm/drm_device.h
> >>>>> +++ b/include/drm/drm_device.h
> >>>>> @@ -311,21 +311,6 @@ struct drm_device {
> >>>>>    	 */
> >>>>>    	struct drm_fb_helper *fb_helper;
> >>>>>    
> >>>>> -	/**
> >>>>> -	 * @debugfs_mutex:
> >>>>> -	 *
> >>>>> -	 * Protects &debugfs_list access.
> >>>>> -	 */
> >>>>> -	struct mutex debugfs_mutex;
> >>>>> -
> >>>>> -	/**
> >>>>> -	 * @debugfs_list:
> >>>>> -	 *
> >>>>> -	 * List of debugfs files to be created by the DRM device. The files
> >>>>> -	 * must be added during drm_dev_register().
> >>>>> -	 */
> >>>>> -	struct list_head debugfs_list;
> >>>>> -
> >>>>>    	/* Everything below here is for legacy driver, never use! */
> >>>>>    	/* private: */
> >>>>>    #if IS_ENABLED(CONFIG_DRM_LEGACY)
> >>>>> -- 
> >>>>> 2.34.1
> >>>>>
> >
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 17:06       ` Jani Nikula
@ 2023-02-16 19:56         ` Daniel Vetter
  2023-02-17 10:35         ` Stanislaw Gruszka
  1 sibling, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2023-02-16 19:56 UTC (permalink / raw)
  To: Jani Nikula
  Cc: mairacanal, Christian König, mcanal, dri-devel, mwen,
	Stanislaw Gruszka, maxime, daniel.vetter, wambui.karugax

On Thu, Feb 16, 2023 at 07:06:46PM +0200, Jani Nikula wrote:
> On Thu, 16 Feb 2023, Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote:
> > On Thu, Feb 16, 2023 at 12:33:08PM +0100, Daniel Vetter wrote:
> >> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> >> > The mutex was completely pointless in the first place since any
> >> > parallel adding of files to this list would result in random
> >> > behavior since the list is filled and consumed multiple times.
> >> > 
> >> > Completely drop that approach and just create the files directly.
> >> > 
> >> > This also re-adds the debugfs files to the render node directory and
> >> > removes drm_debugfs_late_register().
> >> > 
> >> > Signed-off-by: Christian König <christian.koenig@amd.com>
> >> > ---
> >> >  drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
> >> >  drivers/gpu/drm/drm_drv.c         |  3 ---
> >> >  drivers/gpu/drm/drm_internal.h    |  5 -----
> >> >  drivers/gpu/drm/drm_mode_config.c |  2 --
> >> >  include/drm/drm_device.h          | 15 ---------------
> >> >  5 files changed, 7 insertions(+), 50 deletions(-)
> >> > 
> >> > diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> >> > index 558e3a7271a5..a40288e67264 100644
> >> > --- a/drivers/gpu/drm/drm_debugfs.c
> >> > +++ b/drivers/gpu/drm/drm_debugfs.c
> >> > @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
> >> >  void drm_debugfs_minor_register(struct drm_minor *minor)
> >> >  {
> >> >  	struct drm_device *dev = minor->dev;
> >> > -	struct drm_debugfs_entry *entry, *tmp;
> >> >  
> >> >  	if (dev->driver->debugfs_init)
> >> >  		dev->driver->debugfs_init(minor);
> >> > -
> >> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> >> > -		debugfs_create_file(entry->file.name, 0444,
> >> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> >> > -		list_del(&entry->list);
> >> > -	}
> >> > -}
> >> > -
> >> > -void drm_debugfs_late_register(struct drm_device *dev)
> >> > -{
> >> > -	struct drm_minor *minor = dev->primary;
> >> > -	struct drm_debugfs_entry *entry, *tmp;
> >> > -
> >> > -	if (!minor)
> >> > -		return;
> >> > -
> >> > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> >> > -		debugfs_create_file(entry->file.name, 0444,
> >> > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> >> > -		list_del(&entry->list);
> >> > -	}
> >> >  }
> >> >  
> >> >  int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> >> > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> >> >  	entry->file.data = data;
> >> >  	entry->dev = dev;
> >> >  
> >> > -	mutex_lock(&dev->debugfs_mutex);
> >> > -	list_add(&entry->list, &dev->debugfs_list);
> >> > -	mutex_unlock(&dev->debugfs_mutex);
> >> > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> >> > +			    &drm_debugfs_entry_fops);
> >> > +
> >> > +	/* TODO: This should probably only be a symlink */
> >> > +	if (dev->render)
> >> > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> >> > +				    entry, &drm_debugfs_entry_fops);
> >> 
> >> Nope. You are fundamentally missing the point of all this, which is:
> >> 
> >> - drivers create debugfs files whenever they want to, as long as it's
> >>   _before_ drm_dev_register is called.
> >> 
> >> - drm_dev_register will set them all up.
> >> 
> >> This is necessary because otherwise you have the potential for some nice
> >> oops and stuff when userspace tries to access these files before the
> >> driver is ready.
> >
> > But should not this the driver responsibility, call drm_debugfs_add_file()
> > whenever you are ready to handle operations on added file ?
> 
> In theory, yes, but in practice it's pretty hard for a non-trivial
> driver to maintain that all the conditions are met.
> 
> In i915 we call debugfs register all over the place only after we've
> called drm_dev_register(), because it's the only sane way. But it means
> we need the init and register separated everywhere, instead of init
> adding files to a list to be registered later.

Yup, it just forces a ton of boilerplate on drivers for no gain.

Like devm_* and drmm_* are also not needed in the strict sense, and they
are all optional. But you're a fool for not using them when you can.

Same thing with these debugfs helpers here, you can outright bypass them,
and then end up doing what amdgpu/i915 currently do: A massive and
somewhat fragile parallel function call hierarchy.

Which is just not very nice thing to be forced into.
-Daniel

> BR,
> Jani.
> 
> 
> 
> >
> > Regards
> > Stanislaw
> >
> >> Note that with sysfs all this infrastructure already exists, which is why
> >> you can create sysfs files whenever you feel like, and things wont go
> >> boom.
> >> 
> >> So yeah we need the list.
> >> 
> >> This also means that we really should not create the debugfs directories
> >> _before_ drm_dev_register is called. That's just fundamentally not how
> >> device interface setup should work:
> >> 
> >> 1. you allocate stucts and stuff
> >> 2. you fully init everything
> >> 3. you register interfaces so they become userspace visible
> >> -Daniel
> >> 
> >> >  }
> >> >  EXPORT_SYMBOL(drm_debugfs_add_file);
> >> >  
> >> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >> > index 2cbe028e548c..e7b88b65866c 100644
> >> > --- a/drivers/gpu/drm/drm_drv.c
> >> > +++ b/drivers/gpu/drm/drm_drv.c
> >> > @@ -597,7 +597,6 @@ static void drm_dev_init_release(struct drm_device *dev, void *res)
> >> >  	mutex_destroy(&dev->clientlist_mutex);
> >> >  	mutex_destroy(&dev->filelist_mutex);
> >> >  	mutex_destroy(&dev->struct_mutex);
> >> > -	mutex_destroy(&dev->debugfs_mutex);
> >> >  	drm_legacy_destroy_members(dev);
> >> >  }
> >> >  
> >> > @@ -638,14 +637,12 @@ static int drm_dev_init(struct drm_device *dev,
> >> >  	INIT_LIST_HEAD(&dev->filelist_internal);
> >> >  	INIT_LIST_HEAD(&dev->clientlist);
> >> >  	INIT_LIST_HEAD(&dev->vblank_event_list);
> >> > -	INIT_LIST_HEAD(&dev->debugfs_list);
> >> >  
> >> >  	spin_lock_init(&dev->event_lock);
> >> >  	mutex_init(&dev->struct_mutex);
> >> >  	mutex_init(&dev->filelist_mutex);
> >> >  	mutex_init(&dev->clientlist_mutex);
> >> >  	mutex_init(&dev->master_mutex);
> >> > -	mutex_init(&dev->debugfs_mutex);
> >> >  
> >> >  	ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
> >> >  	if (ret)
> >> > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> >> > index 5ff7bf88f162..e215d00ba65c 100644
> >> > --- a/drivers/gpu/drm/drm_internal.h
> >> > +++ b/drivers/gpu/drm/drm_internal.h
> >> > @@ -188,7 +188,6 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id,
> >> >  void drm_debugfs_dev_register(struct drm_device *dev);
> >> >  void drm_debugfs_minor_register(struct drm_minor *minor);
> >> >  void drm_debugfs_cleanup(struct drm_minor *minor);
> >> > -void drm_debugfs_late_register(struct drm_device *dev);
> >> >  void drm_debugfs_connector_add(struct drm_connector *connector);
> >> >  void drm_debugfs_connector_remove(struct drm_connector *connector);
> >> >  void drm_debugfs_crtc_add(struct drm_crtc *crtc);
> >> > @@ -205,10 +204,6 @@ static inline void drm_debugfs_cleanup(struct drm_minor *minor)
> >> >  {
> >> >  }
> >> >  
> >> > -static inline void drm_debugfs_late_register(struct drm_device *dev)
> >> > -{
> >> > -}
> >> > -
> >> >  static inline void drm_debugfs_connector_add(struct drm_connector *connector)
> >> >  {
> >> >  }
> >> > diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
> >> > index 87eb591fe9b5..8525ef851540 100644
> >> > --- a/drivers/gpu/drm/drm_mode_config.c
> >> > +++ b/drivers/gpu/drm/drm_mode_config.c
> >> > @@ -54,8 +54,6 @@ int drm_modeset_register_all(struct drm_device *dev)
> >> >  	if (ret)
> >> >  		goto err_connector;
> >> >  
> >> > -	drm_debugfs_late_register(dev);
> >> > -
> >> >  	return 0;
> >> >  
> >> >  err_connector:
> >> > diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
> >> > index 7cf4afae2e79..900ad7478dd8 100644
> >> > --- a/include/drm/drm_device.h
> >> > +++ b/include/drm/drm_device.h
> >> > @@ -311,21 +311,6 @@ struct drm_device {
> >> >  	 */
> >> >  	struct drm_fb_helper *fb_helper;
> >> >  
> >> > -	/**
> >> > -	 * @debugfs_mutex:
> >> > -	 *
> >> > -	 * Protects &debugfs_list access.
> >> > -	 */
> >> > -	struct mutex debugfs_mutex;
> >> > -
> >> > -	/**
> >> > -	 * @debugfs_list:
> >> > -	 *
> >> > -	 * List of debugfs files to be created by the DRM device. The files
> >> > -	 * must be added during drm_dev_register().
> >> > -	 */
> >> > -	struct list_head debugfs_list;
> >> > -
> >> >  	/* Everything below here is for legacy driver, never use! */
> >> >  	/* private: */
> >> >  #if IS_ENABLED(CONFIG_DRM_LEGACY)
> >> > -- 
> >> > 2.34.1
> >> > 
> >> 
> >> -- 
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> http://blog.ffwll.ch
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: Try to address the drm_debugfs issues
  2023-02-16 16:31           ` Christian König
@ 2023-02-16 19:57             ` Daniel Vetter
  0 siblings, 0 replies; 50+ messages in thread
From: Daniel Vetter @ 2023-02-16 19:57 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, Maíra Canal, dri-devel, mwen, mairacanal,
	maxime, wambui.karugax

On Thu, Feb 16, 2023 at 05:31:50PM +0100, Christian König wrote:
> 
> 
> Am 16.02.23 um 12:34 schrieb Daniel Vetter:
> > On Thu, Feb 09, 2023 at 03:06:10PM +0100, Christian König wrote:
> > > Am 09.02.23 um 14:06 schrieb Maíra Canal:
> > > > On 2/9/23 09:13, Christian König wrote:
> > > > > Am 09.02.23 um 12:23 schrieb Maíra Canal:
> > > > > > On 2/9/23 05:18, Christian König wrote:
> > > > > > > Hello everyone,
> > > > > > > 
> > > > > > > the drm_debugfs has a couple of well known design problems.
> > > > > > > 
> > > > > > > Especially it wasn't possible to add files between
> > > > > > > initializing and registering
> > > > > > > of DRM devices since the underlying debugfs directory wasn't
> > > > > > > created yet.
> > > > > > > 
> > > > > > > The resulting necessity of the driver->debugfs_init()
> > > > > > > callback function is a
> > > > > > > mid-layering which is really frowned on since it creates a horrible
> > > > > > > driver->DRM->driver design layering.
> > > > > > > 
> > > > > > > The recent patch "drm/debugfs: create device-centered
> > > > > > > debugfs functions" tried
> > > > > > > to address those problem, but doesn't seem to work
> > > > > > > correctly. This looks like
> > > > > > > a misunderstanding of the call flow around
> > > > > > > drm_debugfs_init(), which is called
> > > > > > > multiple times, once for the primary and once for the render node.
> > > > > > > 
> > > > > > > So what happens now is the following:
> > > > > > > 
> > > > > > > 1. drm_dev_init() initially allocates the drm_minor objects.
> > > > > > > 2. ... back to the driver ...
> > > > > > > 3. drm_dev_register() is called.
> > > > > > > 
> > > > > > > 4. drm_debugfs_init() is called for the primary node.
> > > > > > > 5. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > >      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > to add the files
> > > > > > >      for the primary node.
> > > > > > > 6. The driver->debugfs_init() callback is called to add
> > > > > > > debugfs files for the
> > > > > > >      primary node.
> > > > > > > 7. The added files are consumed and added to the primary
> > > > > > > node debugfs directory.
> > > > > > > 
> > > > > > > 8. drm_debugfs_init() is called for the render node.
> > > > > > > 9. drm_framebuffer_debugfs_init(), drm_client_debugfs_init() and
> > > > > > >      drm_atomic_debugfs_init() call drm_debugfs_add_file(s)()
> > > > > > > to add the files
> > > > > > >      again for the render node.
> > > > > > > 10. The driver->debugfs_init() callback is called to add
> > > > > > > debugfs files for the
> > > > > > >       render node.
> > > > > > > 11. The added files are consumed and added to the render
> > > > > > > node debugfs directory.
> > > > > > > 
> > > > > > > 12. Some more files are added through drm_debugfs_add_file().
> > > > > > > 13. drm_debugfs_late_register() add the files once more to
> > > > > > > the primary node
> > > > > > >       debugfs directory.
> > > > > > > 14. From this point on files added through
> > > > > > > drm_debugfs_add_file() are simply ignored.
> > > > > > > 15. ... back to the driver ...
> > > > > > > 
> > > > > > > Because of this the dev->debugfs_mutex lock is also
> > > > > > > completely pointless since
> > > > > > > any concurrent use of the interface would just randomly
> > > > > > > either add the files to
> > > > > > > the primary or render node or just not at all.
> > > > > > > 
> > > > > > > Even worse is that this implementation nails the coffin for
> > > > > > > removing the
> > > > > > > driver->debugfs_init() mid-layering because otherwise
> > > > > > > drivers can't control
> > > > > > > where their debugfs (primary/render node) are actually added.
> > > > > > > 
> > > > > > > This patch set here now tries to clean this up a bit, but
> > > > > > > most likely isn't
> > > > > > > fully complete either since I didn't audit every driver/call path.
> > > > > > I tested the patchset on the v3d, vc4 and vkms and all the files
> > > > > > are generated
> > > > > > as expected, but I'm getting the following errors on dmesg:
> > > > > > 
> > > > > > [    3.872026] debugfs: File 'v3d_ident' in directory '0'
> > > > > > already present!
> > > > > > [    3.872064] debugfs: File 'v3d_ident' in directory '128'
> > > > > > already present!
> > > > > > [    3.872078] debugfs: File 'v3d_regs' in directory '0' already
> > > > > > present!
> > > > > > [    3.872087] debugfs: File 'v3d_regs' in directory '128'
> > > > > > already present!
> > > > > > [    3.872097] debugfs: File 'measure_clock' in directory '0'
> > > > > > already present!
> > > > > > [    3.872105] debugfs: File 'measure_clock' in directory '128'
> > > > > > already present!
> > > > > > [    3.872116] debugfs: File 'bo_stats' in directory '0' already
> > > > > > present!
> > > > > > [    3.872124] debugfs: File 'bo_stats' in directory '128'
> > > > > > already present!
> > > > > > 
> > > > > > It looks like the render node is being added twice, since this
> > > > > > doesn't happen
> > > > > > for vc4 and vkms.
> > > > > Thanks for the feedback and yes that's exactly what I meant with
> > > > > that I haven't looked into all code paths.
> > > > > 
> > > > > Could it be that v3d registers it's debugfs files from the
> > > > > debugfs_init callback?
> > > > Although this is true, I'm not sure if this is the reason why the files
> > > > are
> > > > being registered twice, as this doesn't happen to vc4, and it also uses
> > > > the
> > > > debugfs_init callback. I believe it is somewhat related to the fact that
> > > > v3d is the primary node and the render node.
> > > I see. Thanks for the hint.
> > > 
> > > > Best Regards,
> > > > - Maíra Canal
> > > > 
> > > > > One alternative would be to just completely nuke support for
> > > > > separate render node debugfs files and only add a symlink to the
> > > > > primary node. Opinions?
> > > What do you think of this approach? I can't come up with any reason why we
> > > should have separate debugfs files for render nodes and I think it is pretty
> > > much the same reason you came up with the patch for per device debugfs files
> > > instead of per minor.
> > Yeah I think best is to symlink around a bit for compat. I thought we
> > where doing that already, and you can't actually create debugfs files on
> > render nodes? Or did I only dream about this?
> 
> No, we still have that distinction around unfortunately.
> 
> That's why this went boom for me in the first place.

I guess time to land that? Or should we do this as part of the conversion
and just change the new add_file helpers to only instantiate on the
primary node until all the old users are gone?
-Daniel

> 
> Christian.
> 
> > -Daniel
> > 
> > > Regards,
> > > Christian.
> > > 
> > > > > Regards,
> > > > > Christian.
> > > > > 
> > > > > > Otherwise, the patchset looks good to me, but maybe Daniel has
> > > > > > some other
> > > > > > thoughts about it.
> > > > > > 
> > > > > > Best Regards,
> > > > > > - Maíra Canal
> > > > > > 
> > > > > > > Please comment/discuss.
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 19:54             ` Daniel Vetter
@ 2023-02-17  9:22               ` Christian König
  2023-02-17 10:01                 ` Stanislaw Gruszka
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-17  9:22 UTC (permalink / raw)
  To: Daniel Vetter, Jani Nikula
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

Am 16.02.23 um 20:54 schrieb Daniel Vetter:
> On Thu, Feb 16, 2023 at 07:08:49PM +0200, Jani Nikula wrote:
>> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
>>> Am 16.02.23 um 17:46 schrieb Jani Nikula:
>>>> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
>>>>> Am 16.02.23 um 12:33 schrieb Daniel Vetter:
>>>>>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>>>>>>> The mutex was completely pointless in the first place since any
>>>>>>> parallel adding of files to this list would result in random
>>>>>>> behavior since the list is filled and consumed multiple times.
>>>>>>>
>>>>>>> Completely drop that approach and just create the files directly.
>>>>>>>
>>>>>>> This also re-adds the debugfs files to the render node directory and
>>>>>>> removes drm_debugfs_late_register().
>>>>>>>
>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>>>>>>>     drivers/gpu/drm/drm_drv.c         |  3 ---
>>>>>>>     drivers/gpu/drm/drm_internal.h    |  5 -----
>>>>>>>     drivers/gpu/drm/drm_mode_config.c |  2 --
>>>>>>>     include/drm/drm_device.h          | 15 ---------------
>>>>>>>     5 files changed, 7 insertions(+), 50 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>>>>>>> index 558e3a7271a5..a40288e67264 100644
>>>>>>> --- a/drivers/gpu/drm/drm_debugfs.c
>>>>>>> +++ b/drivers/gpu/drm/drm_debugfs.c
>>>>>>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>>>>>>>     void drm_debugfs_minor_register(struct drm_minor *minor)
>>>>>>>     {
>>>>>>>     	struct drm_device *dev = minor->dev;
>>>>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>>>>     
>>>>>>>     	if (dev->driver->debugfs_init)
>>>>>>>     		dev->driver->debugfs_init(minor);
>>>>>>> -
>>>>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>>>>> -		list_del(&entry->list);
>>>>>>> -	}
>>>>>>> -}
>>>>>>> -
>>>>>>> -void drm_debugfs_late_register(struct drm_device *dev)
>>>>>>> -{
>>>>>>> -	struct drm_minor *minor = dev->primary;
>>>>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>>>> -
>>>>>>> -	if (!minor)
>>>>>>> -		return;
>>>>>>> -
>>>>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>>>>> -		list_del(&entry->list);
>>>>>>> -	}
>>>>>>>     }
>>>>>>>     
>>>>>>>     int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>>>>>>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>>>>>>>     	entry->file.data = data;
>>>>>>>     	entry->dev = dev;
>>>>>>>     
>>>>>>> -	mutex_lock(&dev->debugfs_mutex);
>>>>>>> -	list_add(&entry->list, &dev->debugfs_list);
>>>>>>> -	mutex_unlock(&dev->debugfs_mutex);
>>>>>>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>>>>>>> +			    &drm_debugfs_entry_fops);
>>>>>>> +
>>>>>>> +	/* TODO: This should probably only be a symlink */
>>>>>>> +	if (dev->render)
>>>>>>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>>>>>>> +				    entry, &drm_debugfs_entry_fops);
>>>>>> Nope. You are fundamentally missing the point of all this, which is:
>>>>>>
>>>>>> - drivers create debugfs files whenever they want to, as long as it's
>>>>>>      _before_ drm_dev_register is called.
>>>>>>
>>>>>> - drm_dev_register will set them all up.
>>>>>>
>>>>>> This is necessary because otherwise you have the potential for some nice
>>>>>> oops and stuff when userspace tries to access these files before the
>>>>>> driver is ready.
>>>>>>
>>>>>> Note that with sysfs all this infrastructure already exists, which is why
>>>>>> you can create sysfs files whenever you feel like, and things wont go
>>>>>> boom.
>>>>> Well Yeah I've considered that, I just don't think it's a good idea for
>>>>> debugfs.
>>>>>
>>>>> debugfs is meant to be a helper for debugging things and that especially
>>>>> includes the time between drm_dev_init() and drm_dev_register() because
>>>>> that's where we probe the hardware and try to get it working.
>>>>>
>>>>> Not having the debugfs files which allows for things like hardware
>>>>> register access and reading internal state during that is a really and I
>>>>> mean REALLY bad idea. This is essentially what we have those files for.
>>>> So you mean you want to have early debugfs so you can have some script
>>>> hammering the debugfs to get info out between init and register during
>>>> probe?
>>> Well not hammering. What we usually do in bringup is to set firmware
>>> timeout to infinity and the driver then sits and waits for the hw.
>>>
>>> The tool used to access registers then goes directly through the PCI bar
>>> at the moment, but that's essentially a bad idea for registers which you
>>> grab a lock for to access (like index/data).
>>>
>>>> I just think registering debugfs before everything is ready is a recipe
>>>> for disaster. All of the debugfs needs to check all the conditions that
>>>> they need across all of the probe stages. It'll be difficult to get it
>>>> right. And you'll get cargo culted checks copy pasted all over the
>>>> place.
>>> Yeah, but it's debugfs. That is not supposed to work under all conditions.
>>>
>>> Just try to read amdgpu_regs on a not existing register index. This will
>>> just hang or reboot your box immediately on APUs.
>> I'm firmly in the camp that debugfs does not need to work under all
>> conditions, but that it must fail gracefully instead of crashing.
> Yeah I mean once we talk bring-up, you can just hand-roll the necessary
> bring debugfs things that you need to work before the driver is ready to
> do anything.
>
> But bring-up debugfs fun is rather special, same way pre-silicon support
> tends to be rather special. Shipping that in distros does not sound like a
> good idea at all to me.

Yeah, that's indeed a really good point.

I can't remember how often I had to note that module parameters would 
also be used by end users.

How about if the create the debugfs directory with a "." as name prefix 
first and then rename it as soon as the device is registered? 
Alternatively we could clear the i_mode of the directory.

If a power user or engineer wants to debug startup problems stuff it 
should be trivial to work around that from userspace, and if people do 
such things they should also know the potential consequences.

Christian.



> -Daniel
>
>>
>> BR,
>> Jani.
>>
>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17  9:22               ` Christian König
@ 2023-02-17 10:01                 ` Stanislaw Gruszka
  2023-02-17 19:38                   ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-17 10:01 UTC (permalink / raw)
  To: Christian König
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal,
	wambui.karugax, maxime

On Fri, Feb 17, 2023 at 10:22:25AM +0100, Christian König wrote:
> Am 16.02.23 um 20:54 schrieb Daniel Vetter:
> > On Thu, Feb 16, 2023 at 07:08:49PM +0200, Jani Nikula wrote:
> > > On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> > > > Am 16.02.23 um 17:46 schrieb Jani Nikula:
> > > > > On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> > > > > > Am 16.02.23 um 12:33 schrieb Daniel Vetter:
> > > > > > > On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> > > > > > > > The mutex was completely pointless in the first place since any
> > > > > > > > parallel adding of files to this list would result in random
> > > > > > > > behavior since the list is filled and consumed multiple times.
> > > > > > > > 
> > > > > > > > Completely drop that approach and just create the files directly.
> > > > > > > > 
> > > > > > > > This also re-adds the debugfs files to the render node directory and
> > > > > > > > removes drm_debugfs_late_register().
> > > > > > > > 
> > > > > > > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > > > > > > ---
> > > > > > > >     drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
> > > > > > > >     drivers/gpu/drm/drm_drv.c         |  3 ---
> > > > > > > >     drivers/gpu/drm/drm_internal.h    |  5 -----
> > > > > > > >     drivers/gpu/drm/drm_mode_config.c |  2 --
> > > > > > > >     include/drm/drm_device.h          | 15 ---------------
> > > > > > > >     5 files changed, 7 insertions(+), 50 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> > > > > > > > index 558e3a7271a5..a40288e67264 100644
> > > > > > > > --- a/drivers/gpu/drm/drm_debugfs.c
> > > > > > > > +++ b/drivers/gpu/drm/drm_debugfs.c
> > > > > > > > @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
> > > > > > > >     void drm_debugfs_minor_register(struct drm_minor *minor)
> > > > > > > >     {
> > > > > > > >     	struct drm_device *dev = minor->dev;
> > > > > > > > -	struct drm_debugfs_entry *entry, *tmp;
> > > > > > > >     	if (dev->driver->debugfs_init)
> > > > > > > >     		dev->driver->debugfs_init(minor);
> > > > > > > > -
> > > > > > > > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > > > > > > > -		debugfs_create_file(entry->file.name, 0444,
> > > > > > > > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > > > > > > > -		list_del(&entry->list);
> > > > > > > > -	}
> > > > > > > > -}
> > > > > > > > -
> > > > > > > > -void drm_debugfs_late_register(struct drm_device *dev)
> > > > > > > > -{
> > > > > > > > -	struct drm_minor *minor = dev->primary;
> > > > > > > > -	struct drm_debugfs_entry *entry, *tmp;
> > > > > > > > -
> > > > > > > > -	if (!minor)
> > > > > > > > -		return;
> > > > > > > > -
> > > > > > > > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > > > > > > > -		debugfs_create_file(entry->file.name, 0444,
> > > > > > > > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > > > > > > > -		list_del(&entry->list);
> > > > > > > > -	}
> > > > > > > >     }
> > > > > > > >     int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> > > > > > > > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> > > > > > > >     	entry->file.data = data;
> > > > > > > >     	entry->dev = dev;
> > > > > > > > -	mutex_lock(&dev->debugfs_mutex);
> > > > > > > > -	list_add(&entry->list, &dev->debugfs_list);
> > > > > > > > -	mutex_unlock(&dev->debugfs_mutex);
> > > > > > > > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> > > > > > > > +			    &drm_debugfs_entry_fops);
> > > > > > > > +
> > > > > > > > +	/* TODO: This should probably only be a symlink */
> > > > > > > > +	if (dev->render)
> > > > > > > > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> > > > > > > > +				    entry, &drm_debugfs_entry_fops);
> > > > > > > Nope. You are fundamentally missing the point of all this, which is:
> > > > > > > 
> > > > > > > - drivers create debugfs files whenever they want to, as long as it's
> > > > > > >      _before_ drm_dev_register is called.
> > > > > > > 
> > > > > > > - drm_dev_register will set them all up.
> > > > > > > 
> > > > > > > This is necessary because otherwise you have the potential for some nice
> > > > > > > oops and stuff when userspace tries to access these files before the
> > > > > > > driver is ready.
> > > > > > > 
> > > > > > > Note that with sysfs all this infrastructure already exists, which is why
> > > > > > > you can create sysfs files whenever you feel like, and things wont go
> > > > > > > boom.
> > > > > > Well Yeah I've considered that, I just don't think it's a good idea for
> > > > > > debugfs.
> > > > > > 
> > > > > > debugfs is meant to be a helper for debugging things and that especially
> > > > > > includes the time between drm_dev_init() and drm_dev_register() because
> > > > > > that's where we probe the hardware and try to get it working.
> > > > > > 
> > > > > > Not having the debugfs files which allows for things like hardware
> > > > > > register access and reading internal state during that is a really and I
> > > > > > mean REALLY bad idea. This is essentially what we have those files for.
> > > > > So you mean you want to have early debugfs so you can have some script
> > > > > hammering the debugfs to get info out between init and register during
> > > > > probe?
> > > > Well not hammering. What we usually do in bringup is to set firmware
> > > > timeout to infinity and the driver then sits and waits for the hw.
> > > > 
> > > > The tool used to access registers then goes directly through the PCI bar
> > > > at the moment, but that's essentially a bad idea for registers which you
> > > > grab a lock for to access (like index/data).
> > > > 
> > > > > I just think registering debugfs before everything is ready is a recipe
> > > > > for disaster. All of the debugfs needs to check all the conditions that
> > > > > they need across all of the probe stages. It'll be difficult to get it
> > > > > right. And you'll get cargo culted checks copy pasted all over the
> > > > > place.
> > > > Yeah, but it's debugfs. That is not supposed to work under all conditions.
> > > > 
> > > > Just try to read amdgpu_regs on a not existing register index. This will
> > > > just hang or reboot your box immediately on APUs.
> > > I'm firmly in the camp that debugfs does not need to work under all
> > > conditions, but that it must fail gracefully instead of crashing.
> > Yeah I mean once we talk bring-up, you can just hand-roll the necessary
> > bring debugfs things that you need to work before the driver is ready to
> > do anything.
> > 
> > But bring-up debugfs fun is rather special, same way pre-silicon support
> > tends to be rather special. Shipping that in distros does not sound like a
> > good idea at all to me.
> 
> Yeah, that's indeed a really good point.
> 
> I can't remember how often I had to note that module parameters would also
> be used by end users.
> 
> How about if the create the debugfs directory with a "." as name prefix
> first and then rename it as soon as the device is registered?

Good idea. Or the dir could have this drm_dev->unique name and be created
during alloc, and link in minor created during registration. That would
mean minor link is safe to use and unique potentially dangerous before
registration.

> Alternatively
> we could clear the i_mode of the directory.

I checked that yesterday and this does not prevent to access the file
for root user. Perhaps there is other smart way for blocking
root access in vfs just by modifying some inode field, but just
'chmod 0000 file' does not prevent that.

> If a power user or engineer wants to debug startup problems stuff it should
> be trivial to work around that from userspace, and if people do such things
> they should also know the potential consequences.

Fully agree.

Regards
Stanislaw


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-16 17:06       ` Jani Nikula
  2023-02-16 19:56         ` Daniel Vetter
@ 2023-02-17 10:35         ` Stanislaw Gruszka
  2023-02-17 10:49           ` Jani Nikula
  1 sibling, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-17 10:35 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Christian König, mcanal, dri-devel, mwen, mairacanal,
	maxime, daniel.vetter, wambui.karugax

On Thu, Feb 16, 2023 at 07:06:46PM +0200, Jani Nikula wrote:
> >
> > But should not this the driver responsibility, call drm_debugfs_add_file()
> > whenever you are ready to handle operations on added file ?
> 
> In theory, yes, but in practice it's pretty hard for a non-trivial
> driver to maintain that all the conditions are met.

Hmmm... 

> In i915 we call debugfs register all over the place only after we've
> called drm_dev_register(), because it's the only sane way. But it means
> we need the init and register separated everywhere, instead of init
> adding files to a list to be registered later.

Isn't this done this way in i915 only because it was not possible
(and still isn't) to call drm_debugfs_create_file() before registration ?

I think it's should be ok by i915 subsystem to create it's debugfs
files and allow to access to them just after that subsystem init.

Or there are some complex dependencies between i915 subsystems,
that reading registers from one subsystem will corrupt some
other subsystem that did non finish initialization yet?

Regards
Stanislaw

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 10:35         ` Stanislaw Gruszka
@ 2023-02-17 10:49           ` Jani Nikula
  2023-02-17 11:36             ` Stanislaw Gruszka
  0 siblings, 1 reply; 50+ messages in thread
From: Jani Nikula @ 2023-02-17 10:49 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: Christian König, mcanal, dri-devel, mwen, mairacanal,
	maxime, daniel.vetter, wambui.karugax

On Fri, 17 Feb 2023, Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote:
> On Thu, Feb 16, 2023 at 07:06:46PM +0200, Jani Nikula wrote:
>> >
>> > But should not this the driver responsibility, call drm_debugfs_add_file()
>> > whenever you are ready to handle operations on added file ?
>> 
>> In theory, yes, but in practice it's pretty hard for a non-trivial
>> driver to maintain that all the conditions are met.
>
> Hmmm... 
>
>> In i915 we call debugfs register all over the place only after we've
>> called drm_dev_register(), because it's the only sane way. But it means
>> we need the init and register separated everywhere, instead of init
>> adding files to a list to be registered later.
>
> Isn't this done this way in i915 only because it was not possible
> (and still isn't) to call drm_debugfs_create_file() before registration ?
>
> I think it's should be ok by i915 subsystem to create it's debugfs
> files and allow to access to them just after that subsystem init.
>
> Or there are some complex dependencies between i915 subsystems,
> that reading registers from one subsystem will corrupt some
> other subsystem that did non finish initialization yet?

That's the point. It's really hard to figure it all out. Why bother?

BR,
Jani.


>
> Regards
> Stanislaw

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 10:49           ` Jani Nikula
@ 2023-02-17 11:36             ` Stanislaw Gruszka
  2023-02-17 11:54               ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-17 11:36 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Christian König, mcanal, dri-devel, mwen, mairacanal,
	maxime, daniel.vetter, wambui.karugax

On Fri, Feb 17, 2023 at 12:49:41PM +0200, Jani Nikula wrote:
> On Fri, 17 Feb 2023, Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote:
> > On Thu, Feb 16, 2023 at 07:06:46PM +0200, Jani Nikula wrote:
> >> >
> >> > But should not this the driver responsibility, call drm_debugfs_add_file()
> >> > whenever you are ready to handle operations on added file ?
> >> 
> >> In theory, yes, but in practice it's pretty hard for a non-trivial
> >> driver to maintain that all the conditions are met.
> >
> > Hmmm... 
> >
> >> In i915 we call debugfs register all over the place only after we've
> >> called drm_dev_register(), because it's the only sane way. But it means
> >> we need the init and register separated everywhere, instead of init
> >> adding files to a list to be registered later.
> >
> > Isn't this done this way in i915 only because it was not possible
> > (and still isn't) to call drm_debugfs_create_file() before registration ?
> >
> > I think it's should be ok by i915 subsystem to create it's debugfs
> > files and allow to access to them just after that subsystem init.
> >
> > Or there are some complex dependencies between i915 subsystems,
> > that reading registers from one subsystem will corrupt some
> > other subsystem that did non finish initialization yet?
> 
> That's the point. It's really hard to figure it all out. Why bother?

I see. 

Just hope we could get something simpler to limit debugfs access
before registration: unix hidden file, permissions or other way.
Because current drm_debufs_add_file() implementation looks
really over convoluted to me.

Regards
Stanislaw


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 11:36             ` Stanislaw Gruszka
@ 2023-02-17 11:54               ` Christian König
  2023-02-17 12:37                 ` Jani Nikula
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-17 11:54 UTC (permalink / raw)
  To: Stanislaw Gruszka, Jani Nikula
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

Am 17.02.23 um 12:36 schrieb Stanislaw Gruszka:
> On Fri, Feb 17, 2023 at 12:49:41PM +0200, Jani Nikula wrote:
>> On Fri, 17 Feb 2023, Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote:
>>> On Thu, Feb 16, 2023 at 07:06:46PM +0200, Jani Nikula wrote:
>>>>> But should not this the driver responsibility, call drm_debugfs_add_file()
>>>>> whenever you are ready to handle operations on added file ?
>>>> In theory, yes, but in practice it's pretty hard for a non-trivial
>>>> driver to maintain that all the conditions are met.
>>> Hmmm...
>>>
>>>> In i915 we call debugfs register all over the place only after we've
>>>> called drm_dev_register(), because it's the only sane way. But it means
>>>> we need the init and register separated everywhere, instead of init
>>>> adding files to a list to be registered later.
>>> Isn't this done this way in i915 only because it was not possible
>>> (and still isn't) to call drm_debugfs_create_file() before registration ?
>>>
>>> I think it's should be ok by i915 subsystem to create it's debugfs
>>> files and allow to access to them just after that subsystem init.
>>>
>>> Or there are some complex dependencies between i915 subsystems,
>>> that reading registers from one subsystem will corrupt some
>>> other subsystem that did non finish initialization yet?
>> That's the point. It's really hard to figure it all out. Why bother?
> I see.
>
> Just hope we could get something simpler to limit debugfs access
> before registration: unix hidden file, permissions or other way.
> Because current drm_debufs_add_file() implementation looks
> really over convoluted to me.

Completely agree.

We have intentionally removed exactly that approach from radeon because 
it just lead to and over all bad driver design and more problems than it 
solved.

If i915 have such structural problems then I strongly suggest to solve 
them inside i915 and not make common code out of that. This just 
encourages others to follow that lead.

Regards,
Christian.

>
> Regards
> Stanislaw
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 11:54               ` Christian König
@ 2023-02-17 12:37                 ` Jani Nikula
  2023-02-17 15:55                   ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Jani Nikula @ 2023-02-17 12:37 UTC (permalink / raw)
  To: Christian König, Stanislaw Gruszka
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

On Fri, 17 Feb 2023, Christian König <ckoenig.leichtzumerken@gmail.com> wrote:
> If i915 have such structural problems then I strongly suggest to solve 
> them inside i915 and not make common code out of that.

All other things aside, that's just a completely unnecessary and
unhelpful remark.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 12:37                 ` Jani Nikula
@ 2023-02-17 15:55                   ` Christian König
  2023-02-17 19:42                     ` Daniel Vetter
  0 siblings, 1 reply; 50+ messages in thread
From: Christian König @ 2023-02-17 15:55 UTC (permalink / raw)
  To: Jani Nikula, Stanislaw Gruszka
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal, maxime,
	wambui.karugax

Am 17.02.23 um 13:37 schrieb Jani Nikula:
> On Fri, 17 Feb 2023, Christian König <ckoenig.leichtzumerken@gmail.com> wrote:
>> If i915 have such structural problems then I strongly suggest to solve
>> them inside i915 and not make common code out of that.
> All other things aside, that's just a completely unnecessary and
> unhelpful remark.

Sorry, but why?

We have gone through the same problems on radeon and it was massively 
painful, what I try here is to prevent others from using this bad design 
as well. And yes I think devm_ and drmm_ is a bit questionable in that 
regard as well.

The goal is not to make it as simple as possible to write a driver, but 
rather as defensive as possible. In other words automatically releasing 
memory when an object is destroyed might be helpful, but it isn't 
automatically a good idea.

What can easily happen for example is that you run into use after free 
situations on object reference decommissions, e.g. parent is freed 
before child for example.

Regards,
Christian.

>
>
> BR,
> Jani.
>
>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 10:01                 ` Stanislaw Gruszka
@ 2023-02-17 19:38                   ` Daniel Vetter
  2023-02-17 19:55                     ` Christian König
  2023-02-22 13:33                     ` Stanislaw Gruszka
  0 siblings, 2 replies; 50+ messages in thread
From: Daniel Vetter @ 2023-02-17 19:38 UTC (permalink / raw)
  To: Stanislaw Gruszka
  Cc: daniel.vetter, mcanal, mwen, mairacanal, dri-devel,
	wambui.karugax, Christian König, maxime

On Fri, Feb 17, 2023 at 11:01:18AM +0100, Stanislaw Gruszka wrote:
> On Fri, Feb 17, 2023 at 10:22:25AM +0100, Christian König wrote:
> > Am 16.02.23 um 20:54 schrieb Daniel Vetter:
> > > On Thu, Feb 16, 2023 at 07:08:49PM +0200, Jani Nikula wrote:
> > > > On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> > > > > Am 16.02.23 um 17:46 schrieb Jani Nikula:
> > > > > > On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
> > > > > > > Am 16.02.23 um 12:33 schrieb Daniel Vetter:
> > > > > > > > On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
> > > > > > > > > The mutex was completely pointless in the first place since any
> > > > > > > > > parallel adding of files to this list would result in random
> > > > > > > > > behavior since the list is filled and consumed multiple times.
> > > > > > > > > 
> > > > > > > > > Completely drop that approach and just create the files directly.
> > > > > > > > > 
> > > > > > > > > This also re-adds the debugfs files to the render node directory and
> > > > > > > > > removes drm_debugfs_late_register().
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > > > > > > > ---
> > > > > > > > >     drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
> > > > > > > > >     drivers/gpu/drm/drm_drv.c         |  3 ---
> > > > > > > > >     drivers/gpu/drm/drm_internal.h    |  5 -----
> > > > > > > > >     drivers/gpu/drm/drm_mode_config.c |  2 --
> > > > > > > > >     include/drm/drm_device.h          | 15 ---------------
> > > > > > > > >     5 files changed, 7 insertions(+), 50 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
> > > > > > > > > index 558e3a7271a5..a40288e67264 100644
> > > > > > > > > --- a/drivers/gpu/drm/drm_debugfs.c
> > > > > > > > > +++ b/drivers/gpu/drm/drm_debugfs.c
> > > > > > > > > @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
> > > > > > > > >     void drm_debugfs_minor_register(struct drm_minor *minor)
> > > > > > > > >     {
> > > > > > > > >     	struct drm_device *dev = minor->dev;
> > > > > > > > > -	struct drm_debugfs_entry *entry, *tmp;
> > > > > > > > >     	if (dev->driver->debugfs_init)
> > > > > > > > >     		dev->driver->debugfs_init(minor);
> > > > > > > > > -
> > > > > > > > > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > > > > > > > > -		debugfs_create_file(entry->file.name, 0444,
> > > > > > > > > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > > > > > > > > -		list_del(&entry->list);
> > > > > > > > > -	}
> > > > > > > > > -}
> > > > > > > > > -
> > > > > > > > > -void drm_debugfs_late_register(struct drm_device *dev)
> > > > > > > > > -{
> > > > > > > > > -	struct drm_minor *minor = dev->primary;
> > > > > > > > > -	struct drm_debugfs_entry *entry, *tmp;
> > > > > > > > > -
> > > > > > > > > -	if (!minor)
> > > > > > > > > -		return;
> > > > > > > > > -
> > > > > > > > > -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
> > > > > > > > > -		debugfs_create_file(entry->file.name, 0444,
> > > > > > > > > -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
> > > > > > > > > -		list_del(&entry->list);
> > > > > > > > > -	}
> > > > > > > > >     }
> > > > > > > > >     int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
> > > > > > > > > @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
> > > > > > > > >     	entry->file.data = data;
> > > > > > > > >     	entry->dev = dev;
> > > > > > > > > -	mutex_lock(&dev->debugfs_mutex);
> > > > > > > > > -	list_add(&entry->list, &dev->debugfs_list);
> > > > > > > > > -	mutex_unlock(&dev->debugfs_mutex);
> > > > > > > > > +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
> > > > > > > > > +			    &drm_debugfs_entry_fops);
> > > > > > > > > +
> > > > > > > > > +	/* TODO: This should probably only be a symlink */
> > > > > > > > > +	if (dev->render)
> > > > > > > > > +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
> > > > > > > > > +				    entry, &drm_debugfs_entry_fops);
> > > > > > > > Nope. You are fundamentally missing the point of all this, which is:
> > > > > > > > 
> > > > > > > > - drivers create debugfs files whenever they want to, as long as it's
> > > > > > > >      _before_ drm_dev_register is called.
> > > > > > > > 
> > > > > > > > - drm_dev_register will set them all up.
> > > > > > > > 
> > > > > > > > This is necessary because otherwise you have the potential for some nice
> > > > > > > > oops and stuff when userspace tries to access these files before the
> > > > > > > > driver is ready.
> > > > > > > > 
> > > > > > > > Note that with sysfs all this infrastructure already exists, which is why
> > > > > > > > you can create sysfs files whenever you feel like, and things wont go
> > > > > > > > boom.
> > > > > > > Well Yeah I've considered that, I just don't think it's a good idea for
> > > > > > > debugfs.
> > > > > > > 
> > > > > > > debugfs is meant to be a helper for debugging things and that especially
> > > > > > > includes the time between drm_dev_init() and drm_dev_register() because
> > > > > > > that's where we probe the hardware and try to get it working.
> > > > > > > 
> > > > > > > Not having the debugfs files which allows for things like hardware
> > > > > > > register access and reading internal state during that is a really and I
> > > > > > > mean REALLY bad idea. This is essentially what we have those files for.
> > > > > > So you mean you want to have early debugfs so you can have some script
> > > > > > hammering the debugfs to get info out between init and register during
> > > > > > probe?
> > > > > Well not hammering. What we usually do in bringup is to set firmware
> > > > > timeout to infinity and the driver then sits and waits for the hw.
> > > > > 
> > > > > The tool used to access registers then goes directly through the PCI bar
> > > > > at the moment, but that's essentially a bad idea for registers which you
> > > > > grab a lock for to access (like index/data).
> > > > > 
> > > > > > I just think registering debugfs before everything is ready is a recipe
> > > > > > for disaster. All of the debugfs needs to check all the conditions that
> > > > > > they need across all of the probe stages. It'll be difficult to get it
> > > > > > right. And you'll get cargo culted checks copy pasted all over the
> > > > > > place.
> > > > > Yeah, but it's debugfs. That is not supposed to work under all conditions.
> > > > > 
> > > > > Just try to read amdgpu_regs on a not existing register index. This will
> > > > > just hang or reboot your box immediately on APUs.
> > > > I'm firmly in the camp that debugfs does not need to work under all
> > > > conditions, but that it must fail gracefully instead of crashing.
> > > Yeah I mean once we talk bring-up, you can just hand-roll the necessary
> > > bring debugfs things that you need to work before the driver is ready to
> > > do anything.
> > > 
> > > But bring-up debugfs fun is rather special, same way pre-silicon support
> > > tends to be rather special. Shipping that in distros does not sound like a
> > > good idea at all to me.
> > 
> > Yeah, that's indeed a really good point.
> > 
> > I can't remember how often I had to note that module parameters would also
> > be used by end users.
> > 
> > How about if the create the debugfs directory with a "." as name prefix
> > first and then rename it as soon as the device is registered?
> 
> Good idea. Or the dir could have this drm_dev->unique name and be created
> during alloc, and link in minor created during registration. That would
> mean minor link is safe to use and unique potentially dangerous before
> registration.
> 
> > Alternatively
> > we could clear the i_mode of the directory.
> 
> I checked that yesterday and this does not prevent to access the file
> for root user. Perhaps there is other smart way for blocking
> root access in vfs just by modifying some inode field, but just
> 'chmod 0000 file' does not prevent that.
> 
> > If a power user or engineer wants to debug startup problems stuff it should
> > be trivial to work around that from userspace, and if people do such things
> > they should also know the potential consequences.
> 
> Fully agree.

So what about a drm module option instead (that taints the kernel as usual
for these), which:
- registers the debugfs dir right away
- registers any debugfs files as soon as they get populated, instead of
  postponing until drm_dev_register

It would only neatly work with the add_file stuff, but I guess drivers
could still hand-roll this if needed.

I think funny games with trying to hide the files while not hiding them is
not a great idea, and explicit "I'm debugging stuff, please stand back"
knob sounds much better to me.
-Daniel

> 
> Regards
> Stanislaw
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 15:55                   ` Christian König
@ 2023-02-17 19:42                     ` Daniel Vetter
  2023-02-17 19:49                       ` Christian König
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel Vetter @ 2023-02-17 19:42 UTC (permalink / raw)
  To: Christian König
  Cc: mairacanal, daniel.vetter, mcanal, dri-devel, mwen,
	Stanislaw Gruszka, wambui.karugax, maxime

On Fri, Feb 17, 2023 at 04:55:27PM +0100, Christian König wrote:
> Am 17.02.23 um 13:37 schrieb Jani Nikula:
> > On Fri, 17 Feb 2023, Christian König <ckoenig.leichtzumerken@gmail.com> wrote:
> > > If i915 have such structural problems then I strongly suggest to solve
> > > them inside i915 and not make common code out of that.
> > All other things aside, that's just a completely unnecessary and
> > unhelpful remark.
> 
> Sorry, but why?
> 
> We have gone through the same problems on radeon and it was massively
> painful, what I try here is to prevent others from using this bad design as
> well. And yes I think devm_ and drmm_ is a bit questionable in that regard
> as well.
> 
> The goal is not to make it as simple as possible to write a driver, but
> rather as defensive as possible. In other words automatically releasing
> memory when an object is destroyed might be helpful, but it isn't
> automatically a good idea.
> 
> What can easily happen for example is that you run into use after free
> situations on object reference decommissions, e.g. parent is freed before
> child for example.

I know that radeon/amd are going different paths on this, but I think it's
also very clear that you're not really representing the consensus here.
For smaller drivers especially there really isn't anyone arguing against
devm/drmm.

Similar for uapi interfaces that just do the right thing and prevent
races. You're the very first one who argued this is a good thing to have.
kernfs/kobj/sysfs people spend endless amounts of engineer on trying to
build something that's impossible to get wrong, or at least get as close
to that as feasible.

I mean the entire rust endeavour flies under that flag too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 19:42                     ` Daniel Vetter
@ 2023-02-17 19:49                       ` Christian König
  0 siblings, 0 replies; 50+ messages in thread
From: Christian König @ 2023-02-17 19:49 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: mairacanal, daniel.vetter, mcanal, dri-devel, mwen,
	Stanislaw Gruszka, wambui.karugax, maxime

Am 17.02.23 um 20:42 schrieb Daniel Vetter:
> On Fri, Feb 17, 2023 at 04:55:27PM +0100, Christian König wrote:
>> Am 17.02.23 um 13:37 schrieb Jani Nikula:
>>> On Fri, 17 Feb 2023, Christian König <ckoenig.leichtzumerken@gmail.com> wrote:
>>>> If i915 have such structural problems then I strongly suggest to solve
>>>> them inside i915 and not make common code out of that.
>>> All other things aside, that's just a completely unnecessary and
>>> unhelpful remark.
>> Sorry, but why?
>>
>> We have gone through the same problems on radeon and it was massively
>> painful, what I try here is to prevent others from using this bad design as
>> well. And yes I think devm_ and drmm_ is a bit questionable in that regard
>> as well.
>>
>> The goal is not to make it as simple as possible to write a driver, but
>> rather as defensive as possible. In other words automatically releasing
>> memory when an object is destroyed might be helpful, but it isn't
>> automatically a good idea.
>>
>> What can easily happen for example is that you run into use after free
>> situations on object reference decommissions, e.g. parent is freed before
>> child for example.
> I know that radeon/amd are going different paths on this, but I think it's
> also very clear that you're not really representing the consensus here.
> For smaller drivers especially there really isn't anyone arguing against
> devm/drmm.

Which I completely agree on. It's just that we shouldn't promote it as 
"Hey this magically makes everything work in your very complex use case".

It can be a good tool to have such stuff which makes sense in a lot of 
use case, but everybody using it should always keep its downsides in 
mind as well.

> Similar for uapi interfaces that just do the right thing and prevent
> races. You're the very first one who argued this is a good thing to have.
> kernfs/kobj/sysfs people spend endless amounts of engineer on trying to
> build something that's impossible to get wrong, or at least get as close
> to that as feasible.

Yeah, for kernfs/kobj/sysfs it does make complete sense because those 
files are actually sometimes waited on by userspace tools to appear.

I just find it extremely questionable for debugfs.

Regards,
Christian.

> I mean the entire rust endeavour flies under that flag too.
> -Daniel


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 19:38                   ` Daniel Vetter
@ 2023-02-17 19:55                     ` Christian König
  2023-02-22 13:33                     ` Stanislaw Gruszka
  1 sibling, 0 replies; 50+ messages in thread
From: Christian König @ 2023-02-17 19:55 UTC (permalink / raw)
  To: Daniel Vetter, Stanislaw Gruszka
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal,
	wambui.karugax, maxime

Am 17.02.23 um 20:38 schrieb Daniel Vetter:
> On Fri, Feb 17, 2023 at 11:01:18AM +0100, Stanislaw Gruszka wrote:
>> On Fri, Feb 17, 2023 at 10:22:25AM +0100, Christian König wrote:
>>> Am 16.02.23 um 20:54 schrieb Daniel Vetter:
>>>> On Thu, Feb 16, 2023 at 07:08:49PM +0200, Jani Nikula wrote:
>>>>> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
>>>>>> Am 16.02.23 um 17:46 schrieb Jani Nikula:
>>>>>>> On Thu, 16 Feb 2023, Christian König <christian.koenig@amd.com> wrote:
>>>>>>>> Am 16.02.23 um 12:33 schrieb Daniel Vetter:
>>>>>>>>> On Thu, Feb 09, 2023 at 09:18:38AM +0100, Christian König wrote:
>>>>>>>>>> The mutex was completely pointless in the first place since any
>>>>>>>>>> parallel adding of files to this list would result in random
>>>>>>>>>> behavior since the list is filled and consumed multiple times.
>>>>>>>>>>
>>>>>>>>>> Completely drop that approach and just create the files directly.
>>>>>>>>>>
>>>>>>>>>> This also re-adds the debugfs files to the render node directory and
>>>>>>>>>> removes drm_debugfs_late_register().
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>>>>> ---
>>>>>>>>>>      drivers/gpu/drm/drm_debugfs.c     | 32 +++++++------------------------
>>>>>>>>>>      drivers/gpu/drm/drm_drv.c         |  3 ---
>>>>>>>>>>      drivers/gpu/drm/drm_internal.h    |  5 -----
>>>>>>>>>>      drivers/gpu/drm/drm_mode_config.c |  2 --
>>>>>>>>>>      include/drm/drm_device.h          | 15 ---------------
>>>>>>>>>>      5 files changed, 7 insertions(+), 50 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
>>>>>>>>>> index 558e3a7271a5..a40288e67264 100644
>>>>>>>>>> --- a/drivers/gpu/drm/drm_debugfs.c
>>>>>>>>>> +++ b/drivers/gpu/drm/drm_debugfs.c
>>>>>>>>>> @@ -246,31 +246,9 @@ void drm_debugfs_dev_register(struct drm_device *dev)
>>>>>>>>>>      void drm_debugfs_minor_register(struct drm_minor *minor)
>>>>>>>>>>      {
>>>>>>>>>>      	struct drm_device *dev = minor->dev;
>>>>>>>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>>>>>>>      	if (dev->driver->debugfs_init)
>>>>>>>>>>      		dev->driver->debugfs_init(minor);
>>>>>>>>>> -
>>>>>>>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>>>>>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>>>>>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>>>>>>>> -		list_del(&entry->list);
>>>>>>>>>> -	}
>>>>>>>>>> -}
>>>>>>>>>> -
>>>>>>>>>> -void drm_debugfs_late_register(struct drm_device *dev)
>>>>>>>>>> -{
>>>>>>>>>> -	struct drm_minor *minor = dev->primary;
>>>>>>>>>> -	struct drm_debugfs_entry *entry, *tmp;
>>>>>>>>>> -
>>>>>>>>>> -	if (!minor)
>>>>>>>>>> -		return;
>>>>>>>>>> -
>>>>>>>>>> -	list_for_each_entry_safe(entry, tmp, &dev->debugfs_list, list) {
>>>>>>>>>> -		debugfs_create_file(entry->file.name, 0444,
>>>>>>>>>> -				    minor->debugfs_root, entry, &drm_debugfs_entry_fops);
>>>>>>>>>> -		list_del(&entry->list);
>>>>>>>>>> -	}
>>>>>>>>>>      }
>>>>>>>>>>      int drm_debugfs_remove_files(const struct drm_info_list *files, int count,
>>>>>>>>>> @@ -343,9 +321,13 @@ void drm_debugfs_add_file(struct drm_device *dev, const char *name,
>>>>>>>>>>      	entry->file.data = data;
>>>>>>>>>>      	entry->dev = dev;
>>>>>>>>>> -	mutex_lock(&dev->debugfs_mutex);
>>>>>>>>>> -	list_add(&entry->list, &dev->debugfs_list);
>>>>>>>>>> -	mutex_unlock(&dev->debugfs_mutex);
>>>>>>>>>> +	debugfs_create_file(name, 0444, dev->primary->debugfs_root, entry,
>>>>>>>>>> +			    &drm_debugfs_entry_fops);
>>>>>>>>>> +
>>>>>>>>>> +	/* TODO: This should probably only be a symlink */
>>>>>>>>>> +	if (dev->render)
>>>>>>>>>> +		debugfs_create_file(name, 0444, dev->render->debugfs_root,
>>>>>>>>>> +				    entry, &drm_debugfs_entry_fops);
>>>>>>>>> Nope. You are fundamentally missing the point of all this, which is:
>>>>>>>>>
>>>>>>>>> - drivers create debugfs files whenever they want to, as long as it's
>>>>>>>>>       _before_ drm_dev_register is called.
>>>>>>>>>
>>>>>>>>> - drm_dev_register will set them all up.
>>>>>>>>>
>>>>>>>>> This is necessary because otherwise you have the potential for some nice
>>>>>>>>> oops and stuff when userspace tries to access these files before the
>>>>>>>>> driver is ready.
>>>>>>>>>
>>>>>>>>> Note that with sysfs all this infrastructure already exists, which is why
>>>>>>>>> you can create sysfs files whenever you feel like, and things wont go
>>>>>>>>> boom.
>>>>>>>> Well Yeah I've considered that, I just don't think it's a good idea for
>>>>>>>> debugfs.
>>>>>>>>
>>>>>>>> debugfs is meant to be a helper for debugging things and that especially
>>>>>>>> includes the time between drm_dev_init() and drm_dev_register() because
>>>>>>>> that's where we probe the hardware and try to get it working.
>>>>>>>>
>>>>>>>> Not having the debugfs files which allows for things like hardware
>>>>>>>> register access and reading internal state during that is a really and I
>>>>>>>> mean REALLY bad idea. This is essentially what we have those files for.
>>>>>>> So you mean you want to have early debugfs so you can have some script
>>>>>>> hammering the debugfs to get info out between init and register during
>>>>>>> probe?
>>>>>> Well not hammering. What we usually do in bringup is to set firmware
>>>>>> timeout to infinity and the driver then sits and waits for the hw.
>>>>>>
>>>>>> The tool used to access registers then goes directly through the PCI bar
>>>>>> at the moment, but that's essentially a bad idea for registers which you
>>>>>> grab a lock for to access (like index/data).
>>>>>>
>>>>>>> I just think registering debugfs before everything is ready is a recipe
>>>>>>> for disaster. All of the debugfs needs to check all the conditions that
>>>>>>> they need across all of the probe stages. It'll be difficult to get it
>>>>>>> right. And you'll get cargo culted checks copy pasted all over the
>>>>>>> place.
>>>>>> Yeah, but it's debugfs. That is not supposed to work under all conditions.
>>>>>>
>>>>>> Just try to read amdgpu_regs on a not existing register index. This will
>>>>>> just hang or reboot your box immediately on APUs.
>>>>> I'm firmly in the camp that debugfs does not need to work under all
>>>>> conditions, but that it must fail gracefully instead of crashing.
>>>> Yeah I mean once we talk bring-up, you can just hand-roll the necessary
>>>> bring debugfs things that you need to work before the driver is ready to
>>>> do anything.
>>>>
>>>> But bring-up debugfs fun is rather special, same way pre-silicon support
>>>> tends to be rather special. Shipping that in distros does not sound like a
>>>> good idea at all to me.
>>> Yeah, that's indeed a really good point.
>>>
>>> I can't remember how often I had to note that module parameters would also
>>> be used by end users.
>>>
>>> How about if the create the debugfs directory with a "." as name prefix
>>> first and then rename it as soon as the device is registered?
>> Good idea. Or the dir could have this drm_dev->unique name and be created
>> during alloc, and link in minor created during registration. That would
>> mean minor link is safe to use and unique potentially dangerous before
>> registration.
>>
>>> Alternatively
>>> we could clear the i_mode of the directory.
>> I checked that yesterday and this does not prevent to access the file
>> for root user. Perhaps there is other smart way for blocking
>> root access in vfs just by modifying some inode field, but just
>> 'chmod 0000 file' does not prevent that.
>>
>>> If a power user or engineer wants to debug startup problems stuff it should
>>> be trivial to work around that from userspace, and if people do such things
>>> they should also know the potential consequences.
>> Fully agree.
> So what about a drm module option instead (that taints the kernel as usual
> for these), which:
> - registers the debugfs dir right away
> - registers any debugfs files as soon as they get populated, instead of
>    postponing until drm_dev_register

Yeah, works for me as well.

> It would only neatly work with the add_file stuff, but I guess drivers
> could still hand-roll this if needed.
>
> I think funny games with trying to hide the files while not hiding them is
> not a great idea, and explicit "I'm debugging stuff, please stand back"
> knob sounds much better to me.

Well the challenge is that we have to consider the whole spectrum of end 
users for this. This reaches from the grandmother which just tries every 
possible random knob to get her printer working again over the script 
kiddie all the wait to the power users and engineers.

Some option to give an experience level to module parameters would be 
rather helpful.

Christian.

> -Daniel
>
>> Regards
>> Stanislaw
>>


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex
  2023-02-17 19:38                   ` Daniel Vetter
  2023-02-17 19:55                     ` Christian König
@ 2023-02-22 13:33                     ` Stanislaw Gruszka
  1 sibling, 0 replies; 50+ messages in thread
From: Stanislaw Gruszka @ 2023-02-22 13:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: daniel.vetter, mcanal, dri-devel, mwen, mairacanal,
	wambui.karugax, Christian König, maxime

On Fri, Feb 17, 2023 at 08:38:28PM +0100, Daniel Vetter wrote:
> > > > > I'm firmly in the camp that debugfs does not need to work under all
> > > > > conditions, but that it must fail gracefully instead of crashing.
> > > > Yeah I mean once we talk bring-up, you can just hand-roll the necessary
> > > > bring debugfs things that you need to work before the driver is ready to
> > > > do anything.
> > > > 
> > > > But bring-up debugfs fun is rather special, same way pre-silicon support
> > > > tends to be rather special. Shipping that in distros does not sound like a
> > > > good idea at all to me.
> > > 
> > > Yeah, that's indeed a really good point.
> > > 
> > > I can't remember how often I had to note that module parameters would also
> > > be used by end users.
> > > 
> > > How about if the create the debugfs directory with a "." as name prefix
> > > first and then rename it as soon as the device is registered?
> > 
> > Good idea. Or the dir could have this drm_dev->unique name and be created
> > during alloc, and link in minor created during registration. That would
> > mean minor link is safe to use and unique potentially dangerous before
> > registration.
> > 
> > > Alternatively
> > > we could clear the i_mode of the directory.
> > 
> > I checked that yesterday and this does not prevent to access the file
> > for root user. Perhaps there is other smart way for blocking
> > root access in vfs just by modifying some inode field, but just
> > 'chmod 0000 file' does not prevent that.
> > 
> > > If a power user or engineer wants to debug startup problems stuff it should
> > > be trivial to work around that from userspace, and if people do such things
> > > they should also know the potential consequences.
> > 
> > Fully agree.
> 
> So what about a drm module option instead (that taints the kernel as usual
> for these), which:
> - registers the debugfs dir right away
> - registers any debugfs files as soon as they get populated, instead of
>   postponing until drm_dev_register
> 
> It would only neatly work with the add_file stuff, but I guess drivers
> could still hand-roll this if needed.
> 
> I think funny games with trying to hide the files while not hiding them is
> not a great idea, and explicit "I'm debugging stuff, please stand back"
> knob sounds much better to me.

I prepared debugfs patch that allow to create not accessible directory
and publish it once everything is ready. I hope it would be accepted
by Greg KH and we could use it to make drm_debugfs_* simpler.

Would be nice if someone could test it and/or comment,
before I would post it further.

Thanks
Stanislaw

From 6bb4d38d90428904ac59a2717970697621a32a79 Mon Sep 17 00:00:00 2001
From: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Date: Tue, 21 Feb 2023 10:39:47 +0100
Subject: [PATCH] debugfs: introduce support for partially-initialized drivers

The i915 driver, among others, includes multiple subsystems that create
debugfs files in different parts of the code. It's important that these
files are not accessed before the driver is fully initialized, as doing
so could cause issues.

This patch adds support for creating a debugfs directory that will
prevent access to its files until a certain point in initialization is
reached, at which point the driver can signal that it's safe to access
the directory. This ensures that debugfs files are accessed only when
it's safe to do so.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
---
 fs/debugfs/inode.c      | 59 ++++++++++++++++++++++++++++++++++++++---
 fs/debugfs/internal.h   |  7 +++++
 include/linux/debugfs.h |  3 +++
 3 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 2e8e112b1993..04b88a5fab61 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -55,12 +55,23 @@ static int debugfs_setattr(struct user_namespace *mnt_userns,
 	return simple_setattr(&init_user_ns, dentry, ia);
 }
 
+static int debugfs_permission(struct user_namespace *mnt_userns, struct inode *inode, int mask)
+{
+	unsigned long priv = (unsigned long) inode->i_private;
+
+	if (S_ISDIR(inode->i_mode) && (priv & DEBUGFS_DIR_PREPARING))
+		return (priv & DEBUGFS_ALLOW_CREATE) ? 0 : -EPERM;
+
+	return generic_permission(mnt_userns, inode, mask);
+}
+
 static const struct inode_operations debugfs_file_inode_operations = {
 	.setattr	= debugfs_setattr,
 };
 static const struct inode_operations debugfs_dir_inode_operations = {
 	.lookup		= simple_lookup,
 	.setattr	= debugfs_setattr,
+	.permission	= debugfs_permission,
 };
 static const struct inode_operations debugfs_symlink_inode_operations = {
 	.get_link	= simple_get_link,
@@ -340,6 +351,7 @@ EXPORT_SYMBOL_GPL(debugfs_lookup);
 static struct dentry *start_creating(const char *name, struct dentry *parent)
 {
 	struct dentry *dentry;
+	unsigned long priv;
 	int error;
 
 	if (!(debugfs_allow & DEBUGFS_ALLOW_API))
@@ -369,10 +381,20 @@ static struct dentry *start_creating(const char *name, struct dentry *parent)
 		parent = debugfs_mount->mnt_root;
 
 	inode_lock(d_inode(parent));
-	if (unlikely(IS_DEADDIR(d_inode(parent))))
+	if (unlikely(IS_DEADDIR(d_inode(parent)))) {
 		dentry = ERR_PTR(-ENOENT);
-	else
+	} else {
+		priv = (unsigned long) d_inode(parent)->i_private;
+
+		priv |= DEBUGFS_ALLOW_CREATE;
+		d_inode(parent)->i_private = (void *) priv;
+
 		dentry = lookup_one_len(name, parent, strlen(name));
+
+		priv &= ~DEBUGFS_ALLOW_CREATE;
+		d_inode(parent)->i_private = (void *) priv;
+	}
+
 	if (!IS_ERR(dentry) && d_really_is_positive(dentry)) {
 		if (d_is_dir(dentry))
 			pr_err("Directory '%s' with parent '%s' already present!\n",
@@ -585,7 +607,9 @@ EXPORT_SYMBOL_GPL(debugfs_create_file_size);
  * passed to them could be an error and they don't crash in that case.
  * Drivers should generally work fine even if debugfs fails to init anyway.
  */
-struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
+
+static struct dentry *__debugfs_create_dir(const char *name, struct dentry *parent,
+					   bool preparing)
 {
 	struct dentry *dentry = start_creating(name, parent);
 	struct inode *inode;
@@ -605,6 +629,9 @@ struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
 		return failed_creating(dentry);
 	}
 
+	if (preparing)
+		inode->i_private = (void *) DEBUGFS_DIR_PREPARING;
+
 	inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
 	inode->i_op = &debugfs_dir_inode_operations;
 	inode->i_fop = &simple_dir_operations;
@@ -616,8 +643,34 @@ struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
 	fsnotify_mkdir(d_inode(dentry->d_parent), dentry);
 	return end_creating(dentry);
 }
+
+struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
+{
+	return __debugfs_create_dir(name, parent, false);
+}
 EXPORT_SYMBOL_GPL(debugfs_create_dir);
 
+struct dentry *debugfs_prepare_dir(const char *name, struct dentry *parent)
+{
+	return __debugfs_create_dir(name, parent, true);
+}
+EXPORT_SYMBOL_GPL(debugfs_prepare_dir);
+
+void debugfs_publish_dir(struct dentry *dir)
+{
+	struct inode *inode;
+
+	if (!debugfs_initialized() || IS_ERR(dir))
+		return;
+
+	inode = d_inode(dir);
+
+	inode_lock(inode);
+	inode->i_private = NULL;
+	inode_unlock(inode);
+}
+EXPORT_SYMBOL_GPL(debugfs_publish_dir);
+
 /**
  * debugfs_create_automount - create automount point in the debugfs filesystem
  * @name: a pointer to a string containing the name of the file to create.
diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h
index 92af8ae31313..47c795756bec 100644
--- a/fs/debugfs/internal.h
+++ b/fs/debugfs/internal.h
@@ -33,6 +33,13 @@ struct debugfs_fsdata {
 #define DEBUGFS_ALLOW_API	BIT(0)
 #define DEBUGFS_ALLOW_MOUNT	BIT(1)
 
+/*
+ * Inode private flags that limit access to a directory,
+ * which may not be fully propagated to the requested files.
+ */
+#define DEBUGFS_DIR_PREPARING	BIT(0)
+#define DEBUGFS_ALLOW_CREATE	BIT(1)
+
 #ifdef CONFIG_DEBUG_FS_ALLOW_ALL
 #define DEFAULT_DEBUGFS_ALLOW_BITS (DEBUGFS_ALLOW_MOUNT | DEBUGFS_ALLOW_API)
 #endif
diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h
index ea2d919fd9c7..8a080270ac1c 100644
--- a/include/linux/debugfs.h
+++ b/include/linux/debugfs.h
@@ -86,6 +86,9 @@ void debugfs_create_file_size(const char *name, umode_t mode,
 
 struct dentry *debugfs_create_dir(const char *name, struct dentry *parent);
 
+struct dentry *debugfs_prepare_dir(const char *name, struct dentry *parent);
+void debugfs_publish_dir(struct dentry *dir);
+
 struct dentry *debugfs_create_symlink(const char *name, struct dentry *parent,
 				      const char *dest);
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2023-02-22 13:33 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09  8:18 Try to address the drm_debugfs issues Christian König
2023-02-09  8:18 ` [PATCH 1/3] drm/debugfs: separate debugfs creation into init and register Christian König
2023-02-14 11:56   ` Stanislaw Gruszka
2023-02-09  8:18 ` [PATCH 2/3] drm/debugfs: split registration into dev and minor Christian König
2023-02-09 11:12   ` Maíra Canal
2023-02-09 12:03     ` Christian König
2023-02-09  8:18 ` [PATCH 3/3] drm/debugfs: remove dev->debugfs_list and debugfs_mutex Christian König
2023-02-14 12:19   ` Stanislaw Gruszka
2023-02-14 12:46     ` Stanislaw Gruszka
2023-02-16 11:33   ` Daniel Vetter
2023-02-16 11:37     ` Daniel Vetter
2023-02-16 16:00     ` Christian König
2023-02-16 16:46       ` Jani Nikula
2023-02-16 16:56         ` Christian König
2023-02-16 17:08           ` Jani Nikula
2023-02-16 19:54             ` Daniel Vetter
2023-02-17  9:22               ` Christian König
2023-02-17 10:01                 ` Stanislaw Gruszka
2023-02-17 19:38                   ` Daniel Vetter
2023-02-17 19:55                     ` Christian König
2023-02-22 13:33                     ` Stanislaw Gruszka
2023-02-16 16:37     ` Stanislaw Gruszka
2023-02-16 17:06       ` Jani Nikula
2023-02-16 19:56         ` Daniel Vetter
2023-02-17 10:35         ` Stanislaw Gruszka
2023-02-17 10:49           ` Jani Nikula
2023-02-17 11:36             ` Stanislaw Gruszka
2023-02-17 11:54               ` Christian König
2023-02-17 12:37                 ` Jani Nikula
2023-02-17 15:55                   ` Christian König
2023-02-17 19:42                     ` Daniel Vetter
2023-02-17 19:49                       ` Christian König
2023-02-09 11:23 ` Try to address the drm_debugfs issues Maíra Canal
2023-02-09 12:13   ` Christian König
2023-02-09 13:06     ` Maíra Canal
2023-02-09 14:06       ` Christian König
2023-02-09 14:19         ` Maxime Ripard
2023-02-09 15:52           ` Christian König
2023-02-09 18:48             ` Maxime Ripard
2023-02-10 12:07               ` Christian König
2023-02-10 12:18                 ` Maxime Ripard
2023-02-10 13:10                   ` Christian König
2023-02-16 11:34         ` Daniel Vetter
2023-02-16 16:31           ` Christian König
2023-02-16 19:57             ` Daniel Vetter
2023-02-13 18:16       ` Stanislaw Gruszka
2023-02-13 19:59         ` Christian König
2023-02-14  8:59 ` Stanislaw Gruszka
2023-02-14  9:28   ` Christian König
2023-02-14 11:46     ` Stanislaw Gruszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).