linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times
@ 2023-09-01 23:41 Douglas Anderson
  2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
                   ` (14 more replies)
  0 siblings, 15 replies; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, Baolin Wang, Bokun.Zhang, Hawking.Zhang,
	James.Zhu, Sascha Hauer, Victor.Zhao, Xinhui.Pan, YiPeng.Chai,
	abrodkin, airlied, alexander.deucher, alim.akhtar, amd-gfx,
	angelogioacchino.delregno, anitha.chrisanthus, biju.das.jz,
	bskeggs, christian.koenig, chunkuang.hu, daniel, edmund.j.dea,
	festevam, geert+renesas, inki.dae, jonathanh, kernel, kherbst,
	kieran.bingham+renesas, krzysztof.kozlowski, kyungmin.park,
	l.stach, laurent.pinchart, laurentiu.palcu, le.ma, lijo.lazar,
	linux-arm-kernel, linux-imx, linux-kernel, linux-mediatek,
	linux-mips, linux-renesas-soc, linux-samsung-soc, linux-tegra,
	linux, liviu.dudau, lyude, maarten.lankhorst, mario.limonciello,
	matthias.bgg, mdaenzer, mperttunen, nouveau, orsonzhai, p.zabel,
	patrik.r.jakobsson, paul, rfoss, robh, sam, shawnguo,
	shiwu.zhang, srinivasan.shanmugam, steven.price, sw0312.kim,
	thierry.reding, tzimmermann, zhang.lyra


NOTE: in order to avoid email sending limits on the cover letter, I've
split this patch series in two. Patches that target drm-misc and ones
that don't. The cover letter of the two is identical other than this note.

This patch series came about after a _long_ discussion between me and
Maxime Ripard in response to a different patch I sent out [1]. As part
of that discussion, we realized that it would be good if DRM drivers
consistently called drm_atomic_helper_shutdown() properly at shutdown
and driver remove time as it's documented that they should do. The
eventual goal of this would be to enable removing some hacky code from
panel drivers where they had to hook into shutdown themselves because
the DRM driver wasn't calling them.

It turns out that quite a lot of drivers seemed to be missing
drm_atomic_helper_shutdown() in one or both places that it was
supposed to be. This patch series attempts to fix all the drivers that
I was able to identify.

NOTE: fixing this wasn't exactly cookie cutter. Each driver has its
own unique way of setting itself up and tearing itself down. Some
drivers also use the component model, which adds extra fun. I've made
my best guess at solving this and I've run a bunch of compile tests
(specifically, allmodconfig for amd64, arm64, and powerpc). That being
said, these code changes are not totally trivial and I've done zero
real testing on them. Making these patches was also a little mind
numbing and I'm certain my eyes glazed over at several points when
writing them. What I'm trying to say is to please double-check that I
didn't do anything too silly, like cast your driver's drvdata to the
wrong type. Even better, test these patches!

I've organized this series like this:
1. One patch for all simple cases of just needing a call at shutdown
   time for drivers that go through drm-misc.
2. A separate patch for "drm/vc4", even though it goes through
   drm-misc, since I wanted to leave an extra note for that one.
3. Patches for drivers that just needed a call at shutdown time for
   drivers that _don't_ go through drm-misc.
4. Patches for the few drivers that had the call at shutdown time but
   lacked it at remove time.
5. One patch for all simple cases of needing a call at shutdown and
   remove (or unbind) time for drivers that go through drm-misc.
6. A separate patch for "drm/hisilicon/kirin", even though it goes
   through drm-misc, since I wanted to leave an extra note for that
   one.
7. Patches for drivers that needed a call at shutdown and remove (or
   unbind) time for drivers that _don't_ go through drm-misc.

I've labeled this patch series as RFT (request for testing) to help
call attention to the fact that I didn't personally test any of these
patches.

If you're a maintainer of one of these drivers and you think that the
patch for your driver looks fabulous, you've tested it, and you'd like
to land it right away then please do. For non-drm-misc drivers there
are no dependencies here. Some of the drm-misc drivers depend on the
first patch, AKA ("drm/atomic-helper: drm_atomic_helper_shutdown(NULL)
should be a noop"). I've tried to call this out but it also should be
obvious once you know to look for it.

I'd like to call out a few drivers that I _didn't_ fix in this series
and why. If any of these drivers should be fixed then please yell.
- DRM driers backed by usb_driver (like gud, gm12u320, udl): I didn't
  add the call to drm_atomic_helper_shutdown() at shutdown time
  because there's no ".shutdown" callback for them USB drivers. Given
  that USB is hotpluggable, I'm assuming that they are robust against
  this and the special shutdown callback isn't needed.
- ofdrm and simpledrm: These didn't have drm_atomic_helper_shutdown()
  in either shutdown or remove, but I didn't add it. I think that's OK
  since they're sorta special and not really directly controlling
  hardware power sequencing.
- virtio, vkms, vmwgfx, xen: I believe these are all virtual (thus
  they wouldn't directly drive a panel) and adding the shutdown
  didn't look straightforward, so I skipped them.

I've let each patch in the series get CCed straight from
get_maintainer. That means not everyone will have received every patch
but everyone should be on the cover letter. I know some people dislike
this but when touching this many drivers there's not much
choice. dri-devel and lkml have been CCed and lore/lei exist, so
hopefully that's enough for folks. I'm happy to add people to the
whole series for future posts.

[1] https://lore.kernel.org/lkml/20230804140605.RFC.4.I930069a32baab6faf46d6b234f89613b5cec0f14@changeid


Douglas Anderson (15):
  drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  drm/imx/dcss: Call drm_atomic_helper_shutdown() at shutdown time
  drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  drm/kmb: Call drm_atomic_helper_shutdown() at shutdown time
  drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time
  drm/nouveau: Call drm_atomic_helper_shutdown() or equiv at shutdown
    time
  drm/tegra: Call drm_atomic_helper_shutdown() at shutdown time
  drm/arcpgu: Call drm_atomic_helper_shutdown() at shutdown time
  drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time
  drm/sprd: Call drm_atomic_helper_shutdown() at remove time
  drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove
    time
  drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind
    time
  drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove
    time
  drm/renesas/shmobile: Call drm_helper_force_disable_all() at
    shutdown/remove time

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 10 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  2 +
 drivers/gpu/drm/armada/armada_drv.c           |  8 +++
 drivers/gpu/drm/exynos/exynos_drm_drv.c       | 11 ++++
 drivers/gpu/drm/gma500/psb_drv.c              |  8 +++
 drivers/gpu/drm/imx/dcss/dcss-drv.c           |  8 +++
 drivers/gpu/drm/imx/dcss/dcss-kms.c           |  7 ++
 drivers/gpu/drm/imx/dcss/dcss-kms.h           |  1 +
 drivers/gpu/drm/imx/ipuv3/imx-drm-core.c      | 11 ++++
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c     | 66 ++++++++++++-------
 drivers/gpu/drm/kmb/kmb_drv.c                 |  6 ++
 drivers/gpu/drm/mediatek/mtk_drm_drv.c        |  9 +++
 drivers/gpu/drm/nouveau/nouveau_display.c     |  9 +++
 drivers/gpu/drm/nouveau/nouveau_display.h     |  1 +
 drivers/gpu/drm/nouveau/nouveau_drm.c         | 13 ++++
 drivers/gpu/drm/nouveau/nouveau_drv.h         |  1 +
 drivers/gpu/drm/nouveau/nouveau_platform.c    |  6 ++
 drivers/gpu/drm/radeon/radeon_drv.c           |  7 +-
 .../gpu/drm/renesas/shmobile/shmob_drm_drv.c  | 10 +++
 drivers/gpu/drm/sprd/sprd_drm.c               |  4 +-
 drivers/gpu/drm/tegra/drm.c                   |  6 ++
 drivers/gpu/drm/tiny/arcpgu.c                 |  6 ++
 23 files changed, 187 insertions(+), 24 deletions(-)

-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-03 15:53   ` Russell King (Oracle)
                     ` (2 more replies)
  2023-09-01 23:41 ` [RFT PATCH 02/15] drm/imx/dcss: " Douglas Anderson
                   ` (13 subsequent siblings)
  14 siblings, 3 replies; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, daniel, linux-kernel, linux

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

This driver was fairly easy to update. The drm_device is stored in the
drvdata so we just have to make sure the drvdata is NULL whenever the
device is not bound. To make things simpler,
drm_atomic_helper_shutdown() has been modified to consider a NULL
drm_device as a noop in the patch ("drm/atomic-helper:
drm_atomic_helper_shutdown(NULL) should be a noop").

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/armada/armada_drv.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/armada/armada_drv.c b/drivers/gpu/drm/armada/armada_drv.c
index e8d2fe955909..fa1c67598706 100644
--- a/drivers/gpu/drm/armada/armada_drv.c
+++ b/drivers/gpu/drm/armada/armada_drv.c
@@ -148,6 +148,7 @@ static int armada_drm_bind(struct device *dev)
  err_kms:
 	drm_mode_config_cleanup(&priv->drm);
 	drm_mm_takedown(&priv->linear);
+	dev_set_drvdata(dev, NULL);
 	return ret;
 }
 
@@ -166,6 +167,7 @@ static void armada_drm_unbind(struct device *dev)
 
 	drm_mode_config_cleanup(&priv->drm);
 	drm_mm_takedown(&priv->linear);
+	dev_set_drvdata(dev, NULL);
 }
 
 static void armada_add_endpoints(struct device *dev,
@@ -230,6 +232,11 @@ static int armada_drm_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void armada_drm_shutdown(struct platform_device *pdev)
+{
+	drm_atomic_helper_shutdown(platform_get_drvdata(pdev));
+}
+
 static const struct platform_device_id armada_drm_platform_ids[] = {
 	{
 		.name		= "armada-drm",
@@ -243,6 +250,7 @@ MODULE_DEVICE_TABLE(platform, armada_drm_platform_ids);
 static struct platform_driver armada_drm_platform_driver = {
 	.probe	= armada_drm_probe,
 	.remove	= armada_drm_remove,
+	.shutdown = armada_drm_shutdown,
 	.driver	= {
 		.name	= "armada-drm",
 	},
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 02/15] drm/imx/dcss: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
  2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:36   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 03/15] drm/ingenic: " Douglas Anderson
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, Sascha Hauer, airlied, daniel, festevam,
	kernel, l.stach, laurentiu.palcu, linux-arm-kernel, linux-imx,
	linux-kernel, shawnguo

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/imx/dcss/dcss-drv.c | 8 ++++++++
 drivers/gpu/drm/imx/dcss/dcss-kms.c | 7 +++++++
 drivers/gpu/drm/imx/dcss/dcss-kms.h | 1 +
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-drv.c b/drivers/gpu/drm/imx/dcss/dcss-drv.c
index c68b0d93ae9e..b61cec0cc79d 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-drv.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-drv.c
@@ -92,6 +92,13 @@ static int dcss_drv_platform_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void dcss_drv_platform_shutdown(struct platform_device *pdev)
+{
+	struct dcss_drv *mdrv = dev_get_drvdata(&pdev->dev);
+
+	dcss_kms_shutdown(mdrv->kms);
+}
+
 static struct dcss_type_data dcss_types[] = {
 	[DCSS_IMX8MQ] = {
 		.name = "DCSS_IMX8MQ",
@@ -114,6 +121,7 @@ MODULE_DEVICE_TABLE(of, dcss_of_match);
 static struct platform_driver dcss_platform_driver = {
 	.probe	= dcss_drv_platform_probe,
 	.remove	= dcss_drv_platform_remove,
+	.shutdown = dcss_drv_platform_shutdown,
 	.driver	= {
 		.name = "imx-dcss",
 		.of_match_table	= dcss_of_match,
diff --git a/drivers/gpu/drm/imx/dcss/dcss-kms.c b/drivers/gpu/drm/imx/dcss/dcss-kms.c
index 896de946f8df..d0ea4e97cded 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-kms.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-kms.c
@@ -172,3 +172,10 @@ void dcss_kms_detach(struct dcss_kms_dev *kms)
 	dcss_crtc_deinit(&kms->crtc, drm);
 	drm->dev_private = NULL;
 }
+
+void dcss_kms_shutdown(struct dcss_kms_dev *kms)
+{
+	struct drm_device *drm = &kms->base;
+
+	drm_atomic_helper_shutdown(drm);
+}
diff --git a/drivers/gpu/drm/imx/dcss/dcss-kms.h b/drivers/gpu/drm/imx/dcss/dcss-kms.h
index dfe5dd99eea3..62521c1fd6d2 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-kms.h
+++ b/drivers/gpu/drm/imx/dcss/dcss-kms.h
@@ -34,6 +34,7 @@ struct dcss_kms_dev {
 
 struct dcss_kms_dev *dcss_kms_attach(struct dcss_dev *dcss);
 void dcss_kms_detach(struct dcss_kms_dev *kms);
+void dcss_kms_shutdown(struct dcss_kms_dev *kms);
 int dcss_crtc_init(struct dcss_crtc *crtc, struct drm_device *drm);
 void dcss_crtc_deinit(struct dcss_crtc *crtc, struct drm_device *drm);
 struct dcss_plane *dcss_plane_init(struct drm_device *drm,
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
  2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
  2023-09-01 23:41 ` [RFT PATCH 02/15] drm/imx/dcss: " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:36   ` Maxime Ripard
                     ` (2 more replies)
  2023-09-01 23:41 ` [RFT PATCH 04/15] drm/kmb: " Douglas Anderson
                   ` (11 subsequent siblings)
  14 siblings, 3 replies; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, daniel, linux-kernel, linux-mips, paul

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Since this driver uses the component model and shutdown happens at the
base driver, we communicate whether we have to call
drm_atomic_helper_shutdown() by seeing if drvdata is non-NULL.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

NOTE: this patch touches a lot more than other similar patches since
the bind() function is long and we want to make sure that we unset the
drvdata if bind() fails.

While making this patch, I noticed that the bind() function of this
driver is using "devm" and thus assumes it doesn't need to do much
explicit error handling. That's actually a bug. As per kernel docs [1]
"the lifetime of the aggregate driver does not align with any of the
underlying struct device instances. Therefore devm cannot be used and
all resources acquired or allocated in this callback must be
explicitly released in the unbind callback". Fixing that is outside
the scope of this commit.

[1] https://docs.kernel.org/driver-api/component.html

 drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 66 +++++++++++++++--------
 1 file changed, 44 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 8dbd4847d3a6..51995a0cd568 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -1130,7 +1130,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 
 	ret = drmm_mode_config_init(drm);
 	if (ret)
-		return ret;
+		goto err_drvdata;
 
 	drm->mode_config.min_width = 0;
 	drm->mode_config.min_height = 0;
@@ -1142,7 +1142,8 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 	base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
 	if (IS_ERR(base)) {
 		dev_err(dev, "Failed to get memory resource\n");
-		return PTR_ERR(base);
+		ret = PTR_ERR(base);
+		goto err_drvdata;
 	}
 
 	regmap_config = ingenic_drm_regmap_config;
@@ -1151,33 +1152,40 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 					  &regmap_config);
 	if (IS_ERR(priv->map)) {
 		dev_err(dev, "Failed to create regmap\n");
-		return PTR_ERR(priv->map);
+		ret = PTR_ERR(priv->map);
+		goto err_drvdata;
 	}
 
 	irq = platform_get_irq(pdev, 0);
-	if (irq < 0)
-		return irq;
+	if (irq < 0) {
+		ret = irq;
+		goto err_drvdata;
+	}
 
 	if (soc_info->needs_dev_clk) {
 		priv->lcd_clk = devm_clk_get(dev, "lcd");
 		if (IS_ERR(priv->lcd_clk)) {
 			dev_err(dev, "Failed to get lcd clock\n");
-			return PTR_ERR(priv->lcd_clk);
+			ret = PTR_ERR(priv->lcd_clk);
+			goto err_drvdata;
 		}
 	}
 
 	priv->pix_clk = devm_clk_get(dev, "lcd_pclk");
 	if (IS_ERR(priv->pix_clk)) {
 		dev_err(dev, "Failed to get pixel clock\n");
-		return PTR_ERR(priv->pix_clk);
+		ret = PTR_ERR(priv->pix_clk);
+		goto err_drvdata;
 	}
 
 	priv->dma_hwdescs = dmam_alloc_coherent(dev,
 						sizeof(*priv->dma_hwdescs),
 						&priv->dma_hwdescs_phys,
 						GFP_KERNEL);
-	if (!priv->dma_hwdescs)
-		return -ENOMEM;
+	if (!priv->dma_hwdescs) {
+		ret = -ENOMEM;
+		goto err_drvdata;
+	}
 
 	/* Configure DMA hwdesc for foreground0 plane */
 	ingenic_drm_configure_hwdesc_plane(priv, 0);
@@ -1199,7 +1207,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 				       NULL, DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		dev_err(dev, "Failed to register plane: %i\n", ret);
-		return ret;
+		goto err_drvdata;
 	}
 
 	if (soc_info->map_noncoherent)
@@ -1211,7 +1219,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 					NULL, &ingenic_drm_crtc_funcs, NULL);
 	if (ret) {
 		dev_err(dev, "Failed to init CRTC: %i\n", ret);
-		return ret;
+		goto err_drvdata;
 	}
 
 	drm_crtc_enable_color_mgmt(&priv->crtc, 0, false,
@@ -1230,7 +1238,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 		if (ret) {
 			dev_err(dev, "Failed to register overlay plane: %i\n",
 				ret);
-			return ret;
+			goto err_drvdata;
 		}
 
 		if (soc_info->map_noncoherent)
@@ -1241,17 +1249,18 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 			if (ret) {
 				if (ret != -EPROBE_DEFER)
 					dev_err(dev, "Failed to bind components: %i\n", ret);
-				return ret;
+				goto err_drvdata;
 			}
 
 			ret = devm_add_action_or_reset(dev, ingenic_drm_unbind_all, priv);
 			if (ret)
-				return ret;
+				goto err_drvdata;
 
 			priv->ipu_plane = drm_plane_from_index(drm, 2);
 			if (!priv->ipu_plane) {
 				dev_err(dev, "Failed to retrieve IPU plane\n");
-				return -EINVAL;
+				ret = -EINVAL;
+				goto err_drvdata;
 			}
 		}
 	}
@@ -1263,7 +1272,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 				break; /* we're done */
 			if (ret != -EPROBE_DEFER)
 				dev_err(dev, "Failed to get bridge handle\n");
-			return ret;
+			goto err_drvdata;
 		}
 
 		if (panel)
@@ -1275,7 +1284,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 		if (IS_ERR(ib)) {
 			ret = PTR_ERR(ib);
 			dev_err(dev, "Failed to init encoder: %d\n", ret);
-			return ret;
+			goto err_drvdata;
 		}
 
 		encoder = &ib->encoder;
@@ -1290,13 +1299,14 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 					DRM_BRIDGE_ATTACH_NO_CONNECTOR);
 		if (ret) {
 			dev_err(dev, "Unable to attach bridge\n");
-			return ret;
+			goto err_drvdata;
 		}
 
 		connector = drm_bridge_connector_init(drm, encoder);
 		if (IS_ERR(connector)) {
 			dev_err(dev, "Unable to init connector\n");
-			return PTR_ERR(connector);
+			ret = PTR_ERR(connector);
+			goto err_drvdata;
 		}
 
 		drm_connector_attach_encoder(connector, encoder);
@@ -1313,13 +1323,13 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 	ret = devm_request_irq(dev, irq, ingenic_drm_irq_handler, 0, drm->driver->name, drm);
 	if (ret) {
 		dev_err(dev, "Unable to install IRQ handler\n");
-		return ret;
+		goto err_drvdata;
 	}
 
 	ret = drm_vblank_init(drm, 1);
 	if (ret) {
 		dev_err(dev, "Failed calling drm_vblank_init()\n");
-		return ret;
+		goto err_drvdata;
 	}
 
 	drm_mode_config_reset(drm);
@@ -1327,7 +1337,7 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 	ret = clk_prepare_enable(priv->pix_clk);
 	if (ret) {
 		dev_err(dev, "Unable to start pixel clock\n");
-		return ret;
+		goto err_drvdata;
 	}
 
 	if (priv->lcd_clk) {
@@ -1402,6 +1412,8 @@ static int ingenic_drm_bind(struct device *dev, bool has_components)
 		clk_disable_unprepare(priv->lcd_clk);
 err_pixclk_disable:
 	clk_disable_unprepare(priv->pix_clk);
+err_drvdata:
+	platform_set_drvdata(pdev, NULL);
 	return ret;
 }
 
@@ -1422,6 +1434,7 @@ static void ingenic_drm_unbind(struct device *dev)
 
 	drm_dev_unregister(&priv->drm);
 	drm_atomic_helper_shutdown(&priv->drm);
+	dev_set_drvdata(dev, NULL);
 }
 
 static const struct component_master_ops ingenic_master_ops = {
@@ -1461,6 +1474,14 @@ static int ingenic_drm_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void ingenic_drm_shutdown(struct platform_device *pdev)
+{
+	struct ingenic_drm *priv = platform_get_drvdata(pdev);
+
+	if (priv)
+		drm_atomic_helper_shutdown(&priv->drm);
+}
+
 static int ingenic_drm_suspend(struct device *dev)
 {
 	struct ingenic_drm *priv = dev_get_drvdata(dev);
@@ -1612,6 +1633,7 @@ static struct platform_driver ingenic_drm_driver = {
 	},
 	.probe = ingenic_drm_probe,
 	.remove = ingenic_drm_remove,
+	.shutdown = ingenic_drm_shutdown,
 };
 
 static int ingenic_drm_init(void)
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 04/15] drm/kmb: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (2 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 03/15] drm/ingenic: " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:26   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 05/15] drm/mediatek: " Douglas Anderson
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, anitha.chrisanthus, daniel,
	edmund.j.dea, linux-kernel

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/kmb/kmb_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/kmb/kmb_drv.c b/drivers/gpu/drm/kmb/kmb_drv.c
index 24035b53441c..af9bd34fefc0 100644
--- a/drivers/gpu/drm/kmb/kmb_drv.c
+++ b/drivers/gpu/drm/kmb/kmb_drv.c
@@ -476,6 +476,11 @@ static int kmb_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void kmb_shutdown(struct platform_device *pdev)
+{
+	drm_atomic_helper_shutdown(platform_get_drvdata(pdev));
+}
+
 static int kmb_probe(struct platform_device *pdev)
 {
 	struct device *dev = get_device(&pdev->dev);
@@ -622,6 +627,7 @@ static SIMPLE_DEV_PM_OPS(kmb_pm_ops, kmb_pm_suspend, kmb_pm_resume);
 static struct platform_driver kmb_platform_driver = {
 	.probe = kmb_probe,
 	.remove = kmb_remove,
+	.shutdown = kmb_shutdown,
 	.driver = {
 		.name = "kmb-drm",
 		.pm = &kmb_pm_ops,
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 05/15] drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (3 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 04/15] drm/kmb: " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:27   ` Maxime Ripard
  2023-09-08 11:51   ` Fei Shao
  2023-09-01 23:41 ` [RFT PATCH 06/15] drm/nouveau: Call drm_atomic_helper_shutdown() or equiv " Douglas Anderson
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, angelogioacchino.delregno,
	chunkuang.hu, daniel, linux-arm-kernel, linux-kernel,
	linux-mediatek, matthias.bgg, p.zabel

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

This driver users the component model and shutdown happens in the base
driver. The "drvdata" for this driver will always be valid if
shutdown() is called and we know that if the "drm" pointer in our
private data is non-NULL then we need to call
drm_atomic_helper_shutdown(). Technically with a previous patch,
("drm/atomic-helper: drm_atomic_helper_shutdown(NULL) should be a
noop"), we don't actually need to check to see if our "drm" pointer is
NULL before calling drm_atomic_helper_shutdown(). We'll leave the "if"
test in, though, so that this patch can land without any
dependencies. It could potentially be removed later.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 961715dd5b11..8b1c9c992ca8 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -921,6 +921,14 @@ static int mtk_drm_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void mtk_drm_shutdown(struct platform_device *pdev)
+{
+	struct mtk_drm_private *private = platform_get_drvdata(pdev);
+
+	if (private->drm)
+		drm_atomic_helper_shutdown(private->drm);
+}
+
 static int mtk_drm_sys_prepare(struct device *dev)
 {
 	struct mtk_drm_private *private = dev_get_drvdata(dev);
@@ -952,6 +960,7 @@ static const struct dev_pm_ops mtk_drm_pm_ops = {
 static struct platform_driver mtk_drm_platform_driver = {
 	.probe	= mtk_drm_probe,
 	.remove	= mtk_drm_remove,
+	.shutdown = mtk_drm_shutdown,
 	.driver	= {
 		.name	= "mediatek-drm",
 		.pm     = &mtk_drm_pm_ops,
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 06/15] drm/nouveau: Call drm_atomic_helper_shutdown() or equiv at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (4 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 05/15] drm/mediatek: " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:27   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 07/15] drm/tegra: Call drm_atomic_helper_shutdown() " Douglas Anderson
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, bskeggs, daniel, kherbst,
	linux-kernel, lyude, nouveau

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() (or
drm_helper_force_disable_all() if not using atomic) at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested. I made my best guess about
how to fit this into the existing code. If someone wishes a different
style, please yell.

 drivers/gpu/drm/nouveau/nouveau_display.c  |  9 +++++++++
 drivers/gpu/drm/nouveau/nouveau_display.h  |  1 +
 drivers/gpu/drm/nouveau/nouveau_drm.c      | 13 +++++++++++++
 drivers/gpu/drm/nouveau/nouveau_drv.h      |  1 +
 drivers/gpu/drm/nouveau/nouveau_platform.c |  6 ++++++
 5 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index 99977e5fe716..04b3a3c0b5a5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -642,6 +642,15 @@ nouveau_display_fini(struct drm_device *dev, bool suspend, bool runtime)
 	disp->fini(dev, runtime, suspend);
 }
 
+void
+nouveau_display_shutdown(struct drm_device *dev)
+{
+	if (drm_drv_uses_atomic_modeset(dev))
+		drm_atomic_helper_shutdown(dev);
+	else
+		drm_helper_force_disable_all(dev);
+}
+
 static void
 nouveau_display_create_properties(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.h b/drivers/gpu/drm/nouveau/nouveau_display.h
index 2ab2ddb1eadf..9df62e833cda 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.h
+++ b/drivers/gpu/drm/nouveau/nouveau_display.h
@@ -47,6 +47,7 @@ void nouveau_display_destroy(struct drm_device *dev);
 int  nouveau_display_init(struct drm_device *dev, bool resume, bool runtime);
 void nouveau_display_hpd_resume(struct drm_device *dev);
 void nouveau_display_fini(struct drm_device *dev, bool suspend, bool runtime);
+void nouveau_display_shutdown(struct drm_device *dev);
 int  nouveau_display_suspend(struct drm_device *dev, bool runtime);
 void nouveau_display_resume(struct drm_device *dev, bool runtime);
 int  nouveau_display_vblank_enable(struct drm_crtc *crtc);
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 4396f501b16a..d8e8160cdf35 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -879,6 +879,18 @@ nouveau_drm_remove(struct pci_dev *pdev)
 	pci_disable_device(pdev);
 }
 
+void
+nouveau_drm_device_shutdown(struct drm_device *dev)
+{
+	nouveau_display_shutdown(dev);
+}
+
+static void
+nouveau_drm_shutdown(struct pci_dev *pdev)
+{
+	nouveau_drm_device_shutdown(pci_get_drvdata(pdev));
+}
+
 static int
 nouveau_do_suspend(struct drm_device *dev, bool runtime)
 {
@@ -1343,6 +1355,7 @@ nouveau_drm_pci_driver = {
 	.id_table = nouveau_drm_pci_table,
 	.probe = nouveau_drm_probe,
 	.remove = nouveau_drm_remove,
+	.shutdown = nouveau_drm_shutdown,
 	.driver.pm = &nouveau_pm_ops,
 };
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 3666a7403e47..aa936cabb6cf 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -327,6 +327,7 @@ struct drm_device *
 nouveau_platform_device_create(const struct nvkm_device_tegra_func *,
 			       struct platform_device *, struct nvkm_device **);
 void nouveau_drm_device_remove(struct drm_device *dev);
+void nouveau_drm_device_shutdown(struct drm_device *dev);
 
 #define NV_PRINTK(l,c,f,a...) do {                                             \
 	struct nouveau_cli *_cli = (c);                                        \
diff --git a/drivers/gpu/drm/nouveau/nouveau_platform.c b/drivers/gpu/drm/nouveau/nouveau_platform.c
index 23cd43a7fd19..b2e82a96411c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_platform.c
+++ b/drivers/gpu/drm/nouveau/nouveau_platform.c
@@ -50,6 +50,11 @@ static int nouveau_platform_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void nouveau_platform_shutdown(struct platform_device *pdev)
+{
+	nouveau_drm_device_shutdown(platform_get_drvdata(pdev));
+}
+
 #if IS_ENABLED(CONFIG_OF)
 static const struct nvkm_device_tegra_func gk20a_platform_data = {
 	.iommu_bit = 34,
@@ -94,4 +99,5 @@ struct platform_driver nouveau_platform_driver = {
 	},
 	.probe = nouveau_platform_probe,
 	.remove = nouveau_platform_remove,
+	.shutdown = nouveau_platform_shutdown,
 };
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 07/15] drm/tegra: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (5 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 06/15] drm/nouveau: Call drm_atomic_helper_shutdown() or equiv " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:28   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 08/15] drm/arcpgu: " Douglas Anderson
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, daniel, jonathanh, linux-kernel,
	linux-tegra, mperttunen, thierry.reding

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/tegra/drm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index ff36171c8fb7..ce2d4153f7bd 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -1312,6 +1312,11 @@ static int host1x_drm_remove(struct host1x_device *dev)
 	return 0;
 }
 
+static void host1x_drm_shutdown(struct host1x_device *dev)
+{
+	drm_atomic_helper_shutdown(dev_get_drvdata(&dev->dev));
+}
+
 #ifdef CONFIG_PM_SLEEP
 static int host1x_drm_suspend(struct device *dev)
 {
@@ -1380,6 +1385,7 @@ static struct host1x_driver host1x_drm_driver = {
 	},
 	.probe = host1x_drm_probe,
 	.remove = host1x_drm_remove,
+	.shutdown = host1x_drm_shutdown,
 	.subdevs = host1x_drm_subdevs,
 };
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 08/15] drm/arcpgu: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (6 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 07/15] drm/tegra: Call drm_atomic_helper_shutdown() " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:28   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 09/15] drm/amdgpu: " Douglas Anderson
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, abrodkin, airlied, daniel, linux-kernel

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/tiny/arcpgu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/tiny/arcpgu.c b/drivers/gpu/drm/tiny/arcpgu.c
index e5b10e41554a..c1e851c982e4 100644
--- a/drivers/gpu/drm/tiny/arcpgu.c
+++ b/drivers/gpu/drm/tiny/arcpgu.c
@@ -414,6 +414,11 @@ static int arcpgu_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void arcpgu_shutdown(struct platform_device *pdev)
+{
+	drm_atomic_helper_shutdown(platform_get_drvdata(pdev));
+}
+
 static const struct of_device_id arcpgu_of_table[] = {
 	{.compatible = "snps,arcpgu"},
 	{}
@@ -424,6 +429,7 @@ MODULE_DEVICE_TABLE(of, arcpgu_of_table);
 static struct platform_driver arcpgu_platform_driver = {
 	.probe = arcpgu_probe,
 	.remove = arcpgu_remove,
+	.shutdown = arcpgu_shutdown,
 	.driver = {
 		   .name = "arcpgu",
 		   .of_match_table = arcpgu_of_table,
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 09/15] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (7 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 08/15] drm/arcpgu: " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-01 23:41 ` [RFT PATCH 10/15] drm/sprd: Call drm_atomic_helper_shutdown() at remove time Douglas Anderson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, Bokun.Zhang, Hawking.Zhang, James.Zhu,
	Victor.Zhao, Xinhui.Pan, YiPeng.Chai, airlied, alexander.deucher,
	amd-gfx, christian.koenig, daniel, le.ma, lijo.lazar,
	linux-kernel, maarten.lankhorst, mario.limonciello, mdaenzer,
	shiwu.zhang, srinivasan.shanmugam, tzimmermann

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

...and further, I'd say that this patch is more of a plea for help
than a patch I think is actually right. I'm _fairly_ certain that
drm/amdgpu needs this call at shutdown time but the logic is a bit
hard for me to follow. I'd appreciate if anyone who actually knows
what this should look like could illuminate me, or perhaps even just
post a patch themselves!

 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  2 ++
 3 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8f2255b3a38a..cfcff0b37466 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1104,6 +1104,7 @@ static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_device *bdev)
 int amdgpu_device_init(struct amdgpu_device *adev,
 		       uint32_t flags);
 void amdgpu_device_fini_hw(struct amdgpu_device *adev);
+void amdgpu_device_shutdown_hw(struct amdgpu_device *adev);
 void amdgpu_device_fini_sw(struct amdgpu_device *adev);
 
 int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a2cdde0ca0a7..fa5925c2092d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4247,6 +4247,16 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev)
 
 }
 
+void amdgpu_device_shutdown_hw(struct amdgpu_device *adev)
+{
+	if (adev->mode_info.mode_config_initialized) {
+		if (!drm_drv_uses_atomic_modeset(adev_to_drm(adev)))
+			drm_helper_force_disable_all(adev_to_drm(adev));
+		else
+			drm_atomic_helper_shutdown(adev_to_drm(adev));
+	}
+}
+
 void amdgpu_device_fini_sw(struct amdgpu_device *adev)
 {
 	int idx;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index e90f730eb715..3a7cbff111d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2333,6 +2333,8 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
 	struct drm_device *dev = pci_get_drvdata(pdev);
 	struct amdgpu_device *adev = drm_to_adev(dev);
 
+	amdgpu_device_shutdown_hw(adev);
+
 	if (amdgpu_ras_intr_triggered())
 		return;
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 10/15] drm/sprd: Call drm_atomic_helper_shutdown() at remove time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (8 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 09/15] drm/amdgpu: " Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:54   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 11/15] drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, Baolin Wang, airlied, daniel, linux-kernel,
	liviu.dudau, orsonzhai, rfoss, robh, sam, steven.price,
	tzimmermann, zhang.lyra

Based on grepping through the source code, this driver appears to be
missing a call to drm_atomic_helper_shutdown() at remove time. Let's
add it.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS driver remove comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.

While at it, let's also fix it so that if the driver's bind fails or
if a driver gets unbound that the drvdata gets set to NULL. This will
make sure we can't get confused during a later shutdown().

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

While making this patch, I noticed that the bind() function of this
driver is using "devm". That's actually a bug. As per kernel docs [1]
"the lifetime of the aggregate driver does not align with any of the
underlying struct device instances. Therefore devm cannot be used and
all resources acquired or allocated in this callback must be
explicitly released in the unbind callback". Fixing that is outside
the scope of this commit.

[1] https://docs.kernel.org/driver-api/component.html

 drivers/gpu/drm/sprd/sprd_drm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/sprd/sprd_drm.c b/drivers/gpu/drm/sprd/sprd_drm.c
index 0aa39156f2fa..86a175116140 100644
--- a/drivers/gpu/drm/sprd/sprd_drm.c
+++ b/drivers/gpu/drm/sprd/sprd_drm.c
@@ -114,6 +114,7 @@ static int sprd_drm_bind(struct device *dev)
 	drm_kms_helper_poll_fini(drm);
 err_unbind_all:
 	component_unbind_all(drm->dev, drm);
+	platform_set_drvdata(pdev, NULL);
 	return ret;
 }
 
@@ -122,10 +123,11 @@ static void sprd_drm_unbind(struct device *dev)
 	struct drm_device *drm = dev_get_drvdata(dev);
 
 	drm_dev_unregister(drm);
-
 	drm_kms_helper_poll_fini(drm);
+	drm_atomic_helper_shutdown(drm);
 
 	component_unbind_all(drm->dev, drm);
+	dev_set_drvdata(dev, NULL);
 }
 
 static const struct component_master_ops drm_component_ops = {
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 11/15] drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (9 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 10/15] drm/sprd: Call drm_atomic_helper_shutdown() at remove time Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:54   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 12/15] drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, alim.akhtar, daniel, inki.dae,
	krzysztof.kozlowski, kyungmin.park, linux-arm-kernel,
	linux-kernel, linux-samsung-soc, sw0312.kim

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown time
and at driver unbind time. Among other things, this means that if a
panel is in use that it won't be cleanly powered off at system
shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart and at driver remove (or unbind) time comes
straight out of the kernel doc "driver instance overview" in
drm_drv.c.

A few notes about this fix:
- When adding drm_atomic_helper_shutdown() to the unbind path, I added
  it after drm_kms_helper_poll_fini() since that's when other drivers
  seemed to have it.
- Technically with a previous patch, ("drm/atomic-helper:
  drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
  actually need to check to see if our "drm" pointer is NULL before
  calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
  though, so that this patch can land without any dependencies. It
  could potentially be removed later.
- This patch also makes sure to set the drvdata to NULL in the case of
  bind errors to make sure that shutdown can't access freed data.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/exynos/exynos_drm_drv.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 8399256cb5c9..5380fb6c55ae 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -300,6 +300,7 @@ static int exynos_drm_bind(struct device *dev)
 	drm_mode_config_cleanup(drm);
 	exynos_drm_cleanup_dma(drm);
 	kfree(private);
+	dev_set_drvdata(dev, NULL);
 err_free_drm:
 	drm_dev_put(drm);
 
@@ -313,6 +314,7 @@ static void exynos_drm_unbind(struct device *dev)
 	drm_dev_unregister(drm);
 
 	drm_kms_helper_poll_fini(drm);
+	drm_atomic_helper_shutdown(drm);
 
 	component_unbind_all(drm->dev, drm);
 	drm_mode_config_cleanup(drm);
@@ -350,9 +352,18 @@ static int exynos_drm_platform_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void exynos_drm_platform_shutdown(struct platform_device *pdev)
+{
+	struct drm_device *drm = platform_get_drvdata(pdev);
+
+	if (drm)
+		drm_atomic_helper_shutdown(drm);
+}
+
 static struct platform_driver exynos_drm_platform_driver = {
 	.probe	= exynos_drm_platform_probe,
 	.remove	= exynos_drm_platform_remove,
+	.shutdown = exynos_drm_platform_shutdown,
 	.driver	= {
 		.name	= "exynos-drm",
 		.pm	= &exynos_drm_pm_ops,
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 12/15] drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (10 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 11/15] drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:29   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, daniel, linux-kernel, patrik.r.jakobsson

Based on grepping through the source code, this driver appears to be
missing a call to drm_atomic_helper_shutdown(), or in this case the
non-atomic equivalent drm_helper_force_disable_all(), at system
shutdown time and at driver remove time. This is important because
drm_helper_force_disable_all() will cause panels to get disabled
cleanly which may be important for their power sequencing. Future
changes will remove any custom powering off in individual panel
drivers so the DRM drivers need to start getting this right.

The fact that we should call drm_atomic_helper_shutdown(), or in this
case the non-atomic equivalent drm_helper_force_disable_all(), in the
case of OS shutdown/restart comes straight out of the kernel doc
"driver instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/gma500/psb_drv.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 8b64f61ffaf9..a5a399bbe8f5 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -20,6 +20,7 @@
 #include <acpi/video.h>
 
 #include <drm/drm.h>
+#include <drm/drm_crtc_helper.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
 #include <drm/drm_ioctl.h>
@@ -485,6 +486,12 @@ static void psb_pci_remove(struct pci_dev *pdev)
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
 	drm_dev_unregister(dev);
+	drm_helper_force_disable_all(dev);
+}
+
+static void psb_pci_shutdown(struct pci_dev *pdev)
+{
+	drm_helper_force_disable_all(pci_get_drvdata(pdev));
 }
 
 static DEFINE_RUNTIME_DEV_PM_OPS(psb_pm_ops, gma_power_suspend, gma_power_resume, NULL);
@@ -521,6 +528,7 @@ static struct pci_driver psb_pci_driver = {
 	.id_table = pciidlist,
 	.probe = psb_pci_probe,
 	.remove = psb_pci_remove,
+	.shutdown = psb_pci_shutdown,
 	.driver.pm = &psb_pm_ops,
 };
 
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (11 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 12/15] drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:55   ` Maxime Ripard
                     ` (2 more replies)
  2023-09-01 23:41 ` [RFT PATCH 14/15] drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
  2023-09-01 23:41 ` [RFT PATCH 15/15] drm/renesas/shmobile: " Douglas Anderson
  14 siblings, 3 replies; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, Sascha Hauer, airlied, daniel, festevam,
	kernel, linux-arm-kernel, linux-imx, linux-kernel, p.zabel,
	shawnguo

Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown time
and at driver unbind time. Among other things, this means that if a
panel is in use that it won't be cleanly powered off at system
shutdown time.

The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart and at driver remove (or unbind) time comes
straight out of the kernel doc "driver instance overview" in
drm_drv.c.

A few notes about this fix:
- When adding drm_atomic_helper_shutdown() to the unbind path, I added
  it after drm_kms_helper_poll_fini() since that's when other drivers
  seemed to have it.
- Technically with a previous patch, ("drm/atomic-helper:
  drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
  actually need to check to see if our "drm" pointer is NULL before
  calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
  though, so that this patch can land without any dependencies. It
  could potentially be removed later.
- This patch also makes sure to set the drvdata to NULL in the case of
  bind errors to make sure that shutdown can't access freed data.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/imx/ipuv3/imx-drm-core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c b/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
index 4a866ac60fff..4c8bc49758a7 100644
--- a/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
@@ -257,6 +257,7 @@ static int imx_drm_bind(struct device *dev)
 	drm_kms_helper_poll_fini(drm);
 	component_unbind_all(drm->dev, drm);
 err_kms:
+	dev_set_drvdata(dev, NULL);
 	drm_dev_put(drm);
 
 	return ret;
@@ -269,6 +270,7 @@ static void imx_drm_unbind(struct device *dev)
 	drm_dev_unregister(drm);
 
 	drm_kms_helper_poll_fini(drm);
+	drm_atomic_helper_shutdown(drm);
 
 	component_unbind_all(drm->dev, drm);
 
@@ -298,6 +300,14 @@ static int imx_drm_platform_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void imx_drm_platform_shutdown(struct platform_device *pdev)
+{
+	struct drm_device *drm = platform_get_drvdata(pdev);
+
+	if (drm)
+		drm_atomic_helper_shutdown(platform_get_drvdata(pdev));
+}
+
 #ifdef CONFIG_PM_SLEEP
 static int imx_drm_suspend(struct device *dev)
 {
@@ -325,6 +335,7 @@ MODULE_DEVICE_TABLE(of, imx_drm_dt_ids);
 static struct platform_driver imx_drm_pdrv = {
 	.probe		= imx_drm_platform_probe,
 	.remove		= imx_drm_platform_remove,
+	.shutdown	= imx_drm_platform_shutdown,
 	.driver		= {
 		.name	= "imx-drm",
 		.pm	= &imx_drm_pm_ops,
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 14/15] drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (12 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:30   ` Maxime Ripard
  2023-09-01 23:41 ` [RFT PATCH 15/15] drm/renesas/shmobile: " Douglas Anderson
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, Xinhui.Pan, airlied, alexander.deucher,
	amd-gfx, christian.koenig, daniel, linux-kernel

Based on grepping through the source code, this driver appears to be
missing a call to drm_atomic_helper_shutdown(), or in this case the
non-atomic equivalent drm_helper_force_disable_all(), at system
shutdown time and at driver remove time. This is important because
drm_helper_force_disable_all() will cause panels to get disabled
cleanly which may be important for their power sequencing. Future
changes will remove any custom powering off in individual panel
drivers so the DRM drivers need to start getting this right.

The fact that we should call drm_atomic_helper_shutdown(), or in this
case the non-atomic equivalent drm_helper_force_disable_all(), in the
case of OS shutdown/restart comes straight out of the kernel doc
"driver instance overview" in drm_drv.c.

NOTE: in order to get things inserted in the right place, I had to
replace the old/deprecated drm_put_dev() function with the equivalent
new calls.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I honestly have no idea if I got this patch right. The shutdown()
function already had some special case logic for PPC, Loongson, and
VMs and I don't 100% for sure know how this interacts with
those. Everything here is just compile tested.

 drivers/gpu/drm/radeon/radeon_drv.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 39cdede460b5..67995ea24852 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -38,6 +38,7 @@
 #include <linux/pci.h>
 
 #include <drm/drm_aperture.h>
+#include <drm/drm_crtc_helper.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
 #include <drm/drm_gem.h>
@@ -357,7 +358,9 @@ radeon_pci_remove(struct pci_dev *pdev)
 {
 	struct drm_device *dev = pci_get_drvdata(pdev);
 
-	drm_put_dev(dev);
+	drm_dev_unregister(dev);
+	drm_helper_force_disable_all(dev);
+	drm_dev_put(dev);
 }
 
 static void
@@ -368,6 +371,8 @@ radeon_pci_shutdown(struct pci_dev *pdev)
 	 */
 	if (radeon_device_is_virtual())
 		radeon_pci_remove(pdev);
+	else
+		drm_helper_force_disable_all(pci_get_drvdata(pdev));
 
 #if defined(CONFIG_PPC64) || defined(CONFIG_MACH_LOONGSON64)
 	/*
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFT PATCH 15/15] drm/renesas/shmobile: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
                   ` (13 preceding siblings ...)
  2023-09-01 23:41 ` [RFT PATCH 14/15] drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
@ 2023-09-01 23:41 ` Douglas Anderson
  2023-09-04  7:27   ` Geert Uytterhoeven
  14 siblings, 1 reply; 59+ messages in thread
From: Douglas Anderson @ 2023-09-01 23:41 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Douglas Anderson, airlied, biju.das.jz, daniel, geert+renesas,
	kieran.bingham+renesas, laurent.pinchart, linux-kernel,
	linux-renesas-soc, paul, tzimmermann

Based on grepping through the source code, this driver appears to be
missing a call to drm_atomic_helper_shutdown(), or in this case the
non-atomic equivalent drm_helper_force_disable_all(), at system
shutdown time and at driver remove time. This is important because
drm_helper_force_disable_all() will cause panels to get disabled
cleanly which may be important for their power sequencing. Future
changes will remove any custom powering off in individual panel
drivers so the DRM drivers need to start getting this right.

The fact that we should call drm_atomic_helper_shutdown(), or in this
case the non-atomic equivalent drm_helper_force_disable_all(), in the
case of OS shutdown/restart comes straight out of the kernel doc
"driver instance overview" in drm_drv.c.

Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This commit is only compile-time tested.

 drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c b/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
index 30493ce87419..d6dd46c925c5 100644
--- a/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
+++ b/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
@@ -15,6 +15,7 @@
 #include <linux/pm.h>
 #include <linux/slab.h>
 
+#include <drm/drm_crtc_helper.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_fbdev_generic.h>
 #include <drm/drm_gem_dma_helper.h>
@@ -179,12 +180,20 @@ static int shmob_drm_remove(struct platform_device *pdev)
 
 	drm_dev_unregister(ddev);
 	drm_kms_helper_poll_fini(ddev);
+	drm_helper_force_disable_all(ddev);
 	free_irq(sdev->irq, ddev);
 	drm_dev_put(ddev);
 
 	return 0;
 }
 
+static void shmob_drm_shutdown(struct platform_device *pdev)
+{
+	struct shmob_drm_device *sdev = platform_get_drvdata(pdev);
+
+	drm_helper_force_disable_all(sdev->ddev);
+}
+
 static int shmob_drm_probe(struct platform_device *pdev)
 {
 	struct shmob_drm_platform_data *pdata = pdev->dev.platform_data;
@@ -289,6 +298,7 @@ static int shmob_drm_probe(struct platform_device *pdev)
 static struct platform_driver shmob_drm_platform_driver = {
 	.probe		= shmob_drm_probe,
 	.remove		= shmob_drm_remove,
+	.shutdown	= shmob_drm_shutdown,
 	.driver		= {
 		.name	= "shmob-drm",
 		.pm	= pm_sleep_ptr(&shmob_drm_pm_ops),
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
@ 2023-09-03 15:53   ` Russell King (Oracle)
  2023-09-04  7:36     ` Maxime Ripard
  2023-09-05 14:23     ` Doug Anderson
  2023-09-04  7:53   ` Maxime Ripard
  2023-09-04  7:56   ` Russell King (Oracle)
  2 siblings, 2 replies; 59+ messages in thread
From: Russell King (Oracle) @ 2023-09-03 15:53 UTC (permalink / raw)
  To: Douglas Anderson; +Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel

On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> The fact that we should call drm_atomic_helper_shutdown() in the case
> of OS shutdown/restart comes straight out of the kernel doc "driver
> instance overview" in drm_drv.c.
> 
> This driver was fairly easy to update. The drm_device is stored in the
> drvdata so we just have to make sure the drvdata is NULL whenever the
> device is not bound.

... and there I think you have a misunderstanding of the driver model.
Please have a look at device_unbind_cleanup() which will be called if
probe fails, or when the device is removed (in other words, when it is
not bound to a driver.)

Also, devices which aren't bound to a driver won't have their shutdown
method called (because there is no driver currently bound to that
device.) So, ->probe must have completed successfully, and ->remove
must not have been called for that device.

So, I think that all these dev_set_drvdata(dev, NULL) that you're
adding are just asking for a kernel janitor to come along later and
remove them because they serve no purpose... so best not introduce
them in the first place.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 04/15] drm/kmb: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 04/15] drm/kmb: " Douglas Anderson
@ 2023-09-04  7:26   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:26 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, anitha.chrisanthus, daniel, dri-devel, edmund.j.dea,
	linux-kernel, Maxime Ripard

On Fri, 1 Sep 2023 16:41:15 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 05/15] drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 05/15] drm/mediatek: " Douglas Anderson
@ 2023-09-04  7:27   ` Maxime Ripard
  2023-09-08 11:51   ` Fei Shao
  1 sibling, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:27 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, angelogioacchino.delregno, chunkuang.hu, daniel,
	dri-devel, linux-arm-kernel, linux-kernel, linux-mediatek,
	matthias.bgg, p.zabel, Maxime Ripard

On Fri, 1 Sep 2023 16:41:16 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 06/15] drm/nouveau: Call drm_atomic_helper_shutdown() or equiv at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 06/15] drm/nouveau: Call drm_atomic_helper_shutdown() or equiv " Douglas Anderson
@ 2023-09-04  7:27   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:27 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, bskeggs, daniel, dri-devel, kherbst, linux-kernel,
	lyude, nouveau, Maxime Ripard

On Fri, 1 Sep 2023 16:41:17 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() (or
> drm_helper_force_disable_all() if not using atomic) at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 15/15] drm/renesas/shmobile: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-01 23:41 ` [RFT PATCH 15/15] drm/renesas/shmobile: " Douglas Anderson
@ 2023-09-04  7:27   ` Geert Uytterhoeven
  2023-09-05 21:10     ` Doug Anderson
  0 siblings, 1 reply; 59+ messages in thread
From: Geert Uytterhoeven @ 2023-09-04  7:27 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: dri-devel, Maxime Ripard, airlied, biju.das.jz, daniel,
	kieran.bingham+renesas, laurent.pinchart, linux-kernel,
	linux-renesas-soc, paul, tzimmermann

Hi Douglas,

On Sat, Sep 2, 2023 at 1:42 AM Douglas Anderson <dianders@chromium.org> wrote:
> Based on grepping through the source code, this driver appears to be
> missing a call to drm_atomic_helper_shutdown(), or in this case the
> non-atomic equivalent drm_helper_force_disable_all(), at system
> shutdown time and at driver remove time. This is important because
> drm_helper_force_disable_all() will cause panels to get disabled
> cleanly which may be important for their power sequencing. Future
> changes will remove any custom powering off in individual panel
> drivers so the DRM drivers need to start getting this right.
>
> The fact that we should call drm_atomic_helper_shutdown(), or in this
> case the non-atomic equivalent drm_helper_force_disable_all(), in the
> case of OS shutdown/restart comes straight out of the kernel doc
> "driver instance overview" in drm_drv.c.
>
> Suggested-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

Thanks for your patch!

> --- a/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
> +++ b/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
> @@ -179,12 +180,20 @@ static int shmob_drm_remove(struct platform_device *pdev)
>
>         drm_dev_unregister(ddev);
>         drm_kms_helper_poll_fini(ddev);
> +       drm_helper_force_disable_all(ddev);

After "[PATCH v3 36/41] drm: renesas: shmobile: Atomic conversion part
1"[1], this function will already call drm_atomic_helper_shutdown()...

>         free_irq(sdev->irq, ddev);
>         drm_dev_put(ddev);
>
>         return 0;
>  }
>
> +static void shmob_drm_shutdown(struct platform_device *pdev)
> +{
> +       struct shmob_drm_device *sdev = platform_get_drvdata(pdev);
> +
> +       drm_helper_force_disable_all(sdev->ddev);

... and this should be replaced by a call to drm_atomic_helper_shutdown().

[1] https://lore.kernel.org/dri-devel/fd7a6702490bd431f314d6591551bb39e77e3304.1692178020.git.geert+renesas@glider.be

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 07/15] drm/tegra: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 07/15] drm/tegra: Call drm_atomic_helper_shutdown() " Douglas Anderson
@ 2023-09-04  7:28   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:28 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, jonathanh, linux-kernel, linux-tegra,
	mperttunen, thierry.reding, Maxime Ripard

On Fri, 1 Sep 2023 16:41:18 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 08/15] drm/arcpgu: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 08/15] drm/arcpgu: " Douglas Anderson
@ 2023-09-04  7:28   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:28 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: abrodkin, airlied, daniel, dri-devel, linux-kernel, Maxime Ripard

On Fri, 1 Sep 2023 16:41:19 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 12/15] drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-01 23:41 ` [RFT PATCH 12/15] drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
@ 2023-09-04  7:29   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:29 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, linux-kernel, patrik.r.jakobsson,
	Maxime Ripard

On Fri, 1 Sep 2023 16:41:23 -0700, Douglas Anderson wrote:
> Based on grepping through the source code, this driver appears to be
> missing a call to drm_atomic_helper_shutdown(), or in this case the
> non-atomic equivalent drm_helper_force_disable_all(), at system
> shutdown time and at driver remove time. This is important because
> drm_helper_force_disable_all() will cause panels to get disabled
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 14/15] drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-01 23:41 ` [RFT PATCH 14/15] drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
@ 2023-09-04  7:30   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:30 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Xinhui.Pan, airlied, alexander.deucher, amd-gfx,
	christian.koenig, daniel, dri-devel, linux-kernel, Maxime Ripard

On Fri, 1 Sep 2023 16:41:25 -0700, Douglas Anderson wrote:
> Based on grepping through the source code, this driver appears to be
> missing a call to drm_atomic_helper_shutdown(), or in this case the
> non-atomic equivalent drm_helper_force_disable_all(), at system
> shutdown time and at driver remove time. This is important because
> drm_helper_force_disable_all() will cause panels to get disabled
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-03 15:53   ` Russell King (Oracle)
@ 2023-09-04  7:36     ` Maxime Ripard
  2023-09-04  7:55       ` Russell King (Oracle)
  2023-09-05 14:23     ` Doug Anderson
  1 sibling, 1 reply; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:36 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Douglas Anderson, dri-devel, airlied, daniel, linux-kernel

On Sun, Sep 03, 2023 at 04:53:42PM +0100, Russell King (Oracle) wrote:
> On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > Based on grepping through the source code this driver appears to be
> > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > time. Among other things, this means that if a panel is in use that it
> > won't be cleanly powered off at system shutdown time.
> > 
> > The fact that we should call drm_atomic_helper_shutdown() in the case
> > of OS shutdown/restart comes straight out of the kernel doc "driver
> > instance overview" in drm_drv.c.
> > 
> > This driver was fairly easy to update. The drm_device is stored in the
> > drvdata so we just have to make sure the drvdata is NULL whenever the
> > device is not bound.
> 
> ... and there I think you have a misunderstanding of the driver model.
> Please have a look at device_unbind_cleanup() which will be called if
> probe fails, or when the device is removed (in other words, when it is
> not bound to a driver.)
> 
> Also, devices which aren't bound to a driver won't have their shutdown
> method called (because there is no driver currently bound to that
> device.) So, ->probe must have completed successfully, and ->remove
> must not have been called for that device.
> 
> So, I think that all these dev_set_drvdata(dev, NULL) that you're
> adding are just asking for a kernel janitor to come along later and
> remove them because they serve no purpose... so best not introduce
> them in the first place.

What would that hypothetical janitor clean up exactly? Code making sure
that there's no dangling pointer? Doesn't look very wise to me.

Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 03/15] drm/ingenic: " Douglas Anderson
@ 2023-09-04  7:36   ` Maxime Ripard
  2023-09-04  9:15   ` Paul Cercueil
  2023-09-13 18:21   ` Doug Anderson
  2 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:36 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, linux-kernel, linux-mips, paul,
	Maxime Ripard

On Fri, 1 Sep 2023 16:41:14 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 02/15] drm/imx/dcss: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 02/15] drm/imx/dcss: " Douglas Anderson
@ 2023-09-04  7:36   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:36 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, festevam, kernel, l.stach,
	laurentiu.palcu, linux-arm-kernel, linux-imx, linux-kernel,
	shawnguo, Maxime Ripard, Sascha Hauer

On Fri, 1 Sep 2023 16:41:13 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
  2023-09-03 15:53   ` Russell King (Oracle)
@ 2023-09-04  7:53   ` Maxime Ripard
  2023-09-04  7:56   ` Russell King (Oracle)
  2 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:53 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, linux-kernel, linux, Maxime Ripard

On Fri, 1 Sep 2023 16:41:12 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 10/15] drm/sprd: Call drm_atomic_helper_shutdown() at remove time
  2023-09-01 23:41 ` [RFT PATCH 10/15] drm/sprd: Call drm_atomic_helper_shutdown() at remove time Douglas Anderson
@ 2023-09-04  7:54   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:54 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, linux-kernel, liviu.dudau, orsonzhai,
	rfoss, robh, sam, steven.price, tzimmermann, zhang.lyra,
	Baolin Wang, Maxime Ripard

On Fri, 1 Sep 2023 16:41:21 -0700, Douglas Anderson wrote:
> Based on grepping through the source code, this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at remove time. Let's
> add it.
> 
> The fact that we should call drm_atomic_helper_shutdown() in the case
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 11/15] drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-01 23:41 ` [RFT PATCH 11/15] drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
@ 2023-09-04  7:54   ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:54 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, alim.akhtar, daniel, dri-devel, inki.dae,
	krzysztof.kozlowski, kyungmin.park, linux-arm-kernel,
	linux-kernel, linux-samsung-soc, sw0312.kim, Maxime Ripard

On Fri, 1 Sep 2023 16:41:22 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown time
> and at driver unbind time. Among other things, this means that if a
> panel is in use that it won't be cleanly powered off at system
> shutdown time.
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-01 23:41 ` [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
@ 2023-09-04  7:55   ` Maxime Ripard
  2023-09-04  8:30   ` Philipp Zabel
  2023-09-13 18:21   ` Doug Anderson
  2 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04  7:55 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: airlied, daniel, dri-devel, festevam, kernel, linux-arm-kernel,
	linux-imx, linux-kernel, p.zabel, shawnguo, Maxime Ripard,
	Sascha Hauer

On Fri, 1 Sep 2023 16:41:24 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown time
> and at driver unbind time. Among other things, this means that if a
> panel is in use that it won't be cleanly powered off at system
> shutdown time.
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-04  7:36     ` Maxime Ripard
@ 2023-09-04  7:55       ` Russell King (Oracle)
  2023-09-04 15:18         ` Maxime Ripard
  0 siblings, 1 reply; 59+ messages in thread
From: Russell King (Oracle) @ 2023-09-04  7:55 UTC (permalink / raw)
  To: Maxime Ripard, Greg Kroah-Hartman
  Cc: Douglas Anderson, dri-devel, airlied, daniel, linux-kernel

On Mon, Sep 04, 2023 at 09:36:10AM +0200, Maxime Ripard wrote:
> On Sun, Sep 03, 2023 at 04:53:42PM +0100, Russell King (Oracle) wrote:
> > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > Based on grepping through the source code this driver appears to be
> > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > time. Among other things, this means that if a panel is in use that it
> > > won't be cleanly powered off at system shutdown time.
> > > 
> > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > instance overview" in drm_drv.c.
> > > 
> > > This driver was fairly easy to update. The drm_device is stored in the
> > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > device is not bound.
> > 
> > ... and there I think you have a misunderstanding of the driver model.
> > Please have a look at device_unbind_cleanup() which will be called if
> > probe fails, or when the device is removed (in other words, when it is
> > not bound to a driver.)
> > 
> > Also, devices which aren't bound to a driver won't have their shutdown
> > method called (because there is no driver currently bound to that
> > device.) So, ->probe must have completed successfully, and ->remove
> > must not have been called for that device.
> > 
> > So, I think that all these dev_set_drvdata(dev, NULL) that you're
> > adding are just asking for a kernel janitor to come along later and
> > remove them because they serve no purpose... so best not introduce
> > them in the first place.
> 
> What would that hypothetical janitor clean up exactly? Code making sure
> that there's no dangling pointer? Doesn't look very wise to me.

How can there be a dangling pointer when the driver core removes the
pointer for the driver in these cases?

If I were to accept the argument that the driver core _might_ "forget"
to NULL out this pointer, then that argument by extension also means
that no one should make use of the devm_* stuff either, just in case
the driver core forgets to release that stuff. Best have every driver
manually release those resources. Nope, that doesn't work, because
driver authors tend to write buggy cleanup paths.

There are janitors that go around removing this stuff, and janitorial
patches tend to keep coming even if one says nak at any particular
point... and yes, janitors do go around removing this unnecessary
junk from drivers.

You will find examples of this removal in commit
ec3b1ce2ca34, 5cdade2d77dd, c7cb175bb1ef

Moreover:

7efb10383181

is also removing unnecessary driver code. Testing for driver data being
NULL when we know that a _successful_ probe has happened (because
->remove won't be called unless we were successful) and the probe
always sets drvdata non-NULL is also useless code.

If ultimately you don't trust the driver model to do what it's been
doing for more than the last decade, then I wonder whether you should
be trusting the kernel to manage your hardware!

Anyway, I've said no to this patch for a driver that I'm marked as
maintainer for, so at least do not make the changes I am objecting to
to that driver. Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
  2023-09-03 15:53   ` Russell King (Oracle)
  2023-09-04  7:53   ` Maxime Ripard
@ 2023-09-04  7:56   ` Russell King (Oracle)
  2 siblings, 0 replies; 59+ messages in thread
From: Russell King (Oracle) @ 2023-09-04  7:56 UTC (permalink / raw)
  To: Douglas Anderson; +Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel

On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
> 
> The fact that we should call drm_atomic_helper_shutdown() in the case
> of OS shutdown/restart comes straight out of the kernel doc "driver
> instance overview" in drm_drv.c.
> 
> This driver was fairly easy to update. The drm_device is stored in the
> drvdata so we just have to make sure the drvdata is NULL whenever the
> device is not bound. To make things simpler,
> drm_atomic_helper_shutdown() has been modified to consider a NULL
> drm_device as a noop in the patch ("drm/atomic-helper:
> drm_atomic_helper_shutdown(NULL) should be a noop").
> 
> Suggested-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

As the listed driver author for the reasons I've stated - NAK.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-01 23:41 ` [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
  2023-09-04  7:55   ` Maxime Ripard
@ 2023-09-04  8:30   ` Philipp Zabel
  2023-09-05 20:29     ` Doug Anderson
  2023-09-13 18:21   ` Doug Anderson
  2 siblings, 1 reply; 59+ messages in thread
From: Philipp Zabel @ 2023-09-04  8:30 UTC (permalink / raw)
  To: Douglas Anderson, dri-devel, Maxime Ripard
  Cc: Sascha Hauer, airlied, daniel, festevam, kernel,
	linux-arm-kernel, linux-imx, linux-kernel, shawnguo

On Fr, 2023-09-01 at 16:41 -0700, Douglas Anderson wrote:
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown time
> and at driver unbind time. Among other things, this means that if a
> panel is in use that it won't be cleanly powered off at system
> shutdown time.
> 
> The fact that we should call drm_atomic_helper_shutdown() in the case
> of OS shutdown/restart and at driver remove (or unbind) time comes
> straight out of the kernel doc "driver instance overview" in
> drm_drv.c.
> 
> A few notes about this fix:
> - When adding drm_atomic_helper_shutdown() to the unbind path, I added
>   it after drm_kms_helper_poll_fini() since that's when other drivers
>   seemed to have it.
> - Technically with a previous patch, ("drm/atomic-helper:
>   drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
>   actually need to check to see if our "drm" pointer is NULL before
>   calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
>   though, so that this patch can land without any dependencies. It
>   could potentially be removed later.
> - This patch also makes sure to set the drvdata to NULL in the case of
>   bind errors to make sure that shutdown can't access freed data.
> 
> Suggested-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

Thank you,
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 03/15] drm/ingenic: " Douglas Anderson
  2023-09-04  7:36   ` Maxime Ripard
@ 2023-09-04  9:15   ` Paul Cercueil
  2023-09-05 20:16     ` Doug Anderson
  2023-09-13 18:21   ` Doug Anderson
  2 siblings, 1 reply; 59+ messages in thread
From: Paul Cercueil @ 2023-09-04  9:15 UTC (permalink / raw)
  To: Douglas Anderson, dri-devel, Maxime Ripard
  Cc: airlied, daniel, linux-kernel, linux-mips

Hi Douglas,

Le vendredi 01 septembre 2023 à 16:41 -0700, Douglas Anderson a écrit :
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that
> it
> won't be cleanly powered off at system shutdown time.
> 
> The fact that we should call drm_atomic_helper_shutdown() in the case
> of OS shutdown/restart comes straight out of the kernel doc "driver
> instance overview" in drm_drv.c.
> 
> Since this driver uses the component model and shutdown happens at
> the
> base driver, we communicate whether we have to call
> drm_atomic_helper_shutdown() by seeing if drvdata is non-NULL.
> 
> Suggested-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

LGTM.
Acked-by: Paul Cercueil <paul@crapouillou.net>

> ---
> This commit is only compile-time tested.
> 
> NOTE: this patch touches a lot more than other similar patches since
> the bind() function is long and we want to make sure that we unset
> the
> drvdata if bind() fails.
> 
> While making this patch, I noticed that the bind() function of this
> driver is using "devm" and thus assumes it doesn't need to do much
> explicit error handling. That's actually a bug. As per kernel docs
> [1]
> "the lifetime of the aggregate driver does not align with any of the
> underlying struct device instances. Therefore devm cannot be used and
> all resources acquired or allocated in this callback must be
> explicitly released in the unbind callback". Fixing that is outside
> the scope of this commit.
> 
> [1] https://docs.kernel.org/driver-api/component.html
> 

Noted, thanks.

Cheers,
-Paul

>  drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 66 +++++++++++++++------
> --
>  1 file changed, 44 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> index 8dbd4847d3a6..51995a0cd568 100644
> --- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> +++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> @@ -1130,7 +1130,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>  
>         ret = drmm_mode_config_init(drm);
>         if (ret)
> -               return ret;
> +               goto err_drvdata;
>  
>         drm->mode_config.min_width = 0;
>         drm->mode_config.min_height = 0;
> @@ -1142,7 +1142,8 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>         base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
>         if (IS_ERR(base)) {
>                 dev_err(dev, "Failed to get memory resource\n");
> -               return PTR_ERR(base);
> +               ret = PTR_ERR(base);
> +               goto err_drvdata;
>         }
>  
>         regmap_config = ingenic_drm_regmap_config;
> @@ -1151,33 +1152,40 @@ static int ingenic_drm_bind(struct device
> *dev, bool has_components)
>                                           &regmap_config);
>         if (IS_ERR(priv->map)) {
>                 dev_err(dev, "Failed to create regmap\n");
> -               return PTR_ERR(priv->map);
> +               ret = PTR_ERR(priv->map);
> +               goto err_drvdata;
>         }
>  
>         irq = platform_get_irq(pdev, 0);
> -       if (irq < 0)
> -               return irq;
> +       if (irq < 0) {
> +               ret = irq;
> +               goto err_drvdata;
> +       }
>  
>         if (soc_info->needs_dev_clk) {
>                 priv->lcd_clk = devm_clk_get(dev, "lcd");
>                 if (IS_ERR(priv->lcd_clk)) {
>                         dev_err(dev, "Failed to get lcd clock\n");
> -                       return PTR_ERR(priv->lcd_clk);
> +                       ret = PTR_ERR(priv->lcd_clk);
> +                       goto err_drvdata;
>                 }
>         }
>  
>         priv->pix_clk = devm_clk_get(dev, "lcd_pclk");
>         if (IS_ERR(priv->pix_clk)) {
>                 dev_err(dev, "Failed to get pixel clock\n");
> -               return PTR_ERR(priv->pix_clk);
> +               ret = PTR_ERR(priv->pix_clk);
> +               goto err_drvdata;
>         }
>  
>         priv->dma_hwdescs = dmam_alloc_coherent(dev,
>                                                 sizeof(*priv-
> >dma_hwdescs),
>                                                 &priv-
> >dma_hwdescs_phys,
>                                                 GFP_KERNEL);
> -       if (!priv->dma_hwdescs)
> -               return -ENOMEM;
> +       if (!priv->dma_hwdescs) {
> +               ret = -ENOMEM;
> +               goto err_drvdata;
> +       }
>  
>         /* Configure DMA hwdesc for foreground0 plane */
>         ingenic_drm_configure_hwdesc_plane(priv, 0);
> @@ -1199,7 +1207,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>                                        NULL, DRM_PLANE_TYPE_PRIMARY,
> NULL);
>         if (ret) {
>                 dev_err(dev, "Failed to register plane: %i\n", ret);
> -               return ret;
> +               goto err_drvdata;
>         }
>  
>         if (soc_info->map_noncoherent)
> @@ -1211,7 +1219,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>                                         NULL,
> &ingenic_drm_crtc_funcs, NULL);
>         if (ret) {
>                 dev_err(dev, "Failed to init CRTC: %i\n", ret);
> -               return ret;
> +               goto err_drvdata;
>         }
>  
>         drm_crtc_enable_color_mgmt(&priv->crtc, 0, false,
> @@ -1230,7 +1238,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>                 if (ret) {
>                         dev_err(dev, "Failed to register overlay
> plane: %i\n",
>                                 ret);
> -                       return ret;
> +                       goto err_drvdata;
>                 }
>  
>                 if (soc_info->map_noncoherent)
> @@ -1241,17 +1249,18 @@ static int ingenic_drm_bind(struct device
> *dev, bool has_components)
>                         if (ret) {
>                                 if (ret != -EPROBE_DEFER)
>                                         dev_err(dev, "Failed to bind
> components: %i\n", ret);
> -                               return ret;
> +                               goto err_drvdata;
>                         }
>  
>                         ret = devm_add_action_or_reset(dev,
> ingenic_drm_unbind_all, priv);
>                         if (ret)
> -                               return ret;
> +                               goto err_drvdata;
>  
>                         priv->ipu_plane = drm_plane_from_index(drm,
> 2);
>                         if (!priv->ipu_plane) {
>                                 dev_err(dev, "Failed to retrieve IPU
> plane\n");
> -                               return -EINVAL;
> +                               ret = -EINVAL;
> +                               goto err_drvdata;
>                         }
>                 }
>         }
> @@ -1263,7 +1272,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>                                 break; /* we're done */
>                         if (ret != -EPROBE_DEFER)
>                                 dev_err(dev, "Failed to get bridge
> handle\n");
> -                       return ret;
> +                       goto err_drvdata;
>                 }
>  
>                 if (panel)
> @@ -1275,7 +1284,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>                 if (IS_ERR(ib)) {
>                         ret = PTR_ERR(ib);
>                         dev_err(dev, "Failed to init encoder: %d\n",
> ret);
> -                       return ret;
> +                       goto err_drvdata;
>                 }
>  
>                 encoder = &ib->encoder;
> @@ -1290,13 +1299,14 @@ static int ingenic_drm_bind(struct device
> *dev, bool has_components)
>                                         DRM_BRIDGE_ATTACH_NO_CONNECTO
> R);
>                 if (ret) {
>                         dev_err(dev, "Unable to attach bridge\n");
> -                       return ret;
> +                       goto err_drvdata;
>                 }
>  
>                 connector = drm_bridge_connector_init(drm, encoder);
>                 if (IS_ERR(connector)) {
>                         dev_err(dev, "Unable to init connector\n");
> -                       return PTR_ERR(connector);
> +                       ret = PTR_ERR(connector);
> +                       goto err_drvdata;
>                 }
>  
>                 drm_connector_attach_encoder(connector, encoder);
> @@ -1313,13 +1323,13 @@ static int ingenic_drm_bind(struct device
> *dev, bool has_components)
>         ret = devm_request_irq(dev, irq, ingenic_drm_irq_handler, 0,
> drm->driver->name, drm);
>         if (ret) {
>                 dev_err(dev, "Unable to install IRQ handler\n");
> -               return ret;
> +               goto err_drvdata;
>         }
>  
>         ret = drm_vblank_init(drm, 1);
>         if (ret) {
>                 dev_err(dev, "Failed calling drm_vblank_init()\n");
> -               return ret;
> +               goto err_drvdata;
>         }
>  
>         drm_mode_config_reset(drm);
> @@ -1327,7 +1337,7 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>         ret = clk_prepare_enable(priv->pix_clk);
>         if (ret) {
>                 dev_err(dev, "Unable to start pixel clock\n");
> -               return ret;
> +               goto err_drvdata;
>         }
>  
>         if (priv->lcd_clk) {
> @@ -1402,6 +1412,8 @@ static int ingenic_drm_bind(struct device *dev,
> bool has_components)
>                 clk_disable_unprepare(priv->lcd_clk);
>  err_pixclk_disable:
>         clk_disable_unprepare(priv->pix_clk);
> +err_drvdata:
> +       platform_set_drvdata(pdev, NULL);
>         return ret;
>  }
>  
> @@ -1422,6 +1434,7 @@ static void ingenic_drm_unbind(struct device
> *dev)
>  
>         drm_dev_unregister(&priv->drm);
>         drm_atomic_helper_shutdown(&priv->drm);
> +       dev_set_drvdata(dev, NULL);
>  }
>  
>  static const struct component_master_ops ingenic_master_ops = {
> @@ -1461,6 +1474,14 @@ static int ingenic_drm_remove(struct
> platform_device *pdev)
>         return 0;
>  }
>  
> +static void ingenic_drm_shutdown(struct platform_device *pdev)
> +{
> +       struct ingenic_drm *priv = platform_get_drvdata(pdev);
> +
> +       if (priv)
> +               drm_atomic_helper_shutdown(&priv->drm);
> +}
> +
>  static int ingenic_drm_suspend(struct device *dev)
>  {
>         struct ingenic_drm *priv = dev_get_drvdata(dev);
> @@ -1612,6 +1633,7 @@ static struct platform_driver
> ingenic_drm_driver = {
>         },
>         .probe = ingenic_drm_probe,
>         .remove = ingenic_drm_remove,
> +       .shutdown = ingenic_drm_shutdown,
>  };
>  
>  static int ingenic_drm_init(void)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-04  7:55       ` Russell King (Oracle)
@ 2023-09-04 15:18         ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-04 15:18 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Greg Kroah-Hartman, Douglas Anderson, dri-devel, airlied, daniel,
	linux-kernel

On Mon, Sep 04, 2023 at 08:55:43AM +0100, Russell King (Oracle) wrote:
> On Mon, Sep 04, 2023 at 09:36:10AM +0200, Maxime Ripard wrote:
> > On Sun, Sep 03, 2023 at 04:53:42PM +0100, Russell King (Oracle) wrote:
> > > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > > Based on grepping through the source code this driver appears to be
> > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > time. Among other things, this means that if a panel is in use that it
> > > > won't be cleanly powered off at system shutdown time.
> > > > 
> > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > instance overview" in drm_drv.c.
> > > > 
> > > > This driver was fairly easy to update. The drm_device is stored in the
> > > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > > device is not bound.
> > > 
> > > ... and there I think you have a misunderstanding of the driver model.
> > > Please have a look at device_unbind_cleanup() which will be called if
> > > probe fails, or when the device is removed (in other words, when it is
> > > not bound to a driver.)
> > > 
> > > Also, devices which aren't bound to a driver won't have their shutdown
> > > method called (because there is no driver currently bound to that
> > > device.) So, ->probe must have completed successfully, and ->remove
> > > must not have been called for that device.
> > > 
> > > So, I think that all these dev_set_drvdata(dev, NULL) that you're
> > > adding are just asking for a kernel janitor to come along later and
> > > remove them because they serve no purpose... so best not introduce
> > > them in the first place.
> > 
> > What would that hypothetical janitor clean up exactly? Code making sure
> > that there's no dangling pointer? Doesn't look very wise to me.
> 
> How can there be a dangling pointer when the driver core removes the
> pointer for the driver in these cases?

You can still access that pointer from remove after the call to
component_del(). It's unlikely, sure, but the issue is still there.

> If I were to accept the argument that the driver core _might_ "forget"
> to NULL out this pointer, then that argument by extension also means
> that no one should make use of the devm_* stuff either, just in case
> the driver core forgets to release that stuff. Best have every driver
> manually release those resources.

It's funny that you go for that argument, because using devm is known to
be a design issue in KMS (and the rest of the kernel to some extent), so
yeah, I very much agree with you there.

> Nope, that doesn't work, because driver authors tend to write buggy
> cleanup paths.

And using devm is just as buggy for a KMS driver. We even discourage its
use in the documentation.

But really, I'm not sure what your point is there. Does devm lead to
bugs? Sure. Is it less buggy that manually unrolling an exit path by
hand? Probably. I seriously don't get the relation to some code clearing
a pointer after it's been freed though.

> There are janitors that go around removing this stuff, and janitorial
> patches tend to keep coming even if one says nak at any particular
> point... and yes, janitors do go around removing this unnecessary
> junk from drivers.
> 
> You will find examples of this removal in commit
> ec3b1ce2ca34, 5cdade2d77dd, c7cb175bb1ef

Other people doing it doesn't make it right (or wrong). And really, I'm
not arguing that it's right, I'm saying that it's not wrong.

It's probably being over-cautious, especially in that driver, but it's
not wrong.

> Moreover:
> 
> 7efb10383181
> 
> is also removing unnecessary driver code. Testing for driver data being
> NULL when we know that a _successful_ probe has happened (because
> ->remove won't be called unless we were successful) and the probe
> always sets drvdata non-NULL is also useless code.

Again, I fail to see what the relationship is there.

> If ultimately you don't trust the driver model to do what it's been
> doing for more than the last decade, then I wonder whether you should
> be trusting the kernel to manage your hardware!

It's not the kernel driver model that I don't trust, it's C's (lack of)
memory safety and management. And like you said yourself, "driver
authors tend to write buggy"

> Anyway, I've said no to this patch for a driver that I'm marked as
> maintainer for, so at least do not make the changes I am objecting to
> to that driver. Thanks.

You're entitled to that opinion indeed.

Maxime

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-03 15:53   ` Russell King (Oracle)
  2023-09-04  7:36     ` Maxime Ripard
@ 2023-09-05 14:23     ` Doug Anderson
  2023-09-13 15:34       ` Doug Anderson
  1 sibling, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-05 14:23 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel

Hi,

On Sun, Sep 3, 2023 at 8:53 AM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > Based on grepping through the source code this driver appears to be
> > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > time. Among other things, this means that if a panel is in use that it
> > won't be cleanly powered off at system shutdown time.
> >
> > The fact that we should call drm_atomic_helper_shutdown() in the case
> > of OS shutdown/restart comes straight out of the kernel doc "driver
> > instance overview" in drm_drv.c.
> >
> > This driver was fairly easy to update. The drm_device is stored in the
> > drvdata so we just have to make sure the drvdata is NULL whenever the
> > device is not bound.
>
> ... and there I think you have a misunderstanding of the driver model.
> Please have a look at device_unbind_cleanup() which will be called if
> probe fails, or when the device is removed (in other words, when it is
> not bound to a driver.)

...and there I think you didn't read this patch closely enough and
perhaps that you have a misunderstanding of the component model.
Please have a look at the difference between armada_drm_unbind() and
armada_drm_remove() and also check which of those two functions is
being modified by my patch. Were this patch adding a call to
"dev_set_drvdata(dev, NULL)" in armada_drm_remove() then your NAK
would be justified. However, I am not aware of anything in the
component unbind path nor in the failure case of component bind that
would NULL the drvdata.

Kindly look at the patch a second time with this in mind.

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-04  9:15   ` Paul Cercueil
@ 2023-09-05 20:16     ` Doug Anderson
  2023-09-06  8:39       ` Maxime Ripard
  2023-09-13 16:25       ` Doug Anderson
  0 siblings, 2 replies; 59+ messages in thread
From: Doug Anderson @ 2023-09-05 20:16 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel, linux-mips

Paul,

On Mon, Sep 4, 2023 at 2:15 AM Paul Cercueil <paul@crapouillou.net> wrote:
>
> Hi Douglas,
>
> Le vendredi 01 septembre 2023 à 16:41 -0700, Douglas Anderson a écrit :
> > Based on grepping through the source code this driver appears to be
> > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > time. Among other things, this means that if a panel is in use that
> > it
> > won't be cleanly powered off at system shutdown time.
> >
> > The fact that we should call drm_atomic_helper_shutdown() in the case
> > of OS shutdown/restart comes straight out of the kernel doc "driver
> > instance overview" in drm_drv.c.
> >
> > Since this driver uses the component model and shutdown happens at
> > the
> > base driver, we communicate whether we have to call
> > drm_atomic_helper_shutdown() by seeing if drvdata is non-NULL.
> >
> > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>
> LGTM.
> Acked-by: Paul Cercueil <paul@crapouillou.net>

Thanks for the Ack! Would you expect this patch to land through
"drm-misc", or do you expect it to go through some other tree?
Running:

./scripts/get_maintainer.pl --scm -f drivers/gpu/drm/ingenic/ingenic-drm-drv.c

...does not show that this driver normally goes through drm-misc, but
it also doesn't show that it goes through any other tree so maybe it's
just an artifact of the way it's tagged in the MAINTAINERS file? If
it's fine for this to go through drm-misc, I'll probably land it (with
your Ack and Maxime's Review) sooner rather than later just to make
this patch series less unwieldy.


> > ---
> > This commit is only compile-time tested.
> >
> > NOTE: this patch touches a lot more than other similar patches since
> > the bind() function is long and we want to make sure that we unset
> > the
> > drvdata if bind() fails.
> >
> > While making this patch, I noticed that the bind() function of this
> > driver is using "devm" and thus assumes it doesn't need to do much
> > explicit error handling. That's actually a bug. As per kernel docs
> > [1]
> > "the lifetime of the aggregate driver does not align with any of the
> > underlying struct device instances. Therefore devm cannot be used and
> > all resources acquired or allocated in this callback must be
> > explicitly released in the unbind callback". Fixing that is outside
> > the scope of this commit.
> >
> > [1] https://docs.kernel.org/driver-api/component.html
> >
>
> Noted, thanks.

FWIW, I think that at least a few other DRM drivers handle this by
doing some of their resource allocation / acquiring in the probe()
function and then only doing things in the bind() that absolutely need
to be in the bind. ;-)


-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-04  8:30   ` Philipp Zabel
@ 2023-09-05 20:29     ` Doug Anderson
  2023-09-06  5:47       ` Philipp Zabel
  0 siblings, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-05 20:29 UTC (permalink / raw)
  To: Philipp Zabel
  Cc: dri-devel, Maxime Ripard, Sascha Hauer, airlied, daniel,
	festevam, kernel, linux-arm-kernel, linux-imx, linux-kernel,
	shawnguo

Hi,

On Mon, Sep 4, 2023 at 1:30 AM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> On Fr, 2023-09-01 at 16:41 -0700, Douglas Anderson wrote:
> > Based on grepping through the source code this driver appears to be
> > missing a call to drm_atomic_helper_shutdown() at system shutdown time
> > and at driver unbind time. Among other things, this means that if a
> > panel is in use that it won't be cleanly powered off at system
> > shutdown time.
> >
> > The fact that we should call drm_atomic_helper_shutdown() in the case
> > of OS shutdown/restart and at driver remove (or unbind) time comes
> > straight out of the kernel doc "driver instance overview" in
> > drm_drv.c.
> >
> > A few notes about this fix:
> > - When adding drm_atomic_helper_shutdown() to the unbind path, I added
> >   it after drm_kms_helper_poll_fini() since that's when other drivers
> >   seemed to have it.
> > - Technically with a previous patch, ("drm/atomic-helper:
> >   drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
> >   actually need to check to see if our "drm" pointer is NULL before
> >   calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
> >   though, so that this patch can land without any dependencies. It
> >   could potentially be removed later.
> > - This patch also makes sure to set the drvdata to NULL in the case of
> >   bind errors to make sure that shutdown can't access freed data.
> >
> > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>
> Thank you,
> Tested-by: Philipp Zabel <p.zabel@pengutronix.de>

Thanks! I notice that:

./scripts/get_maintainer.pl --scm -f drivers/gpu/drm/imx/ipuv3/imx-drm-core.c

Doesn't say drm-misc but also when I look at the MAINTAINERS file and
find the section for "DRM DRIVERS FOR FREESCALE IMX" it doesn't
explicitly list a different git tree. I guess the "shawnguo" git tree
listed by get_maintainer.pl is just from regex matching?

Would you expect this to go through drm-misc? If so, I'll probably
land it sooner rather than later. I can also post up a patch making it
obvious that "DRM DRIVERS FOR FREESCALE IMX" goes through drm-misc if
you don't object.

Thanks!

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 15/15] drm/renesas/shmobile: Call drm_helper_force_disable_all() at shutdown/remove time
  2023-09-04  7:27   ` Geert Uytterhoeven
@ 2023-09-05 21:10     ` Doug Anderson
  0 siblings, 0 replies; 59+ messages in thread
From: Doug Anderson @ 2023-09-05 21:10 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: dri-devel, Maxime Ripard, airlied, biju.das.jz, daniel,
	kieran.bingham+renesas, laurent.pinchart, linux-kernel,
	linux-renesas-soc, paul, tzimmermann

Hi,

On Mon, Sep 4, 2023 at 12:28 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Douglas,
>
> On Sat, Sep 2, 2023 at 1:42 AM Douglas Anderson <dianders@chromium.org> wrote:
> > Based on grepping through the source code, this driver appears to be
> > missing a call to drm_atomic_helper_shutdown(), or in this case the
> > non-atomic equivalent drm_helper_force_disable_all(), at system
> > shutdown time and at driver remove time. This is important because
> > drm_helper_force_disable_all() will cause panels to get disabled
> > cleanly which may be important for their power sequencing. Future
> > changes will remove any custom powering off in individual panel
> > drivers so the DRM drivers need to start getting this right.
> >
> > The fact that we should call drm_atomic_helper_shutdown(), or in this
> > case the non-atomic equivalent drm_helper_force_disable_all(), in the
> > case of OS shutdown/restart comes straight out of the kernel doc
> > "driver instance overview" in drm_drv.c.
> >
> > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>
> Thanks for your patch!
>
> > --- a/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
> > +++ b/drivers/gpu/drm/renesas/shmobile/shmob_drm_drv.c
> > @@ -179,12 +180,20 @@ static int shmob_drm_remove(struct platform_device *pdev)
> >
> >         drm_dev_unregister(ddev);
> >         drm_kms_helper_poll_fini(ddev);
> > +       drm_helper_force_disable_all(ddev);
>
> After "[PATCH v3 36/41] drm: renesas: shmobile: Atomic conversion part
> 1"[1], this function will already call drm_atomic_helper_shutdown()...
>
> >         free_irq(sdev->irq, ddev);
> >         drm_dev_put(ddev);
> >
> >         return 0;
> >  }
> >
> > +static void shmob_drm_shutdown(struct platform_device *pdev)
> > +{
> > +       struct shmob_drm_device *sdev = platform_get_drvdata(pdev);
> > +
> > +       drm_helper_force_disable_all(sdev->ddev);
>
> ... and this should be replaced by a call to drm_atomic_helper_shutdown().
>
> [1] https://lore.kernel.org/dri-devel/fd7a6702490bd431f314d6591551bb39e77e3304.1692178020.git.geert+renesas@glider.be

Ah, thanks! I will put this patch on hold and check back in a few
weeks to see how things are looking. If you wanted to fold it into
your series I certainly wouldn't object to it, but if not that's fine
too. ;-)

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-05 20:29     ` Doug Anderson
@ 2023-09-06  5:47       ` Philipp Zabel
  2023-09-06 14:30         ` Doug Anderson
  0 siblings, 1 reply; 59+ messages in thread
From: Philipp Zabel @ 2023-09-06  5:47 UTC (permalink / raw)
  To: Doug Anderson
  Cc: dri-devel, Maxime Ripard, Sascha Hauer, airlied, daniel,
	festevam, kernel, linux-arm-kernel, linux-imx, linux-kernel,
	shawnguo

Hi,

On Di, 2023-09-05 at 13:29 -0700, Doug Anderson wrote:
> Hi,
> 
> On Mon, Sep 4, 2023 at 1:30 AM Philipp Zabel <p.zabel@pengutronix.de> wrote:
> > 
> > On Fr, 2023-09-01 at 16:41 -0700, Douglas Anderson wrote:
> > > Based on grepping through the source code this driver appears to be
> > > missing a call to drm_atomic_helper_shutdown() at system shutdown time
> > > and at driver unbind time. Among other things, this means that if a
> > > panel is in use that it won't be cleanly powered off at system
> > > shutdown time.
> > > 
> > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > of OS shutdown/restart and at driver remove (or unbind) time comes
> > > straight out of the kernel doc "driver instance overview" in
> > > drm_drv.c.
> > > 
> > > A few notes about this fix:
> > > - When adding drm_atomic_helper_shutdown() to the unbind path, I added
> > >   it after drm_kms_helper_poll_fini() since that's when other drivers
> > >   seemed to have it.
> > > - Technically with a previous patch, ("drm/atomic-helper:
> > >   drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
> > >   actually need to check to see if our "drm" pointer is NULL before
> > >   calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
> > >   though, so that this patch can land without any dependencies. It
> > >   could potentially be removed later.
> > > - This patch also makes sure to set the drvdata to NULL in the case of
> > >   bind errors to make sure that shutdown can't access freed data.
> > > 
> > > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > 
> > Thank you,
> > Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> 
> Thanks! I notice that:
> 
> ./scripts/get_maintainer.pl --scm -f drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
> 
> Doesn't say drm-misc but also when I look at the MAINTAINERS file and
> find the section for "DRM DRIVERS FOR FREESCALE IMX"

That should probably say "IMX5/6" nowadays. There are a lot more i.MX
that do not use IPUv3 than those that do.

> it doesn't explicitly list a different git tree.

I used to send pull requests from git.pengutronix.de/git/pza/linux,
same as for the reset controller framework. I might still have to do
that for changes in drivers/gpu/ipu-v3 that need coordination between
drm and v4l2, but usually pure drm/imx/ipuv3 changes are pushed to drm-
misc.

> I guess the "shawnguo" git tree listed by get_maintainer.pl is just
> from regex matching?

The "N: imx" pattern in "ARM/FREESCALE IMX / MXC ARM ARCHITECTURE", I
think.

> Would you expect this to go through drm-misc? If so, I'll probably
> land it sooner rather than later. I can also post up a patch making it
> obvious that "DRM DRIVERS FOR FREESCALE IMX" goes through drm-misc if
> you don't object.

Yes, both would be great.

regards
Philipp

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-05 20:16     ` Doug Anderson
@ 2023-09-06  8:39       ` Maxime Ripard
  2023-09-13 16:23         ` Doug Anderson
  2023-09-13 16:25       ` Doug Anderson
  1 sibling, 1 reply; 59+ messages in thread
From: Maxime Ripard @ 2023-09-06  8:39 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Paul Cercueil, dri-devel, airlied, daniel, linux-kernel, linux-mips

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

On Tue, Sep 05, 2023 at 01:16:08PM -0700, Doug Anderson wrote:
> > > ---
> > > This commit is only compile-time tested.
> > >
> > > NOTE: this patch touches a lot more than other similar patches since
> > > the bind() function is long and we want to make sure that we unset
> > > the
> > > drvdata if bind() fails.
> > >
> > > While making this patch, I noticed that the bind() function of this
> > > driver is using "devm" and thus assumes it doesn't need to do much
> > > explicit error handling. That's actually a bug. As per kernel docs
> > > [1]
> > > "the lifetime of the aggregate driver does not align with any of the
> > > underlying struct device instances. Therefore devm cannot be used and
> > > all resources acquired or allocated in this callback must be
> > > explicitly released in the unbind callback". Fixing that is outside
> > > the scope of this commit.
> > >
> > > [1] https://docs.kernel.org/driver-api/component.html
> > >
> >
> > Noted, thanks.
> 
> FWIW, I think that at least a few other DRM drivers handle this by
> doing some of their resource allocation / acquiring in the probe()
> function and then only doing things in the bind() that absolutely need
> to be in the bind. ;-)

That doesn't change much. The fundamental issue is that the DRM device
sticks around until the last application that has an open fd to it
closes it.

So it doesn't have any relationship with the unbind/remove timing, and
for all we know it can be there indefinitely, while the application
continues to interact with the driver.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-06  5:47       ` Philipp Zabel
@ 2023-09-06 14:30         ` Doug Anderson
  0 siblings, 0 replies; 59+ messages in thread
From: Doug Anderson @ 2023-09-06 14:30 UTC (permalink / raw)
  To: Philipp Zabel
  Cc: dri-devel, Maxime Ripard, Sascha Hauer, airlied, daniel,
	festevam, kernel, linux-arm-kernel, linux-imx, linux-kernel,
	shawnguo

Hi,

On Tue, Sep 5, 2023 at 10:47 PM Philipp Zabel <p.zabel@pengutronix.de> wrote:
>
> Hi,
>
> On Di, 2023-09-05 at 13:29 -0700, Doug Anderson wrote:
> > Hi,
> >
> > On Mon, Sep 4, 2023 at 1:30 AM Philipp Zabel <p.zabel@pengutronix.de> wrote:
> > >
> > > On Fr, 2023-09-01 at 16:41 -0700, Douglas Anderson wrote:
> > > > Based on grepping through the source code this driver appears to be
> > > > missing a call to drm_atomic_helper_shutdown() at system shutdown time
> > > > and at driver unbind time. Among other things, this means that if a
> > > > panel is in use that it won't be cleanly powered off at system
> > > > shutdown time.
> > > >
> > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > of OS shutdown/restart and at driver remove (or unbind) time comes
> > > > straight out of the kernel doc "driver instance overview" in
> > > > drm_drv.c.
> > > >
> > > > A few notes about this fix:
> > > > - When adding drm_atomic_helper_shutdown() to the unbind path, I added
> > > >   it after drm_kms_helper_poll_fini() since that's when other drivers
> > > >   seemed to have it.
> > > > - Technically with a previous patch, ("drm/atomic-helper:
> > > >   drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
> > > >   actually need to check to see if our "drm" pointer is NULL before
> > > >   calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
> > > >   though, so that this patch can land without any dependencies. It
> > > >   could potentially be removed later.
> > > > - This patch also makes sure to set the drvdata to NULL in the case of
> > > >   bind errors to make sure that shutdown can't access freed data.
> > > >
> > > > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > >
> > > Thank you,
> > > Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
> >
> > Thanks! I notice that:
> >
> > ./scripts/get_maintainer.pl --scm -f drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
> >
> > Doesn't say drm-misc but also when I look at the MAINTAINERS file and
> > find the section for "DRM DRIVERS FOR FREESCALE IMX"
>
> That should probably say "IMX5/6" nowadays. There are a lot more i.MX
> that do not use IPUv3 than those that do.
>
> > it doesn't explicitly list a different git tree.
>
> I used to send pull requests from git.pengutronix.de/git/pza/linux,
> same as for the reset controller framework. I might still have to do
> that for changes in drivers/gpu/ipu-v3 that need coordination between
> drm and v4l2, but usually pure drm/imx/ipuv3 changes are pushed to drm-
> misc.
>
> > I guess the "shawnguo" git tree listed by get_maintainer.pl is just
> > from regex matching?
>
> The "N: imx" pattern in "ARM/FREESCALE IMX / MXC ARM ARCHITECTURE", I
> think.
>
> > Would you expect this to go through drm-misc? If so, I'll probably
> > land it sooner rather than later. I can also post up a patch making it
> > obvious that "DRM DRIVERS FOR FREESCALE IMX" goes through drm-misc if
> > you don't object.
>
> Yes, both would be great.

Maintainers update posted at:

https://lore.kernel.org/r/20230906072803.1.Idef7e77e8961cbeb8625183eec9db0356b2eccd0@changeid

I'll aim to land ${SUBJECT} patch early next week unless there are any
objections.

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 05/15] drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 05/15] drm/mediatek: " Douglas Anderson
  2023-09-04  7:27   ` Maxime Ripard
@ 2023-09-08 11:51   ` Fei Shao
  2023-09-11 16:10     ` Doug Anderson
  1 sibling, 1 reply; 59+ messages in thread
From: Fei Shao @ 2023-09-08 11:51 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: dri-devel, Maxime Ripard, airlied, angelogioacchino.delregno,
	chunkuang.hu, daniel, linux-arm-kernel, linux-kernel,
	linux-mediatek, matthias.bgg, p.zabel

Hi,

On Sat, Sep 2, 2023 at 7:42 AM Douglas Anderson <dianders@chromium.org> wrote:
...<snip>
> @@ -952,6 +960,7 @@ static const struct dev_pm_ops mtk_drm_pm_ops = {
>  static struct platform_driver mtk_drm_platform_driver = {
>         .probe  = mtk_drm_probe,
>         .remove = mtk_drm_remove,

I think this patch, and perhaps some others in this series, will have
a trivial conflict to Uwe's work about the remove callback conversion
e.g. [1], so you might want to rebase the series onto the latest
linux-next.

On the other hand, I tested this patch on MT8195 and MT8188
Chromebooks and I don't see issues during boot / reboot, so

Reviewed-by: Fei Shao <fshao@chromium.org>
Tested-by: Fei Shao <fshao@chromium.org>

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/commit/?h=mediatek-drm-next&id=b3af12a0b46888340e024ba8b231605bcf2d0ab3



> +       .shutdown = mtk_drm_shutdown,
>         .driver = {
>                 .name   = "mediatek-drm",
>                 .pm     = &mtk_drm_pm_ops,
> --
> 2.42.0.283.g2d96d420d3-goog
>
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 05/15] drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-08 11:51   ` Fei Shao
@ 2023-09-11 16:10     ` Doug Anderson
  2023-09-12  6:28       ` Fei Shao
  0 siblings, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-11 16:10 UTC (permalink / raw)
  To: Fei Shao
  Cc: dri-devel, Maxime Ripard, airlied, angelogioacchino.delregno,
	chunkuang.hu, daniel, linux-arm-kernel, linux-kernel,
	linux-mediatek, matthias.bgg, p.zabel

Hi,

On Fri, Sep 8, 2023 at 4:51 AM Fei Shao <fshao@chromium.org> wrote:
>
> Hi,
>
> On Sat, Sep 2, 2023 at 7:42 AM Douglas Anderson <dianders@chromium.org> wrote:
> ...<snip>
> > @@ -952,6 +960,7 @@ static const struct dev_pm_ops mtk_drm_pm_ops = {
> >  static struct platform_driver mtk_drm_platform_driver = {
> >         .probe  = mtk_drm_probe,
> >         .remove = mtk_drm_remove,
>
> I think this patch, and perhaps some others in this series, will have
> a trivial conflict to Uwe's work about the remove callback conversion
> e.g. [1], so you might want to rebase the series onto the latest
> linux-next.
>
> On the other hand, I tested this patch on MT8195 and MT8188
> Chromebooks and I don't see issues during boot / reboot, so
>
> Reviewed-by: Fei Shao <fshao@chromium.org>
> Tested-by: Fei Shao <fshao@chromium.org>
>
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux.git/commit/?h=mediatek-drm-next&id=b3af12a0b46888340e024ba8b231605bcf2d0ab3

That makes sense. I had based this series on drm-misc-next which
didn't have those, but now that a new -rc1 is out it then
drm-misc-next should rebase shortly. I'll make sure that the next
version includes Uwe's changes as much as possible.

That being said, I also wouldn't object if the maintainer of this DRM
driver wanted to resolve conflicts themselves and land the patch
without me needing to resend. The conflict is trivial, there are no
dependencies and no reason to land the series all at once, so landing
this patch early would mean less spam for the maintainer since they
would no longer get CCed on future versions. :-P Just sayin...

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 05/15] drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-11 16:10     ` Doug Anderson
@ 2023-09-12  6:28       ` Fei Shao
  0 siblings, 0 replies; 59+ messages in thread
From: Fei Shao @ 2023-09-12  6:28 UTC (permalink / raw)
  To: Doug Anderson
  Cc: dri-devel, Maxime Ripard, airlied, angelogioacchino.delregno,
	chunkuang.hu, daniel, linux-arm-kernel, linux-kernel,
	linux-mediatek, matthias.bgg, p.zabel

On Tue, Sep 12, 2023 at 12:11 AM Doug Anderson <dianders@chromium.org> wrote:
>
[...]
>
> That makes sense. I had based this series on drm-misc-next which
> didn't have those, but now that a new -rc1 is out it then
> drm-misc-next should rebase shortly. I'll make sure that the next
> version includes Uwe's changes as much as possible.
>
> That being said, I also wouldn't object if the maintainer of this DRM
> driver wanted to resolve conflicts themselves and land the patch
> without me needing to resend. The conflict is trivial, there are no
> dependencies and no reason to land the series all at once, so landing
> this patch early would mean less spam for the maintainer since they
> would no longer get CCed on future versions. :-P Just sayin...

Oh then feel free to ignore that if the changes weren't in the tree at
that time... It was just a gentle reminder. Thanks for clarifying.  :)

Regards,
Fei

>
> -Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-05 14:23     ` Doug Anderson
@ 2023-09-13 15:34       ` Doug Anderson
  2023-09-20 18:03         ` Doug Anderson
  0 siblings, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-13 15:34 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel

Hi,

On Tue, Sep 5, 2023 at 7:23 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Sun, Sep 3, 2023 at 8:53 AM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > Based on grepping through the source code this driver appears to be
> > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > time. Among other things, this means that if a panel is in use that it
> > > won't be cleanly powered off at system shutdown time.
> > >
> > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > instance overview" in drm_drv.c.
> > >
> > > This driver was fairly easy to update. The drm_device is stored in the
> > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > device is not bound.
> >
> > ... and there I think you have a misunderstanding of the driver model.
> > Please have a look at device_unbind_cleanup() which will be called if
> > probe fails, or when the device is removed (in other words, when it is
> > not bound to a driver.)
>
> ...and there I think you didn't read this patch closely enough and
> perhaps that you have a misunderstanding of the component model.
> Please have a look at the difference between armada_drm_unbind() and
> armada_drm_remove() and also check which of those two functions is
> being modified by my patch. Were this patch adding a call to
> "dev_set_drvdata(dev, NULL)" in armada_drm_remove() then your NAK
> would be justified. However, I am not aware of anything in the
> component unbind path nor in the failure case of component bind that
> would NULL the drvdata.
>
> Kindly look at the patch a second time with this in mind.

Since I didn't see any further response, I'll assume that my
explanation here has addressed your concerns. If not, I can re-post it
without NULLing the "drvdata". While I still believe this is unsafe in
some corner cases because of the component model used by this driver,
at least it would get the shutdown call in.

In any case, what's the process for landing patches to this driver?
Running `./scripts/get_maintainer.pl --scm -f
drivers/gpu/drm/armada/armada_drv.c` seems to indicate that this
should go through the git tree:

git git://git.armlinux.org.uk/~rmk/linux-arm.git drm-armada-devel

...but it doesn't appear that recent changes to this driver have gone
that way. Should this land through drm-misc?

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-06  8:39       ` Maxime Ripard
@ 2023-09-13 16:23         ` Doug Anderson
  2023-09-14  8:14           ` Maxime Ripard
  0 siblings, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-13 16:23 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Paul Cercueil, dri-devel, airlied, daniel, linux-kernel, linux-mips

Hi,

On Wed, Sep 6, 2023 at 1:39 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> On Tue, Sep 05, 2023 at 01:16:08PM -0700, Doug Anderson wrote:
> > > > ---
> > > > This commit is only compile-time tested.
> > > >
> > > > NOTE: this patch touches a lot more than other similar patches since
> > > > the bind() function is long and we want to make sure that we unset
> > > > the
> > > > drvdata if bind() fails.
> > > >
> > > > While making this patch, I noticed that the bind() function of this
> > > > driver is using "devm" and thus assumes it doesn't need to do much
> > > > explicit error handling. That's actually a bug. As per kernel docs
> > > > [1]
> > > > "the lifetime of the aggregate driver does not align with any of the
> > > > underlying struct device instances. Therefore devm cannot be used and
> > > > all resources acquired or allocated in this callback must be
> > > > explicitly released in the unbind callback". Fixing that is outside
> > > > the scope of this commit.
> > > >
> > > > [1] https://docs.kernel.org/driver-api/component.html
> > > >
> > >
> > > Noted, thanks.
> >
> > FWIW, I think that at least a few other DRM drivers handle this by
> > doing some of their resource allocation / acquiring in the probe()
> > function and then only doing things in the bind() that absolutely need
> > to be in the bind. ;-)
>
> That doesn't change much. The fundamental issue is that the DRM device
> sticks around until the last application that has an open fd to it
> closes it.
>
> So it doesn't have any relationship with the unbind/remove timing, and
> for all we know it can be there indefinitely, while the application
> continues to interact with the driver.

I spent some time thinking about similar issues recently and, assuming
my understanding is correct, I'd at least partially disagree.

Specifically, I _think_ the only thing that's truly required to remain
valid until userspace closes the last open "fd" is the memory for the
"struct drm_device" itself, right? My understanding is that this is
similar to how "struct device" works. The memory backing a "struct
device" has to live until the last client releases a reference to it
even if everything else about a device has gone away. So if it was all
working perfectly then if the Linux driver backing the "struct
drm_device" goes away then we'd release resources and NULL out a bunch
of stuff in the "struct drm_device" but still keep the actual "struct
drm_device" around since userspace still has a reference. Pretty much
all userspace calls would fail, but at least they wouldn't crash. Is
that roughly the gist?

Assuming that's correct, then _most_ of the resource acquiring /
memory allocation can still happen in the device probe() routine and
can still use devm as long as we do something to ensure that any
resources released are no longer pointed to by anything in the "struct
drm_device".

To make it concrete, I think we want this (feel free to correct). For
simplicity, I'm assuming a driver that _doesn't_ use the component
framework:

a) Linux driver probe() happens. The "struct drm_device" is allocated
in probe() by devm_drm_dev_alloc(). This takes a reference to the
"struct drm_device". The device also acquires resources / allocates
memory.

b) Userspace acquires a reference to the "struct drm_device". Refcount
is now 2 (one from userspace, one from the Linux driver).

c) The Linux driver unbinds, presumably because userspace requested
it. From earlier I think we decided that we can't (by design) block
unbind. Once unbind happens then we shouldn't try to keep operating
the device and the driver should stop running. As part of the unbind,
the remove() is called and also "devm" resources are deallocated. If
any of the things freed are pointed to by the "struct drm_device" then
the code needs to NULL them out at this time. Also we should make sure
that any callback functions that userspace could cause to be invoked
return errors. Our code could go away at any point here since
userspace could "rmmod" our module.

d) Eventually userspace releases the reference and the "struct
drm_device" memory gets automatically freed because it was allocated
by devm_drm_dev_alloc()


NOTE: potentially some things could be allocated / managed by
drmm_xyz() function, like drmm_kmalloc() and that could simplify some
things. However, it's not a panacea for everything. Specifically once
the Linux driver unbind finishes then the device isn't functional
anymore.



-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-05 20:16     ` Doug Anderson
  2023-09-06  8:39       ` Maxime Ripard
@ 2023-09-13 16:25       ` Doug Anderson
  2023-09-13 18:00         ` Paul Cercueil
  1 sibling, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-13 16:25 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel, linux-mips

Hi,

On Tue, Sep 5, 2023 at 1:16 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Paul,
>
> On Mon, Sep 4, 2023 at 2:15 AM Paul Cercueil <paul@crapouillou.net> wrote:
> >
> > Hi Douglas,
> >
> > Le vendredi 01 septembre 2023 à 16:41 -0700, Douglas Anderson a écrit :
> > > Based on grepping through the source code this driver appears to be
> > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > time. Among other things, this means that if a panel is in use that
> > > it
> > > won't be cleanly powered off at system shutdown time.
> > >
> > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > instance overview" in drm_drv.c.
> > >
> > > Since this driver uses the component model and shutdown happens at
> > > the
> > > base driver, we communicate whether we have to call
> > > drm_atomic_helper_shutdown() by seeing if drvdata is non-NULL.
> > >
> > > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> >
> > LGTM.
> > Acked-by: Paul Cercueil <paul@crapouillou.net>
>
> Thanks for the Ack! Would you expect this patch to land through
> "drm-misc", or do you expect it to go through some other tree?
> Running:
>
> ./scripts/get_maintainer.pl --scm -f drivers/gpu/drm/ingenic/ingenic-drm-drv.c
>
> ...does not show that this driver normally goes through drm-misc, but
> it also doesn't show that it goes through any other tree so maybe it's
> just an artifact of the way it's tagged in the MAINTAINERS file? If
> it's fine for this to go through drm-misc, I'll probably land it (with
> your Ack and Maxime's Review) sooner rather than later just to make
> this patch series less unwieldy.
>
>
> > > ---
> > > This commit is only compile-time tested.
> > >
> > > NOTE: this patch touches a lot more than other similar patches since
> > > the bind() function is long and we want to make sure that we unset
> > > the
> > > drvdata if bind() fails.
> > >
> > > While making this patch, I noticed that the bind() function of this
> > > driver is using "devm" and thus assumes it doesn't need to do much
> > > explicit error handling. That's actually a bug. As per kernel docs
> > > [1]
> > > "the lifetime of the aggregate driver does not align with any of the
> > > underlying struct device instances. Therefore devm cannot be used and
> > > all resources acquired or allocated in this callback must be
> > > explicitly released in the unbind callback". Fixing that is outside
> > > the scope of this commit.
> > >
> > > [1] https://docs.kernel.org/driver-api/component.html
> > >
> >
> > Noted, thanks.
>
> FWIW, I think that at least a few other DRM drivers handle this by
> doing some of their resource allocation / acquiring in the probe()
> function and then only doing things in the bind() that absolutely need
> to be in the bind. ;-)

I've been collecting patches that are ready to land in drm-misc but,
right now, I'm not taking this patch since I didn't get any
clarification of whether it should land through drm-misc or somewhere
else.

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-13 16:25       ` Doug Anderson
@ 2023-09-13 18:00         ` Paul Cercueil
  0 siblings, 0 replies; 59+ messages in thread
From: Paul Cercueil @ 2023-09-13 18:00 UTC (permalink / raw)
  To: Doug Anderson
  Cc: dri-devel, Maxime Ripard, airlied, daniel, linux-kernel, linux-mips

Hi Doug,

Le mercredi 13 septembre 2023 à 09:25 -0700, Doug Anderson a écrit :
> Hi,
> 
> On Tue, Sep 5, 2023 at 1:16 PM Doug Anderson <dianders@chromium.org>
> wrote:
> > 
> > Paul,
> > 
> > On Mon, Sep 4, 2023 at 2:15 AM Paul Cercueil <paul@crapouillou.net>
> > wrote:
> > > 
> > > Hi Douglas,
> > > 
> > > Le vendredi 01 septembre 2023 à 16:41 -0700, Douglas Anderson a
> > > écrit :
> > > > Based on grepping through the source code this driver appears
> > > > to be
> > > > missing a call to drm_atomic_helper_shutdown() at system
> > > > shutdown
> > > > time. Among other things, this means that if a panel is in use
> > > > that
> > > > it
> > > > won't be cleanly powered off at system shutdown time.
> > > > 
> > > > The fact that we should call drm_atomic_helper_shutdown() in
> > > > the case
> > > > of OS shutdown/restart comes straight out of the kernel doc
> > > > "driver
> > > > instance overview" in drm_drv.c.
> > > > 
> > > > Since this driver uses the component model and shutdown happens
> > > > at
> > > > the
> > > > base driver, we communicate whether we have to call
> > > > drm_atomic_helper_shutdown() by seeing if drvdata is non-NULL.
> > > > 
> > > > Suggested-by: Maxime Ripard <mripard@kernel.org>
> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > 
> > > LGTM.
> > > Acked-by: Paul Cercueil <paul@crapouillou.net>
> > 
> > Thanks for the Ack! Would you expect this patch to land through
> > "drm-misc", or do you expect it to go through some other tree?
> > Running:
> > 
> > ./scripts/get_maintainer.pl --scm -f
> > drivers/gpu/drm/ingenic/ingenic-drm-drv.c
> > 
> > ...does not show that this driver normally goes through drm-misc,
> > but
> > it also doesn't show that it goes through any other tree so maybe
> > it's
> > just an artifact of the way it's tagged in the MAINTAINERS file? If
> > it's fine for this to go through drm-misc, I'll probably land it
> > (with
> > your Ack and Maxime's Review) sooner rather than later just to make
> > this patch series less unwieldy.
> > 
> > 
> > > > ---
> > > > This commit is only compile-time tested.
> > > > 
> > > > NOTE: this patch touches a lot more than other similar patches
> > > > since
> > > > the bind() function is long and we want to make sure that we
> > > > unset
> > > > the
> > > > drvdata if bind() fails.
> > > > 
> > > > While making this patch, I noticed that the bind() function of
> > > > this
> > > > driver is using "devm" and thus assumes it doesn't need to do
> > > > much
> > > > explicit error handling. That's actually a bug. As per kernel
> > > > docs
> > > > [1]
> > > > "the lifetime of the aggregate driver does not align with any
> > > > of the
> > > > underlying struct device instances. Therefore devm cannot be
> > > > used and
> > > > all resources acquired or allocated in this callback must be
> > > > explicitly released in the unbind callback". Fixing that is
> > > > outside
> > > > the scope of this commit.
> > > > 
> > > > [1] https://docs.kernel.org/driver-api/component.html
> > > > 
> > > 
> > > Noted, thanks.
> > 
> > FWIW, I think that at least a few other DRM drivers handle this by
> > doing some of their resource allocation / acquiring in the probe()
> > function and then only doing things in the bind() that absolutely
> > need
> > to be in the bind. ;-)
> 
> I've been collecting patches that are ready to land in drm-misc but,
> right now, I'm not taking this patch since I didn't get any
> clarification of whether it should land through drm-misc or somewhere
> else.

Sorry, you can take it in drm-misc, yes.

Cheers,
-Paul

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-01 23:41 ` [RFT PATCH 03/15] drm/ingenic: " Douglas Anderson
  2023-09-04  7:36   ` Maxime Ripard
  2023-09-04  9:15   ` Paul Cercueil
@ 2023-09-13 18:21   ` Doug Anderson
  2 siblings, 0 replies; 59+ messages in thread
From: Doug Anderson @ 2023-09-13 18:21 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard; +Cc: airlied, daniel, linux-kernel, linux-mips, paul

Hi,

On Fri, Sep 1, 2023 at 4:42 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown
> time. Among other things, this means that if a panel is in use that it
> won't be cleanly powered off at system shutdown time.
>
> The fact that we should call drm_atomic_helper_shutdown() in the case
> of OS shutdown/restart comes straight out of the kernel doc "driver
> instance overview" in drm_drv.c.
>
> Since this driver uses the component model and shutdown happens at the
> base driver, we communicate whether we have to call
> drm_atomic_helper_shutdown() by seeing if drvdata is non-NULL.
>
> Suggested-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> This commit is only compile-time tested.
>
> NOTE: this patch touches a lot more than other similar patches since
> the bind() function is long and we want to make sure that we unset the
> drvdata if bind() fails.
>
> While making this patch, I noticed that the bind() function of this
> driver is using "devm" and thus assumes it doesn't need to do much
> explicit error handling. That's actually a bug. As per kernel docs [1]
> "the lifetime of the aggregate driver does not align with any of the
> underlying struct device instances. Therefore devm cannot be used and
> all resources acquired or allocated in this callback must be
> explicitly released in the unbind callback". Fixing that is outside
> the scope of this commit.
>
> [1] https://docs.kernel.org/driver-api/component.html
>
>  drivers/gpu/drm/ingenic/ingenic-drm-drv.c | 66 +++++++++++++++--------
>  1 file changed, 44 insertions(+), 22 deletions(-)

[ ... cut ... ]

> @@ -1612,6 +1633,7 @@ static struct platform_driver ingenic_drm_driver = {
>         },
>         .probe = ingenic_drm_probe,
>         .remove = ingenic_drm_remove,
> +       .shutdown = ingenic_drm_shutdown,

I resolved the trivial conflict with commit 2b9b0a9fc548
("drm/ingenic: Convert to platform remove callback returning void"),
then pushed to drm-misc-next:

c3ca98396ffa (HEAD -> drm-misc-next) drm/ingenic: Call
drm_atomic_helper_shutdown() at shutdown time

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time
  2023-09-01 23:41 ` [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
  2023-09-04  7:55   ` Maxime Ripard
  2023-09-04  8:30   ` Philipp Zabel
@ 2023-09-13 18:21   ` Doug Anderson
  2 siblings, 0 replies; 59+ messages in thread
From: Doug Anderson @ 2023-09-13 18:21 UTC (permalink / raw)
  To: dri-devel, Maxime Ripard
  Cc: Sascha Hauer, airlied, daniel, festevam, kernel,
	linux-arm-kernel, linux-imx, linux-kernel, p.zabel, shawnguo

Hi,

On Fri, Sep 1, 2023 at 4:42 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> Based on grepping through the source code this driver appears to be
> missing a call to drm_atomic_helper_shutdown() at system shutdown time
> and at driver unbind time. Among other things, this means that if a
> panel is in use that it won't be cleanly powered off at system
> shutdown time.
>
> The fact that we should call drm_atomic_helper_shutdown() in the case
> of OS shutdown/restart and at driver remove (or unbind) time comes
> straight out of the kernel doc "driver instance overview" in
> drm_drv.c.
>
> A few notes about this fix:
> - When adding drm_atomic_helper_shutdown() to the unbind path, I added
>   it after drm_kms_helper_poll_fini() since that's when other drivers
>   seemed to have it.
> - Technically with a previous patch, ("drm/atomic-helper:
>   drm_atomic_helper_shutdown(NULL) should be a noop"), we don't
>   actually need to check to see if our "drm" pointer is NULL before
>   calling drm_atomic_helper_shutdown(). We'll leave the "if" test in,
>   though, so that this patch can land without any dependencies. It
>   could potentially be removed later.
> - This patch also makes sure to set the drvdata to NULL in the case of
>   bind errors to make sure that shutdown can't access freed data.
>
> Suggested-by: Maxime Ripard <mripard@kernel.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> This commit is only compile-time tested.
>
>  drivers/gpu/drm/imx/ipuv3/imx-drm-core.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c b/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
> index 4a866ac60fff..4c8bc49758a7 100644
> --- a/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
> +++ b/drivers/gpu/drm/imx/ipuv3/imx-drm-core.c
> @@ -257,6 +257,7 @@ static int imx_drm_bind(struct device *dev)
>         drm_kms_helper_poll_fini(drm);
>         component_unbind_all(drm->dev, drm);
>  err_kms:
> +       dev_set_drvdata(dev, NULL);
>         drm_dev_put(drm);
>
>         return ret;
> @@ -269,6 +270,7 @@ static void imx_drm_unbind(struct device *dev)
>         drm_dev_unregister(drm);
>
>         drm_kms_helper_poll_fini(drm);
> +       drm_atomic_helper_shutdown(drm);
>
>         component_unbind_all(drm->dev, drm);
>
> @@ -298,6 +300,14 @@ static int imx_drm_platform_remove(struct platform_device *pdev)
>         return 0;
>  }
>
> +static void imx_drm_platform_shutdown(struct platform_device *pdev)
> +{
> +       struct drm_device *drm = platform_get_drvdata(pdev);
> +
> +       if (drm)
> +               drm_atomic_helper_shutdown(platform_get_drvdata(pdev));

Since this is now landing through the drm-misc-next tree, I got rid of
the check for NULL when applying and landed this after the patch
("drm/atomic-helper: drm_atomic_helper_shutdown(NULL) should be a
noop").


> @@ -325,6 +335,7 @@ MODULE_DEVICE_TABLE(of, imx_drm_dt_ids);
>  static struct platform_driver imx_drm_pdrv = {
>         .probe          = imx_drm_platform_probe,
>         .remove         = imx_drm_platform_remove,
> +       .shutdown       = imx_drm_platform_shutdown,

There was a trivial context conflict with commit 3095f1122203
("drm/imx/ipuv3: Convert to platform remove callback returning void")
that I resolved while applying.

I've now pushed to drm-misc-next:

02680d71dea8 drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at
shutdown/unbind time

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-13 16:23         ` Doug Anderson
@ 2023-09-14  8:14           ` Maxime Ripard
  2023-09-14 22:29             ` Doug Anderson
  0 siblings, 1 reply; 59+ messages in thread
From: Maxime Ripard @ 2023-09-14  8:14 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Paul Cercueil, dri-devel, airlied, daniel, linux-kernel, linux-mips

[-- Attachment #1: Type: text/plain, Size: 7773 bytes --]

Hi,

On Wed, Sep 13, 2023 at 09:23:29AM -0700, Doug Anderson wrote:
> On Wed, Sep 6, 2023 at 1:39 AM Maxime Ripard <mripard@kernel.org> wrote:
> > On Tue, Sep 05, 2023 at 01:16:08PM -0700, Doug Anderson wrote:
> > > > > This commit is only compile-time tested.
> > > > >
> > > > > NOTE: this patch touches a lot more than other similar patches since
> > > > > the bind() function is long and we want to make sure that we unset
> > > > > the
> > > > > drvdata if bind() fails.
> > > > >
> > > > > While making this patch, I noticed that the bind() function of this
> > > > > driver is using "devm" and thus assumes it doesn't need to do much
> > > > > explicit error handling. That's actually a bug. As per kernel docs
> > > > > [1]
> > > > > "the lifetime of the aggregate driver does not align with any of the
> > > > > underlying struct device instances. Therefore devm cannot be used and
> > > > > all resources acquired or allocated in this callback must be
> > > > > explicitly released in the unbind callback". Fixing that is outside
> > > > > the scope of this commit.
> > > > >
> > > > > [1] https://docs.kernel.org/driver-api/component.html
> > > > >
> > > >
> > > > Noted, thanks.
> > >
> > > FWIW, I think that at least a few other DRM drivers handle this by
> > > doing some of their resource allocation / acquiring in the probe()
> > > function and then only doing things in the bind() that absolutely need
> > > to be in the bind. ;-)
> >
> > That doesn't change much. The fundamental issue is that the DRM device
> > sticks around until the last application that has an open fd to it
> > closes it.
> >
> > So it doesn't have any relationship with the unbind/remove timing, and
> > for all we know it can be there indefinitely, while the application
> > continues to interact with the driver.
> 
> I spent some time thinking about similar issues recently and, assuming
> my understanding is correct, I'd at least partially disagree.
> 
> Specifically, I _think_ the only thing that's truly required to remain
> valid until userspace closes the last open "fd" is the memory for the
> "struct drm_device" itself, right? My understanding is that this is
> similar to how "struct device" works. The memory backing a "struct
> device" has to live until the last client releases a reference to it
> even if everything else about a device has gone away. So if it was all
> working perfectly then if the Linux driver backing the "struct
> drm_device" goes away then we'd release resources and NULL out a bunch
> of stuff in the "struct drm_device" but still keep the actual "struct
> drm_device" around since userspace still has a reference. Pretty much
> all userspace calls would fail, but at least they wouldn't crash. Is
> that roughly the gist?

Yes, but also, no.

In the spirit, you're right. However, there's three things interfering
here:

  - You don't always have a match between device and KMS entity. Display
    pipelines are usually multiple devices working together, and while
    you probably have a 1:1 relationship with bridges and panels (and to
    some extent encoders/connectors), the planes and framebuffers for
    example are a mess :) So, if the device backing the planes is to be
    removed, what are you removing exactly? All of the planes and
    framebuffers? Do you free the buffers allocated by the userspace
    (that it might still use?)?

  - In addition to that, KMS doesn't deal with individual entities being
    hotplugged so neither the subsystem nor the application expect to
    have a connector being removed.

  - ioctl's aren't filtered once the device is starting to get removed
    on most drivers.

So due to 1 and 2, we can't really partially remove components unless
the application is aware of it, and it doesn't expect to. And most
drivers still allow (probably unwillingly though) the application to
call ioctls once the DRM device has lost at least one of its backing
devices.

> Assuming that's correct, then _most_ of the resource acquiring /
> memory allocation can still happen in the device probe() routine and
> can still use devm as long as we do something to ensure that any
> resources released are no longer pointed to by anything in the "struct
> drm_device".
> 
> To make it concrete, I think we want this (feel free to correct). For
> simplicity, I'm assuming a driver that _doesn't_ use the component
> framework:
> 
> a) Linux driver probe() happens. The "struct drm_device" is allocated
> in probe() by devm_drm_dev_alloc(). This takes a reference to the
> "struct drm_device". The device also acquires resources / allocates
> memory.

You need to differentiate resources and allocations there. Resources can
be expected to go away at the same time than the device, so using devm
is fine. Allocations are largely disconnected from the device lifetime,
and using devm leads to UAF.

> b) Userspace acquires a reference to the "struct drm_device". Refcount
> is now 2 (one from userspace, one from the Linux driver).
> 
> c) The Linux driver unbinds, presumably because userspace requested
> it. From earlier I think we decided that we can't (by design) block
> unbind. Once unbind happens then we shouldn't try to keep operating
> the device

That part is correct, because the resources aren't there anymore.

> the driver should stop running.

But for the reasons above, the driver needs to still operate (in a
degraded mode).

> As part of the unbind, the remove() is called and also "devm"
> resources are deallocated. If any of the things freed are pointed to
> by the "struct drm_device" then the code needs to NULL them out at
> this time.

Right, we also need to make sure we don't access any of the resources
that got freed. This is typically done by protecting all the accesses
with drm_dev_enter/drm_dev_exit.

> Also we should make sure that any callback functions that userspace
> could cause to be invoked return errors.

That would prevent any new ioctl from occuring after the device has been
removed, but that doesn't fix the race condition if it's removed while
there's a commit happening. This is further complicated by the fact that
commits can be queued (so you would have multiple submitted already) or
made asynchronous.

> Our code could go away at any point here since userspace could "rmmod"
> our module.

Yeah, we probably have a bug there. Boris also reported something like
that recently where if you add an action with drmm_add_action, and then
remove the module, the function would have been free'd by the time it
executes.

> d) Eventually userspace releases the reference and the "struct
> drm_device" memory gets automatically freed because it was allocated
> by devm_drm_dev_alloc()

It was allocated by devm_drm_dev_alloc() but wasn't by devm_kzalloc().
devm_drm_dev_alloc() will "only" register an action to put back its
reference, but any application that opens the DRM device file will take
a reference as well (through drm_minor_acquire()).

So it's not freed at device_release_all() time, but when the last
reference is given back which could happen much later.

> NOTE: potentially some things could be allocated / managed by
> drmm_xyz() function, like drmm_kmalloc() and that could simplify some
> things.

The general rule is that any allocation needed for the framework
interactions need to be allocated by drmm, any allocation/resource
needed to operate the device need to be allocated by devm.

> However, it's not a panacea for everything. Specifically once
> the Linux driver unbind finishes then the device isn't functional
> anymore.

What's wrong with it then?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-14  8:14           ` Maxime Ripard
@ 2023-09-14 22:29             ` Doug Anderson
  2023-09-19  9:33               ` Maxime Ripard
  0 siblings, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-14 22:29 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Paul Cercueil, dri-devel, airlied, daniel, linux-kernel, linux-mips

Hi,

On Thu, Sep 14, 2023 at 1:14 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> > > So it doesn't have any relationship with the unbind/remove timing, and
> > > for all we know it can be there indefinitely, while the application
> > > continues to interact with the driver.
> >
> > I spent some time thinking about similar issues recently and, assuming
> > my understanding is correct, I'd at least partially disagree.
> >
> > Specifically, I _think_ the only thing that's truly required to remain
> > valid until userspace closes the last open "fd" is the memory for the
> > "struct drm_device" itself, right? My understanding is that this is
> > similar to how "struct device" works. The memory backing a "struct
> > device" has to live until the last client releases a reference to it
> > even if everything else about a device has gone away. So if it was all
> > working perfectly then if the Linux driver backing the "struct
> > drm_device" goes away then we'd release resources and NULL out a bunch
> > of stuff in the "struct drm_device" but still keep the actual "struct
> > drm_device" around since userspace still has a reference. Pretty much
> > all userspace calls would fail, but at least they wouldn't crash. Is
> > that roughly the gist?
>
> Yes, but also, no.
>
> In the spirit, you're right. However, there's three things interfering
> here:
>
>   - You don't always have a match between device and KMS entity. Display
>     pipelines are usually multiple devices working together, and while
>     you probably have a 1:1 relationship with bridges and panels (and to
>     some extent encoders/connectors), the planes and framebuffers for
>     example are a mess :) So, if the device backing the planes is to be
>     removed, what are you removing exactly? All of the planes and
>     framebuffers? Do you free the buffers allocated by the userspace
>     (that it might still use?)?
>
>   - In addition to that, KMS doesn't deal with individual entities being
>     hotplugged so neither the subsystem nor the application expect to
>     have a connector being removed.
>
>   - ioctl's aren't filtered once the device is starting to get removed
>     on most drivers.
>
> So due to 1 and 2, we can't really partially remove components unless
> the application is aware of it, and it doesn't expect to. And most
> drivers still allow (probably unwillingly though) the application to
> call ioctls once the DRM device has lost at least one of its backing
> devices.

We "can't", but we "can", right? Userspace can freely unbind a driver.
Unless you want to dig into if the community would allow a driver to
block "unbind" then we have to, at the very least, not crash the
kernel when userspace does this. Ideally we'd have something more
elegant than just "don't crash the kernel", but at least we shouldn't
crash.


> > Assuming that's correct, then _most_ of the resource acquiring /
> > memory allocation can still happen in the device probe() routine and
> > can still use devm as long as we do something to ensure that any
> > resources released are no longer pointed to by anything in the "struct
> > drm_device".
> >
> > To make it concrete, I think we want this (feel free to correct). For
> > simplicity, I'm assuming a driver that _doesn't_ use the component
> > framework:
> >
> > a) Linux driver probe() happens. The "struct drm_device" is allocated
> > in probe() by devm_drm_dev_alloc(). This takes a reference to the
> > "struct drm_device". The device also acquires resources / allocates
> > memory.
>
> You need to differentiate resources and allocations there. Resources can
> be expected to go away at the same time than the device, so using devm
> is fine. Allocations are largely disconnected from the device lifetime,
> and using devm leads to UAF.

Right. I think my original point was looking at "ingenic-drm-drv.c".
Much of the "devm" stuff there is resources and those specific things
could be moved to probe() instead of bind(), right?

For allocations, I think you'd have to look at each allocation. If the
allocation needed to live as long as the "struct drm_device" then devm
is clearly the wrong choice. ...but not every allocation needs to live
that long. Also, even if in the "simple" case allocations need to live
as long as a "struct drm_device", it's possible that there are some
cases where there's only an indirect reference to the memory. In that
case, you could NULL out the indirect reference and then free it.
Obviously someone would need to take care here.


> > b) Userspace acquires a reference to the "struct drm_device". Refcount
> > is now 2 (one from userspace, one from the Linux driver).
> >
> > c) The Linux driver unbinds, presumably because userspace requested
> > it. From earlier I think we decided that we can't (by design) block
> > unbind. Once unbind happens then we shouldn't try to keep operating
> > the device
>
> That part is correct, because the resources aren't there anymore.
>
> > the driver should stop running.
>
> But for the reasons above, the driver needs to still operate (in a
> degraded mode).

So I think here is where the disconnect is from our viewpoints. IMO
when a Linux driver is unbound then it makes no sense to try to
operate the device in "a degraded mode". When a Linux driver is
unbound then it should be releasing all of the resources from the
device (iomaps, IRQs, regulators, GPIOs, etc). That's just what
unbinding a driver is supposed to do.

I understand what you're saying above about display pipelines being
multiple Linux drivers working together and that it doesn't make lots
of sense to just unbind a random Linux device driver in the middle of
things. ...and I don't really have a simple/great answer for how to do
something super elegant if userspace tries to just randomly unbind one
of the many drivers in an active display pipeline.


> > As part of the unbind, the remove() is called and also "devm"
> > resources are deallocated. If any of the things freed are pointed to
> > by the "struct drm_device" then the code needs to NULL them out at
> > this time.
>
> Right, we also need to make sure we don't access any of the resources
> that got freed. This is typically done by protecting all the accesses
> with drm_dev_enter/drm_dev_exit.
>
> > Also we should make sure that any callback functions that userspace
> > could cause to be invoked return errors.
>
> That would prevent any new ioctl from occuring after the device has been
> removed, but that doesn't fix the race condition if it's removed while
> there's a commit happening. This is further complicated by the fact that
> commits can be queued (so you would have multiple submitted already) or
> made asynchronous.

I guess I would have expected that the remove() callback in the device
would prevent new commits from starting and then block waiting until
any in-progress commits were finished? ...kinda like how drivers call
del_timer_sync() in their remove functions...


> > Our code could go away at any point here since userspace could "rmmod"
> > our module.
>
> Yeah, we probably have a bug there. Boris also reported something like
> that recently where if you add an action with drmm_add_action, and then
> remove the module, the function would have been free'd by the time it
> executes.

I'm fairly certain that you can prevent a module from being unloaded
by just grabbing a refcount to it. However, I'm not sure that's the
right solution. If we're trying to run driver code after a driver has
been unbound then, IMO, that's the bug.


> > However, it's not a panacea for everything. Specifically once
> > the Linux driver unbind finishes then the device isn't functional
> > anymore.
>
> What's wrong with it then?

I'm mostly just saying don't just search-and-replace "devm" with
"drmm" in your driver and call it done. You need to think carefully
about which things are which lifetime.

---

Ironically, while digging into this I'm tempted to take back my
original request. Despite the kernel docs I pointed at [1], it
actually looks like it might be fine to use "devm" within a
component's bind() function. In try_to_bring_up_aggregate_device() it
seems like the code is opening up a nested "devres" group specifically
to allow this to work. A little bit of testing that I did with this
shows that, indeed, the nesting seems to be working. Am I missing
something here?

[1] https://docs.kernel.org/driver-api/component.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 03/15] drm/ingenic: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-14 22:29             ` Doug Anderson
@ 2023-09-19  9:33               ` Maxime Ripard
  0 siblings, 0 replies; 59+ messages in thread
From: Maxime Ripard @ 2023-09-19  9:33 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Paul Cercueil, dri-devel, airlied, daniel, linux-kernel, linux-mips

[-- Attachment #1: Type: text/plain, Size: 10689 bytes --]

On Thu, Sep 14, 2023 at 03:29:16PM -0700, Doug Anderson wrote:
> Hi,
> 
> On Thu, Sep 14, 2023 at 1:14 AM Maxime Ripard <mripard@kernel.org> wrote:
> >
> > > > So it doesn't have any relationship with the unbind/remove timing, and
> > > > for all we know it can be there indefinitely, while the application
> > > > continues to interact with the driver.
> > >
> > > I spent some time thinking about similar issues recently and, assuming
> > > my understanding is correct, I'd at least partially disagree.
> > >
> > > Specifically, I _think_ the only thing that's truly required to remain
> > > valid until userspace closes the last open "fd" is the memory for the
> > > "struct drm_device" itself, right? My understanding is that this is
> > > similar to how "struct device" works. The memory backing a "struct
> > > device" has to live until the last client releases a reference to it
> > > even if everything else about a device has gone away. So if it was all
> > > working perfectly then if the Linux driver backing the "struct
> > > drm_device" goes away then we'd release resources and NULL out a bunch
> > > of stuff in the "struct drm_device" but still keep the actual "struct
> > > drm_device" around since userspace still has a reference. Pretty much
> > > all userspace calls would fail, but at least they wouldn't crash. Is
> > > that roughly the gist?
> >
> > Yes, but also, no.
> >
> > In the spirit, you're right. However, there's three things interfering
> > here:
> >
> >   - You don't always have a match between device and KMS entity. Display
> >     pipelines are usually multiple devices working together, and while
> >     you probably have a 1:1 relationship with bridges and panels (and to
> >     some extent encoders/connectors), the planes and framebuffers for
> >     example are a mess :) So, if the device backing the planes is to be
> >     removed, what are you removing exactly? All of the planes and
> >     framebuffers? Do you free the buffers allocated by the userspace
> >     (that it might still use?)?
> >
> >   - In addition to that, KMS doesn't deal with individual entities being
> >     hotplugged so neither the subsystem nor the application expect to
> >     have a connector being removed.
> >
> >   - ioctl's aren't filtered once the device is starting to get removed
> >     on most drivers.
> >
> > So due to 1 and 2, we can't really partially remove components unless
> > the application is aware of it, and it doesn't expect to. And most
> > drivers still allow (probably unwillingly though) the application to
> > call ioctls once the DRM device has lost at least one of its backing
> > devices.
> 
> We "can't", but we "can", right? Userspace can freely unbind a driver.
> Unless you want to dig into if the community would allow a driver to
> block "unbind" then we have to, at the very least, not crash the
> kernel when userspace does this. Ideally we'd have something more
> elegant than just "don't crash the kernel", but at least we shouldn't
> crash.

I'm not sure what you mean here, sorry

> > > Assuming that's correct, then _most_ of the resource acquiring /
> > > memory allocation can still happen in the device probe() routine and
> > > can still use devm as long as we do something to ensure that any
> > > resources released are no longer pointed to by anything in the "struct
> > > drm_device".
> > >
> > > To make it concrete, I think we want this (feel free to correct). For
> > > simplicity, I'm assuming a driver that _doesn't_ use the component
> > > framework:
> > >
> > > a) Linux driver probe() happens. The "struct drm_device" is allocated
> > > in probe() by devm_drm_dev_alloc(). This takes a reference to the
> > > "struct drm_device". The device also acquires resources / allocates
> > > memory.
> >
> > You need to differentiate resources and allocations there. Resources can
> > be expected to go away at the same time than the device, so using devm
> > is fine. Allocations are largely disconnected from the device lifetime,
> > and using devm leads to UAF.
> 
> Right. I think my original point was looking at "ingenic-drm-drv.c".
> Much of the "devm" stuff there is resources and those specific things
> could be moved to probe() instead of bind(), right?

It depends. The registers, clock, regmap allocations are fine. The panel
isn't for example.

> For allocations, I think you'd have to look at each allocation. If the
> allocation needed to live as long as the "struct drm_device" then devm
> is clearly the wrong choice. ...but not every allocation needs to live
> that long.

Most of the allocations are in a KMS driver though? At least all the
structures that store either planes, crtcs, encoders, connectors, panels
or bridges (plus their state) need to be allocated through drmm.

> Also, even if in the "simple" case allocations need to live as long as
> a "struct drm_device", it's possible that there are some cases where
> there's only an indirect reference to the memory. In that case, you
> could NULL out the indirect reference and then free it. Obviously
> someone would need to take care here.

I guess we could, but it would be fairly hard to do since if we clear a
connector, we would need to clear that particular allocation, but also
from all the states that reference it, and the entities that store a
pointer to it somehow (some of them possibly in drivers). It's not super
valuable anyway since the current expectation is that it's all or
nothing, if you remove one connector you are expected to remove the
whole KMS driver.

> > > b) Userspace acquires a reference to the "struct drm_device". Refcount
> > > is now 2 (one from userspace, one from the Linux driver).
> > >
> > > c) The Linux driver unbinds, presumably because userspace requested
> > > it. From earlier I think we decided that we can't (by design) block
> > > unbind. Once unbind happens then we shouldn't try to keep operating
> > > the device
> >
> > That part is correct, because the resources aren't there anymore.
> >
> > > the driver should stop running.
> >
> > But for the reasons above, the driver needs to still operate (in a
> > degraded mode).
> 
> So I think here is where the disconnect is from our viewpoints. IMO
> when a Linux driver is unbound then it makes no sense to try to
> operate the device in "a degraded mode". When a Linux driver is
> unbound then it should be releasing all of the resources from the
> device (iomaps, IRQs, regulators, GPIOs, etc). That's just what
> unbinding a driver is supposed to do.

I guess we agree on that part.

> I understand what you're saying above about display pipelines being
> multiple Linux drivers working together and that it doesn't make lots
> of sense to just unbind a random Linux device driver in the middle of
> things.

That's not what I'm saying though. What I'm saying is that if we remove
one, everything must go. And that's what ingenic is doing btw. But all
the allocations still need to stay until the last fd is closed.

> ...and I don't really have a simple/great answer for how to do
> something super elegant if userspace tries to just randomly unbind one
> of the many drivers in an active display pipeline.

That's not a concern. I know vc4 handles that just fine, probably others
too.

> > > As part of the unbind, the remove() is called and also "devm"
> > > resources are deallocated. If any of the things freed are pointed to
> > > by the "struct drm_device" then the code needs to NULL them out at
> > > this time.
> >
> > Right, we also need to make sure we don't access any of the resources
> > that got freed. This is typically done by protecting all the accesses
> > with drm_dev_enter/drm_dev_exit.
> >
> > > Also we should make sure that any callback functions that userspace
> > > could cause to be invoked return errors.
> >
> > That would prevent any new ioctl from occuring after the device has been
> > removed, but that doesn't fix the race condition if it's removed while
> > there's a commit happening. This is further complicated by the fact that
> > commits can be queued (so you would have multiple submitted already) or
> > made asynchronous.
> 
> I guess I would have expected that the remove() callback in the device
> would prevent new commits from starting and then block waiting until
> any in-progress commits were finished? ...kinda like how drivers call
> del_timer_sync() in their remove functions...
> 
> 
> > > Our code could go away at any point here since userspace could "rmmod"
> > > our module.
> >
> > Yeah, we probably have a bug there. Boris also reported something like
> > that recently where if you add an action with drmm_add_action, and then
> > remove the module, the function would have been free'd by the time it
> > executes.
> 
> I'm fairly certain that you can prevent a module from being unloaded
> by just grabbing a refcount to it. However, I'm not sure that's the
> right solution. If we're trying to run driver code after a driver has
> been unbound then, IMO, that's the bug.

init, exit and probe run while the device in unbound.

> > > However, it's not a panacea for everything. Specifically once
> > > the Linux driver unbind finishes then the device isn't functional
> > > anymore.
> >
> > What's wrong with it then?
> 
> I'm mostly just saying don't just search-and-replace "devm" with
> "drmm" in your driver and call it done. You need to think carefully
> about which things are which lifetime.

Sure, where did I say anything different? For vc4, it took me a ~60
patches to do the conversion, so yeah, it's not just a sed call.

> Ironically, while digging into this I'm tempted to take back my
> original request. Despite the kernel docs I pointed at [1], it
> actually looks like it might be fine to use "devm" within a
> component's bind() function. In try_to_bring_up_aggregate_device() it
> seems like the code is opening up a nested "devres" group specifically
> to allow this to work. A little bit of testing that I did with this
> shows that, indeed, the nesting seems to be working. Am I missing
> something here?

I don't think we're on the same page, because I also don't know why
that's relevant in that particular context?

Sure, you can use devm in a component framework driver. The limitations
I'm talking about have nothing to do with the component framework but
rather between devm and KMS. So all the issues I brought up are still
very much relevant for a single device doing devm_ allocations at probe.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-13 15:34       ` Doug Anderson
@ 2023-09-20 18:03         ` Doug Anderson
  2023-09-20 18:58           ` Russell King (Oracle)
  0 siblings, 1 reply; 59+ messages in thread
From: Doug Anderson @ 2023-09-20 18:03 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: dri-devel, airlied, daniel, linux-kernel, Russell King (Oracle)

Maxime,

On Wed, Sep 13, 2023 at 8:34 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, Sep 5, 2023 at 7:23 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Sun, Sep 3, 2023 at 8:53 AM Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:
> > >
> > > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > > Based on grepping through the source code this driver appears to be
> > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > time. Among other things, this means that if a panel is in use that it
> > > > won't be cleanly powered off at system shutdown time.
> > > >
> > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > instance overview" in drm_drv.c.
> > > >
> > > > This driver was fairly easy to update. The drm_device is stored in the
> > > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > > device is not bound.
> > >
> > > ... and there I think you have a misunderstanding of the driver model.
> > > Please have a look at device_unbind_cleanup() which will be called if
> > > probe fails, or when the device is removed (in other words, when it is
> > > not bound to a driver.)
> >
> > ...and there I think you didn't read this patch closely enough and
> > perhaps that you have a misunderstanding of the component model.
> > Please have a look at the difference between armada_drm_unbind() and
> > armada_drm_remove() and also check which of those two functions is
> > being modified by my patch. Were this patch adding a call to
> > "dev_set_drvdata(dev, NULL)" in armada_drm_remove() then your NAK
> > would be justified. However, I am not aware of anything in the
> > component unbind path nor in the failure case of component bind that
> > would NULL the drvdata.
> >
> > Kindly look at the patch a second time with this in mind.
>
> Since I didn't see any further response, I'll assume that my
> explanation here has addressed your concerns. If not, I can re-post it
> without NULLing the "drvdata". While I still believe this is unsafe in
> some corner cases because of the component model used by this driver,
> at least it would get the shutdown call in.
>
> In any case, what's the process for landing patches to this driver?
> Running `./scripts/get_maintainer.pl --scm -f
> drivers/gpu/drm/armada/armada_drv.c` seems to indicate that this
> should go through the git tree:
>
> git git://git.armlinux.org.uk/~rmk/linux-arm.git drm-armada-devel
>
> ...but it doesn't appear that recent changes to this driver have gone
> that way. Should this land through drm-misc?

Do you have any advice here? Should I land this through drm-misc-next,
put it on ice for a while, or resend without the calls to NULL our the
drvdata?

Thanks!

-Doug

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-20 18:03         ` Doug Anderson
@ 2023-09-20 18:58           ` Russell King (Oracle)
  2023-09-21 18:46             ` Doug Anderson
  0 siblings, 1 reply; 59+ messages in thread
From: Russell King (Oracle) @ 2023-09-20 18:58 UTC (permalink / raw)
  To: Doug Anderson; +Cc: Maxime Ripard, dri-devel, airlied, daniel, linux-kernel

On Wed, Sep 20, 2023 at 11:03:32AM -0700, Doug Anderson wrote:
> Maxime,
> 
> On Wed, Sep 13, 2023 at 8:34 AM Doug Anderson <dianders@chromium.org> wrote:
> >
> > Hi,
> >
> > On Tue, Sep 5, 2023 at 7:23 AM Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Sun, Sep 3, 2023 at 8:53 AM Russell King (Oracle)
> > > <linux@armlinux.org.uk> wrote:
> > > >
> > > > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > > > Based on grepping through the source code this driver appears to be
> > > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > > time. Among other things, this means that if a panel is in use that it
> > > > > won't be cleanly powered off at system shutdown time.
> > > > >
> > > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > > instance overview" in drm_drv.c.
> > > > >
> > > > > This driver was fairly easy to update. The drm_device is stored in the
> > > > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > > > device is not bound.
> > > >
> > > > ... and there I think you have a misunderstanding of the driver model.
> > > > Please have a look at device_unbind_cleanup() which will be called if
> > > > probe fails, or when the device is removed (in other words, when it is
> > > > not bound to a driver.)
> > >
> > > ...and there I think you didn't read this patch closely enough and
> > > perhaps that you have a misunderstanding of the component model.
> > > Please have a look at the difference between armada_drm_unbind() and
> > > armada_drm_remove() and also check which of those two functions is
> > > being modified by my patch. Were this patch adding a call to
> > > "dev_set_drvdata(dev, NULL)" in armada_drm_remove() then your NAK
> > > would be justified. However, I am not aware of anything in the
> > > component unbind path nor in the failure case of component bind that
> > > would NULL the drvdata.
> > >
> > > Kindly look at the patch a second time with this in mind.
> >
> > Since I didn't see any further response, I'll assume that my
> > explanation here has addressed your concerns. If not, I can re-post it
> > without NULLing the "drvdata". While I still believe this is unsafe in
> > some corner cases because of the component model used by this driver,
> > at least it would get the shutdown call in.
> >
> > In any case, what's the process for landing patches to this driver?
> > Running `./scripts/get_maintainer.pl --scm -f
> > drivers/gpu/drm/armada/armada_drv.c` seems to indicate that this
> > should go through the git tree:
> >
> > git git://git.armlinux.org.uk/~rmk/linux-arm.git drm-armada-devel
> >
> > ...but it doesn't appear that recent changes to this driver have gone
> > that way. Should this land through drm-misc?
> 
> Do you have any advice here? Should I land this through drm-misc-next,
> put it on ice for a while, or resend without the calls to NULL our the
> drvdata?

Sorry, I haven't had a chance to look at it, but I think you're probably
right, so I withdraw my objection. Please take it through drm-misc for
the time being. Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time
  2023-09-20 18:58           ` Russell King (Oracle)
@ 2023-09-21 18:46             ` Doug Anderson
  0 siblings, 0 replies; 59+ messages in thread
From: Doug Anderson @ 2023-09-21 18:46 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Maxime Ripard, dri-devel, airlied, daniel, linux-kernel

Hi,

On Wed, Sep 20, 2023 at 11:58 AM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Wed, Sep 20, 2023 at 11:03:32AM -0700, Doug Anderson wrote:
> > Maxime,
> >
> > On Wed, Sep 13, 2023 at 8:34 AM Doug Anderson <dianders@chromium.org> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Sep 5, 2023 at 7:23 AM Doug Anderson <dianders@chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Sun, Sep 3, 2023 at 8:53 AM Russell King (Oracle)
> > > > <linux@armlinux.org.uk> wrote:
> > > > >
> > > > > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > > > > Based on grepping through the source code this driver appears to be
> > > > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > > > time. Among other things, this means that if a panel is in use that it
> > > > > > won't be cleanly powered off at system shutdown time.
> > > > > >
> > > > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > > > instance overview" in drm_drv.c.
> > > > > >
> > > > > > This driver was fairly easy to update. The drm_device is stored in the
> > > > > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > > > > device is not bound.
> > > > >
> > > > > ... and there I think you have a misunderstanding of the driver model.
> > > > > Please have a look at device_unbind_cleanup() which will be called if
> > > > > probe fails, or when the device is removed (in other words, when it is
> > > > > not bound to a driver.)
> > > >
> > > > ...and there I think you didn't read this patch closely enough and
> > > > perhaps that you have a misunderstanding of the component model.
> > > > Please have a look at the difference between armada_drm_unbind() and
> > > > armada_drm_remove() and also check which of those two functions is
> > > > being modified by my patch. Were this patch adding a call to
> > > > "dev_set_drvdata(dev, NULL)" in armada_drm_remove() then your NAK
> > > > would be justified. However, I am not aware of anything in the
> > > > component unbind path nor in the failure case of component bind that
> > > > would NULL the drvdata.
> > > >
> > > > Kindly look at the patch a second time with this in mind.
> > >
> > > Since I didn't see any further response, I'll assume that my
> > > explanation here has addressed your concerns. If not, I can re-post it
> > > without NULLing the "drvdata". While I still believe this is unsafe in
> > > some corner cases because of the component model used by this driver,
> > > at least it would get the shutdown call in.
> > >
> > > In any case, what's the process for landing patches to this driver?
> > > Running `./scripts/get_maintainer.pl --scm -f
> > > drivers/gpu/drm/armada/armada_drv.c` seems to indicate that this
> > > should go through the git tree:
> > >
> > > git git://git.armlinux.org.uk/~rmk/linux-arm.git drm-armada-devel
> > >
> > > ...but it doesn't appear that recent changes to this driver have gone
> > > that way. Should this land through drm-misc?
> >
> > Do you have any advice here? Should I land this through drm-misc-next,
> > put it on ice for a while, or resend without the calls to NULL our the
> > drvdata?
>
> Sorry, I haven't had a chance to look at it, but I think you're probably
> right, so I withdraw my objection. Please take it through drm-misc for
> the time being. Thanks.

Pushed to drm-misc-next:

c478768ce807 drm/armada: Call drm_atomic_helper_shutdown() at shutdown time

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2023-09-21 18:46 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-01 23:41 [RFT PATCH 00/15] drm: non-drm-misc drivers call drm_atomic_helper_shutdown() at the right times Douglas Anderson
2023-09-01 23:41 ` [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown() at shutdown time Douglas Anderson
2023-09-03 15:53   ` Russell King (Oracle)
2023-09-04  7:36     ` Maxime Ripard
2023-09-04  7:55       ` Russell King (Oracle)
2023-09-04 15:18         ` Maxime Ripard
2023-09-05 14:23     ` Doug Anderson
2023-09-13 15:34       ` Doug Anderson
2023-09-20 18:03         ` Doug Anderson
2023-09-20 18:58           ` Russell King (Oracle)
2023-09-21 18:46             ` Doug Anderson
2023-09-04  7:53   ` Maxime Ripard
2023-09-04  7:56   ` Russell King (Oracle)
2023-09-01 23:41 ` [RFT PATCH 02/15] drm/imx/dcss: " Douglas Anderson
2023-09-04  7:36   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 03/15] drm/ingenic: " Douglas Anderson
2023-09-04  7:36   ` Maxime Ripard
2023-09-04  9:15   ` Paul Cercueil
2023-09-05 20:16     ` Doug Anderson
2023-09-06  8:39       ` Maxime Ripard
2023-09-13 16:23         ` Doug Anderson
2023-09-14  8:14           ` Maxime Ripard
2023-09-14 22:29             ` Doug Anderson
2023-09-19  9:33               ` Maxime Ripard
2023-09-13 16:25       ` Doug Anderson
2023-09-13 18:00         ` Paul Cercueil
2023-09-13 18:21   ` Doug Anderson
2023-09-01 23:41 ` [RFT PATCH 04/15] drm/kmb: " Douglas Anderson
2023-09-04  7:26   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 05/15] drm/mediatek: " Douglas Anderson
2023-09-04  7:27   ` Maxime Ripard
2023-09-08 11:51   ` Fei Shao
2023-09-11 16:10     ` Doug Anderson
2023-09-12  6:28       ` Fei Shao
2023-09-01 23:41 ` [RFT PATCH 06/15] drm/nouveau: Call drm_atomic_helper_shutdown() or equiv " Douglas Anderson
2023-09-04  7:27   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 07/15] drm/tegra: Call drm_atomic_helper_shutdown() " Douglas Anderson
2023-09-04  7:28   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 08/15] drm/arcpgu: " Douglas Anderson
2023-09-04  7:28   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 09/15] drm/amdgpu: " Douglas Anderson
2023-09-01 23:41 ` [RFT PATCH 10/15] drm/sprd: Call drm_atomic_helper_shutdown() at remove time Douglas Anderson
2023-09-04  7:54   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 11/15] drm/exynos: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
2023-09-04  7:54   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 12/15] drm/gma500: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
2023-09-04  7:29   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 13/15] drm/imx/ipuv3: Call drm_atomic_helper_shutdown() at shutdown/unbind time Douglas Anderson
2023-09-04  7:55   ` Maxime Ripard
2023-09-04  8:30   ` Philipp Zabel
2023-09-05 20:29     ` Doug Anderson
2023-09-06  5:47       ` Philipp Zabel
2023-09-06 14:30         ` Doug Anderson
2023-09-13 18:21   ` Doug Anderson
2023-09-01 23:41 ` [RFT PATCH 14/15] drm/radeon: Call drm_helper_force_disable_all() at shutdown/remove time Douglas Anderson
2023-09-04  7:30   ` Maxime Ripard
2023-09-01 23:41 ` [RFT PATCH 15/15] drm/renesas/shmobile: " Douglas Anderson
2023-09-04  7:27   ` Geert Uytterhoeven
2023-09-05 21:10     ` Doug Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).