dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config
@ 2022-07-24 12:37 Javier Martinez Canillas
  2022-07-24 18:24 ` Thomas Zimmermann
  0 siblings, 1 reply; 6+ messages in thread
From: Javier Martinez Canillas @ 2022-07-24 12:37 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Zimmermann, David Airlie, Javier Martinez Canillas,
	dri-devel, Dmitry Baryshkov

DRM drivers initialize the mode configuration with drmm_mode_config_init()
and that function (among other things) initializes mutexes that are later
used by modeset helpers.

But the helpers should only attempt to grab those locks if the mode config
was properly initialized. Otherwise it can lead to kernel oops. An example
is when a DRM driver using the component framework does not initialize the
drm_mode_config, because its .bind callback was not being executed due one
of its expected sub-devices' driver failing to probe.

Some drivers check the struct drm_driver.registered field as an indication
on whether their .shutdown callback should call helpers to tearn down the
mode configuration or not, but most drivers just assume that it is always
safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.

Let make the DRM core more robust and prevent this to happen, by marking a
struct drm_mode_config as initialized during drmm_mode_config_init(). that
way helpers can check for it and not attempt to grab uninitialized mutexes.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
---

 drivers/gpu/drm/drm_mode_config.c  | 4 ++++
 drivers/gpu/drm/drm_modeset_lock.c | 6 ++++++
 include/drm/drm_mode_config.h      | 8 ++++++++
 3 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
index 59b34f07cfce..db649f97120b 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -456,6 +456,8 @@ int drmm_mode_config_init(struct drm_device *dev)
 		dma_resv_fini(&resv);
 	}
 
+	dev->mode_config.initialized = true;
+
 	return drmm_add_action_or_reset(dev, drm_mode_config_init_release,
 					NULL);
 }
@@ -549,6 +551,8 @@ void drm_mode_config_cleanup(struct drm_device *dev)
 	idr_destroy(&dev->mode_config.tile_idr);
 	idr_destroy(&dev->mode_config.object_idr);
 	drm_modeset_lock_fini(&dev->mode_config.connection_mutex);
+
+	dev->mode_config.initialized = false;
 }
 EXPORT_SYMBOL(drm_mode_config_cleanup);
 
diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c
index 918065982db4..d6a81cb88123 100644
--- a/drivers/gpu/drm/drm_modeset_lock.c
+++ b/drivers/gpu/drm/drm_modeset_lock.c
@@ -444,6 +444,9 @@ EXPORT_SYMBOL(drm_modeset_unlock);
  *
  * See also: DRM_MODESET_LOCK_ALL_BEGIN() and DRM_MODESET_LOCK_ALL_END()
  *
+ * This function must only be called after drmm_mode_config_init(), since it
+ * takes locks that are initialized as part of the initial mode configuration.
+ *
  * Returns: 0 on success or a negative error-code on failure.
  */
 int drm_modeset_lock_all_ctx(struct drm_device *dev,
@@ -454,6 +457,9 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
 	struct drm_plane *plane;
 	int ret;
 
+	if (WARN_ON(!dev->mode_config.initialized))
+		return -EINVAL;
+
 	ret = drm_modeset_lock(&dev->mode_config.connection_mutex, ctx);
 	if (ret)
 		return ret;
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 6b5e01295348..d2e1a6d7dcc2 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -954,6 +954,14 @@ struct drm_mode_config {
 	struct drm_atomic_state *suspend_state;
 
 	const struct drm_mode_config_helper_funcs *helper_private;
+
+	/**
+	 * @initialized:
+	 *
+	 * Internally used by modeset helpers such as drm_modeset_lock_all_ctx()
+	 * to determine if the mode configuration has been properly initialized.
+	 */
+	bool initialized;
 };
 
 int __must_check drmm_mode_config_init(struct drm_device *dev);
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config
  2022-07-24 12:37 [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config Javier Martinez Canillas
@ 2022-07-24 18:24 ` Thomas Zimmermann
  2022-07-24 18:41   ` Javier Martinez Canillas
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Zimmermann @ 2022-07-24 18:24 UTC (permalink / raw)
  To: Javier Martinez Canillas, linux-kernel
  Cc: David Airlie, Dmitry Baryshkov, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 4382 bytes --]

Hi Javier

Am 24.07.22 um 14:37 schrieb Javier Martinez Canillas:
> DRM drivers initialize the mode configuration with drmm_mode_config_init()
> and that function (among other things) initializes mutexes that are later
> used by modeset helpers.
> 
> But the helpers should only attempt to grab those locks if the mode config
> was properly initialized. Otherwise it can lead to kernel oops. An example
> is when a DRM driver using the component framework does not initialize the
> drm_mode_config, because its .bind callback was not being executed due one
> of its expected sub-devices' driver failing to probe.
> 
> Some drivers check the struct drm_driver.registered field as an indication
> on whether their .shutdown callback should call helpers to tearn down the
> mode configuration or not, but most drivers just assume that it is always
> safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.
> 
> Let make the DRM core more robust and prevent this to happen, by marking a
> struct drm_mode_config as initialized during drmm_mode_config_init(). that
> way helpers can check for it and not attempt to grab uninitialized mutexes.

I disagree. This patch looks like cargo-cult programming and entirely 
arbitrary.  The solution here is to fix drivers.  The actual test to 
perform is to instrument the mutex implementation to detect 
uninitialized mutexes.

Best regards
Thomas

> 
> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
> ---
> 
>   drivers/gpu/drm/drm_mode_config.c  | 4 ++++
>   drivers/gpu/drm/drm_modeset_lock.c | 6 ++++++
>   include/drm/drm_mode_config.h      | 8 ++++++++
>   3 files changed, 18 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
> index 59b34f07cfce..db649f97120b 100644
> --- a/drivers/gpu/drm/drm_mode_config.c
> +++ b/drivers/gpu/drm/drm_mode_config.c
> @@ -456,6 +456,8 @@ int drmm_mode_config_init(struct drm_device *dev)
>   		dma_resv_fini(&resv);
>   	}
>   
> +	dev->mode_config.initialized = true;
> +
>   	return drmm_add_action_or_reset(dev, drm_mode_config_init_release,
>   					NULL);
>   }
> @@ -549,6 +551,8 @@ void drm_mode_config_cleanup(struct drm_device *dev)
>   	idr_destroy(&dev->mode_config.tile_idr);
>   	idr_destroy(&dev->mode_config.object_idr);
>   	drm_modeset_lock_fini(&dev->mode_config.connection_mutex);
> +
> +	dev->mode_config.initialized = false;
>   }
>   EXPORT_SYMBOL(drm_mode_config_cleanup);
>   
> diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c
> index 918065982db4..d6a81cb88123 100644
> --- a/drivers/gpu/drm/drm_modeset_lock.c
> +++ b/drivers/gpu/drm/drm_modeset_lock.c
> @@ -444,6 +444,9 @@ EXPORT_SYMBOL(drm_modeset_unlock);
>    *
>    * See also: DRM_MODESET_LOCK_ALL_BEGIN() and DRM_MODESET_LOCK_ALL_END()
>    *
> + * This function must only be called after drmm_mode_config_init(), since it
> + * takes locks that are initialized as part of the initial mode configuration.
> + *
>    * Returns: 0 on success or a negative error-code on failure.
>    */
>   int drm_modeset_lock_all_ctx(struct drm_device *dev,
> @@ -454,6 +457,9 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
>   	struct drm_plane *plane;
>   	int ret;
>   
> +	if (WARN_ON(!dev->mode_config.initialized))
> +		return -EINVAL;
> +
>   	ret = drm_modeset_lock(&dev->mode_config.connection_mutex, ctx);
>   	if (ret)
>   		return ret;
> diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
> index 6b5e01295348..d2e1a6d7dcc2 100644
> --- a/include/drm/drm_mode_config.h
> +++ b/include/drm/drm_mode_config.h
> @@ -954,6 +954,14 @@ struct drm_mode_config {
>   	struct drm_atomic_state *suspend_state;
>   
>   	const struct drm_mode_config_helper_funcs *helper_private;
> +
> +	/**
> +	 * @initialized:
> +	 *
> +	 * Internally used by modeset helpers such as drm_modeset_lock_all_ctx()
> +	 * to determine if the mode configuration has been properly initialized.
> +	 */
> +	bool initialized;
>   };
>   
>   int __must_check drmm_mode_config_init(struct drm_device *dev);

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config
  2022-07-24 18:24 ` Thomas Zimmermann
@ 2022-07-24 18:41   ` Javier Martinez Canillas
  2022-07-25  7:12     ` Thomas Zimmermann
  0 siblings, 1 reply; 6+ messages in thread
From: Javier Martinez Canillas @ 2022-07-24 18:41 UTC (permalink / raw)
  To: Thomas Zimmermann, linux-kernel; +Cc: David Airlie, Dmitry Baryshkov, dri-devel

Hello Thomas,

Thanks for your feedback.

On 7/24/22 20:24, Thomas Zimmermann wrote:
> Hi Javier
> 
> Am 24.07.22 um 14:37 schrieb Javier Martinez Canillas:
>> DRM drivers initialize the mode configuration with drmm_mode_config_init()
>> and that function (among other things) initializes mutexes that are later
>> used by modeset helpers.
>>
>> But the helpers should only attempt to grab those locks if the mode config
>> was properly initialized. Otherwise it can lead to kernel oops. An example
>> is when a DRM driver using the component framework does not initialize the
>> drm_mode_config, because its .bind callback was not being executed due one
>> of its expected sub-devices' driver failing to probe.
>>
>> Some drivers check the struct drm_driver.registered field as an indication
>> on whether their .shutdown callback should call helpers to tearn down the
>> mode configuration or not, but most drivers just assume that it is always
>> safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.
>>
>> Let make the DRM core more robust and prevent this to happen, by marking a
>> struct drm_mode_config as initialized during drmm_mode_config_init(). that
>> way helpers can check for it and not attempt to grab uninitialized mutexes.
> 
> I disagree. This patch looks like cargo-cult programming and entirely 
> arbitrary.  The solution here is to fix drivers.  The actual test to 
> perform is to instrument the mutex implementation to detect 
> uninitialized mutexes.
>

While I do agree that drivers should be fixed, IMO we should try to make it
hard for the kernel to crash. We already have checks in other DRM helpers to
avoid accessing uninitialized data, so I don't see why we couldn't do the
same here.

I wrote this patch after fixing a bug in the drm/msm driver [0]. By looking
at how other drivers handled this case, I'm pretty sure that they have the
same problem. A warning is much better than a kernel crash during shutdown.

[0]: https://patchwork.kernel.org/project/dri-devel/patch/20220724111327.1195693-1-javierm@redhat.com/
 
> Best regards
> Thomas
> 

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config
  2022-07-24 18:41   ` Javier Martinez Canillas
@ 2022-07-25  7:12     ` Thomas Zimmermann
  2022-07-25  8:28       ` Javier Martinez Canillas
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Zimmermann @ 2022-07-25  7:12 UTC (permalink / raw)
  To: Javier Martinez Canillas, linux-kernel
  Cc: David Airlie, Dmitry Baryshkov, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 4347 bytes --]

Hi Javier

Am 24.07.22 um 20:41 schrieb Javier Martinez Canillas:
> Hello Thomas,
> 
> Thanks for your feedback.
> 
> On 7/24/22 20:24, Thomas Zimmermann wrote:
>> Hi Javier
>>
>> Am 24.07.22 um 14:37 schrieb Javier Martinez Canillas:
>>> DRM drivers initialize the mode configuration with drmm_mode_config_init()
>>> and that function (among other things) initializes mutexes that are later
>>> used by modeset helpers.
>>>
>>> But the helpers should only attempt to grab those locks if the mode config
>>> was properly initialized. Otherwise it can lead to kernel oops. An example
>>> is when a DRM driver using the component framework does not initialize the
>>> drm_mode_config, because its .bind callback was not being executed due one
>>> of its expected sub-devices' driver failing to probe.
>>>
>>> Some drivers check the struct drm_driver.registered field as an indication
>>> on whether their .shutdown callback should call helpers to tearn down the
>>> mode configuration or not, but most drivers just assume that it is always
>>> safe to call helpers such as drm_atomic_helper_shutdown() during shutdown.
>>>
>>> Let make the DRM core more robust and prevent this to happen, by marking a
>>> struct drm_mode_config as initialized during drmm_mode_config_init(). that
>>> way helpers can check for it and not attempt to grab uninitialized mutexes.
>>
>> I disagree. This patch looks like cargo-cult programming and entirely
>> arbitrary.  The solution here is to fix drivers.  The actual test to
>> perform is to instrument the mutex implementation to detect
>> uninitialized mutexes.
>>
> 
> While I do agree that drivers should be fixed, IMO we should try to make it
> hard for the kernel to crash. We already have checks in other DRM helpers to
> avoid accessing uninitialized data, so I don't see why we couldn't do the
> same here.

Code should stand on its own merits, instead of doing something because 
something else does it. The latter is what is referred to as cargo-cult 
programming.

Doing sanity checks on values is not a problem, but putting flag 
variables throughout the code to question other code's state is. That's 
not 'The Way of the C.' There's also the problem that a good part of 
struct drm_mode_config's initialization is open-coded in drivers. So the 
meaning of is_initialized is somewhat fuzzy.

> 
> I wrote this patch after fixing a bug in the drm/msm driver [0]. By looking
> at how other drivers handled this case, I'm pretty sure that they have the
> same problem. A warning is much better than a kernel crash during shutdown.
> 
> [0]: https://patchwork.kernel.org/project/dri-devel/patch/20220724111327.1195693-1-javierm@redhat.com/

I see. I wasn't aware that missing mode_config_init() is a problem. From 
the linked URL, I cannot really understand how it's related. msm appears 
to be calling drm_mode_config_init(), right? The idiomatic solution 
would be to convert msm to managed code. But that's an entirely 
different patchset, of course. (I only took a brief look at the link TBH.)

Here's a suggestion on how to construct the mode-config code in order to 
make it hard to misuse:  Driver currently open-code the initialization 
of many fields in drm_mode_config. Expand the arguments of 
drm_mode_config_init() to take the pointer to the drm_mode_config_funcs. 
These functions are essential to do anything, so it's a good candidate 
for an argument. Drivers are easily converted the the new interface 
AFAICT.  After the conversion, add a test to drm_mode_config_reset() 
that tests for the funcs to be set.  drm_mode_config_reset() is also 
essential during initialization or the driver will fail immediately on 
the first modeset operation. That gives a test for an initialized 
mode_config without adding extra fields.

As a bit of a sidenote: we should consider making 
drm_mode_config_reset() and the reset callbacks return errors. The reset 
functions allocate memory for states and if this fails, we have no way 
of reporting the failure.

Best regards
Thomas



>   
>> Best regards
>> Thomas
>>
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config
  2022-07-25  7:12     ` Thomas Zimmermann
@ 2022-07-25  8:28       ` Javier Martinez Canillas
  2022-09-06 19:23         ` Daniel Vetter
  0 siblings, 1 reply; 6+ messages in thread
From: Javier Martinez Canillas @ 2022-07-25  8:28 UTC (permalink / raw)
  To: Thomas Zimmermann, linux-kernel; +Cc: David Airlie, Dmitry Baryshkov, dri-devel

Hello Thomas,

On 7/25/22 09:12, Thomas Zimmermann wrote:

[...]

>>>>
>>>> Let make the DRM core more robust and prevent this to happen, by marking a
>>>> struct drm_mode_config as initialized during drmm_mode_config_init(). that
>>>> way helpers can check for it and not attempt to grab uninitialized mutexes.
>>>
>>> I disagree. This patch looks like cargo-cult programming and entirely
>>> arbitrary.  The solution here is to fix drivers.  The actual test to
>>> perform is to instrument the mutex implementation to detect
>>> uninitialized mutexes.
>>>
>>
>> While I do agree that drivers should be fixed, IMO we should try to make it
>> hard for the kernel to crash. We already have checks in other DRM helpers to
>> avoid accessing uninitialized data, so I don't see why we couldn't do the
>> same here.
> 
> Code should stand on its own merits, instead of doing something because 
> something else does it. The latter is what is referred to as cargo-cult 
> programming.
>

Yes, I'm familiar with the concept but was thinking about a different paradigm
when writing this patch that is defensive programming. The question is whether
makes sense for the DRM core to defense from buggy drivers. My opinion is that
if possible it should and we should prevent kernel panics at all costs.

Otherwise, let's say a reboot may crash and never finish unless you have a HW
watchdog or something to bring the system again into a functional state.

But I agree that this particular patch is not nice and in fact I considered to
post it as a RFC at the beginning. 
 
> Doing sanity checks on values is not a problem, but putting flag 
> variables throughout the code to question other code's state is. That's 
> not 'The Way of the C.' There's also the problem that a good part of 
> struct drm_mode_config's initialization is open-coded in drivers. So the 
> meaning of is_initialized is somewhat fuzzy.
>

That is true. It is meant to signal that at least drm_mode_config_init() was
called by the driver.

>>
>> I wrote this patch after fixing a bug in the drm/msm driver [0]. By looking
>> at how other drivers handled this case, I'm pretty sure that they have the
>> same problem. A warning is much better than a kernel crash during shutdown.
>>
>> [0]: https://patchwork.kernel.org/project/dri-devel/patch/20220724111327.1195693-1-javierm@redhat.com/
> 
> I see. I wasn't aware that missing mode_config_init() is a problem. From 
> the linked URL, I cannot really understand how it's related. msm appears 
> to be calling drm_mode_config_init(), right? The idiomatic solution 
> would be to convert msm to managed code. But that's an entirely 
> different patchset, of course. (I only took a brief look at the link TBH.)
> 

The problem as mentioned is that drivers could call the modeset helpers even
when that init function was never called. In this case due the .bind callback
not being executed because a probe failed for a driver of a expected sub-device
failed.

The .shutdown callback is registered when the platform_driver is registered
and that happens at module_init(), so the shutdown would be executed regardless
of the probe (or bind) doing a proper initialization.

I think we need some way to signal drivers about that. Even better we could use
it in the core helpers to not attempt to access struct drm_device data if the
DRM device was not properly initialized.

Maybe the drm_device .registered field is enough and we could reuse that ?

> Here's a suggestion on how to construct the mode-config code in order to 
> make it hard to misuse:  Driver currently open-code the initialization 
> of many fields in drm_mode_config. Expand the arguments of 
> drm_mode_config_init() to take the pointer to the drm_mode_config_funcs. 
> These functions are essential to do anything, so it's a good candidate 
> for an argument. Drivers are easily converted the the new interface 
> AFAICT.  After the conversion, add a test to drm_mode_config_reset() 
> that tests for the funcs to be set.  drm_mode_config_reset() is also 
> essential during initialization or the driver will fail immediately on 
> the first modeset operation. That gives a test for an initialized 
> mode_config without adding extra fields.
>

I agree that all these are nice things to do but I can't see how it could help
in this case. If all these are still done in the .bind callback then it might
never be executed and the .shutdown callback would still happily call to the
drm_atomic_helper_shutdown() which will try to grab uninitialized mutexes and
crash the kernel.

Yes, I already posted a patch for msm/drm to prevent this but as mentioned a
lot of drivers still have that issue. We could audit all drivers and fix them
but I think we need to make the core more robust.

> As a bit of a sidenote: we should consider making 
> drm_mode_config_reset() and the reset callbacks return errors. The reset 
> functions allocate memory for states and if this fails, we have no way 
> of reporting the failure.
>

Yes, agreed.
 
> Best regards
> Thomas
> 

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config
  2022-07-25  8:28       ` Javier Martinez Canillas
@ 2022-09-06 19:23         ` Daniel Vetter
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Vetter @ 2022-09-06 19:23 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: David Airlie, Dmitry Baryshkov, dri-devel, linux-kernel,
	Thomas Zimmermann

On Mon, Jul 25, 2022 at 10:28:21AM +0200, Javier Martinez Canillas wrote:
> Hello Thomas,
> 
> On 7/25/22 09:12, Thomas Zimmermann wrote:
> 
> [...]
> 
> >>>>
> >>>> Let make the DRM core more robust and prevent this to happen, by marking a
> >>>> struct drm_mode_config as initialized during drmm_mode_config_init(). that
> >>>> way helpers can check for it and not attempt to grab uninitialized mutexes.
> >>>
> >>> I disagree. This patch looks like cargo-cult programming and entirely
> >>> arbitrary.  The solution here is to fix drivers.  The actual test to
> >>> perform is to instrument the mutex implementation to detect
> >>> uninitialized mutexes.
> >>>
> >>
> >> While I do agree that drivers should be fixed, IMO we should try to make it
> >> hard for the kernel to crash. We already have checks in other DRM helpers to
> >> avoid accessing uninitialized data, so I don't see why we couldn't do the
> >> same here.
> > 
> > Code should stand on its own merits, instead of doing something because 
> > something else does it. The latter is what is referred to as cargo-cult 
> > programming.
> >
> 
> Yes, I'm familiar with the concept but was thinking about a different paradigm
> when writing this patch that is defensive programming. The question is whether
> makes sense for the DRM core to defense from buggy drivers. My opinion is that
> if possible it should and we should prevent kernel panics at all costs.
> 
> Otherwise, let's say a reboot may crash and never finish unless you have a HW
> watchdog or something to bring the system again into a functional state.
> 
> But I agree that this particular patch is not nice and in fact I considered to
> post it as a RFC at the beginning. 

Way late on this, but because it's a bit more fundamental discussions I
figured maybe still useful when I chime in here ...

Imo the fix for this is drmm_ and aggressively removing driver cleanup
code everywhere by using drmm_ and devm_ for everything.

Trying to make these functions robust against arbitrary driver bugs is a
lost play imo, and the fundamental fix is making cleanup code fool proof
by making it all unecessary.

Ideally we'd be able to unexport all the explicit cleanup functions so
that drivers just cannot get anything wrong here. We're a long way from
that world though :-(
-Daniel


>  
> > Doing sanity checks on values is not a problem, but putting flag 
> > variables throughout the code to question other code's state is. That's 
> > not 'The Way of the C.' There's also the problem that a good part of 
> > struct drm_mode_config's initialization is open-coded in drivers. So the 
> > meaning of is_initialized is somewhat fuzzy.
> >
> 
> That is true. It is meant to signal that at least drm_mode_config_init() was
> called by the driver.
> 
> >>
> >> I wrote this patch after fixing a bug in the drm/msm driver [0]. By looking
> >> at how other drivers handled this case, I'm pretty sure that they have the
> >> same problem. A warning is much better than a kernel crash during shutdown.
> >>
> >> [0]: https://patchwork.kernel.org/project/dri-devel/patch/20220724111327.1195693-1-javierm@redhat.com/
> > 
> > I see. I wasn't aware that missing mode_config_init() is a problem. From 
> > the linked URL, I cannot really understand how it's related. msm appears 
> > to be calling drm_mode_config_init(), right? The idiomatic solution 
> > would be to convert msm to managed code. But that's an entirely 
> > different patchset, of course. (I only took a brief look at the link TBH.)
> > 
> 
> The problem as mentioned is that drivers could call the modeset helpers even
> when that init function was never called. In this case due the .bind callback
> not being executed because a probe failed for a driver of a expected sub-device
> failed.
> 
> The .shutdown callback is registered when the platform_driver is registered
> and that happens at module_init(), so the shutdown would be executed regardless
> of the probe (or bind) doing a proper initialization.
> 
> I think we need some way to signal drivers about that. Even better we could use
> it in the core helpers to not attempt to access struct drm_device data if the
> DRM device was not properly initialized.
> 
> Maybe the drm_device .registered field is enough and we could reuse that ?
> 
> > Here's a suggestion on how to construct the mode-config code in order to 
> > make it hard to misuse:  Driver currently open-code the initialization 
> > of many fields in drm_mode_config. Expand the arguments of 
> > drm_mode_config_init() to take the pointer to the drm_mode_config_funcs. 
> > These functions are essential to do anything, so it's a good candidate 
> > for an argument. Drivers are easily converted the the new interface 
> > AFAICT.  After the conversion, add a test to drm_mode_config_reset() 
> > that tests for the funcs to be set.  drm_mode_config_reset() is also 
> > essential during initialization or the driver will fail immediately on 
> > the first modeset operation. That gives a test for an initialized 
> > mode_config without adding extra fields.
> >
> 
> I agree that all these are nice things to do but I can't see how it could help
> in this case. If all these are still done in the .bind callback then it might
> never be executed and the .shutdown callback would still happily call to the
> drm_atomic_helper_shutdown() which will try to grab uninitialized mutexes and
> crash the kernel.
> 
> Yes, I already posted a patch for msm/drm to prevent this but as mentioned a
> lot of drivers still have that issue. We could audit all drivers and fix them
> but I think we need to make the core more robust.
> 
> > As a bit of a sidenote: we should consider making 
> > drm_mode_config_reset() and the reset callbacks return errors. The reset 
> > functions allocate memory for states and if this fails, we have no way 
> > of reporting the failure.
> >
> 
> Yes, agreed.
>  
> > Best regards
> > Thomas
> > 
> 
> -- 
> Best regards,
> 
> Javier Martinez Canillas
> Linux Engineering
> Red Hat
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-09-06 19:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-24 12:37 [PATCH] drm: Prevent modeset helpers to access an uninitialized drm_mode_config Javier Martinez Canillas
2022-07-24 18:24 ` Thomas Zimmermann
2022-07-24 18:41   ` Javier Martinez Canillas
2022-07-25  7:12     ` Thomas Zimmermann
2022-07-25  8:28       ` Javier Martinez Canillas
2022-09-06 19:23         ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).