dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes
@ 2021-11-09 18:04 Douglas Anderson
  2021-11-09 22:46 ` khsieh
  2021-11-10  1:43 ` [Freedreno] " Abhinav Kumar
  0 siblings, 2 replies; 3+ messages in thread
From: Douglas Anderson @ 2021-11-09 18:04 UTC (permalink / raw)
  To: Rob Clark
  Cc: David Airlie, freedreno, Douglas Anderson, dri-devel,
	Stephen Boyd, Kuogee Hsieh, linux-arm-msm, Sean Paul,
	linux-kernel

If you happened to try to access `/dev/drm_dp_aux` devices provided by
the MSM DP AUX driver too early at bootup you could go boom. Let's
avoid that by only allowing AUX transfers when the controller is
powered up.

Specifically the crash that was seen (on Chrome OS 5.4 tree with
relevant backports):
  Kernel panic - not syncing: Asynchronous SError Interrupt
  CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb #1
  Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
  Call trace:
   dump_backtrace+0x0/0x14c
   show_stack+0x20/0x2c
   dump_stack+0xac/0x124
   panic+0x150/0x390
   nmi_panic+0x80/0x94
   arm64_serror_panic+0x78/0x84
   do_serror+0x0/0x118
   do_serror+0xa4/0x118
   el1_error+0xbc/0x160
   dp_catalog_aux_write_data+0x1c/0x3c
   dp_aux_cmd_fifo_tx+0xf0/0x1b0
   dp_aux_transfer+0x1b0/0x2bc
   drm_dp_dpcd_access+0x8c/0x11c
   drm_dp_dpcd_read+0x64/0x10c
   auxdev_read_iter+0xd4/0x1c4

I did a little bit of tracing and found that:
* We register the AUX device very early at bootup.
* Power isn't actually turned on for my system until
  hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
* You can see that dp_power_init() calls dp_aux_init() which is where
  we start allowing AUX channel requests to go through.

In general this patch is a bit of a bandaid but at least it gets us
out of the current state where userspace acting at the wrong time can
fully crash the system.
* I think the more proper fix (which requires quite a bit more
  changes) is to power stuff on while an AUX transfer is
  happening. This is like the solution we did for ti-sn65dsi86. This
  might be required for us to move to populating the panel via the
  DP-AUX bus.
* Another fix considered was to dynamically register / unregister. I
  tried that at <https://crrev.com/c/3169431/3> but it got
  ugly. Currently there's a bug where the pm_runtime() state isn't
  tracked properly and that causes us to just keep registering more
  and more.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/msm/dp/dp_aux.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
index eb40d8413bca..6d36f63c3338 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -33,6 +33,7 @@ struct dp_aux_private {
 	bool read;
 	bool no_send_addr;
 	bool no_send_stop;
+	bool initted;
 	u32 offset;
 	u32 segment;
 
@@ -331,6 +332,10 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
 	}
 
 	mutex_lock(&aux->mutex);
+	if (!aux->initted) {
+		ret = -EIO;
+		goto exit;
+	}
 
 	dp_aux_update_offset_and_segment(aux, msg);
 	dp_aux_transfer_helper(aux, msg, true);
@@ -380,6 +385,8 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
 	}
 
 	aux->cmd_busy = false;
+
+exit:
 	mutex_unlock(&aux->mutex);
 
 	return ret;
@@ -431,8 +438,13 @@ void dp_aux_init(struct drm_dp_aux *dp_aux)
 
 	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
 
+	mutex_lock(&aux->mutex);
+
 	dp_catalog_aux_enable(aux->catalog, true);
 	aux->retry_cnt = 0;
+	aux->initted = true;
+
+	mutex_unlock(&aux->mutex);
 }
 
 void dp_aux_deinit(struct drm_dp_aux *dp_aux)
@@ -441,7 +453,12 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
 
 	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
 
+	mutex_lock(&aux->mutex);
+
+	aux->initted = false;
 	dp_catalog_aux_enable(aux->catalog, false);
+
+	mutex_unlock(&aux->mutex);
 }
 
 int dp_aux_register(struct drm_dp_aux *dp_aux)
-- 
2.34.0.rc0.344.g81b53c2807-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes
  2021-11-09 18:04 [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes Douglas Anderson
@ 2021-11-09 22:46 ` khsieh
  2021-11-10  1:43 ` [Freedreno] " Abhinav Kumar
  1 sibling, 0 replies; 3+ messages in thread
From: khsieh @ 2021-11-09 22:46 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: David Airlie, freedreno, linux-kernel, dri-devel, Stephen Boyd,
	linux-arm-msm, Sean Paul

On 2021-11-09 10:04, Douglas Anderson wrote:
> If you happened to try to access `/dev/drm_dp_aux` devices provided by
> the MSM DP AUX driver too early at bootup you could go boom. Let's
> avoid that by only allowing AUX transfers when the controller is
> powered up.
> 
> Specifically the crash that was seen (on Chrome OS 5.4 tree with
> relevant backports):
>   Kernel panic - not syncing: Asynchronous SError Interrupt
>   CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb 
> #1
>   Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
>   Call trace:
>    dump_backtrace+0x0/0x14c
>    show_stack+0x20/0x2c
>    dump_stack+0xac/0x124
>    panic+0x150/0x390
>    nmi_panic+0x80/0x94
>    arm64_serror_panic+0x78/0x84
>    do_serror+0x0/0x118
>    do_serror+0xa4/0x118
>    el1_error+0xbc/0x160
>    dp_catalog_aux_write_data+0x1c/0x3c
>    dp_aux_cmd_fifo_tx+0xf0/0x1b0
>    dp_aux_transfer+0x1b0/0x2bc
>    drm_dp_dpcd_access+0x8c/0x11c
>    drm_dp_dpcd_read+0x64/0x10c
>    auxdev_read_iter+0xd4/0x1c4
> 
> I did a little bit of tracing and found that:
> * We register the AUX device very early at bootup.
> * Power isn't actually turned on for my system until
>   hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
> * You can see that dp_power_init() calls dp_aux_init() which is where
>   we start allowing AUX channel requests to go through.
> 
> In general this patch is a bit of a bandaid but at least it gets us
> out of the current state where userspace acting at the wrong time can
> fully crash the system.
> * I think the more proper fix (which requires quite a bit more
>   changes) is to power stuff on while an AUX transfer is
>   happening. This is like the solution we did for ti-sn65dsi86. This
>   might be required for us to move to populating the panel via the
>   DP-AUX bus.
> * Another fix considered was to dynamically register / unregister. I
>   tried that at <https://crrev.com/c/3169431/3> but it got
>   ugly. Currently there's a bug where the pm_runtime() state isn't
>   tracked properly and that causes us to just keep registering more
>   and more.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> 
>  drivers/gpu/drm/msm/dp/dp_aux.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c 
> b/drivers/gpu/drm/msm/dp/dp_aux.c
> index eb40d8413bca..6d36f63c3338 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -33,6 +33,7 @@ struct dp_aux_private {
>  	bool read;
>  	bool no_send_addr;
>  	bool no_send_stop;
> +	bool initted;
>  	u32 offset;
>  	u32 segment;
> 
> @@ -331,6 +332,10 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux 
> *dp_aux,
>  	}
> 
>  	mutex_lock(&aux->mutex);
> +	if (!aux->initted) {
> +		ret = -EIO;
> +		goto exit;
> +	}
> 
>  	dp_aux_update_offset_and_segment(aux, msg);
>  	dp_aux_transfer_helper(aux, msg, true);
> @@ -380,6 +385,8 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux 
> *dp_aux,
>  	}
> 
>  	aux->cmd_busy = false;
> +
> +exit:
>  	mutex_unlock(&aux->mutex);
> 
>  	return ret;
> @@ -431,8 +438,13 @@ void dp_aux_init(struct drm_dp_aux *dp_aux)
> 
>  	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
> 
> +	mutex_lock(&aux->mutex);
> +
>  	dp_catalog_aux_enable(aux->catalog, true);
>  	aux->retry_cnt = 0;
> +	aux->initted = true;
> +
> +	mutex_unlock(&aux->mutex);
>  }
> 
>  void dp_aux_deinit(struct drm_dp_aux *dp_aux)
> @@ -441,7 +453,12 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
> 
>  	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
> 
> +	mutex_lock(&aux->mutex);
> +
> +	aux->initted = false;
>  	dp_catalog_aux_enable(aux->catalog, false);
> +
> +	mutex_unlock(&aux->mutex);
>  }
> 
>  int dp_aux_register(struct drm_dp_aux *dp_aux)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Freedreno] [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes
  2021-11-09 18:04 [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes Douglas Anderson
  2021-11-09 22:46 ` khsieh
@ 2021-11-10  1:43 ` Abhinav Kumar
  1 sibling, 0 replies; 3+ messages in thread
From: Abhinav Kumar @ 2021-11-10  1:43 UTC (permalink / raw)
  To: Douglas Anderson, Rob Clark
  Cc: Sean Paul, David Airlie, linux-arm-msm, linux-kernel, dri-devel,
	Stephen Boyd, Kuogee Hsieh, freedreno

Hi Doug

On 11/9/2021 10:04 AM, Douglas Anderson wrote:
> If you happened to try to access `/dev/drm_dp_aux` devices provided by
> the MSM DP AUX driver too early at bootup you could go boom. Let's
> avoid that by only allowing AUX transfers when the controller is
> powered up.
> 
> Specifically the crash that was seen (on Chrome OS 5.4 tree with
> relevant backports):
>    Kernel panic - not syncing: Asynchronous SError Interrupt
>    CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb #1
>    Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
>    Call trace:
>     dump_backtrace+0x0/0x14c
>     show_stack+0x20/0x2c
>     dump_stack+0xac/0x124
>     panic+0x150/0x390
>     nmi_panic+0x80/0x94
>     arm64_serror_panic+0x78/0x84
>     do_serror+0x0/0x118
>     do_serror+0xa4/0x118
>     el1_error+0xbc/0x160
>     dp_catalog_aux_write_data+0x1c/0x3c
>     dp_aux_cmd_fifo_tx+0xf0/0x1b0
>     dp_aux_transfer+0x1b0/0x2bc
>     drm_dp_dpcd_access+0x8c/0x11c
>     drm_dp_dpcd_read+0x64/0x10c
>     auxdev_read_iter+0xd4/0x1c4
> 
> I did a little bit of tracing and found that:
> * We register the AUX device very early at bootup.
> * Power isn't actually turned on for my system until
>    hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
> * You can see that dp_power_init() calls dp_aux_init() which is where
>    we start allowing AUX channel requests to go through.
> 
> In general this patch is a bit of a bandaid but at least it gets us
> out of the current state where userspace acting at the wrong time can
> fully crash the system.
> * I think the more proper fix (which requires quite a bit more
>    changes) is to power stuff on while an AUX transfer is
>    happening. This is like the solution we did for ti-sn65dsi86. This
>    might be required for us to move to populating the panel via the
>    DP-AUX bus.
> * Another fix considered was to dynamically register / unregister. I
>    tried that at <https://crrev.com/c/3169431/3> but it got
>    ugly. Currently there's a bug where the pm_runtime() state isn't
>    tracked properly and that causes us to just keep registering more
>    and more.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
> ---
> 
>   drivers/gpu/drm/msm/dp/dp_aux.c | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
> index eb40d8413bca..6d36f63c3338 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -33,6 +33,7 @@ struct dp_aux_private {
>   	bool read;
>   	bool no_send_addr;
>   	bool no_send_stop;
> +	bool initted;
>   	u32 offset;
>   	u32 segment;
>   
> @@ -331,6 +332,10 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
>   	}
>   
>   	mutex_lock(&aux->mutex);
> +	if (!aux->initted) {
> +		ret = -EIO;
> +		goto exit;
> +	}
>   
>   	dp_aux_update_offset_and_segment(aux, msg);
>   	dp_aux_transfer_helper(aux, msg, true);
> @@ -380,6 +385,8 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux *dp_aux,
>   	}
>   
>   	aux->cmd_busy = false;
> +
> +exit:
>   	mutex_unlock(&aux->mutex);
>   
>   	return ret;
> @@ -431,8 +438,13 @@ void dp_aux_init(struct drm_dp_aux *dp_aux)
>   
>   	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>   
> +	mutex_lock(&aux->mutex);
> +
>   	dp_catalog_aux_enable(aux->catalog, true);
>   	aux->retry_cnt = 0;
> +	aux->initted = true;
> +
> +	mutex_unlock(&aux->mutex);
>   }
>   
>   void dp_aux_deinit(struct drm_dp_aux *dp_aux)
> @@ -441,7 +453,12 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
>   
>   	aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>   
> +	mutex_lock(&aux->mutex);
> +
> +	aux->initted = false;
>   	dp_catalog_aux_enable(aux->catalog, false);
> +
> +	mutex_unlock(&aux->mutex);
>   }
>   
>   int dp_aux_register(struct drm_dp_aux *dp_aux)
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-11-10  1:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-09 18:04 [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes Douglas Anderson
2021-11-09 22:46 ` khsieh
2021-11-10  1:43 ` [Freedreno] " Abhinav Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).