All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-08 17:27 ` Kai Vehmanen
  0 siblings, 0 replies; 17+ messages in thread
From: Kai Vehmanen @ 2022-03-08 17:27 UTC (permalink / raw)
  To: alsa-devel, tiwai; +Cc: intel-gfx, Lucas De Marchi, amadeuszx.slawinski

If kernel is built with hung task detection enabled and
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT set to less than 60 seconds,
snd_hdac_i915_init() will trigger the hung task timeout in case i915 is
not available and taint the kernel.

Split the 60sec wait into a loop of smaller waits to avoid this.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
---
 sound/hda/hdac_i915.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Changes V1->V2:
 - address local variable naming issue raised by Amadeusz
   and use Takashi's proposal

diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
index 454474ac5716..aa48bed7baf7 100644
--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -143,7 +143,7 @@ static bool i915_gfx_present(void)
 int snd_hdac_i915_init(struct hdac_bus *bus)
 {
 	struct drm_audio_component *acomp;
-	int err;
+	int err, i;
 
 	if (!i915_gfx_present())
 		return -ENODEV;
@@ -159,9 +159,11 @@ int snd_hdac_i915_init(struct hdac_bus *bus)
 	if (!acomp->ops) {
 		if (!IS_ENABLED(CONFIG_MODULES) ||
 		    !request_module("i915")) {
-			/* 60s timeout */
-			wait_for_completion_timeout(&acomp->master_bind_complete,
-						    msecs_to_jiffies(60 * 1000));
+			/* max 60s timeout */
+			for (i = 0; i < 60; i++)
+				if (wait_for_completion_timeout(&acomp->master_bind_complete,
+								msecs_to_jiffies(1000)))
+					break;
 		}
 	}
 	if (!acomp->ops) {

base-commit: fd7698cf0858f8c5e659b655109cd93c2f15cdd3
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-08 17:27 ` Kai Vehmanen
  0 siblings, 0 replies; 17+ messages in thread
From: Kai Vehmanen @ 2022-03-08 17:27 UTC (permalink / raw)
  To: alsa-devel, tiwai
  Cc: Ramalingam C, intel-gfx, Lucas De Marchi, kai.vehmanen,
	amadeuszx.slawinski

If kernel is built with hung task detection enabled and
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT set to less than 60 seconds,
snd_hdac_i915_init() will trigger the hung task timeout in case i915 is
not available and taint the kernel.

Split the 60sec wait into a loop of smaller waits to avoid this.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
---
 sound/hda/hdac_i915.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Changes V1->V2:
 - address local variable naming issue raised by Amadeusz
   and use Takashi's proposal

diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
index 454474ac5716..aa48bed7baf7 100644
--- a/sound/hda/hdac_i915.c
+++ b/sound/hda/hdac_i915.c
@@ -143,7 +143,7 @@ static bool i915_gfx_present(void)
 int snd_hdac_i915_init(struct hdac_bus *bus)
 {
 	struct drm_audio_component *acomp;
-	int err;
+	int err, i;
 
 	if (!i915_gfx_present())
 		return -ENODEV;
@@ -159,9 +159,11 @@ int snd_hdac_i915_init(struct hdac_bus *bus)
 	if (!acomp->ops) {
 		if (!IS_ENABLED(CONFIG_MODULES) ||
 		    !request_module("i915")) {
-			/* 60s timeout */
-			wait_for_completion_timeout(&acomp->master_bind_complete,
-						    msecs_to_jiffies(60 * 1000));
+			/* max 60s timeout */
+			for (i = 0; i < 60; i++)
+				if (wait_for_completion_timeout(&acomp->master_bind_complete,
+								msecs_to_jiffies(1000)))
+					break;
 		}
 	}
 	if (!acomp->ops) {

base-commit: fd7698cf0858f8c5e659b655109cd93c2f15cdd3
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for ALSA: hda/i915 - avoid hung task timeout in i915 wait (rev2)
  2022-03-08 17:27 ` Kai Vehmanen
  (?)
@ 2022-03-09  2:39 ` Patchwork
  -1 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2022-03-09  2:39 UTC (permalink / raw)
  To: Kai Vehmanen; +Cc: intel-gfx

== Series Details ==

Series: ALSA: hda/i915 - avoid hung task timeout in i915 wait (rev2)
URL   : https://patchwork.freedesktop.org/series/101156/
State : failure

== Summary ==

Applying: ALSA: hda/i915 - avoid hung task timeout in i915 wait
Using index info to reconstruct a base tree...
M	sound/hda/hdac_i915.c
Falling back to patching base and 3-way merge...
Auto-merging sound/hda/hdac_i915.c
CONFLICT (content): Merge conflict in sound/hda/hdac_i915.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 ALSA: hda/i915 - avoid hung task timeout in i915 wait
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-08 17:27 ` Kai Vehmanen
  (?)
  (?)
@ 2022-03-09  8:36 ` Tvrtko Ursulin
  2022-03-09  8:39     ` Kai Vehmanen
  2022-03-09  8:55     ` Takashi Iwai
  -1 siblings, 2 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2022-03-09  8:36 UTC (permalink / raw)
  To: Kai Vehmanen, alsa-devel, tiwai
  Cc: intel-gfx, Lucas De Marchi, amadeuszx.slawinski


On 08/03/2022 17:27, Kai Vehmanen wrote:
> If kernel is built with hung task detection enabled and
> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT set to less than 60 seconds,
> snd_hdac_i915_init() will trigger the hung task timeout in case i915 is
> not available and taint the kernel.
> 
> Split the 60sec wait into a loop of smaller waits to avoid this.
> 
> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> Co-developed-by: Ramalingam C <ramalingam.c@intel.com>
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> ---
>   sound/hda/hdac_i915.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> Changes V1->V2:
>   - address local variable naming issue raised by Amadeusz
>     and use Takashi's proposal
> 
> diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
> index 454474ac5716..aa48bed7baf7 100644
> --- a/sound/hda/hdac_i915.c
> +++ b/sound/hda/hdac_i915.c
> @@ -143,7 +143,7 @@ static bool i915_gfx_present(void)
>   int snd_hdac_i915_init(struct hdac_bus *bus)
>   {
>   	struct drm_audio_component *acomp;
> -	int err;
> +	int err, i;
>   
>   	if (!i915_gfx_present())
>   		return -ENODEV;
> @@ -159,9 +159,11 @@ int snd_hdac_i915_init(struct hdac_bus *bus)
>   	if (!acomp->ops) {
>   		if (!IS_ENABLED(CONFIG_MODULES) ||
>   		    !request_module("i915")) {
> -			/* 60s timeout */

Where does this 60s come from and why is the fix to work around 
DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would 
limiting the wait here to whatever the kconfig is set to be an option?

Regards,

Tvrtko

> -			wait_for_completion_timeout(&acomp->master_bind_complete,
> -						    msecs_to_jiffies(60 * 1000));
> +			/* max 60s timeout */
> +			for (i = 0; i < 60; i++)
> +				if (wait_for_completion_timeout(&acomp->master_bind_complete,
> +								msecs_to_jiffies(1000)))
> +					break;
>   		}
>   	}
>   	if (!acomp->ops) {
> 
> base-commit: fd7698cf0858f8c5e659b655109cd93c2f15cdd3

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  8:36 ` [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait Tvrtko Ursulin
@ 2022-03-09  8:39     ` Kai Vehmanen
  2022-03-09  8:55     ` Takashi Iwai
  1 sibling, 0 replies; 17+ messages in thread
From: Kai Vehmanen @ 2022-03-09  8:39 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Paul Menzel, Takashi Iwai, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski

Hi,

On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:

> > -			/* 60s timeout */
> 
> Where does this 60s come from and why is the fix to work around
> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
> limiting the wait here to whatever the kconfig is set to be an option?

this was discussed in
https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
... and that thread concluded it's cleaner to split the wait than try
to figure out hung-task configuration from middle of audio driver.

The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component 
bind timeout" to fix an issue reported by Paul Menzel (cc'ed).

This patch keeps the timeout intact.
Br, Kai

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-09  8:39     ` Kai Vehmanen
  0 siblings, 0 replies; 17+ messages in thread
From: Kai Vehmanen @ 2022-03-09  8:39 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Paul Menzel, Kai Vehmanen, Takashi Iwai, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski

Hi,

On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:

> > -			/* 60s timeout */
> 
> Where does this 60s come from and why is the fix to work around
> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
> limiting the wait here to whatever the kconfig is set to be an option?

this was discussed in
https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
... and that thread concluded it's cleaner to split the wait than try
to figure out hung-task configuration from middle of audio driver.

The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component 
bind timeout" to fix an issue reported by Paul Menzel (cc'ed).

This patch keeps the timeout intact.
Br, Kai

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  8:36 ` [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait Tvrtko Ursulin
@ 2022-03-09  8:55     ` Takashi Iwai
  2022-03-09  8:55     ` Takashi Iwai
  1 sibling, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2022-03-09  8:55 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Lucas De Marchi, intel-gfx, amadeuszx.slawinski

On Wed, 09 Mar 2022 09:36:54 +0100,
Tvrtko Ursulin wrote:
> 
> 
> On 08/03/2022 17:27, Kai Vehmanen wrote:
> > If kernel is built with hung task detection enabled and
> > CONFIG_DEFAULT_HUNG_TASK_TIMEOUT set to less than 60 seconds,
> > snd_hdac_i915_init() will trigger the hung task timeout in case i915 is
> > not available and taint the kernel.
> >
> > Split the 60sec wait into a loop of smaller waits to avoid this.
> >
> > Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Co-developed-by: Ramalingam C <ramalingam.c@intel.com>
> > Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> > Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > ---
> >   sound/hda/hdac_i915.c | 10 ++++++----
> >   1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > Changes V1->V2:
> >   - address local variable naming issue raised by Amadeusz
> >     and use Takashi's proposal
> >
> > diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
> > index 454474ac5716..aa48bed7baf7 100644
> > --- a/sound/hda/hdac_i915.c
> > +++ b/sound/hda/hdac_i915.c
> > @@ -143,7 +143,7 @@ static bool i915_gfx_present(void)
> >   int snd_hdac_i915_init(struct hdac_bus *bus)
> >   {
> >   	struct drm_audio_component *acomp;
> > -	int err;
> > +	int err, i;
> >     	if (!i915_gfx_present())
> >   		return -ENODEV;
> > @@ -159,9 +159,11 @@ int snd_hdac_i915_init(struct hdac_bus *bus)
> >   	if (!acomp->ops) {
> >   		if (!IS_ENABLED(CONFIG_MODULES) ||
> >   		    !request_module("i915")) {
> > -			/* 60s timeout */
> 
> Where does this 60s come from and why is the fix to work around
> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay?

The 60s timeout comes from the fact that the binding with i915 *might*
be mandatory for HD-audio driver on some platforms, while the binding
couldn't be achieved depending on the dynamic configuration change or
any other reasons, so we don't want to block forver.  And, basically
the hung check is false-positive, and if there is a better way to mark
to skip the hung check, we'd take it. But currently this seems to be
the easiest workaround for avoiding the false-positive checks.

> For instance
> would limiting the wait here to whatever the kconfig is set to be an
> option?

No, the hunk task timeout can be changed dynamically via procfs, hence
the fixed Kconfig won't help at all.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-09  8:55     ` Takashi Iwai
  0 siblings, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2022-03-09  8:55 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Lucas De Marchi, intel-gfx, Kai Vehmanen,
	amadeuszx.slawinski

On Wed, 09 Mar 2022 09:36:54 +0100,
Tvrtko Ursulin wrote:
> 
> 
> On 08/03/2022 17:27, Kai Vehmanen wrote:
> > If kernel is built with hung task detection enabled and
> > CONFIG_DEFAULT_HUNG_TASK_TIMEOUT set to less than 60 seconds,
> > snd_hdac_i915_init() will trigger the hung task timeout in case i915 is
> > not available and taint the kernel.
> >
> > Split the 60sec wait into a loop of smaller waits to avoid this.
> >
> > Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > Co-developed-by: Ramalingam C <ramalingam.c@intel.com>
> > Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> > Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > ---
> >   sound/hda/hdac_i915.c | 10 ++++++----
> >   1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > Changes V1->V2:
> >   - address local variable naming issue raised by Amadeusz
> >     and use Takashi's proposal
> >
> > diff --git a/sound/hda/hdac_i915.c b/sound/hda/hdac_i915.c
> > index 454474ac5716..aa48bed7baf7 100644
> > --- a/sound/hda/hdac_i915.c
> > +++ b/sound/hda/hdac_i915.c
> > @@ -143,7 +143,7 @@ static bool i915_gfx_present(void)
> >   int snd_hdac_i915_init(struct hdac_bus *bus)
> >   {
> >   	struct drm_audio_component *acomp;
> > -	int err;
> > +	int err, i;
> >     	if (!i915_gfx_present())
> >   		return -ENODEV;
> > @@ -159,9 +159,11 @@ int snd_hdac_i915_init(struct hdac_bus *bus)
> >   	if (!acomp->ops) {
> >   		if (!IS_ENABLED(CONFIG_MODULES) ||
> >   		    !request_module("i915")) {
> > -			/* 60s timeout */
> 
> Where does this 60s come from and why is the fix to work around
> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay?

The 60s timeout comes from the fact that the binding with i915 *might*
be mandatory for HD-audio driver on some platforms, while the binding
couldn't be achieved depending on the dynamic configuration change or
any other reasons, so we don't want to block forver.  And, basically
the hung check is false-positive, and if there is a better way to mark
to skip the hung check, we'd take it. But currently this seems to be
the easiest workaround for avoiding the false-positive checks.

> For instance
> would limiting the wait here to whatever the kconfig is set to be an
> option?

No, the hunk task timeout can be changed dynamically via procfs, hence
the fixed Kconfig won't help at all.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  8:39     ` Kai Vehmanen
  (?)
@ 2022-03-09  9:02     ` Tvrtko Ursulin
  2022-03-09  9:23         ` Takashi Iwai
  -1 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2022-03-09  9:02 UTC (permalink / raw)
  To: Kai Vehmanen
  Cc: alsa-devel, Paul Menzel, Takashi Iwai, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski


On 09/03/2022 08:39, Kai Vehmanen wrote:
> Hi,
> 
> On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
> 
>>> -			/* 60s timeout */
>>
>> Where does this 60s come from and why is the fix to work around
>> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
>> limiting the wait here to whatever the kconfig is set to be an option?
> 
> this was discussed in
> https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
> ... and that thread concluded it's cleaner to split the wait than try
> to figure out hung-task configuration from middle of audio driver.
> 
> The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
> bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
> 
> This patch keeps the timeout intact.

I did not spot discussion touching on the point I raised.

How about not fight the hung task detector but mark your wait context as 
"I really know what I'm doing - not stuck trust me". Maybe using 
wait_for_completion_killable_timeout would do it since 
snd_hdac_i915_init is allowed to fail with an error already?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  9:02     ` Tvrtko Ursulin
@ 2022-03-09  9:23         ` Takashi Iwai
  0 siblings, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2022-03-09  9:23 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Paul Menzel, intel-gfx, Lucas De Marchi, amadeuszx.slawinski

On Wed, 09 Mar 2022 10:02:13 +0100,
Tvrtko Ursulin wrote:
> 
> 
> On 09/03/2022 08:39, Kai Vehmanen wrote:
> > Hi,
> >
> > On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
> >
> >>> -			/* 60s timeout */
> >>
> >> Where does this 60s come from and why is the fix to work around
> >> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
> >> limiting the wait here to whatever the kconfig is set to be an option?
> >
> > this was discussed in
> > https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
> > ... and that thread concluded it's cleaner to split the wait than try
> > to figure out hung-task configuration from middle of audio driver.
> >
> > The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
> > bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
> >
> > This patch keeps the timeout intact.
> 
> I did not spot discussion touching on the point I raised.
> 
> How about not fight the hung task detector but mark your wait context
> as "I really know what I'm doing - not stuck trust me".

The question is how often this problem hits.  Basically it's a very
corner case, and I even think we may leave as is; that's a matter of
configuration, and lowering such a bar should expect some
side-effect. OTOH, if the problem happens in many cases, it's
beneficial to fix in the core part, indeed.

> Maybe using
> wait_for_completion_killable_timeout would do it since
> snd_hdac_i915_init is allowed to fail with an error already?

It makes it killable -- which is a complete behavior change.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-09  9:23         ` Takashi Iwai
  0 siblings, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2022-03-09  9:23 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Paul Menzel, Kai Vehmanen, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski

On Wed, 09 Mar 2022 10:02:13 +0100,
Tvrtko Ursulin wrote:
> 
> 
> On 09/03/2022 08:39, Kai Vehmanen wrote:
> > Hi,
> >
> > On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
> >
> >>> -			/* 60s timeout */
> >>
> >> Where does this 60s come from and why is the fix to work around
> >> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
> >> limiting the wait here to whatever the kconfig is set to be an option?
> >
> > this was discussed in
> > https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
> > ... and that thread concluded it's cleaner to split the wait than try
> > to figure out hung-task configuration from middle of audio driver.
> >
> > The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
> > bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
> >
> > This patch keeps the timeout intact.
> 
> I did not spot discussion touching on the point I raised.
> 
> How about not fight the hung task detector but mark your wait context
> as "I really know what I'm doing - not stuck trust me".

The question is how often this problem hits.  Basically it's a very
corner case, and I even think we may leave as is; that's a matter of
configuration, and lowering such a bar should expect some
side-effect. OTOH, if the problem happens in many cases, it's
beneficial to fix in the core part, indeed.

> Maybe using
> wait_for_completion_killable_timeout would do it since
> snd_hdac_i915_init is allowed to fail with an error already?

It makes it killable -- which is a complete behavior change.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  9:23         ` Takashi Iwai
@ 2022-03-09  9:48           ` Tvrtko Ursulin
  -1 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2022-03-09  9:48 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: alsa-devel, Paul Menzel, intel-gfx, Lucas De Marchi, amadeuszx.slawinski


On 09/03/2022 09:23, Takashi Iwai wrote:
> On Wed, 09 Mar 2022 10:02:13 +0100,
> Tvrtko Ursulin wrote:
>>
>>
>> On 09/03/2022 08:39, Kai Vehmanen wrote:
>>> Hi,
>>>
>>> On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
>>>
>>>>> -			/* 60s timeout */
>>>>
>>>> Where does this 60s come from and why is the fix to work around
>>>> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
>>>> limiting the wait here to whatever the kconfig is set to be an option?
>>>
>>> this was discussed in
>>> https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
>>> ... and that thread concluded it's cleaner to split the wait than try
>>> to figure out hung-task configuration from middle of audio driver.
>>>
>>> The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
>>> bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
>>>
>>> This patch keeps the timeout intact.
>>
>> I did not spot discussion touching on the point I raised.
>>
>> How about not fight the hung task detector but mark your wait context
>> as "I really know what I'm doing - not stuck trust me".
> 
> The question is how often this problem hits.  Basically it's a very
> corner case, and I even think we may leave as is; that's a matter of
> configuration, and lowering such a bar should expect some
> side-effect. OTOH, if the problem happens in many cases, it's
> beneficial to fix in the core part, indeed.

Yes argument you raise can be made I agree.

>> Maybe using
>> wait_for_completion_killable_timeout would do it since
>> snd_hdac_i915_init is allowed to fail with an error already?
> 
> It makes it killable -- which is a complete behavior change.

Complete behaviour change how? Isn't this something ran on probe so 
likelihood of anyone sending SIGKILL to the modprobe process is only the 
init process? And in that case what is the fundamental difference in 
init giving up before the internal 60s in HDA driver does? I don't see a 
difference. Either party decided to abort the wait and code can just 
unwind and propagate the different error codes.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-09  9:48           ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2022-03-09  9:48 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: alsa-devel, Paul Menzel, Kai Vehmanen, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski


On 09/03/2022 09:23, Takashi Iwai wrote:
> On Wed, 09 Mar 2022 10:02:13 +0100,
> Tvrtko Ursulin wrote:
>>
>>
>> On 09/03/2022 08:39, Kai Vehmanen wrote:
>>> Hi,
>>>
>>> On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
>>>
>>>>> -			/* 60s timeout */
>>>>
>>>> Where does this 60s come from and why is the fix to work around
>>>> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
>>>> limiting the wait here to whatever the kconfig is set to be an option?
>>>
>>> this was discussed in
>>> https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
>>> ... and that thread concluded it's cleaner to split the wait than try
>>> to figure out hung-task configuration from middle of audio driver.
>>>
>>> The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
>>> bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
>>>
>>> This patch keeps the timeout intact.
>>
>> I did not spot discussion touching on the point I raised.
>>
>> How about not fight the hung task detector but mark your wait context
>> as "I really know what I'm doing - not stuck trust me".
> 
> The question is how often this problem hits.  Basically it's a very
> corner case, and I even think we may leave as is; that's a matter of
> configuration, and lowering such a bar should expect some
> side-effect. OTOH, if the problem happens in many cases, it's
> beneficial to fix in the core part, indeed.

Yes argument you raise can be made I agree.

>> Maybe using
>> wait_for_completion_killable_timeout would do it since
>> snd_hdac_i915_init is allowed to fail with an error already?
> 
> It makes it killable -- which is a complete behavior change.

Complete behaviour change how? Isn't this something ran on probe so 
likelihood of anyone sending SIGKILL to the modprobe process is only the 
init process? And in that case what is the fundamental difference in 
init giving up before the internal 60s in HDA driver does? I don't see a 
difference. Either party decided to abort the wait and code can just 
unwind and propagate the different error codes.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  9:48           ` Tvrtko Ursulin
@ 2022-03-09  9:55             ` Takashi Iwai
  -1 siblings, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2022-03-09  9:55 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Paul Menzel, intel-gfx, Lucas De Marchi, amadeuszx.slawinski

On Wed, 09 Mar 2022 10:48:49 +0100,
Tvrtko Ursulin wrote:
> 
> 
> On 09/03/2022 09:23, Takashi Iwai wrote:
> > On Wed, 09 Mar 2022 10:02:13 +0100,
> > Tvrtko Ursulin wrote:
> >>
> >>
> >> On 09/03/2022 08:39, Kai Vehmanen wrote:
> >>> Hi,
> >>>
> >>> On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
> >>>
> >>>>> -			/* 60s timeout */
> >>>>
> >>>> Where does this 60s come from and why is the fix to work around
> >>>> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
> >>>> limiting the wait here to whatever the kconfig is set to be an option?
> >>>
> >>> this was discussed in
> >>> https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
> >>> ... and that thread concluded it's cleaner to split the wait than try
> >>> to figure out hung-task configuration from middle of audio driver.
> >>>
> >>> The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
> >>> bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
> >>>
> >>> This patch keeps the timeout intact.
> >>
> >> I did not spot discussion touching on the point I raised.
> >>
> >> How about not fight the hung task detector but mark your wait context
> >> as "I really know what I'm doing - not stuck trust me".
> >
> > The question is how often this problem hits.  Basically it's a very
> > corner case, and I even think we may leave as is; that's a matter of
> > configuration, and lowering such a bar should expect some
> > side-effect. OTOH, if the problem happens in many cases, it's
> > beneficial to fix in the core part, indeed.
> 
> Yes argument you raise can be made I agree.
> 
> >> Maybe using
> >> wait_for_completion_killable_timeout would do it since
> >> snd_hdac_i915_init is allowed to fail with an error already?
> >
> > It makes it killable -- which is a complete behavior change.
> 
> Complete behaviour change how? Isn't this something ran on probe so
> likelihood of anyone sending SIGKILL to the modprobe process is only
> the init process? And in that case what is the fundamental difference
> in init giving up before the internal 60s in HDA driver does? I don't
> see a difference. Either party decided to abort the wait and code can
> just unwind and propagate the different error codes.

The point is that it does change the actual behavior, and changing the
actual behavior just for working around the corner case like the above
wouldn't be justified without the proper evaluation.

That said, if this behavior change is intentional and even desired,
that's a way to go.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-09  9:55             ` Takashi Iwai
  0 siblings, 0 replies; 17+ messages in thread
From: Takashi Iwai @ 2022-03-09  9:55 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: alsa-devel, Paul Menzel, Kai Vehmanen, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski

On Wed, 09 Mar 2022 10:48:49 +0100,
Tvrtko Ursulin wrote:
> 
> 
> On 09/03/2022 09:23, Takashi Iwai wrote:
> > On Wed, 09 Mar 2022 10:02:13 +0100,
> > Tvrtko Ursulin wrote:
> >>
> >>
> >> On 09/03/2022 08:39, Kai Vehmanen wrote:
> >>> Hi,
> >>>
> >>> On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
> >>>
> >>>>> -			/* 60s timeout */
> >>>>
> >>>> Where does this 60s come from and why is the fix to work around
> >>>> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
> >>>> limiting the wait here to whatever the kconfig is set to be an option?
> >>>
> >>> this was discussed in
> >>> https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
> >>> ... and that thread concluded it's cleaner to split the wait than try
> >>> to figure out hung-task configuration from middle of audio driver.
> >>>
> >>> The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
> >>> bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
> >>>
> >>> This patch keeps the timeout intact.
> >>
> >> I did not spot discussion touching on the point I raised.
> >>
> >> How about not fight the hung task detector but mark your wait context
> >> as "I really know what I'm doing - not stuck trust me".
> >
> > The question is how often this problem hits.  Basically it's a very
> > corner case, and I even think we may leave as is; that's a matter of
> > configuration, and lowering such a bar should expect some
> > side-effect. OTOH, if the problem happens in many cases, it's
> > beneficial to fix in the core part, indeed.
> 
> Yes argument you raise can be made I agree.
> 
> >> Maybe using
> >> wait_for_completion_killable_timeout would do it since
> >> snd_hdac_i915_init is allowed to fail with an error already?
> >
> > It makes it killable -- which is a complete behavior change.
> 
> Complete behaviour change how? Isn't this something ran on probe so
> likelihood of anyone sending SIGKILL to the modprobe process is only
> the init process? And in that case what is the fundamental difference
> in init giving up before the internal 60s in HDA driver does? I don't
> see a difference. Either party decided to abort the wait and code can
> just unwind and propagate the different error codes.

The point is that it does change the actual behavior, and changing the
actual behavior just for working around the corner case like the above
wouldn't be justified without the proper evaluation.

That said, if this behavior change is intentional and even desired,
that's a way to go.


Takashi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
  2022-03-09  9:55             ` Takashi Iwai
@ 2022-03-09 13:05               ` Kai Vehmanen
  -1 siblings, 0 replies; 17+ messages in thread
From: Kai Vehmanen @ 2022-03-09 13:05 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: alsa-devel, Paul Menzel, intel-gfx, Lucas De Marchi, amadeuszx.slawinski

Hi,

On Wed, 9 Mar 2022, Takashi Iwai wrote:

>> Takashi Iwai wrote:
>>> The question is how often this problem hits.  Basically it's a very
>>> corner case, and I even think we may leave as is; that's a matter of
>>> configuration, and lowering such a bar should expect some
>>> side-effect. OTOH, if the problem happens in many cases, it's
>>> beneficial to fix in the core part, indeed.

I'm basicly helping out the intel-gfx folks here. This is now happening 
systematically in the intel-gfx CI. The hung-task timeout is configured to 
30sec (in intel-gfx CI), and there's some new hw configs where this 
happens every time (I have a separate patch in progress [1] that tries 
to detect this case and skip the init, but this will require more time as there is 
risk of breaking existing configurations).

[1] 
https://lists.freedesktop.org/archives/intel-gfx-trybot/2022-February/128278.html

Tvrtko Ursulin wrote:
> Takashi Iwai wrote:
> > Complete behaviour change how? Isn't this something ran on probe so
> > likelihood of anyone sending SIGKILL to the modprobe process is only
> > the init process? And in that case what is the fundamental difference
>
[...]
> The point is that it does change the actual behavior, and changing the
> actual behavior just for working around the corner case like the above
> wouldn't be justified without the proper evaluation.
> 
> That said, if this behavior change is intentional and even desired,
> that's a way to go.

Let me try this out and test on a few configs (with and without the 
timeout occurring) and send a V3 if this seems ok.

Br, Kai

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait
@ 2022-03-09 13:05               ` Kai Vehmanen
  0 siblings, 0 replies; 17+ messages in thread
From: Kai Vehmanen @ 2022-03-09 13:05 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Tvrtko Ursulin, alsa-devel, Paul Menzel, Kai Vehmanen, intel-gfx,
	Lucas De Marchi, amadeuszx.slawinski

Hi,

On Wed, 9 Mar 2022, Takashi Iwai wrote:

>> Takashi Iwai wrote:
>>> The question is how often this problem hits.  Basically it's a very
>>> corner case, and I even think we may leave as is; that's a matter of
>>> configuration, and lowering such a bar should expect some
>>> side-effect. OTOH, if the problem happens in many cases, it's
>>> beneficial to fix in the core part, indeed.

I'm basicly helping out the intel-gfx folks here. This is now happening 
systematically in the intel-gfx CI. The hung-task timeout is configured to 
30sec (in intel-gfx CI), and there's some new hw configs where this 
happens every time (I have a separate patch in progress [1] that tries 
to detect this case and skip the init, but this will require more time as there is 
risk of breaking existing configurations).

[1] 
https://lists.freedesktop.org/archives/intel-gfx-trybot/2022-February/128278.html

Tvrtko Ursulin wrote:
> Takashi Iwai wrote:
> > Complete behaviour change how? Isn't this something ran on probe so
> > likelihood of anyone sending SIGKILL to the modprobe process is only
> > the init process? And in that case what is the fundamental difference
>
[...]
> The point is that it does change the actual behavior, and changing the
> actual behavior just for working around the corner case like the above
> wouldn't be justified without the proper evaluation.
> 
> That said, if this behavior change is intentional and even desired,
> that's a way to go.

Let me try this out and test on a few configs (with and without the 
timeout occurring) and send a V3 if this seems ok.

Br, Kai

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-03-09 13:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-08 17:27 [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait Kai Vehmanen
2022-03-08 17:27 ` Kai Vehmanen
2022-03-09  2:39 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for ALSA: hda/i915 - avoid hung task timeout in i915 wait (rev2) Patchwork
2022-03-09  8:36 ` [Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait Tvrtko Ursulin
2022-03-09  8:39   ` Kai Vehmanen
2022-03-09  8:39     ` Kai Vehmanen
2022-03-09  9:02     ` Tvrtko Ursulin
2022-03-09  9:23       ` Takashi Iwai
2022-03-09  9:23         ` Takashi Iwai
2022-03-09  9:48         ` Tvrtko Ursulin
2022-03-09  9:48           ` Tvrtko Ursulin
2022-03-09  9:55           ` Takashi Iwai
2022-03-09  9:55             ` Takashi Iwai
2022-03-09 13:05             ` Kai Vehmanen
2022-03-09 13:05               ` Kai Vehmanen
2022-03-09  8:55   ` Takashi Iwai
2022-03-09  8:55     ` Takashi Iwai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.