linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
@ 2022-11-28  9:10 Ricardo Ribalda
  2022-11-28  9:10 ` [PATCH v3 1/2] freezer: Add processes_frozen() Ricardo Ribalda
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2022-11-28  9:10 UTC (permalink / raw)
  To: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki
  Cc: alsa-devel, Joel Fernandes (Google),
	Ricardo Ribalda, linux-kernel, linux-pm

Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
we wait for userspace to close its fds.

But that will never occur with a frozen userspace (like during kexec()).

Lets detect the frozen userpace and act accordingly.

To: Jaroslav Kysela <perex@perex.cz>
To: Takashi Iwai <tiwai@suse.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
To: Pavel Machek <pavel@ucw.cz>
To: Len Brown <len.brown@intel.com>
To: Kai Vehmanen <kai.vehmanen@linux.intel.com>
To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
To: Mark Brown <broonie@kernel.org>
Cc: alsa-devel@alsa-project.org
Cc: linux-kernel@vger.kernel.org
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
Changes in v3:
- Wrap pm_freezing in a function
- Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org

Changes in v2:
- Only use pm_freezing if CONFIG_FREEZER 
- Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org

---
Ricardo Ribalda (2):
      freezer: Add processes_frozen()
      ALSA: core: Fix deadlock when shutdown a frozen userspace

 include/linux/freezer.h |  2 ++
 kernel/freezer.c        | 11 +++++++++++
 sound/core/init.c       | 13 +++++++++++++
 3 files changed, 26 insertions(+)
---
base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
change-id: 20221127-snd-freeze-1ee143228326

Best regards,
-- 
Ricardo Ribalda <ribalda@chromium.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] freezer: Add processes_frozen()
  2022-11-28  9:10 [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
@ 2022-11-28  9:10 ` Ricardo Ribalda
  2022-11-28  9:10 ` [PATCH v3 2/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2022-11-28  9:10 UTC (permalink / raw)
  To: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki
  Cc: alsa-devel, Joel Fernandes (Google),
	Ricardo Ribalda, linux-kernel, linux-pm

Add a way to let the drivers know if the processes are frozen.

This is needed by drivers that are waiting for processes to end on their
shutdown path.

Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
 include/linux/freezer.h |  2 ++
 kernel/freezer.c        | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index b303472255be..ca861ace0689 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -47,6 +47,7 @@ extern int freeze_processes(void);
 extern int freeze_kernel_threads(void);
 extern void thaw_processes(void);
 extern void thaw_kernel_threads(void);
+bool processes_frozen(void);
 
 static inline bool try_to_freeze(void)
 {
@@ -80,6 +81,7 @@ static inline int freeze_processes(void) { return -ENOSYS; }
 static inline int freeze_kernel_threads(void) { return -ENOSYS; }
 static inline void thaw_processes(void) {}
 static inline void thaw_kernel_threads(void) {}
+static inline bool processes_frozen(void) { return false; }
 
 static inline bool try_to_freeze(void) { return false; }
 
diff --git a/kernel/freezer.c b/kernel/freezer.c
index 4fad0e6fca64..08828c4e12fd 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -233,3 +233,14 @@ bool set_freezable(void)
 	return try_to_freeze();
 }
 EXPORT_SYMBOL(set_freezable);
+
+/**
+ * processes_frozen - Check if the processes are frozen
+ *
+ * Let drivers know if they shall wait for processes.
+ */
+bool processes_frozen(void)
+{
+	return pm_freezing;
+}
+EXPORT_SYMBOL(processes_frozen);

-- 
2.38.1.584.g0f3c55d4c2-goog-b4-0.11.0-dev-696ae

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
  2022-11-28  9:10 [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
  2022-11-28  9:10 ` [PATCH v3 1/2] freezer: Add processes_frozen() Ricardo Ribalda
@ 2022-11-28  9:10 ` Ricardo Ribalda
  2022-11-28  9:24 ` [PATCH v3 0/2] " Takashi Iwai
  2022-11-28  9:26 ` Oliver Neukum
  3 siblings, 0 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2022-11-28  9:10 UTC (permalink / raw)
  To: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki
  Cc: alsa-devel, Joel Fernandes (Google),
	Ricardo Ribalda, linux-kernel, linux-pm

If the user space is frozen, we cannot wait for it to complete.

This fixes:

[   84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done.
[  246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds.
[  246.819035] Call Trace:
[  246.821782]  <TASK>
[  246.824186]  __schedule+0x5f9/0x1263
[  246.828231]  schedule+0x87/0xc5
[  246.831779]  snd_card_disconnect_sync+0xb5/0x127
...
[  246.889249]  snd_sof_device_shutdown+0xb4/0x150
[  246.899317]  pci_device_shutdown+0x37/0x61
[  246.903990]  device_shutdown+0x14c/0x1d6
[  246.908391]  kernel_kexec+0x45/0xb9

And:

[  246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds.
[  246.927709] Call Trace:
[  246.930461]  <TASK>
[  246.932819]  __schedule+0x5f9/0x1263
[  246.936855]  ? fsnotify_grab_connector+0x5c/0x70
[  246.942045]  schedule+0x87/0xc5
[  246.945567]  schedule_timeout+0x49/0xf3
[  246.949877]  wait_for_completion+0x86/0xe8
[  246.954463]  snd_card_free+0x68/0x89
...
[  247.001080]  platform_device_unregister+0x12/0x35

Fixes: 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
 sound/core/init.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/sound/core/init.c b/sound/core/init.c
index 5377f94eb211..bc038b49d4d3 100644
--- a/sound/core/init.c
+++ b/sound/core/init.c
@@ -9,6 +9,7 @@
 #include <linux/module.h>
 #include <linux/device.h>
 #include <linux/file.h>
+#include <linux/freezer.h>
 #include <linux/slab.h>
 #include <linux/time.h>
 #include <linux/ctype.h>
@@ -573,6 +574,11 @@ void snd_card_disconnect_sync(struct snd_card *card)
 		return;
 	}
 
+	if (processes_frozen()) {
+		dev_err(card->dev, "Userspace is frozen, skipping sync\n");
+		return;
+	}
+
 	spin_lock_irq(&card->files_lock);
 	wait_event_lock_irq(card->remove_sleep,
 			    list_empty(&card->files_list),
@@ -658,6 +664,13 @@ int snd_card_free(struct snd_card *card)
 	ret = snd_card_free_when_closed(card);
 	if (ret)
 		return ret;
+
+	/*
+	 * If userspace is frozen the wait from completion will never end.
+	 */
+	if (processes_frozen())
+		return 0;
+
 	/* wait, until all devices are ready for the free operation */
 	wait_for_completion(&released);
 

-- 
2.38.1.584.g0f3c55d4c2-goog-b4-0.11.0-dev-696ae

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
  2022-11-28  9:10 [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
  2022-11-28  9:10 ` [PATCH v3 1/2] freezer: Add processes_frozen() Ricardo Ribalda
  2022-11-28  9:10 ` [PATCH v3 2/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
@ 2022-11-28  9:24 ` Takashi Iwai
  2022-11-28  9:26   ` Ricardo Ribalda
  2022-11-28  9:26 ` Oliver Neukum
  3 siblings, 1 reply; 8+ messages in thread
From: Takashi Iwai @ 2022-11-28  9:24 UTC (permalink / raw)
  To: Ricardo Ribalda
  Cc: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki, alsa-devel, Joel Fernandes (Google),
	linux-kernel, linux-pm

On Mon, 28 Nov 2022 10:10:12 +0100,
Ricardo Ribalda wrote:
> 
> Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
> we wait for userspace to close its fds.

IMO, the fix above brought more problem.  If you'd need to want to
avoid later accesses during shutdown, the driver should rather just
disconnect devices without waiting for the user-space completion.
And, for that, a simple call of snd_card_disconnect() should suffice.

> But that will never occur with a frozen userspace (like during kexec()).
> 
> Lets detect the frozen userpace and act accordingly.

... and skipping the user-space sync at snd_card_disconnect_sync() as
of this patch set is a dangerous move, I'm afraid.  The user-space
gets frozen also at the normal suspend/resume, and it implies that the
sync will be lost even for the normal PM, too (although it must be a
very corner case).


thanks,

Takashi

> 
> To: Jaroslav Kysela <perex@perex.cz>
> To: Takashi Iwai <tiwai@suse.com>
> To: "Rafael J. Wysocki" <rafael@kernel.org>
> To: Pavel Machek <pavel@ucw.cz>
> To: Len Brown <len.brown@intel.com>
> To: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> To: Mark Brown <broonie@kernel.org>
> Cc: alsa-devel@alsa-project.org
> Cc: linux-kernel@vger.kernel.org
> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> Cc: linux-pm@vger.kernel.org
> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> ---
> Changes in v3:
> - Wrap pm_freezing in a function
> - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
> 
> Changes in v2:
> - Only use pm_freezing if CONFIG_FREEZER 
> - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
> 
> ---
> Ricardo Ribalda (2):
>       freezer: Add processes_frozen()
>       ALSA: core: Fix deadlock when shutdown a frozen userspace
> 
>  include/linux/freezer.h |  2 ++
>  kernel/freezer.c        | 11 +++++++++++
>  sound/core/init.c       | 13 +++++++++++++
>  3 files changed, 26 insertions(+)
> ---
> base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> change-id: 20221127-snd-freeze-1ee143228326
> 
> Best regards,
> -- 
> Ricardo Ribalda <ribalda@chromium.org>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
  2022-11-28  9:24 ` [PATCH v3 0/2] " Takashi Iwai
@ 2022-11-28  9:26   ` Ricardo Ribalda
  2022-11-28  9:53     ` Takashi Iwai
  0 siblings, 1 reply; 8+ messages in thread
From: Ricardo Ribalda @ 2022-11-28  9:26 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki, alsa-devel, Joel Fernandes (Google),
	linux-kernel, linux-pm

Hi Takashi

Thanks for your prompt reply

On Mon, 28 Nov 2022 at 10:24, Takashi Iwai <tiwai@suse.de> wrote:
>
> On Mon, 28 Nov 2022 10:10:12 +0100,
> Ricardo Ribalda wrote:
> >
> > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
> > we wait for userspace to close its fds.
>
> IMO, the fix above brought more problem.  If you'd need to want to
> avoid later accesses during shutdown, the driver should rather just
> disconnect devices without waiting for the user-space completion.
> And, for that, a simple call of snd_card_disconnect() should suffice.
>
> > But that will never occur with a frozen userspace (like during kexec()).
> >
> > Lets detect the frozen userpace and act accordingly.
>
> ... and skipping the user-space sync at snd_card_disconnect_sync() as
> of this patch set is a dangerous move, I'm afraid.  The user-space
> gets frozen also at the normal suspend/resume, and it implies that the
> sync will be lost even for the normal PM, too (although it must be a
> very corner case).
>

And what about checking kexec_in_progress instead?

Thanks!

>
> thanks,
>
> Takashi
>
> >
> > To: Jaroslav Kysela <perex@perex.cz>
> > To: Takashi Iwai <tiwai@suse.com>
> > To: "Rafael J. Wysocki" <rafael@kernel.org>
> > To: Pavel Machek <pavel@ucw.cz>
> > To: Len Brown <len.brown@intel.com>
> > To: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > To: Mark Brown <broonie@kernel.org>
> > Cc: alsa-devel@alsa-project.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > Cc: linux-pm@vger.kernel.org
> > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > ---
> > Changes in v3:
> > - Wrap pm_freezing in a function
> > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
> >
> > Changes in v2:
> > - Only use pm_freezing if CONFIG_FREEZER
> > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
> >
> > ---
> > Ricardo Ribalda (2):
> >       freezer: Add processes_frozen()
> >       ALSA: core: Fix deadlock when shutdown a frozen userspace
> >
> >  include/linux/freezer.h |  2 ++
> >  kernel/freezer.c        | 11 +++++++++++
> >  sound/core/init.c       | 13 +++++++++++++
> >  3 files changed, 26 insertions(+)
> > ---
> > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> > change-id: 20221127-snd-freeze-1ee143228326
> >
> > Best regards,
> > --
> > Ricardo Ribalda <ribalda@chromium.org>
> >



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
  2022-11-28  9:10 [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
                   ` (2 preceding siblings ...)
  2022-11-28  9:24 ` [PATCH v3 0/2] " Takashi Iwai
@ 2022-11-28  9:26 ` Oliver Neukum
  3 siblings, 0 replies; 8+ messages in thread
From: Oliver Neukum @ 2022-11-28  9:26 UTC (permalink / raw)
  To: Ricardo Ribalda, Takashi Iwai, Len Brown, Pierre-Louis Bossart,
	Kai Vehmanen, Ranjani Sridharan, Mark Brown, Jaroslav Kysela,
	Pavel Machek, Rafael J. Wysocki
  Cc: alsa-devel, Joel Fernandes (Google), linux-kernel, linux-pm



On 28.11.22 10:10, Ricardo Ribalda wrote:
> Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
> we wait for userspace to close its fds.
> 
> But that will never occur with a frozen userspace (like during kexec()).


Hi,

that means that user space can delay an unplugging process for an
indefinite time. And that is a problem. I am afraid there is a problem
with the basic approach, even outside of times user space is frozen.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
  2022-11-28  9:26   ` Ricardo Ribalda
@ 2022-11-28  9:53     ` Takashi Iwai
  2022-11-28 13:34       ` Ricardo Ribalda
  0 siblings, 1 reply; 8+ messages in thread
From: Takashi Iwai @ 2022-11-28  9:53 UTC (permalink / raw)
  To: Ricardo Ribalda
  Cc: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki, alsa-devel, Joel Fernandes (Google),
	linux-kernel, linux-pm

On Mon, 28 Nov 2022 10:26:36 +0100,
Ricardo Ribalda wrote:
> 
> Hi Takashi
> 
> Thanks for your prompt reply
> 
> On Mon, 28 Nov 2022 at 10:24, Takashi Iwai <tiwai@suse.de> wrote:
> >
> > On Mon, 28 Nov 2022 10:10:12 +0100,
> > Ricardo Ribalda wrote:
> > >
> > > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
> > > we wait for userspace to close its fds.
> >
> > IMO, the fix above brought more problem.  If you'd need to want to
> > avoid later accesses during shutdown, the driver should rather just
> > disconnect devices without waiting for the user-space completion.
> > And, for that, a simple call of snd_card_disconnect() should suffice.
> >
> > > But that will never occur with a frozen userspace (like during kexec()).
> > >
> > > Lets detect the frozen userpace and act accordingly.
> >
> > ... and skipping the user-space sync at snd_card_disconnect_sync() as
> > of this patch set is a dangerous move, I'm afraid.  The user-space
> > gets frozen also at the normal suspend/resume, and it implies that the
> > sync will be lost even for the normal PM, too (although it must be a
> > very corner case).
> >
> 
> And what about checking kexec_in_progress instead?

I still think that the call of snd_card_disconnect_sync() itself at
shutdown is somehow wrong.  If this only comes from the SOF code path
above, we should address in that code path instead.

OTOH, you showed two code paths: one is 

[   84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done.
[  246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds.
[  246.819035] Call Trace:
[  246.821782]  <TASK>
[  246.824186]  __schedule+0x5f9/0x1263
[  246.828231]  schedule+0x87/0xc5
[  246.831779]  snd_card_disconnect_sync+0xb5/0x127
...
[  246.889249]  snd_sof_device_shutdown+0xb4/0x150
[  246.899317]  pci_device_shutdown+0x37/0x61
[  246.903990]  device_shutdown+0x14c/0x1d6
[  246.908391]  kernel_kexec+0x45/0xb9

and another is

[  246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds.
[  246.927709] Call Trace:
[  246.930461]  <TASK>
[  246.932819]  __schedule+0x5f9/0x1263
[  246.936855]  ? fsnotify_grab_connector+0x5c/0x70
[  246.942045]  schedule+0x87/0xc5
[  246.945567]  schedule_timeout+0x49/0xf3
[  246.949877]  wait_for_completion+0x86/0xe8
[  246.954463]  snd_card_free+0x68/0x89
...
[  247.001080]  platform_device_unregister+0x12/0x35

The former is likely the SOF code path by the commit you mentioned
(but it's not 100% clear because you trimmed the stack trace), and
this should be reconsidered.

But, the latter seems to be independent from that.  If that's the code
path where the unbind is triggered before kexec, your fix might not
work, either; it could be already at the wait_event*() when kexec
starts.

Maybe a simpler workaround would be to replace it with
wait_event_killable*() stuff.  But whether we can discontinue the sync
there is still another thing to consider...


Takashi

> 
> Thanks!
> 
> >
> > thanks,
> >
> > Takashi
> >
> > >
> > > To: Jaroslav Kysela <perex@perex.cz>
> > > To: Takashi Iwai <tiwai@suse.com>
> > > To: "Rafael J. Wysocki" <rafael@kernel.org>
> > > To: Pavel Machek <pavel@ucw.cz>
> > > To: Len Brown <len.brown@intel.com>
> > > To: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> > > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > > To: Mark Brown <broonie@kernel.org>
> > > Cc: alsa-devel@alsa-project.org
> > > Cc: linux-kernel@vger.kernel.org
> > > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > > Cc: linux-pm@vger.kernel.org
> > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > > ---
> > > Changes in v3:
> > > - Wrap pm_freezing in a function
> > > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
> > >
> > > Changes in v2:
> > > - Only use pm_freezing if CONFIG_FREEZER
> > > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
> > >
> > > ---
> > > Ricardo Ribalda (2):
> > >       freezer: Add processes_frozen()
> > >       ALSA: core: Fix deadlock when shutdown a frozen userspace
> > >
> > >  include/linux/freezer.h |  2 ++
> > >  kernel/freezer.c        | 11 +++++++++++
> > >  sound/core/init.c       | 13 +++++++++++++
> > >  3 files changed, 26 insertions(+)
> > > ---
> > > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> > > change-id: 20221127-snd-freeze-1ee143228326
> > >
> > > Best regards,
> > > --
> > > Ricardo Ribalda <ribalda@chromium.org>
> > >
> 
> 
> 
> -- 
> Ricardo Ribalda
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace
  2022-11-28  9:53     ` Takashi Iwai
@ 2022-11-28 13:34       ` Ricardo Ribalda
  0 siblings, 0 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2022-11-28 13:34 UTC (permalink / raw)
  To: Takashi Iwai
  Cc: Takashi Iwai, Len Brown, Pierre-Louis Bossart, Kai Vehmanen,
	Ranjani Sridharan, Mark Brown, Jaroslav Kysela, Pavel Machek,
	Rafael J. Wysocki, alsa-devel, Joel Fernandes (Google),
	linux-kernel, linux-pm

Hi Takashi

Thanks  for your response

On Mon, 28 Nov 2022 at 10:53, Takashi Iwai <tiwai@suse.de> wrote:
>
> On Mon, 28 Nov 2022 10:26:36 +0100,
> Ricardo Ribalda wrote:
> >
> > Hi Takashi
> >
> > Thanks for your prompt reply
> >
> > On Mon, 28 Nov 2022 at 10:24, Takashi Iwai <tiwai@suse.de> wrote:
> > >
> > > On Mon, 28 Nov 2022 10:10:12 +0100,
> > > Ricardo Ribalda wrote:
> > > >
> > > > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
> > > > we wait for userspace to close its fds.
> > >
> > > IMO, the fix above brought more problem.  If you'd need to want to
> > > avoid later accesses during shutdown, the driver should rather just
> > > disconnect devices without waiting for the user-space completion.
> > > And, for that, a simple call of snd_card_disconnect() should suffice.
> > >
> > > > But that will never occur with a frozen userspace (like during kexec()).
> > > >
> > > > Lets detect the frozen userpace and act accordingly.
> > >
> > > ... and skipping the user-space sync at snd_card_disconnect_sync() as
> > > of this patch set is a dangerous move, I'm afraid.  The user-space
> > > gets frozen also at the normal suspend/resume, and it implies that the
> > > sync will be lost even for the normal PM, too (although it must be a
> > > very corner case).
> > >
> >
> > And what about checking kexec_in_progress instead?
>
> I still think that the call of snd_card_disconnect_sync() itself at
> shutdown is somehow wrong.  If this only comes from the SOF code path
> above, we should address in that code path instead.
>
> OTOH, you showed two code paths: one is
>
> [   84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done.
> [  246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds.
> [  246.819035] Call Trace:
> [  246.821782]  <TASK>
> [  246.824186]  __schedule+0x5f9/0x1263
> [  246.828231]  schedule+0x87/0xc5
> [  246.831779]  snd_card_disconnect_sync+0xb5/0x127
> ...
> [  246.889249]  snd_sof_device_shutdown+0xb4/0x150
> [  246.899317]  pci_device_shutdown+0x37/0x61
> [  246.903990]  device_shutdown+0x14c/0x1d6
> [  246.908391]  kernel_kexec+0x45/0xb9
>
> and another is
>
> [  246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds.
> [  246.927709] Call Trace:
> [  246.930461]  <TASK>
> [  246.932819]  __schedule+0x5f9/0x1263
> [  246.936855]  ? fsnotify_grab_connector+0x5c/0x70
> [  246.942045]  schedule+0x87/0xc5
> [  246.945567]  schedule_timeout+0x49/0xf3
> [  246.949877]  wait_for_completion+0x86/0xe8
> [  246.954463]  snd_card_free+0x68/0x89
> ...
> [  247.001080]  platform_device_unregister+0x12/0x35
>
> The former is likely the SOF code path by the commit you mentioned
> (but it's not 100% clear because you trimmed the stack trace), and
> this should be reconsidered.

Let me add both traces completely:


[  247.080494] INFO: task kexec-lite:5441 blocked for more than 122 seconds.
[  247.088100]       Tainted: G     U            5.15.79-14932-g6eecac25030c #5
[  247.095984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  247.104745] task:kexec-lite      state:D stack:    0 pid: 5441
ppid:  5390 flags:0x00004002
[  247.114083] Call Trace:
[  247.116833]  <TASK>
[  247.119185]  __schedule+0x5f9/0x1263
[  247.123193]  schedule+0x8e/0xdb
[  247.126698]  snd_card_disconnect_sync+0xb5/0x127
[  247.131877]  ? init_wait_entry+0x31/0x31
[  247.136270]  soc_cleanup_card_resources+0x27/0x1d3
[  247.141636]  snd_soc_unbind_card+0xa6/0xfc
[  247.146218]  snd_soc_unregister_card+0x26/0x34
[  247.151192]  release_nodes+0x43/0x62
[  247.155200]  devres_release_all+0x8b/0xc4
[  247.159695]  device_release_driver_internal+0x11a/0x1be
[  247.165546]  bus_remove_device+0xd9/0x103
[  247.170038]  device_del+0x1f7/0x355
[  247.173946]  platform_device_del+0x28/0x8e
[  247.178539]  platform_device_unregister+0x12/0x35
[  247.183807]  snd_sof_device_shutdown+0x57/0x94 [snd_sof
7d15170beb0e48032711b3230201744d5e4e590d]
[  247.193742]  pci_device_shutdown+0x37/0x61
[  247.198335]  device_shutdown+0x14c/0x1d6
[  247.202735]  kernel_kexec+0x45/0xb9
[  247.206646]  __se_sys_reboot+0x173/0x1f6
[  247.211041]  ? syscall_enter_from_user_mode+0x1a6/0x1ab
[  247.216885]  do_syscall_64+0x55/0x9d
[  247.220879]  ? exit_to_user_mode_prepare+0x3c/0x8b
[  247.226245]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[  247.231889] RIP: 0033:0x79b6a8563693
[  247.235897] RSP: 002b:00007fff9ab771b8 EFLAGS: 00000202 ORIG_RAX:
00000000000000a9
[  247.244375] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000079b6a8563693
[  247.252363] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
[  247.260337] RBP: 00007fff9ab77200 R08: 0000000000000004 R09: 00007fff9ab7930b
[  247.268323] R10: 00007fff9ab7936a R11: 0000000000000202 R12: 00000000ffffffff
[  247.276297] R13: 0000000000000004 R14: 00007fff9ab7936a R15: 0000000000000003
[  247.284272]  </TASK>
[  247.286729] task:init            state:D stack:    0 pid:    1
ppid:     0 flags:0x00004006
[  247.296079] Call Trace:
[  247.298810]  <TASK>
[  247.301155]  __schedule+0x5f9/0x1263
[  247.305163]  ? core_sys_select+0x278/0x343
[  247.309754]  schedule+0x8e/0xdb
[  247.313261]  __refrigerator+0x5e/0x97
[  247.317364]  get_signal+0x5e4/0x5e9
[  247.321280]  arch_do_signal_or_restart+0x4a/0x27a
[  247.326550]  exit_to_user_mode_loop+0x76/0xde
[  247.331433]  exit_to_user_mode_prepare+0x61/0x8b
[  247.336606]  syscall_exit_to_user_mode+0x26/0x168
[  247.341877]  do_syscall_64+0x63/0x9d
[  247.345871]  ? exit_to_user_mode_prepare+0x3c/0x8b
[  247.351227]  entry_SYSCALL_64_after_hwframe+0x61/0xcb



Then, after adding the (pm_freezing hack) in card_disconnect_sync


[  247.081334] INFO: task kexec-lite:5568 blocked for more than 122 seconds.
[  247.088940]       Tainted: G     U
5.15.79-14932-g6eecac25030c-dirty #6
[  247.097428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  247.106198] task:kexec-lite      state:D stack:    0 pid: 5568
ppid:  5389 flags:0x00004002
[  247.115552] Call Trace:
[  247.118297]  <TASK>
[  247.120651]  __schedule+0x5f9/0x1263
[  247.124655]  ? fsnotify_grab_connector+0x5c/0x70
[  247.129829]  schedule+0x8e/0xdb
[  247.133354]  schedule_timeout+0x49/0xf3
[  247.137655]  wait_for_completion+0x86/0xe8
[  247.142249]  snd_card_free+0x68/0x89
[  247.146258]  soc_cleanup_card_resources+0x1b5/0x1d3
[  247.151725]  snd_soc_unbind_card+0xa6/0xfc
[  247.156314]  snd_soc_unregister_card+0x26/0x34
[  247.161293]  release_nodes+0x43/0x62
[  247.165303]  devres_release_all+0x8b/0xc4
[  247.169797]  device_release_driver_internal+0x11a/0x1be
[  247.175654]  bus_remove_device+0xd9/0x103
[  247.180147]  device_del+0x1f7/0x355
[  247.184058]  platform_device_del+0x28/0x8e
[  247.188649]  platform_device_unregister+0x12/0x35
[  247.193923]  snd_sof_device_shutdown+0x57/0x94 [snd_sof
e88f872f6695f09a149f1a182cec1fe8e07d5eea]
[  247.203858]  pci_device_shutdown+0x37/0x61
[  247.208455]  device_shutdown+0x14c/0x1d6
[  247.212858]  kernel_kexec+0x45/0xb9
[  247.216759]  __se_sys_reboot+0x173/0x1f6
[  247.221157]  ? syscall_enter_from_user_mode+0x1a6/0x1ab
[  247.227016]  do_syscall_64+0x55/0x9d
[  247.231020]  ? exit_to_user_mode_prepare+0x3c/0x8b
[  247.236386]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[  247.242048] RIP: 0033:0x7d2d2634a693
[  247.246056] RSP: 002b:00007ffdceeb6338 EFLAGS: 00000202 ORIG_RAX:
00000000000000a9
[  247.254532] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007d2d2634a693
[  247.262520] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
[  247.270500] RBP: 00007ffdceeb6380 R08: 0000000000000004 R09: 00007ffdceeb8305
[  247.278489] R10: 00007ffdceeb836a R11: 0000000000000202 R12: 00000000ffffffff
[  247.286468] R13: 0000000000000004 R14: 00007ffdceeb836a R15: 0000000000000003
[  247.294458]  </TASK>
[  247.296909] task:init            state:D stack:    0 pid:    1
ppid:     0 flags:0x00004006
[  247.306253] Call Trace:
[  247.308994]  <TASK>
[  247.311337]  __schedule+0x5f9/0x1263
[  247.315330]  ? core_sys_select+0x278/0x343
[  247.319924]  schedule+0x8e/0xdb
[  247.323445]  __refrigerator+0x5e/0x97
[  247.327537]  get_signal+0x5e4/0x5e9
[  247.331449]  arch_do_signal_or_restart+0x4a/0x27a
[  247.336721]  exit_to_user_mode_loop+0x76/0xde
[  247.341602]  exit_to_user_mode_prepare+0x61/0x8b
[  247.346763]  syscall_exit_to_user_mode+0x26/0x168
[  247.352035]  do_syscall_64+0x63/0x9d
[  247.356043]  ? exit_to_user_mode_prepare+0x3c/0x8b
[  247.361410]  entry_SYSCALL_64_after_hwframe+0x61/0xcb


Simply commenting out snd_sof_machine_unregister(sdev, pdata); does
the trick though... will send a new version with just that.



>
> But, the latter seems to be independent from that.  If that's the code
> path where the unbind is triggered before kexec, your fix might not
> work, either; it could be already at the wait_event*() when kexec
> starts.
>
> Maybe a simpler workaround would be to replace it with
> wait_event_killable*() stuff.  But whether we can discontinue the sync
> there is still another thing to consider...
>
>
> Takashi
>
> >
> > Thanks!
> >
> > >
> > > thanks,
> > >
> > > Takashi
> > >
> > > >
> > > > To: Jaroslav Kysela <perex@perex.cz>
> > > > To: Takashi Iwai <tiwai@suse.com>
> > > > To: "Rafael J. Wysocki" <rafael@kernel.org>
> > > > To: Pavel Machek <pavel@ucw.cz>
> > > > To: Len Brown <len.brown@intel.com>
> > > > To: Kai Vehmanen <kai.vehmanen@linux.intel.com>
> > > > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
> > > > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> > > > To: Mark Brown <broonie@kernel.org>
> > > > Cc: alsa-devel@alsa-project.org
> > > > Cc: linux-kernel@vger.kernel.org
> > > > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > > > Cc: linux-pm@vger.kernel.org
> > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > > > ---
> > > > Changes in v3:
> > > > - Wrap pm_freezing in a function
> > > > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
> > > >
> > > > Changes in v2:
> > > > - Only use pm_freezing if CONFIG_FREEZER
> > > > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
> > > >
> > > > ---
> > > > Ricardo Ribalda (2):
> > > >       freezer: Add processes_frozen()
> > > >       ALSA: core: Fix deadlock when shutdown a frozen userspace
> > > >
> > > >  include/linux/freezer.h |  2 ++
> > > >  kernel/freezer.c        | 11 +++++++++++
> > > >  sound/core/init.c       | 13 +++++++++++++
> > > >  3 files changed, 26 insertions(+)
> > > > ---
> > > > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> > > > change-id: 20221127-snd-freeze-1ee143228326
> > > >
> > > > Best regards,
> > > > --
> > > > Ricardo Ribalda <ribalda@chromium.org>
> > > >
> >
> >
> >
> > --
> > Ricardo Ribalda
> >



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-28 13:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-28  9:10 [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
2022-11-28  9:10 ` [PATCH v3 1/2] freezer: Add processes_frozen() Ricardo Ribalda
2022-11-28  9:10 ` [PATCH v3 2/2] ALSA: core: Fix deadlock when shutdown a frozen userspace Ricardo Ribalda
2022-11-28  9:24 ` [PATCH v3 0/2] " Takashi Iwai
2022-11-28  9:26   ` Ricardo Ribalda
2022-11-28  9:53     ` Takashi Iwai
2022-11-28 13:34       ` Ricardo Ribalda
2022-11-28  9:26 ` Oliver Neukum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).