All of lore.kernel.org
 help / color / mirror / Atom feed
* [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups)
@ 2019-09-22  3:28 Ben Russell
  2019-09-22  3:28 ` [alsa-devel] [PATCH 1/3] pcm_local: Add snd_pcm_is_locked Ben Russell
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Ben Russell @ 2019-09-22  3:28 UTC (permalink / raw)
  To: alsa-devel; +Cc: Ben Russell

This is my first time contributing a patch to a mailing list. If I've made a mess, please let me know so I can learn how to avoid it in future.

The purpose of this patchset is to fix a specific common lockup in the pcm_jack plugin (from alsa-plugins). I'm not sure if this is the right approach to take, but at the very least it should make the pcm_ioplug code a bit more resilient.

The problem is this: When using the pcm_jack plugin, if a program attempts to play audio using the SND_PCM_FORMAT_FLOAT format, said program locks up.

This should be enough to reproduce the bug:

    pcm.rawjack {
        type jack
        playback_ports {
            0 system:playback_1
            1 system:playback_2
        }
        capture_ports {
            0 system:capture_1
            1 system:capture_2
        }
    }
    
    pcm.!default {
        type plug
        slave.pcm "rawjack"
    }

What's happening is that several snd_pcm_ioplug_* functions assume that the pcm mutex is locked already. It then proceeds to unlock the mutex, call a function, and then relock the mutex. When the mutex isn't locked already, the initial unlock results in a silently ignored pthread error, and the lock results in the program eventually deadlocking as it doesn't expect the mutex to be held at that point.

Patch 2 modifies pcm_ioplug to check if the mutex is held before doing the unlock-act-lock sequence, and if the mutex is not held then it skips the unlock and lock stages. This depends on Patch 1, which adds a snd_pcm_is_locked function to give the state of the mutex.

Patch 3 is completely optional. It adds assertions which make sure that all uses of snd_pcm_lock/snd_pcm_unlock are correct. On one hand this will likely result in crashes in some of the less refined parts of the code. On the other hand, when that happens, you'll know which parts need a bit more love. I know it was useful for finding this issue in the first place.

These patches fix the problems I am having, but if you have a more suitable approach to fixing this problem then please let me know.

Regards,
Ben R

Ben Russell (3):
  pcm_local: Add snd_pcm_is_locked
  pcm_ioplug: Don't unlock+lock if we're not locked
  pcm_local: assert() when using mutexes incorrectly

 src/pcm/pcm_ioplug.c | 50 +++++++++++++++++++++++-----------
 src/pcm/pcm_local.h  | 64 ++++++++++++++++++++++++++++++++++++++------
 2 files changed, 91 insertions(+), 23 deletions(-)

-- 
2.23.0

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [alsa-devel] [PATCH 1/3] pcm_local: Add snd_pcm_is_locked
  2019-09-22  3:28 [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Ben Russell
@ 2019-09-22  3:28 ` Ben Russell
  2019-09-22  3:28 ` [alsa-devel] [PATCH 2/3] pcm_ioplug: Don't unlock+lock if we're not locked Ben Russell
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Ben Russell @ 2019-09-22  3:28 UTC (permalink / raw)
  To: alsa-devel; +Cc: Ben Russell

This is needed in preparation for a fix to be applied to
pcm_ioplug which will be in a following commit.

Signed-off-by: Ben Russell <thematrixeatsyou@gmail.com>
---
 src/pcm/pcm_local.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/src/pcm/pcm_local.h b/src/pcm/pcm_local.h
index 05ed935f..649d84f2 100644
--- a/src/pcm/pcm_local.h
+++ b/src/pcm/pcm_local.h
@@ -1171,9 +1171,41 @@ static inline void snd_pcm_unlock(snd_pcm_t *pcm)
 	if (pcm->lock_enabled && pcm->need_lock)
 		pthread_mutex_unlock(&pcm->lock);
 }
+
+/*
+ * snd_pcm_is_locked() is used to determine if the lock is currently held.
+ * It is mostly provided as a workaround for pcm_ioplug, which otherwise would
+ * for some functions assume that the plugin is locked even when it isn't.
+ *
+ * If $LIBASOUND_THREAD_SAFE=0 is set then this always returns 0 as there
+ * is no mutex in that case.
+ */
+static inline int snd_pcm_is_locked(snd_pcm_t *pcm)
+{
+	int trylock_result;
+	int unlock_result;
+
+	if (pcm->lock_enabled) {
+		trylock_result = pthread_mutex_trylock(&pcm->lock);
+		assert(trylock_result == 0 || trylock_result == EBUSY);
+		if (trylock_result == 0) {
+			/* we managed to grab the lock; therefore, it wasn't locked */
+			unlock_result = pthread_mutex_unlock(&pcm->lock);
+			assert(unlock_result == 0);
+			return 0;
+		} else {
+			/* we failed to grab the lock; therefore, it was locked */
+			return 1;
+		}
+	} else {
+		return 0;
+	}
+}
+
 #else /* THREAD_SAFE_API */
 #define __snd_pcm_lock(pcm)		do {} while (0)
 #define __snd_pcm_unlock(pcm)		do {} while (0)
 #define snd_pcm_lock(pcm)		do {} while (0)
 #define snd_pcm_unlock(pcm)		do {} while (0)
+#define snd_pcm_is_locked(pcm)		0
 #endif /* THREAD_SAFE_API */
-- 
2.23.0

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [alsa-devel] [PATCH 2/3] pcm_ioplug: Don't unlock+lock if we're not locked
  2019-09-22  3:28 [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Ben Russell
  2019-09-22  3:28 ` [alsa-devel] [PATCH 1/3] pcm_local: Add snd_pcm_is_locked Ben Russell
@ 2019-09-22  3:28 ` Ben Russell
  2019-09-22  3:28 ` [alsa-devel] [PATCH 3/3] pcm_local: assert() when using mutexes incorrectly Ben Russell
  2019-09-22 20:25 ` [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Takashi Iwai
  3 siblings, 0 replies; 8+ messages in thread
From: Ben Russell @ 2019-09-22  3:28 UTC (permalink / raw)
  To: alsa-devel; +Cc: Ben Russell

This fixes a deadlock when using the pcm_jack plugin via a plug.

Some snd_pcm_ioplug_* functions assumed that they were being
called with the pcm mutex locked, and would perform an
unlock-act-lock sequence in order to allow the relevant function to
lock the mutex itself.

If the mutex was not locked before calling these functions, it
would silently fail on unlocking the mutex, and then erroneously
leave the mutex locked as it left the function.

This patch checks to see if the mutex is locked before attempting
the unlock-act-lock sequence. If the mutex is not locked when
entering these functions, then we act and leave the mutex unlocked.

Signed-off-by: Ben Russell <thematrixeatsyou@gmail.com>
---
 src/pcm/pcm_ioplug.c | 50 +++++++++++++++++++++++++++++++-------------
 1 file changed, 35 insertions(+), 15 deletions(-)

diff --git a/src/pcm/pcm_ioplug.c b/src/pcm/pcm_ioplug.c
index a437ca32..a94ecd96 100644
--- a/src/pcm/pcm_ioplug.c
+++ b/src/pcm/pcm_ioplug.c
@@ -164,9 +164,13 @@ static int snd_pcm_ioplug_prepare(snd_pcm_t *pcm)
 
 	snd_pcm_ioplug_reset(pcm);
 	if (io->data->callback->prepare) {
-		snd_pcm_unlock(pcm); /* to avoid deadlock */
-		err = io->data->callback->prepare(io->data);
-		snd_pcm_lock(pcm);
+		if (snd_pcm_is_locked(pcm)) {
+			snd_pcm_unlock(pcm); /* to avoid deadlock */
+			err = io->data->callback->prepare(io->data);
+			snd_pcm_lock(pcm);
+		} else {
+			err = io->data->callback->prepare(io->data);
+		}
 	}
 	if (err < 0)
 		return err;
@@ -463,9 +467,13 @@ static int snd_pcm_ioplug_sw_params(snd_pcm_t *pcm, snd_pcm_sw_params_t *params)
 	if (!io->data->callback->sw_params)
 		return 0;
 
-	snd_pcm_unlock(pcm); /* to avoid deadlock */
-	err = io->data->callback->sw_params(io->data, params);
-	snd_pcm_lock(pcm);
+	if (snd_pcm_is_locked(pcm)) {
+		snd_pcm_unlock(pcm); /* to avoid deadlock */
+		err = io->data->callback->sw_params(io->data, params);
+		snd_pcm_lock(pcm);
+	} else {
+		err = io->data->callback->sw_params(io->data, params);
+	}
 
 	return err;
 }
@@ -764,9 +772,13 @@ static int snd_pcm_ioplug_poll_descriptors_count(snd_pcm_t *pcm)
 	int err = 1;
 
 	if (io->data->callback->poll_descriptors_count) {
-		snd_pcm_unlock(pcm); /* to avoid deadlock */
-		err = io->data->callback->poll_descriptors_count(io->data);
-		snd_pcm_lock(pcm);
+		if (snd_pcm_is_locked(pcm)) {
+			snd_pcm_unlock(pcm); /* to avoid deadlock */
+			err = io->data->callback->poll_descriptors_count(io->data);
+			snd_pcm_lock(pcm);
+		} else {
+			err = io->data->callback->poll_descriptors_count(io->data);
+		}
 	}
 	return err;
 }
@@ -777,9 +789,13 @@ static int snd_pcm_ioplug_poll_descriptors(snd_pcm_t *pcm, struct pollfd *pfds,
 	int err;
 
 	if (io->data->callback->poll_descriptors) {
-		snd_pcm_unlock(pcm); /* to avoid deadlock */
-		err = io->data->callback->poll_descriptors(io->data, pfds, space);
-		snd_pcm_lock(pcm);
+		if (snd_pcm_is_locked(pcm)) {
+			snd_pcm_unlock(pcm); /* to avoid deadlock */
+			err = io->data->callback->poll_descriptors(io->data, pfds, space);
+			snd_pcm_lock(pcm);
+		} else {
+			err = io->data->callback->poll_descriptors(io->data, pfds, space);
+		}
 		return err;
 	}
 	if (pcm->poll_fd < 0)
@@ -799,9 +815,13 @@ static int snd_pcm_ioplug_poll_revents(snd_pcm_t *pcm, struct pollfd *pfds, unsi
 	int err;
 
 	if (io->data->callback->poll_revents) {
-		snd_pcm_unlock(pcm); /* to avoid deadlock */
-		err = io->data->callback->poll_revents(io->data, pfds, nfds, revents);
-		snd_pcm_lock(pcm);
+		if (snd_pcm_is_locked(pcm)) {
+			snd_pcm_unlock(pcm); /* to avoid deadlock */
+			err = io->data->callback->poll_revents(io->data, pfds, nfds, revents);
+			snd_pcm_lock(pcm);
+		} else {
+			err = io->data->callback->poll_revents(io->data, pfds, nfds, revents);
+		}
 	} else {
 		*revents = pfds->revents;
 		err = 0;
-- 
2.23.0

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [alsa-devel] [PATCH 3/3] pcm_local: assert() when using mutexes incorrectly
  2019-09-22  3:28 [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Ben Russell
  2019-09-22  3:28 ` [alsa-devel] [PATCH 1/3] pcm_local: Add snd_pcm_is_locked Ben Russell
  2019-09-22  3:28 ` [alsa-devel] [PATCH 2/3] pcm_ioplug: Don't unlock+lock if we're not locked Ben Russell
@ 2019-09-22  3:28 ` Ben Russell
  2019-09-22 20:25 ` [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Takashi Iwai
  3 siblings, 0 replies; 8+ messages in thread
From: Ben Russell @ 2019-09-22  3:28 UTC (permalink / raw)
  To: alsa-devel; +Cc: Ben Russell

This makes it easier to find pcm mutex handling errors.

Signed-off-by: Ben Russell <thematrixeatsyou@gmail.com>
---
 src/pcm/pcm_local.h | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/src/pcm/pcm_local.h b/src/pcm/pcm_local.h
index 649d84f2..c1f881a6 100644
--- a/src/pcm/pcm_local.h
+++ b/src/pcm/pcm_local.h
@@ -1153,23 +1153,39 @@ static inline void sw_set_period_event(snd_pcm_sw_params_t *params, int val)
  */
 static inline void __snd_pcm_lock(snd_pcm_t *pcm)
 {
-	if (pcm->lock_enabled)
-		pthread_mutex_lock(&pcm->lock);
+	int lock_err;
+
+	if (pcm->lock_enabled) {
+		lock_err = pthread_mutex_lock(&pcm->lock);
+		assert(lock_err == 0);
+	}
 }
 static inline void __snd_pcm_unlock(snd_pcm_t *pcm)
 {
-	if (pcm->lock_enabled)
-		pthread_mutex_unlock(&pcm->lock);
+	int unlock_err;
+
+	if (pcm->lock_enabled) {
+		unlock_err = pthread_mutex_unlock(&pcm->lock);
+		assert(unlock_err == 0);
+	}
 }
 static inline void snd_pcm_lock(snd_pcm_t *pcm)
 {
-	if (pcm->lock_enabled && pcm->need_lock)
-		pthread_mutex_lock(&pcm->lock);
+	int lock_err;
+
+	if (pcm->lock_enabled && pcm->need_lock) {
+		lock_err = pthread_mutex_lock(&pcm->lock);
+		assert(lock_err == 0);
+	}
 }
 static inline void snd_pcm_unlock(snd_pcm_t *pcm)
 {
-	if (pcm->lock_enabled && pcm->need_lock)
-		pthread_mutex_unlock(&pcm->lock);
+	int unlock_err;
+
+	if (pcm->lock_enabled && pcm->need_lock) {
+		unlock_err = pthread_mutex_unlock(&pcm->lock);
+		assert(unlock_err == 0);
+	}
 }
 
 /*
-- 
2.23.0

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups)
  2019-09-22  3:28 [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Ben Russell
                   ` (2 preceding siblings ...)
  2019-09-22  3:28 ` [alsa-devel] [PATCH 3/3] pcm_local: assert() when using mutexes incorrectly Ben Russell
@ 2019-09-22 20:25 ` Takashi Iwai
  2019-09-24  7:45   ` Ben Russell
  3 siblings, 1 reply; 8+ messages in thread
From: Takashi Iwai @ 2019-09-22 20:25 UTC (permalink / raw)
  To: Ben Russell; +Cc: alsa-devel

On Sun, 22 Sep 2019 05:28:50 +0200,
Ben Russell wrote:
> 
> This is my first time contributing a patch to a mailing list. If I've made a mess, please let me know so I can learn how to avoid it in future.
> 
> The purpose of this patchset is to fix a specific common lockup in the pcm_jack plugin (from alsa-plugins). I'm not sure if this is the right approach to take, but at the very least it should make the pcm_ioplug code a bit more resilient.
> 
> The problem is this: When using the pcm_jack plugin, if a program attempts to play audio using the SND_PCM_FORMAT_FLOAT format, said program locks up.
> 
> This should be enough to reproduce the bug:
> 
>     pcm.rawjack {
>         type jack
>         playback_ports {
>             0 system:playback_1
>             1 system:playback_2
>         }
>         capture_ports {
>             0 system:capture_1
>             1 system:capture_2
>         }
>     }
>     
>     pcm.!default {
>         type plug
>         slave.pcm "rawjack"
>     }
> 
> What's happening is that several snd_pcm_ioplug_* functions assume that the pcm mutex is locked already. It then proceeds to unlock the mutex, call a function, and then relock the mutex. When the mutex isn't locked already, the initial unlock results in a silently ignored pthread error, and the lock results in the program eventually deadlocking as it doesn't expect the mutex to be held at that point.
> 
> Patch 2 modifies pcm_ioplug to check if the mutex is held before doing the unlock-act-lock sequence, and if the mutex is not held then it skips the unlock and lock stages. This depends on Patch 1, which adds a snd_pcm_is_locked function to give the state of the mutex.
> 
> Patch 3 is completely optional. It adds assertions which make sure that all uses of snd_pcm_lock/snd_pcm_unlock are correct. On one hand this will likely result in crashes in some of the less refined parts of the code. On the other hand, when that happens, you'll know which parts need a bit more love. I know it was useful for finding this issue in the first place.
> 
> These patches fix the problems I am having, but if you have a more suitable approach to fixing this problem then please let me know.

Thanks for the analysis and patches.  It's indeed a serious problem we
need to address.

The current semantics of locking in alsa-lib code assumes that the
lock/unlock never fails.  So the "right-and-quick" fix would be just
to take the patch 3 to assert() upon pthread_mutex_lock/unlock
errors.  Then we need to paper over the actual invalid locks.

I do wonder, though, exactly which code path triggers the pthread
mutex error?  You should be able to catch it via gdb after the patch.


thanks,

Takashi
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups)
  2019-09-22 20:25 ` [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Takashi Iwai
@ 2019-09-24  7:45   ` Ben Russell
  2019-09-24 10:18     ` Takashi Iwai
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Russell @ 2019-09-24  7:45 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Sun, 22 Sep 2019 22:25:48 +0200
Takashi Iwai <tiwai@suse.de> wrote:

> On Sun, 22 Sep 2019 05:28:50 +0200,
> Ben Russell wrote:
> > 
> > This is my first time contributing a patch to a mailing list. If I've made a mess, please let me know so I can learn how to avoid it in future.
> > 
> > The purpose of this patchset is to fix a specific common lockup in the pcm_jack plugin (from alsa-plugins). I'm not sure if this is the right approach to take, but at the very least it should make the pcm_ioplug code a bit more resilient.
> > 
> > The problem is this: When using the pcm_jack plugin, if a program attempts to play audio using the SND_PCM_FORMAT_FLOAT format, said program locks up.
> > 
> > This should be enough to reproduce the bug:
> > 
> >     pcm.rawjack {
> >         type jack
> >         playback_ports {
> >             0 system:playback_1
> >             1 system:playback_2
> >         }
> >         capture_ports {
> >             0 system:capture_1
> >             1 system:capture_2
> >         }
> >     }
> >     
> >     pcm.!default {
> >         type plug
> >         slave.pcm "rawjack"
> >     }
> > 
> > What's happening is that several snd_pcm_ioplug_* functions assume that the pcm mutex is locked already. It then proceeds to unlock the mutex, call a function, and then relock the mutex. When the mutex isn't locked already, the initial unlock results in a silently ignored pthread error, and the lock results in the program eventually deadlocking as it doesn't expect the mutex to be held at that point.
> > 
> > Patch 2 modifies pcm_ioplug to check if the mutex is held before doing the unlock-act-lock sequence, and if the mutex is not held then it skips the unlock and lock stages. This depends on Patch 1, which adds a snd_pcm_is_locked function to give the state of the mutex.
> > 
> > Patch 3 is completely optional. It adds assertions which make sure that all uses of snd_pcm_lock/snd_pcm_unlock are correct. On one hand this will likely result in crashes in some of the less refined parts of the code. On the other hand, when that happens, you'll know which parts need a bit more love. I know it was useful for finding this issue in the first place.
> > 
> > These patches fix the problems I am having, but if you have a more suitable approach to fixing this problem then please let me know.
> 
> Thanks for the analysis and patches.  It's indeed a serious problem we
> need to address.
> 
> The current semantics of locking in alsa-lib code assumes that the
> lock/unlock never fails.  So the "right-and-quick" fix would be just
> to take the patch 3 to assert() upon pthread_mutex_lock/unlock
> errors.  Then we need to paper over the actual invalid locks.
> 
> I do wonder, though, exactly which code path triggers the pthread
> mutex error?  You should be able to catch it via gdb after the patch.
> 
> 
> thanks,
> 
> Takashi

Sure thing. With patch #3 against commit 1d7a131f, this trips up on an assert:

$ gdb --args mpv --ao=alsa --audio-format=float RESPECOGNIZE.mp3

The relevant backtrace:

#4  0x00007ffff4b06cf3 in snd_pcm_unlock (pcm=0x5555558625c0) at pcm_local.h:1187
#5  0x00007ffff4b075b3 in snd_pcm_unlock (pcm=0x5555558625c0) at pcm_local.h:1187
	if (pcm->lock_enabled && pcm->need_lock) {
		unlock_err = pthread_mutex_unlock(&pcm->lock);
		assert(unlock_err == 0); <-- this line
	}
#6  snd_pcm_ioplug_prepare (pcm=0x5555558625c0) at pcm_ioplug.c:167
	snd_pcm_ioplug_reset(pcm);
	if (io->data->callback->prepare) {
		snd_pcm_unlock(pcm); /* to avoid deadlock */ <-- this line
		err = io->data->callback->prepare(io->data);
		snd_pcm_lock(pcm);
	}
#7  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x5555558627c0) at pcm.c:1214
	snd_pcm_lock(pcm);
	if (pcm->fast_ops->prepare)
		err = pcm->fast_ops->prepare(pcm->fast_op_arg); <-- this line
	else
		err = -ENOSYS;
#8  0x00005555555a2efa in ?? ()

Using --audio-format=s16 does not deadlock, as it goes via pcm_plugin before it can go via pcm_ioplug (it's most likely for sample format conversion):

#0  0x00007ffff4b07522 in snd_pcm_unlock (pcm=<optimized out>) at pcm_ioplug.c:166
#1  snd_pcm_ioplug_prepare (pcm=0x555555862640) at pcm_ioplug.c:167
#2  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x555555862640) at pcm.c:1214
#3  0x00007ffff4ad2ed7 in snd_pcm_plugin_prepare (pcm=0x555555862e50) at pcm_plugin.c:159
	snd_pcm_plugin_t *plugin = pcm->private_data;
	int err;
	err = snd_pcm_prepare(plugin->gen.slave); <-- this line
	if (err < 0)
		return err;
#4  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x555555862840) at pcm.c:1214
#5  0x00005555555a2efa in ?? ()

Regards,
Ben R
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups)
  2019-09-24  7:45   ` Ben Russell
@ 2019-09-24 10:18     ` Takashi Iwai
  2019-09-25  6:10       ` Ben Russell
  0 siblings, 1 reply; 8+ messages in thread
From: Takashi Iwai @ 2019-09-24 10:18 UTC (permalink / raw)
  To: Ben Russell; +Cc: alsa-devel

On Tue, 24 Sep 2019 09:45:25 +0200,
Ben Russell wrote:
> 
> On Sun, 22 Sep 2019 22:25:48 +0200
> Takashi Iwai <tiwai@suse.de> wrote:
> 
> > On Sun, 22 Sep 2019 05:28:50 +0200,
> > Ben Russell wrote:
> > > 
> > > This is my first time contributing a patch to a mailing list. If I've made a mess, please let me know so I can learn how to avoid it in future.
> > > 
> > > The purpose of this patchset is to fix a specific common lockup in the pcm_jack plugin (from alsa-plugins). I'm not sure if this is the right approach to take, but at the very least it should make the pcm_ioplug code a bit more resilient.
> > > 
> > > The problem is this: When using the pcm_jack plugin, if a program attempts to play audio using the SND_PCM_FORMAT_FLOAT format, said program locks up.
> > > 
> > > This should be enough to reproduce the bug:
> > > 
> > >     pcm.rawjack {
> > >         type jack
> > >         playback_ports {
> > >             0 system:playback_1
> > >             1 system:playback_2
> > >         }
> > >         capture_ports {
> > >             0 system:capture_1
> > >             1 system:capture_2
> > >         }
> > >     }
> > >     
> > >     pcm.!default {
> > >         type plug
> > >         slave.pcm "rawjack"
> > >     }
> > > 
> > > What's happening is that several snd_pcm_ioplug_* functions assume that the pcm mutex is locked already. It then proceeds to unlock the mutex, call a function, and then relock the mutex. When the mutex isn't locked already, the initial unlock results in a silently ignored pthread error, and the lock results in the program eventually deadlocking as it doesn't expect the mutex to be held at that point.
> > > 
> > > Patch 2 modifies pcm_ioplug to check if the mutex is held before doing the unlock-act-lock sequence, and if the mutex is not held then it skips the unlock and lock stages. This depends on Patch 1, which adds a snd_pcm_is_locked function to give the state of the mutex.
> > > 
> > > Patch 3 is completely optional. It adds assertions which make sure that all uses of snd_pcm_lock/snd_pcm_unlock are correct. On one hand this will likely result in crashes in some of the less refined parts of the code. On the other hand, when that happens, you'll know which parts need a bit more love. I know it was useful for finding this issue in the first place.
> > > 
> > > These patches fix the problems I am having, but if you have a more suitable approach to fixing this problem then please let me know.
> > 
> > Thanks for the analysis and patches.  It's indeed a serious problem we
> > need to address.
> > 
> > The current semantics of locking in alsa-lib code assumes that the
> > lock/unlock never fails.  So the "right-and-quick" fix would be just
> > to take the patch 3 to assert() upon pthread_mutex_lock/unlock
> > errors.  Then we need to paper over the actual invalid locks.
> > 
> > I do wonder, though, exactly which code path triggers the pthread
> > mutex error?  You should be able to catch it via gdb after the patch.
> > 
> > 
> > thanks,
> > 
> > Takashi
> 
> Sure thing. With patch #3 against commit 1d7a131f, this trips up on an assert:
> 
> $ gdb --args mpv --ao=alsa --audio-format=float RESPECOGNIZE.mp3
> 
> The relevant backtrace:
> 
> #4  0x00007ffff4b06cf3 in snd_pcm_unlock (pcm=0x5555558625c0) at pcm_local.h:1187
> #5  0x00007ffff4b075b3 in snd_pcm_unlock (pcm=0x5555558625c0) at pcm_local.h:1187
> 	if (pcm->lock_enabled && pcm->need_lock) {
> 		unlock_err = pthread_mutex_unlock(&pcm->lock);
> 		assert(unlock_err == 0); <-- this line
> 	}
> #6  snd_pcm_ioplug_prepare (pcm=0x5555558625c0) at pcm_ioplug.c:167
> 	snd_pcm_ioplug_reset(pcm);
> 	if (io->data->callback->prepare) {
> 		snd_pcm_unlock(pcm); /* to avoid deadlock */ <-- this line
> 		err = io->data->callback->prepare(io->data);
> 		snd_pcm_lock(pcm);
> 	}
> #7  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x5555558627c0) at pcm.c:1214
> 	snd_pcm_lock(pcm);
> 	if (pcm->fast_ops->prepare)
> 		err = pcm->fast_ops->prepare(pcm->fast_op_arg); <-- this line
> 	else
> 		err = -ENOSYS;
> #8  0x00005555555a2efa in ?? ()
> 
> Using --audio-format=s16 does not deadlock, as it goes via pcm_plugin before it can go via pcm_ioplug (it's most likely for sample format conversion):
> 
> #0  0x00007ffff4b07522 in snd_pcm_unlock (pcm=<optimized out>) at pcm_ioplug.c:166
> #1  snd_pcm_ioplug_prepare (pcm=0x555555862640) at pcm_ioplug.c:167
> #2  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x555555862640) at pcm.c:1214
> #3  0x00007ffff4ad2ed7 in snd_pcm_plugin_prepare (pcm=0x555555862e50) at pcm_plugin.c:159
> 	snd_pcm_plugin_t *plugin = pcm->private_data;
> 	int err;
> 	err = snd_pcm_prepare(plugin->gen.slave); <-- this line
> 	if (err < 0)
> 		return err;
> #4  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x555555862840) at pcm.c:1214
> #5  0x00005555555a2efa in ?? ()

Thanks, after reading the traces carefully, finally I figured out what
went wrong.  In some cases, pcm->fast_op_arg isn't pcm itself but has
a different value.  Then we call snd_pcm_lock() with pcm, but call
fast_op with fast_op_arg as the PCM object.  This leads to the
unexpected behavior for the plugin that tries to unlock / relock from
the given PCM object.

So, we'd need to convert snd_pcm_lock()/unlock() with fast_op_arg or
op_arg in most places.

A test fix patch is below.  Could you test it quickly?
I'm going to traveling from tomorrow, so I'd like to have confirmation
before merging into git repo.


thanks,

Takashi

-- 8< --
From: Takashi Iwai <tiwai@suse.de>
Subject: [PATCH] pcm: Fix the wrong PCM object passed for locking/unlocking

Most of PCM API functions have snd_pcm_lock()/unlock() wraps for the
actual calls of ops, and some plugins try to unlock/relock internally
for the given PCM object.  This, unfortunately, causes a problem in
some configurations and leads to the unexpected behavior or deadlock.

The main problem is that we call snd_pcm_lock() with the given PCM
object, while calling the ops with pcm->op_arg or pcm->fast_op_arg as
the slave PCM object.  Meanwhile the plugin code assumes that the
passed PCM object is already locked, and calls snd_pcm_unlock().
This bug doesn't hit always because in most cases pcm->op_arg and
fast_op_arg are identical with pcm itself.  But in some configurations
they have different values, so the problem surfaces.

This patch is an attempt to resolve these inconsistencies.  It
replaces most of snd_pcm_lock()/unlock() calls with either pcm->op_arg
or pcm->fast_op_arg, depending on the call pattern, so that the plugin
code can safely run snd_pcm_unlock() to the given PCM object.

Reported-by: Ben Russell <thematrixeatsyou@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
---
 src/pcm/pcm.c | 114 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 57 insertions(+), 57 deletions(-)

diff --git a/src/pcm/pcm.c b/src/pcm/pcm.c
index 178d43875f00..e7519f04686b 100644
--- a/src/pcm/pcm.c
+++ b/src/pcm/pcm.c
@@ -789,7 +789,7 @@ int snd_pcm_nonblock(snd_pcm_t *pcm, int nonblock)
 	/* FIXME: __snd_pcm_lock() call below is commented out because of the
 	 * the possible deadlock in signal handler calling snd_pcm_abort()
 	 */
-	/* __snd_pcm_lock(pcm); */ /* forced lock due to pcm field change */
+	/* __snd_pcm_lock(pcm->op_arg); */ /* forced lock due to pcm field change */
 	if (pcm->ops->nonblock)
 		err = pcm->ops->nonblock(pcm->op_arg, nonblock);
 	else
@@ -809,7 +809,7 @@ int snd_pcm_nonblock(snd_pcm_t *pcm, int nonblock)
 			pcm->mode &= ~SND_PCM_NONBLOCK;
 	}
  unlock:
-	/* __snd_pcm_unlock(pcm); */ /* FIXME: see above */
+	/* __snd_pcm_unlock(pcm->op_arg); */ /* FIXME: see above */
 	return err;
 }
 
@@ -989,13 +989,13 @@ int snd_pcm_sw_params(snd_pcm_t *pcm, snd_pcm_sw_params_t *params)
 		return -EINVAL;
 	}
 #endif
-	__snd_pcm_lock(pcm); /* forced lock due to pcm field change */
+	__snd_pcm_lock(pcm->op_arg); /* forced lock due to pcm field change */
 	if (pcm->ops->sw_params)
 		err = pcm->ops->sw_params(pcm->op_arg, params);
 	else
 		err = -ENOSYS;
 	if (err < 0) {
-		__snd_pcm_unlock(pcm);
+		__snd_pcm_unlock(pcm->op_arg);
 		return err;
 	}
 	pcm->tstamp_mode = params->tstamp_mode;
@@ -1008,7 +1008,7 @@ int snd_pcm_sw_params(snd_pcm_t *pcm, snd_pcm_sw_params_t *params)
 	pcm->silence_threshold = params->silence_threshold;
 	pcm->silence_size = params->silence_size;
 	pcm->boundary = params->boundary;
-	__snd_pcm_unlock(pcm);
+	__snd_pcm_unlock(pcm->op_arg);
 	return 0;
 }
 
@@ -1025,12 +1025,12 @@ int snd_pcm_status(snd_pcm_t *pcm, snd_pcm_status_t *status)
 	int err;
 
 	assert(pcm && status);
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->status)
 		err = pcm->fast_ops->status(pcm->fast_op_arg, status);
 	else
 		err = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 
 	return err;
 }
@@ -1050,9 +1050,9 @@ snd_pcm_state_t snd_pcm_state(snd_pcm_t *pcm)
 	snd_pcm_state_t state;
 
 	assert(pcm);
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	state = __snd_pcm_state(pcm);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return state;
 }
 
@@ -1078,9 +1078,9 @@ int snd_pcm_hwsync(snd_pcm_t *pcm)
 		SNDMSG("PCM not set up");
 		return -EIO;
 	}
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_hwsync(pcm);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1123,9 +1123,9 @@ int snd_pcm_delay(snd_pcm_t *pcm, snd_pcm_sframes_t *delayp)
 		SNDMSG("PCM not set up");
 		return -EIO;
 	}
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_delay(pcm, delayp);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1181,12 +1181,12 @@ int snd_pcm_htimestamp(snd_pcm_t *pcm, snd_pcm_uframes_t *avail, snd_htimestamp_
 		SNDMSG("PCM not set up");
 		return -EIO;
 	}
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->htimestamp)
 		err = pcm->fast_ops->htimestamp(pcm->fast_op_arg, avail, tstamp);
 	else
 		err = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1209,12 +1209,12 @@ int snd_pcm_prepare(snd_pcm_t *pcm)
 	err = bad_pcm_state(pcm, ~P_STATE(DISCONNECTED));
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->prepare)
 		err = pcm->fast_ops->prepare(pcm->fast_op_arg);
 	else
 		err = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1236,12 +1236,12 @@ int snd_pcm_reset(snd_pcm_t *pcm)
 		SNDMSG("PCM not set up");
 		return -EIO;
 	}
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->reset)
 		err = pcm->fast_ops->reset(pcm->fast_op_arg);
 	else
 		err = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1264,9 +1264,9 @@ int snd_pcm_start(snd_pcm_t *pcm)
 	err = bad_pcm_state(pcm, P_STATE(PREPARED));
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_start(pcm);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1296,12 +1296,12 @@ int snd_pcm_drop(snd_pcm_t *pcm)
 			    P_STATE(SUSPENDED));
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->drop)
 		err = pcm->fast_ops->drop(pcm->fast_op_arg);
 	else
 		err = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1364,12 +1364,12 @@ int snd_pcm_pause(snd_pcm_t *pcm, int enable)
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->pause)
 		err = pcm->fast_ops->pause(pcm->fast_op_arg, enable);
 	else
 		err = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1397,12 +1397,12 @@ snd_pcm_sframes_t snd_pcm_rewindable(snd_pcm_t *pcm)
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->rewindable)
 		result = pcm->fast_ops->rewindable(pcm->fast_op_arg);
 	else
 		result = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 
@@ -1430,12 +1430,12 @@ snd_pcm_sframes_t snd_pcm_rewind(snd_pcm_t *pcm, snd_pcm_uframes_t frames)
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->rewind)
 		result = pcm->fast_ops->rewind(pcm->fast_op_arg, frames);
 	else
 		result = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 
@@ -1463,12 +1463,12 @@ snd_pcm_sframes_t snd_pcm_forwardable(snd_pcm_t *pcm)
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->forwardable)
 		result = pcm->fast_ops->forwardable(pcm->fast_op_arg);
 	else
 		result = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 
@@ -1500,12 +1500,12 @@ snd_pcm_sframes_t snd_pcm_forward(snd_pcm_t *pcm, snd_pcm_uframes_t frames)
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	if (pcm->fast_ops->forward)
 		result = pcm->fast_ops->forward(pcm->fast_op_arg, frames);
 	else
 		result = -ENOSYS;
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 use_default_symbol_version(__snd_pcm_forward, snd_pcm_forward, ALSA_0.9.0rc8);
@@ -1724,9 +1724,9 @@ int snd_pcm_poll_descriptors_count(snd_pcm_t *pcm)
 	int count;
 
 	assert(pcm);
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	count = __snd_pcm_poll_descriptors_count(pcm);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return count;
 }
 
@@ -1782,9 +1782,9 @@ int snd_pcm_poll_descriptors(snd_pcm_t *pcm, struct pollfd *pfds, unsigned int s
 	int err;
 
 	assert(pcm && pfds);
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_poll_descriptors(pcm, pfds, space);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -1817,9 +1817,9 @@ int snd_pcm_poll_descriptors_revents(snd_pcm_t *pcm, struct pollfd *pfds, unsign
 	int err;
 
 	assert(pcm && pfds && revents);
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_poll_revents(pcm, pfds, nfds, revents);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -2816,9 +2816,9 @@ int snd_pcm_wait(snd_pcm_t *pcm, int timeout)
 {
 	int err;
 
-	__snd_pcm_lock(pcm); /* forced lock */
+	__snd_pcm_lock(pcm->fast_op_arg); /* forced lock */
 	err = __snd_pcm_wait_in_lock(pcm, timeout);
-	__snd_pcm_unlock(pcm);
+	__snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -2865,9 +2865,9 @@ int snd_pcm_wait_nocheck(snd_pcm_t *pcm, int timeout)
 		return -EIO;
 	}
 	do {
-		__snd_pcm_unlock(pcm);
+		__snd_pcm_unlock(pcm->fast_op_arg);
 		err_poll = poll(pfd, npfds, timeout);
-		__snd_pcm_lock(pcm);
+		__snd_pcm_lock(pcm->fast_op_arg);
 		if (err_poll < 0) {
 		        if (errno == EINTR && !PCMINABORT(pcm))
 		                continue;
@@ -2926,9 +2926,9 @@ snd_pcm_sframes_t snd_pcm_avail_update(snd_pcm_t *pcm)
 {
 	snd_pcm_sframes_t result;
 
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	result = __snd_pcm_avail_update(pcm);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 
@@ -2956,13 +2956,13 @@ snd_pcm_sframes_t snd_pcm_avail(snd_pcm_t *pcm)
 		SNDMSG("PCM not set up");
 		return -EIO;
 	}
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_hwsync(pcm);
 	if (err < 0)
 		result = err;
 	else
 		result = __snd_pcm_avail_update(pcm);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 
@@ -2989,7 +2989,7 @@ int snd_pcm_avail_delay(snd_pcm_t *pcm,
 		SNDMSG("PCM not set up");
 		return -EIO;
 	}
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_hwsync(pcm);
 	if (err < 0)
 		goto unlock;
@@ -3004,7 +3004,7 @@ int snd_pcm_avail_delay(snd_pcm_t *pcm,
 	*availp = sf;
 	err = 0;
  unlock:
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -7192,9 +7192,9 @@ int snd_pcm_mmap_begin(snd_pcm_t *pcm,
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	err = __snd_pcm_mmap_begin(pcm, areas, offset, frames);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return err;
 }
 
@@ -7297,9 +7297,9 @@ snd_pcm_sframes_t snd_pcm_mmap_commit(snd_pcm_t *pcm,
 	err = bad_pcm_state(pcm, P_STATE_RUNNABLE);
 	if (err < 0)
 		return err;
-	snd_pcm_lock(pcm);
+	snd_pcm_lock(pcm->fast_op_arg);
 	result = __snd_pcm_mmap_commit(pcm, offset, frames);
-	snd_pcm_unlock(pcm);
+	snd_pcm_unlock(pcm->fast_op_arg);
 	return result;
 }
 
@@ -7375,7 +7375,7 @@ snd_pcm_sframes_t snd_pcm_read_areas(snd_pcm_t *pcm, const snd_pcm_channel_area_
 	if (size == 0)
 		return 0;
 
-	__snd_pcm_lock(pcm); /* forced lock */
+	__snd_pcm_lock(pcm->fast_op_arg); /* forced lock */
 	while (size > 0) {
 		snd_pcm_uframes_t frames;
 		snd_pcm_sframes_t avail;
@@ -7434,7 +7434,7 @@ snd_pcm_sframes_t snd_pcm_read_areas(snd_pcm_t *pcm, const snd_pcm_channel_area_
 		xfer += frames;
 	}
  _end:
-	__snd_pcm_unlock(pcm);
+	__snd_pcm_unlock(pcm->fast_op_arg);
 	return xfer > 0 ? (snd_pcm_sframes_t) xfer : snd_pcm_check_error(pcm, err);
 }
 
@@ -7449,7 +7449,7 @@ snd_pcm_sframes_t snd_pcm_write_areas(snd_pcm_t *pcm, const snd_pcm_channel_area
 	if (size == 0)
 		return 0;
 
-	__snd_pcm_lock(pcm); /* forced lock */
+	__snd_pcm_lock(pcm->fast_op_arg); /* forced lock */
 	while (size > 0) {
 		snd_pcm_uframes_t frames;
 		snd_pcm_sframes_t avail;
@@ -7523,7 +7523,7 @@ snd_pcm_sframes_t snd_pcm_write_areas(snd_pcm_t *pcm, const snd_pcm_channel_area
 		xfer += frames;
 	}
  _end:
-	__snd_pcm_unlock(pcm);
+	__snd_pcm_unlock(pcm->fast_op_arg);
 	return xfer > 0 ? (snd_pcm_sframes_t) xfer : snd_pcm_check_error(pcm, err);
 }
 
-- 
2.16.4

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups)
  2019-09-24 10:18     ` Takashi Iwai
@ 2019-09-25  6:10       ` Ben Russell
  0 siblings, 0 replies; 8+ messages in thread
From: Ben Russell @ 2019-09-25  6:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel

On Tue, 24 Sep 2019 12:18:24 +0200
Takashi Iwai <tiwai@suse.de> wrote:

> On Tue, 24 Sep 2019 09:45:25 +0200,
> Ben Russell wrote:
> > 
> > On Sun, 22 Sep 2019 22:25:48 +0200
> > Takashi Iwai <tiwai@suse.de> wrote:
> > 
> > > On Sun, 22 Sep 2019 05:28:50 +0200,
> > > Ben Russell wrote:
> > > > 
> > > > This is my first time contributing a patch to a mailing list. If I've made a mess, please let me know so I can learn how to avoid it in future.
> > > > 
> > > > The purpose of this patchset is to fix a specific common lockup in the pcm_jack plugin (from alsa-plugins). I'm not sure if this is the right approach to take, but at the very least it should make the pcm_ioplug code a bit more resilient.
> > > > 
> > > > The problem is this: When using the pcm_jack plugin, if a program attempts to play audio using the SND_PCM_FORMAT_FLOAT format, said program locks up.
> > > > 
> > > > This should be enough to reproduce the bug:
> > > > 
> > > >     pcm.rawjack {
> > > >         type jack
> > > >         playback_ports {
> > > >             0 system:playback_1
> > > >             1 system:playback_2
> > > >         }
> > > >         capture_ports {
> > > >             0 system:capture_1
> > > >             1 system:capture_2
> > > >         }
> > > >     }
> > > >     
> > > >     pcm.!default {
> > > >         type plug
> > > >         slave.pcm "rawjack"
> > > >     }
> > > > 
> > > > What's happening is that several snd_pcm_ioplug_* functions assume that the pcm mutex is locked already. It then proceeds to unlock the mutex, call a function, and then relock the mutex. When the mutex isn't locked already, the initial unlock results in a silently ignored pthread error, and the lock results in the program eventually deadlocking as it doesn't expect the mutex to be held at that point.
> > > > 
> > > > Patch 2 modifies pcm_ioplug to check if the mutex is held before doing the unlock-act-lock sequence, and if the mutex is not held then it skips the unlock and lock stages. This depends on Patch 1, which adds a snd_pcm_is_locked function to give the state of the mutex.
> > > > 
> > > > Patch 3 is completely optional. It adds assertions which make sure that all uses of snd_pcm_lock/snd_pcm_unlock are correct. On one hand this will likely result in crashes in some of the less refined parts of the code. On the other hand, when that happens, you'll know which parts need a bit more love. I know it was useful for finding this issue in the first place.
> > > > 
> > > > These patches fix the problems I am having, but if you have a more suitable approach to fixing this problem then please let me know.
> > > 
> > > Thanks for the analysis and patches.  It's indeed a serious problem we
> > > need to address.
> > > 
> > > The current semantics of locking in alsa-lib code assumes that the
> > > lock/unlock never fails.  So the "right-and-quick" fix would be just
> > > to take the patch 3 to assert() upon pthread_mutex_lock/unlock
> > > errors.  Then we need to paper over the actual invalid locks.
> > > 
> > > I do wonder, though, exactly which code path triggers the pthread
> > > mutex error?  You should be able to catch it via gdb after the patch.
> > > 
> > > 
> > > thanks,
> > > 
> > > Takashi
> > 
> > Sure thing. With patch #3 against commit 1d7a131f, this trips up on an assert:
> > 
> > $ gdb --args mpv --ao=alsa --audio-format=float RESPECOGNIZE.mp3
> > 
> > The relevant backtrace:
> > 
> > #4  0x00007ffff4b06cf3 in snd_pcm_unlock (pcm=0x5555558625c0) at pcm_local.h:1187
> > #5  0x00007ffff4b075b3 in snd_pcm_unlock (pcm=0x5555558625c0) at pcm_local.h:1187
> > 	if (pcm->lock_enabled && pcm->need_lock) {
> > 		unlock_err = pthread_mutex_unlock(&pcm->lock);
> > 		assert(unlock_err == 0); <-- this line
> > 	}
> > #6  snd_pcm_ioplug_prepare (pcm=0x5555558625c0) at pcm_ioplug.c:167
> > 	snd_pcm_ioplug_reset(pcm);
> > 	if (io->data->callback->prepare) {
> > 		snd_pcm_unlock(pcm); /* to avoid deadlock */ <-- this line
> > 		err = io->data->callback->prepare(io->data);
> > 		snd_pcm_lock(pcm);
> > 	}
> > #7  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x5555558627c0) at pcm.c:1214
> > 	snd_pcm_lock(pcm);
> > 	if (pcm->fast_ops->prepare)
> > 		err = pcm->fast_ops->prepare(pcm->fast_op_arg); <-- this line
> > 	else
> > 		err = -ENOSYS;
> > #8  0x00005555555a2efa in ?? ()
> > 
> > Using --audio-format=s16 does not deadlock, as it goes via pcm_plugin before it can go via pcm_ioplug (it's most likely for sample format conversion):
> > 
> > #0  0x00007ffff4b07522 in snd_pcm_unlock (pcm=<optimized out>) at pcm_ioplug.c:166
> > #1  snd_pcm_ioplug_prepare (pcm=0x555555862640) at pcm_ioplug.c:167
> > #2  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x555555862640) at pcm.c:1214
> > #3  0x00007ffff4ad2ed7 in snd_pcm_plugin_prepare (pcm=0x555555862e50) at pcm_plugin.c:159
> > 	snd_pcm_plugin_t *plugin = pcm->private_data;
> > 	int err;
> > 	err = snd_pcm_prepare(plugin->gen.slave); <-- this line
> > 	if (err < 0)
> > 		return err;
> > #4  0x00007ffff4abcba0 in snd_pcm_prepare (pcm=0x555555862840) at pcm.c:1214
> > #5  0x00005555555a2efa in ?? ()
> 
> Thanks, after reading the traces carefully, finally I figured out what
> went wrong.  In some cases, pcm->fast_op_arg isn't pcm itself but has
> a different value.  Then we call snd_pcm_lock() with pcm, but call
> fast_op with fast_op_arg as the PCM object.  This leads to the
> unexpected behavior for the plugin that tries to unlock / relock from
> the given PCM object.
> 
> So, we'd need to convert snd_pcm_lock()/unlock() with fast_op_arg or
> op_arg in most places.
> 
> A test fix patch is below.  Could you test it quickly?
> I'm going to traveling from tomorrow, so I'd like to have confirmation
> before merging into git repo.
> 
> 
> thanks,
> 
> Takashi
> 
> -- 8< --

The patch works fine. mpv works for float and s16, Wine works, Mumble works for capture and playback.

Thanks for tracking this down so quickly.

Regards,
Ben R
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-09-25  6:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-22  3:28 [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Ben Russell
2019-09-22  3:28 ` [alsa-devel] [PATCH 1/3] pcm_local: Add snd_pcm_is_locked Ben Russell
2019-09-22  3:28 ` [alsa-devel] [PATCH 2/3] pcm_ioplug: Don't unlock+lock if we're not locked Ben Russell
2019-09-22  3:28 ` [alsa-devel] [PATCH 3/3] pcm_local: assert() when using mutexes incorrectly Ben Russell
2019-09-22 20:25 ` [alsa-devel] [PATCH 0/3] Make pcm_ioplug check lock status before locking (fixes pcm_jack lockups) Takashi Iwai
2019-09-24  7:45   ` Ben Russell
2019-09-24 10:18     ` Takashi Iwai
2019-09-25  6:10       ` Ben Russell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.