linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
       [not found]             ` <4F45167A.6080706@ladisch.de>
@ 2012-02-24  0:29               ` Jonathan Andrews
  2012-02-29  9:12                 ` Huang Shijie
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Andrews @ 2012-02-24  0:29 UTC (permalink / raw)
  To: linux-kernel

Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
Debian arm-linux-gnueabi toolchain.

arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
Was used to build kernel, alsa-lib and application.

Changing gcc version, kernel version or alsa-lib version makes the
problem worse or better, but ALL versions seem to suffer this problem. I
have also seen it once on Intel (but only once so far).

Something seeks broken at a lower layer than im using.  I simply don't
have the skill to debug it.

The hardware is a USB cm109 audio adapter, but the problem seems to show
on more than this one driver.

The audio application writing to alsa will freezes at random intervals,
infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
processes are running, one reading from the sound device and one writing
to the sound device. I am not using threading or anything very clever
just generic alsa functions.

This is the only diagnostic I can generate so far as running the
application under strace slows it to the point it no longer functions
enough to generate the problem.

ARM / # strace -p 417
Process 417 attached - interrupt to quit
futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
Process 417 detached

ARM / # uname -a
Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
ARM / # uptime
 22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
ARM / # cat /proc/cpuinfo 
Processor       : ARM920T rev 0 (v4l)
BogoMIPS        : 199.06
Features        : swp half thumb crunch 
CPU implementer : 0x41
CPU architecture: 4T
CPU variant     : 0x1
CPU part        : 0x920
CPU revision    : 0


Any help welcome.

Thanks,
Jon



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-02-24  0:29               ` Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE Jonathan Andrews
@ 2012-02-29  9:12                 ` Huang Shijie
  2012-03-07 20:07                   ` Darren Hart
  0 siblings, 1 reply; 10+ messages in thread
From: Huang Shijie @ 2012-02-29  9:12 UTC (permalink / raw)
  To: jon; +Cc: linux-kernel

Hi ,

I meet a similar problem with the latest futex code.

I play the video and the processes will hang at the futex.

BR
Huang Shijie

On Fri, Feb 24, 2012 at 8:29 AM, Jonathan Andrews <jon@jonshouse.co.uk> wrote:
> Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
> Debian arm-linux-gnueabi toolchain.
>
> arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
> Was used to build kernel, alsa-lib and application.
>
> Changing gcc version, kernel version or alsa-lib version makes the
> problem worse or better, but ALL versions seem to suffer this problem. I
> have also seen it once on Intel (but only once so far).
>
> Something seeks broken at a lower layer than im using.  I simply don't
> have the skill to debug it.
>
> The hardware is a USB cm109 audio adapter, but the problem seems to show
> on more than this one driver.
>
> The audio application writing to alsa will freezes at random intervals,
> infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
> processes are running, one reading from the sound device and one writing
> to the sound device. I am not using threading or anything very clever
> just generic alsa functions.
>
> This is the only diagnostic I can generate so far as running the
> application under strace slows it to the point it no longer functions
> enough to generate the problem.
>
> ARM / # strace -p 417
> Process 417 attached - interrupt to quit
> futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
> Process 417 detached
>
> ARM / # uname -a
> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
> ARM / # uptime
>  22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
> ARM / # cat /proc/cpuinfo
> Processor       : ARM920T rev 0 (v4l)
> BogoMIPS        : 199.06
> Features        : swp half thumb crunch
> CPU implementer : 0x41
> CPU architecture: 4T
> CPU variant     : 0x1
> CPU part        : 0x920
> CPU revision    : 0
>
>
> Any help welcome.
>
> Thanks,
> Jon
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-02-29  9:12                 ` Huang Shijie
@ 2012-03-07 20:07                   ` Darren Hart
  2012-03-07 21:22                     ` Jonathan Andrews
  2012-03-08  2:28                     ` Huang Shijie
  0 siblings, 2 replies; 10+ messages in thread
From: Darren Hart @ 2012-03-07 20:07 UTC (permalink / raw)
  To: Huang Shijie; +Cc: jon, linux-kernel



On 02/29/2012 01:12 AM, Huang Shijie wrote:
> Hi ,
> 
> I meet a similar problem with the latest futex code.
> 
> I play the video and the processes will hang at the futex.

Are either of you able to bisect the kernel? At the very least can you
find two kernels where it works and where it does not?

Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
problems including userspace locking issues and race conditions.

Huang, are you also on ARM?

--
Darren

> 
> BR
> Huang Shijie
> 
> On Fri, Feb 24, 2012 at 8:29 AM, Jonathan Andrews <jon@jonshouse.co.uk> wrote:
>> Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
>> Debian arm-linux-gnueabi toolchain.
>>
>> arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
>> Was used to build kernel, alsa-lib and application.
>>
>> Changing gcc version, kernel version or alsa-lib version makes the
>> problem worse or better, but ALL versions seem to suffer this problem. I
>> have also seen it once on Intel (but only once so far).
>>
>> Something seeks broken at a lower layer than im using.  I simply don't
>> have the skill to debug it.
>>
>> The hardware is a USB cm109 audio adapter, but the problem seems to show
>> on more than this one driver.
>>
>> The audio application writing to alsa will freezes at random intervals,
>> infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
>> processes are running, one reading from the sound device and one writing
>> to the sound device. I am not using threading or anything very clever
>> just generic alsa functions.
>>
>> This is the only diagnostic I can generate so far as running the
>> application under strace slows it to the point it no longer functions
>> enough to generate the problem.
>>
>> ARM / # strace -p 417
>> Process 417 attached - interrupt to quit
>> futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
>> Process 417 detached
>>
>> ARM / # uname -a
>> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
>> ARM / # uptime
>>  22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
>> ARM / # cat /proc/cpuinfo
>> Processor       : ARM920T rev 0 (v4l)
>> BogoMIPS        : 199.06
>> Features        : swp half thumb crunch
>> CPU implementer : 0x41
>> CPU architecture: 4T
>> CPU variant     : 0x1
>> CPU part        : 0x920
>> CPU revision    : 0
>>
>>
>> Any help welcome.
>>
>> Thanks,
>> Jon
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-07 20:07                   ` Darren Hart
@ 2012-03-07 21:22                     ` Jonathan Andrews
  2012-03-08  3:42                       ` Darren Hart
  2012-03-08  2:28                     ` Huang Shijie
  1 sibling, 1 reply; 10+ messages in thread
From: Jonathan Andrews @ 2012-03-07 21:22 UTC (permalink / raw)
  To: Darren Hart; +Cc: Huang Shijie, linux-kernel

On Wed, 2012-03-07 at 12:07 -0800, Darren Hart wrote:
> 
> On 02/29/2012 01:12 AM, Huang Shijie wrote:
> > Hi ,
> > 
> > I meet a similar problem with the latest futex code.
> > 
> > I play the video and the processes will hang at the futex.
> 
> Are either of you able to bisect the kernel?
I'm not a kernel hacker what do you mean ?

>  At the very least can you
> find two kernels where it works and where it does not?
> 
> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
> problems including userspace locking issues and race conditions.


My workload is UDP network audio.  I have compiled my code with and
without ALSA support. The version without ALSA seems to run forever, the
version with ALSA works on ARM for between a few minutes and a few
hours. On Intel the same futex stall problem occurs, but it may take
runtime of days.

I have two processes running. One RX process that takes UDP packets from
the network mixes them and presents them to ALSA as an audio stream, the
second process takes audio from the sound device and transmits it as a
UDP audio stream.  The two processes are independent. 

My workload is atypical as I need to both transmit and receive audio via
UDP on a 27/7 basis.

So far I have experienced the problem on 3 kernels, but I have tried
only 3 kernels it may be all 2.6 kernels that suffer.

My development PC is "Linux jonspc 2.6.32.26-175.fc12.i686 #1 SMP Wed
Dec 1 21:52:04 UTC 2010 i686 athlon i386 GNU/Linux"

My ARM board target:
ARM / # uname -a
Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux

And my ARM target running its older kernel was (2.6.36).
 
I have an strace of the process running and stalling on the PC.
The file is 2GB, its not a fast link sorry.
 
http://www.jonshouse.co.uk/download/a_stop.txt


Many thanks,
Jon




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-07 20:07                   ` Darren Hart
  2012-03-07 21:22                     ` Jonathan Andrews
@ 2012-03-08  2:28                     ` Huang Shijie
  2012-03-08  3:36                       ` Darren Hart
  1 sibling, 1 reply; 10+ messages in thread
From: Huang Shijie @ 2012-03-08  2:28 UTC (permalink / raw)
  To: Darren Hart; +Cc: jon, linux-kernel

Hi,

On Thu, Mar 8, 2012 at 4:07 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>
>
> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>> Hi ,
>>
>> I meet a similar problem with the latest futex code.
>>
>> I play the video and the processes will hang at the futex.
>
> Are either of you able to bisect the kernel? At the very least can you

I finially found my arch/arm/include/asm/futex.h is not the
latest, so i updated the header.

And the futex issue is gone now. But a dataAbort issue appears, I am
not sure whether it caused by the futex patch.
I am debugging it now.


BR
Huang Shijie
> find two kernels where it works and where it does not?
>
> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
> problems including userspace locking issues and race conditions.
>
> Huang, are you also on ARM?

Yes, FREESCALE imx6q platform.

BR
Huang Shijie
>
> --
> Darren
>
>>
>> BR
>> Huang Shijie
>>
>> On Fri, Feb 24, 2012 at 8:29 AM, Jonathan Andrews <jon@jonshouse.co.uk> wrote:
>>> Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
>>> Debian arm-linux-gnueabi toolchain.
>>>
>>> arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
>>> Was used to build kernel, alsa-lib and application.
>>>
>>> Changing gcc version, kernel version or alsa-lib version makes the
>>> problem worse or better, but ALL versions seem to suffer this problem. I
>>> have also seen it once on Intel (but only once so far).
>>>
>>> Something seeks broken at a lower layer than im using.  I simply don't
>>> have the skill to debug it.
>>>
>>> The hardware is a USB cm109 audio adapter, but the problem seems to show
>>> on more than this one driver.
>>>
>>> The audio application writing to alsa will freezes at random intervals,
>>> infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
>>> processes are running, one reading from the sound device and one writing
>>> to the sound device. I am not using threading or anything very clever
>>> just generic alsa functions.
>>>
>>> This is the only diagnostic I can generate so far as running the
>>> application under strace slows it to the point it no longer functions
>>> enough to generate the problem.
>>>
>>> ARM / # strace -p 417
>>> Process 417 attached - interrupt to quit
>>> futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
>>> Process 417 detached
>>>
>>> ARM / # uname -a
>>> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
>>> ARM / # uptime
>>>  22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
>>> ARM / # cat /proc/cpuinfo
>>> Processor       : ARM920T rev 0 (v4l)
>>> BogoMIPS        : 199.06
>>> Features        : swp half thumb crunch
>>> CPU implementer : 0x41
>>> CPU architecture: 4T
>>> CPU variant     : 0x1
>>> CPU part        : 0x920
>>> CPU revision    : 0
>>>
>>>
>>> Any help welcome.
>>>
>>> Thanks,
>>> Jon
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
> --
> Darren Hart
> Intel Open Source Technology Center
> Yocto Project - Linux Kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-08  2:28                     ` Huang Shijie
@ 2012-03-08  3:36                       ` Darren Hart
  2012-03-08  4:24                         ` Huang Shijie
  0 siblings, 1 reply; 10+ messages in thread
From: Darren Hart @ 2012-03-08  3:36 UTC (permalink / raw)
  To: Huang Shijie; +Cc: jon, linux-kernel



On 03/07/2012 06:28 PM, Huang Shijie wrote:
> Hi,
> 
> On Thu, Mar 8, 2012 at 4:07 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>>
>>
>> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>>> Hi ,
>>>
>>> I meet a similar problem with the latest futex code.
>>>
>>> I play the video and the processes will hang at the futex.
>>
>> Are either of you able to bisect the kernel? At the very least can you
> 
> I finially found my arch/arm/include/asm/futex.h is not the
> latest, so i updated the header.

Just make sure it matches your kernel version.

> 
> And the futex issue is gone now. But a dataAbort issue appears, I am
> not sure whether it caused by the futex patch.
> I am debugging it now.

Which APIs are you using that make the futex syscall?

--
Darren

> 
> 
> BR
> Huang Shijie
>> find two kernels where it works and where it does not?
>>
>> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
>> problems including userspace locking issues and race conditions.
>>
>> Huang, are you also on ARM?
> 
> Yes, FREESCALE imx6q platform.
> 
> BR
> Huang Shijie
>>
>> --
>> Darren
>>
>>>
>>> BR
>>> Huang Shijie
>>>
>>> On Fri, Feb 24, 2012 at 8:29 AM, Jonathan Andrews <jon@jonshouse.co.uk> wrote:
>>>> Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
>>>> Debian arm-linux-gnueabi toolchain.
>>>>
>>>> arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
>>>> Was used to build kernel, alsa-lib and application.
>>>>
>>>> Changing gcc version, kernel version or alsa-lib version makes the
>>>> problem worse or better, but ALL versions seem to suffer this problem. I
>>>> have also seen it once on Intel (but only once so far).
>>>>
>>>> Something seeks broken at a lower layer than im using.  I simply don't
>>>> have the skill to debug it.
>>>>
>>>> The hardware is a USB cm109 audio adapter, but the problem seems to show
>>>> on more than this one driver.
>>>>
>>>> The audio application writing to alsa will freezes at random intervals,
>>>> infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
>>>> processes are running, one reading from the sound device and one writing
>>>> to the sound device. I am not using threading or anything very clever
>>>> just generic alsa functions.
>>>>
>>>> This is the only diagnostic I can generate so far as running the
>>>> application under strace slows it to the point it no longer functions
>>>> enough to generate the problem.
>>>>
>>>> ARM / # strace -p 417
>>>> Process 417 attached - interrupt to quit
>>>> futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
>>>> Process 417 detached
>>>>
>>>> ARM / # uname -a
>>>> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
>>>> ARM / # uptime
>>>>  22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
>>>> ARM / # cat /proc/cpuinfo
>>>> Processor       : ARM920T rev 0 (v4l)
>>>> BogoMIPS        : 199.06
>>>> Features        : swp half thumb crunch
>>>> CPU implementer : 0x41
>>>> CPU architecture: 4T
>>>> CPU variant     : 0x1
>>>> CPU part        : 0x920
>>>> CPU revision    : 0
>>>>
>>>>
>>>> Any help welcome.
>>>>
>>>> Thanks,
>>>> Jon
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>> --
>> Darren Hart
>> Intel Open Source Technology Center
>> Yocto Project - Linux Kernel

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-07 21:22                     ` Jonathan Andrews
@ 2012-03-08  3:42                       ` Darren Hart
  0 siblings, 0 replies; 10+ messages in thread
From: Darren Hart @ 2012-03-08  3:42 UTC (permalink / raw)
  To: jon; +Cc: Huang Shijie, linux-kernel



On 03/07/2012 01:22 PM, Jonathan Andrews wrote:
> On Wed, 2012-03-07 at 12:07 -0800, Darren Hart wrote:
>>
>> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>>> Hi ,
>>>
>>> I meet a similar problem with the latest futex code.
>>>
>>> I play the video and the processes will hang at the futex.
>>
>> Are either of you able to bisect the kernel?
> I'm not a kernel hacker what do you mean ?

Google for "git bisect". It's a way to divide-and-conquer and find
exactly where the kernel stops working for you. You would need a known
working kernel though. If you don't have a known working kernel, then
this is either a long standing bug (not a regression) or, I suspect more
likely, a locking bug or race in the userspace code.

When you see your thread stuck in FUTEX_WAIT_PRIVATE, that in and of
itself is not indication of a problem. It's waiting there to be woken
(FUTEX_WAKE_PRIVATE) by another thread in your application. So the
questions you should be asking are:

Why is it blocked?
	pthread_cond_wait()?
	pthread_mutex_lock()?

Why isn't it being woken up?
	Did you call pthread_cond_broadcast() without taking the mutex
	first?

There are other API that use futexes under the covers. You should be
able to sort out where in your application your threads are blocked and
thus determine which API you are using - and from there how you were
expecting that to get woken up. These calls may be being made from
within ALSA, in which case you may need to elicit the help of the ALSA
developers.

--
Darren

> 
>>  At the very least can you
>> find two kernels where it works and where it does not?
>>
>> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
>> problems including userspace locking issues and race conditions.
> 
> 
> My workload is UDP network audio.  I have compiled my code with and
> without ALSA support. The version without ALSA seems to run forever, the
> version with ALSA works on ARM for between a few minutes and a few
> hours. On Intel the same futex stall problem occurs, but it may take
> runtime of days.
> 
> I have two processes running. One RX process that takes UDP packets from
> the network mixes them and presents them to ALSA as an audio stream, the
> second process takes audio from the sound device and transmits it as a
> UDP audio stream.  The two processes are independent. 
> 
> My workload is atypical as I need to both transmit and receive audio via
> UDP on a 27/7 basis.
> 
> So far I have experienced the problem on 3 kernels, but I have tried
> only 3 kernels it may be all 2.6 kernels that suffer.
> 
> My development PC is "Linux jonspc 2.6.32.26-175.fc12.i686 #1 SMP Wed
> Dec 1 21:52:04 UTC 2010 i686 athlon i386 GNU/Linux"
> 
> My ARM board target:
> ARM / # uname -a
> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
> 
> And my ARM target running its older kernel was (2.6.36).
>  
> I have an strace of the process running and stalling on the PC.
> The file is 2GB, its not a fast link sorry.
>  
> http://www.jonshouse.co.uk/download/a_stop.txt
> 
> 
> Many thanks,
> Jon
> 
> 
> 

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-08  3:36                       ` Darren Hart
@ 2012-03-08  4:24                         ` Huang Shijie
  2012-03-08  7:40                           ` Darren Hart
  0 siblings, 1 reply; 10+ messages in thread
From: Huang Shijie @ 2012-03-08  4:24 UTC (permalink / raw)
  To: Darren Hart; +Cc: jon, linux-kernel

hi,

On Thu, Mar 8, 2012 at 11:36 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>
>
> On 03/07/2012 06:28 PM, Huang Shijie wrote:
>> Hi,
>>
>> On Thu, Mar 8, 2012 at 4:07 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>>>
>>>
>>> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>>>> Hi ,
>>>>
>>>> I meet a similar problem with the latest futex code.
>>>>
>>>> I play the video and the processes will hang at the futex.
>>>
>>> Are either of you able to bisect the kernel? At the very least can you
>>
>> I finially found my arch/arm/include/asm/futex.h is not the
>> latest, so i updated the header.
>
> Just make sure it matches your kernel version.
>
>>
>> And the futex issue is gone now. But a dataAbort issue appears, I am
>> not sure whether it caused by the futex patch.
>> I am debugging it now.
>
> Which APIs are you using that make the futex syscall?
>

futex_wait().

I cherry-pick the
" df77abc ARM: 7099/1: futex: preserve oldval in SMP
__futex_atomic_op" to the arch/arm/include/asm/futex.h.


BR
Huang Shijie

> --
> Darren
>
>>
>>
>> BR
>> Huang Shijie
>>> find two kernels where it works and where it does not?
>>>
>>> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
>>> problems including userspace locking issues and race conditions.
>>>
>>> Huang, are you also on ARM?
>>
>> Yes, FREESCALE imx6q platform.
>>
>> BR
>> Huang Shijie
>>>
>>> --
>>> Darren
>>>
>>>>
>>>> BR
>>>> Huang Shijie
>>>>
>>>> On Fri, Feb 24, 2012 at 8:29 AM, Jonathan Andrews <jon@jonshouse.co.uk> wrote:
>>>>> Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
>>>>> Debian arm-linux-gnueabi toolchain.
>>>>>
>>>>> arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
>>>>> Was used to build kernel, alsa-lib and application.
>>>>>
>>>>> Changing gcc version, kernel version or alsa-lib version makes the
>>>>> problem worse or better, but ALL versions seem to suffer this problem. I
>>>>> have also seen it once on Intel (but only once so far).
>>>>>
>>>>> Something seeks broken at a lower layer than im using.  I simply don't
>>>>> have the skill to debug it.
>>>>>
>>>>> The hardware is a USB cm109 audio adapter, but the problem seems to show
>>>>> on more than this one driver.
>>>>>
>>>>> The audio application writing to alsa will freezes at random intervals,
>>>>> infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
>>>>> processes are running, one reading from the sound device and one writing
>>>>> to the sound device. I am not using threading or anything very clever
>>>>> just generic alsa functions.
>>>>>
>>>>> This is the only diagnostic I can generate so far as running the
>>>>> application under strace slows it to the point it no longer functions
>>>>> enough to generate the problem.
>>>>>
>>>>> ARM / # strace -p 417
>>>>> Process 417 attached - interrupt to quit
>>>>> futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
>>>>> Process 417 detached
>>>>>
>>>>> ARM / # uname -a
>>>>> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
>>>>> ARM / # uptime
>>>>>  22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
>>>>> ARM / # cat /proc/cpuinfo
>>>>> Processor       : ARM920T rev 0 (v4l)
>>>>> BogoMIPS        : 199.06
>>>>> Features        : swp half thumb crunch
>>>>> CPU implementer : 0x41
>>>>> CPU architecture: 4T
>>>>> CPU variant     : 0x1
>>>>> CPU part        : 0x920
>>>>> CPU revision    : 0
>>>>>
>>>>>
>>>>> Any help welcome.
>>>>>
>>>>> Thanks,
>>>>> Jon
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
>>> --
>>> Darren Hart
>>> Intel Open Source Technology Center
>>> Yocto Project - Linux Kernel
>
> --
> Darren Hart
> Intel Open Source Technology Center
> Yocto Project - Linux Kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-08  4:24                         ` Huang Shijie
@ 2012-03-08  7:40                           ` Darren Hart
  2012-03-08  8:43                             ` Huang Shijie
  0 siblings, 1 reply; 10+ messages in thread
From: Darren Hart @ 2012-03-08  7:40 UTC (permalink / raw)
  To: Huang Shijie; +Cc: jon, linux-kernel



On 03/07/2012 08:24 PM, Huang Shijie wrote:
> hi,
> 
> On Thu, Mar 8, 2012 at 11:36 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>>
>>
>> On 03/07/2012 06:28 PM, Huang Shijie wrote:
>>> Hi,
>>>
>>> On Thu, Mar 8, 2012 at 4:07 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>>>>
>>>>
>>>> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>>>>> Hi ,
>>>>>
>>>>> I meet a similar problem with the latest futex code.
>>>>>
>>>>> I play the video and the processes will hang at the futex.
>>>>
>>>> Are either of you able to bisect the kernel? At the very least can you
>>>
>>> I finially found my arch/arm/include/asm/futex.h is not the
>>> latest, so i updated the header.
>>
>> Just make sure it matches your kernel version.
>>
>>>
>>> And the futex issue is gone now. But a dataAbort issue appears, I am
>>> not sure whether it caused by the futex patch.
>>> I am debugging it now.
>>
>> Which APIs are you using that make the futex syscall?
>>
> 
> futex_wait().

You are calling the futex syscall directly from your application?

--
Darren

> 
> I cherry-pick the
> " df77abc ARM: 7099/1: futex: preserve oldval in SMP
> __futex_atomic_op" to the arch/arm/include/asm/futex.h.
> 

> 
> BR
> Huang Shijie
> 
>> --
>> Darren
>>
>>>
>>>
>>> BR
>>> Huang Shijie
>>>> find two kernels where it works and where it does not?
>>>>
>>>> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
>>>> problems including userspace locking issues and race conditions.
>>>>
>>>> Huang, are you also on ARM?
>>>
>>> Yes, FREESCALE imx6q platform.
>>>
>>> BR
>>> Huang Shijie
>>>>
>>>> --
>>>> Darren
>>>>
>>>>>
>>>>> BR
>>>>> Huang Shijie
>>>>>
>>>>> On Fri, Feb 24, 2012 at 8:29 AM, Jonathan Andrews <jon@jonshouse.co.uk> wrote:
>>>>>> Using kernel 3.2.5 with alsa-lib 1.0.25, all compiled with generic
>>>>>> Debian arm-linux-gnueabi toolchain.
>>>>>>
>>>>>> arm-linux-gnueabi-gcc (Debian 4.3.2-1.1) 4.3.2
>>>>>> Was used to build kernel, alsa-lib and application.
>>>>>>
>>>>>> Changing gcc version, kernel version or alsa-lib version makes the
>>>>>> problem worse or better, but ALL versions seem to suffer this problem. I
>>>>>> have also seen it once on Intel (but only once so far).
>>>>>>
>>>>>> Something seeks broken at a lower layer than im using.  I simply don't
>>>>>> have the skill to debug it.
>>>>>>
>>>>>> The hardware is a USB cm109 audio adapter, but the problem seems to show
>>>>>> on more than this one driver.
>>>>>>
>>>>>> The audio application writing to alsa will freezes at random intervals,
>>>>>> infrequent at the moment, last one was after runtime 20H 37M 29S.  Two
>>>>>> processes are running, one reading from the sound device and one writing
>>>>>> to the sound device. I am not using threading or anything very clever
>>>>>> just generic alsa functions.
>>>>>>
>>>>>> This is the only diagnostic I can generate so far as running the
>>>>>> application under strace slows it to the point it no longer functions
>>>>>> enough to generate the problem.
>>>>>>
>>>>>> ARM / # strace -p 417
>>>>>> Process 417 attached - interrupt to quit
>>>>>> futex(0x175734, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>
>>>>>> Process 417 detached
>>>>>>
>>>>>> ARM / # uname -a
>>>>>> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
>>>>>> ARM / # uptime
>>>>>>  22:36:19 up 22:36,  0 users,  load average: 0.15, 0.16, 0.18
>>>>>> ARM / # cat /proc/cpuinfo
>>>>>> Processor       : ARM920T rev 0 (v4l)
>>>>>> BogoMIPS        : 199.06
>>>>>> Features        : swp half thumb crunch
>>>>>> CPU implementer : 0x41
>>>>>> CPU architecture: 4T
>>>>>> CPU variant     : 0x1
>>>>>> CPU part        : 0x920
>>>>>> CPU revision    : 0
>>>>>>
>>>>>>
>>>>>> Any help welcome.
>>>>>>
>>>>>> Thanks,
>>>>>> Jon
>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>>
>>>> --
>>>> Darren Hart
>>>> Intel Open Source Technology Center
>>>> Yocto Project - Linux Kernel
>>
>> --
>> Darren Hart
>> Intel Open Source Technology Center
>> Yocto Project - Linux Kernel

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
  2012-03-08  7:40                           ` Darren Hart
@ 2012-03-08  8:43                             ` Huang Shijie
  0 siblings, 0 replies; 10+ messages in thread
From: Huang Shijie @ 2012-03-08  8:43 UTC (permalink / raw)
  To: Darren Hart; +Cc: jon, linux-kernel

On Thu, Mar 8, 2012 at 3:40 PM, Darren Hart <dvhart@linux.intel.com> wrote:
>
>
> On 03/07/2012 08:24 PM, Huang Shijie wrote:
>> hi,
>>
>> On Thu, Mar 8, 2012 at 11:36 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>>>
>>>
>>> On 03/07/2012 06:28 PM, Huang Shijie wrote:
>>>> Hi,
>>>>
>>>> On Thu, Mar 8, 2012 at 4:07 AM, Darren Hart <dvhart@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>>>>>> Hi ,
>>>>>>
>>>>>> I meet a similar problem with the latest futex code.
>>>>>>
>>>>>> I play the video and the processes will hang at the futex.
>>>>>
>>>>> Are either of you able to bisect the kernel? At the very least can you
>>>>
>>>> I finially found my arch/arm/include/asm/futex.h is not the
>>>> latest, so i updated the header.
>>>
>>> Just make sure it matches your kernel version.
>>>
>>>>
>>>> And the futex issue is gone now. But a dataAbort issue appears, I am
>>>> not sure whether it caused by the futex patch.
>>>> I am debugging it now.
>>>
>>> Which APIs are you using that make the futex syscall?
>>>
>>
>> futex_wait().
>
> You are calling the futex syscall directly from your application?
>

The application call futex(), but i ever hung at futex_wait().

Huang Shijie

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-03-08  8:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1329393691.6830.20.camel@jonspc>
     [not found] ` <1329512682.29051.1.camel@jonspc>
     [not found]   ` <1329570128.6670.0.camel@jonspc>
     [not found]     ` <4F400BBF.9020707@ladisch.de>
     [not found]       ` <1329603022.1089.57.camel@jonspc>
     [not found]         ` <4F41FC28.1070605@ladisch.de>
     [not found]           ` <1329926198.22918.10.camel@jonspc>
     [not found]             ` <4F45167A.6080706@ladisch.de>
2012-02-24  0:29               ` Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE Jonathan Andrews
2012-02-29  9:12                 ` Huang Shijie
2012-03-07 20:07                   ` Darren Hart
2012-03-07 21:22                     ` Jonathan Andrews
2012-03-08  3:42                       ` Darren Hart
2012-03-08  2:28                     ` Huang Shijie
2012-03-08  3:36                       ` Darren Hart
2012-03-08  4:24                         ` Huang Shijie
2012-03-08  7:40                           ` Darren Hart
2012-03-08  8:43                             ` Huang Shijie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).