All of lore.kernel.org
 help / color / mirror / Atom feed
* N_PPP_SYNC ldisc BUG: sleeping function called from invalid context
@ 2009-09-30 14:50 Tilman Schmidt
  2009-09-30 16:47 ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Tilman Schmidt @ 2009-09-30 14:50 UTC (permalink / raw)
  To: linux-kernel, netdev; +Cc: Alan Cox

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

While testing pppd with capiplugin on kernel 2.6.31 (plus my CAPI and
Gigaset patches) I came across this BUG message. (Please ignore the
Tainted flag, it's from the Nvidia driver and doesn't influence what's
following.)

BUG: sleeping function called from invalid context at kernel/mutex.c:280
in_atomic(): 1, irqs_disabled(): 1, pid: 10, name: events/0
INFO: lockdep is turned off.
irq event stamp: 2034
hardirqs last  enabled at (2033): [<c0347b07>]
_spin_unlock_irqrestore+0x3c/0x6c
hardirqs last disabled at (2034): [<c03475d0>] _spin_lock_irq+0x11/0x3f
softirqs last  enabled at (2008): [<c012f598>] __do_softirq+0x114/0x11c
softirqs last disabled at (1971): [<c010593a>] do_softirq+0x69/0xc7
Pid: 10, comm: events/0 Tainted: P           2.6.31-vanilla #1
Call Trace:
 [<c0124a3c>] __might_sleep+0x10e/0x116
 [<c0346bdd>] mutex_lock_nested+0x18/0x31
 [<c026d39b>] tty_unthrottle+0x10/0x38
 [<f8dcc31f>] ppp_sync_receive+0x168/0x170 [ppp_synctty]
 [<f8fbb9ce>] handle_minor_recv+0x187/0x1cd [capi]
 [<f8fbc19b>] capi_recv_message+0x1d9/0x24e [capi]
 [<fa5be461>] recv_handler+0x56/0x6f [kernelcapi]
 [<c0138cba>] worker_thread+0x14a/0x21d
 [<c0138c9e>] ? worker_thread+0x12e/0x21d
 [<fa5be40b>] ? recv_handler+0x0/0x6f [kernelcapi]
 [<c013b375>] ? autoremove_wake_function+0x0/0x30
 [<c013b201>] kthread+0x64/0x69
 [<c0138b70>] ? worker_thread+0x0/0x21d
 [<c013b19d>] ? kthread+0x0/0x69
 [<c0103f33>] kernel_thread_helper+0x7/0x10

Turns out the ppp_sync_receive() function (drivers/net/ppp_synctty.c
line 385ff.) has a comment in front stating:

/*
 * This can now be called from hard interrupt level as well
 * as soft interrupt level or mainline.
 */

but calls tty_unthrottle() which in turn calls mutex_lock() which of
course can sleep.

That tty_unthrottle() call was already removed once by

commit a6540f731d506d9e82444cf0020e716613d4c46c
Author: Alan Cox <alan@linux.intel.com>
Date:   Mon Jun 22 18:42:29 2009 +0100

    ppp: Fix throttling bugs

but re-added by

commit 4a21b8cb3550f19f838f7c48345fbbf6a0e8536b
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Thu Jul 16 09:14:23 2009 -0700

    Revert "ppp: Fix throttling bugs"

    This reverts commit a6540f731d506d9e82444cf0020e716613d4c46c, as
    requested by Alan:

      "... as it was wrong, the pty code is now fixed and the fact this
       isn't reverted is breaking pptp setups."

Opinions?

Thanks,
Tilman

- --
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFKw3AyQ3+did9BuFsRAiZXAKCGaos/qZNTlStEP4SE90PA0ZoMAQCdFtvf
U9chE7at35y8c6CGGS1IGg0=
=Vpq4
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context
  2009-09-30 14:50 N_PPP_SYNC ldisc BUG: sleeping function called from invalid context Tilman Schmidt
@ 2009-09-30 16:47 ` Alan Cox
  2009-09-30 18:55   ` Tilman Schmidt
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2009-09-30 16:47 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: linux-kernel, netdev, Alan Cox

>  [<c026d39b>] tty_unthrottle+0x10/0x38
>  [<f8dcc31f>] ppp_sync_receive+0x168/0x170 [ppp_synctty]
>  [<f8fbb9ce>] handle_minor_recv+0x187/0x1cd [capi]
>  [<f8fbc19b>] capi_recv_message+0x1d9/0x24e [capi]

Really need to see the rest of the call trace to be sure

> Turns out the ppp_sync_receive() function (drivers/net/ppp_synctty.c
> line 385ff.) has a comment in front stating:
> 
> /*
>  * This can now be called from hard interrupt level as well
>  * as soft interrupt level or mainline.
>  */

Which is wrong. The flip_buffer_push -> rx processing path should never
be called from IRQ context and that was fixed for various drivers that
mis-set tty->low_latency, as well as in the PPP rework. The PPP case is
actually unrelated in many was.

> Opinions?

See how we got into that code direct from an IRQ path. The expectation of
the tty logic is that it gets processed from work queues either
specifically in driver or via tty_flip_buffer_push when tty->low_latency
= 0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context
  2009-09-30 16:47 ` Alan Cox
@ 2009-09-30 18:55   ` Tilman Schmidt
  2009-09-30 20:28     ` Jarek Poplawski
  2009-09-30 21:12     ` Jarek Poplawski
  0 siblings, 2 replies; 7+ messages in thread
From: Tilman Schmidt @ 2009-09-30 18:55 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel, netdev, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 2412 bytes --]

Alan Cox schrieb:
>>  [<c026d39b>] tty_unthrottle+0x10/0x38
>>  [<f8dcc31f>] ppp_sync_receive+0x168/0x170 [ppp_synctty]
>>  [<f8fbb9ce>] handle_minor_recv+0x187/0x1cd [capi]
>>  [<f8fbc19b>] capi_recv_message+0x1d9/0x24e [capi]
> 
> Really need to see the rest of the call trace to be sure

There wasn't more than what I posted. I had six of them, they looked all
identical, and all of them ended after the kernel_thread_helper line. 

>> Turns out the ppp_sync_receive() function (drivers/net/ppp_synctty.c
>> line 385ff.) has a comment in front stating:
>>
>> /*
>>  * This can now be called from hard interrupt level as well
>>  * as soft interrupt level or mainline.
>>  */
> 
> Which is wrong. The flip_buffer_push -> rx processing path should never
> be called from IRQ context and that was fixed for various drivers that
> mis-set tty->low_latency, as well as in the PPP rework. The PPP case is
> actually unrelated in many was.

Might be worth correcting that text then before is misleads someone.

>> Opinions?
> 
> See how we got into that code direct from an IRQ path. The expectation of
> the tty logic is that it gets processed from work queues either
> specifically in driver or via tty_flip_buffer_push when tty->low_latency
> = 0

I'm at a loss here. According to all the backtraces:

- ppp_sync_receive() was called, as the LD's receive_buf method,
  via handle_recv_skb() [drivers/isdn/capi/capi.c line 504, inlined]
  from handle_minor_recv() [drivers/isdn/capi/capi.c line 519]

- handle_minor_recv() was called from capi_recv_message()
  [drivers/isdn/capi/capi.c line 656]

- capi_recv_message() was called, as the CAPI application's
  recv_message method, from recv_handler()
  [drivers/isdn/capi/kcapi.c line 268]

- recv_handler() is never called directly. It's only scheduled
  via the work queue ap->recv_work from capi_ctr_handle_message()
  [drivers/isdn/capi/kcapi.c line 349]

Even if we don't trust the backtraces, there's not much room for
another activation path. So for all I know, the expectation of the
tty logic should have been met. The call was indeed processed from
a work queue.

Why then does mutex_lock() complain?

-- 
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context
  2009-09-30 18:55   ` Tilman Schmidt
@ 2009-09-30 20:28     ` Jarek Poplawski
  2009-09-30 22:00       ` Tilman Schmidt
  2009-09-30 21:12     ` Jarek Poplawski
  1 sibling, 1 reply; 7+ messages in thread
From: Jarek Poplawski @ 2009-09-30 20:28 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: Alan Cox, linux-kernel, netdev, Alan Cox

Tilman Schmidt wrote, On 09/30/2009 08:55 PM:

> Alan Cox schrieb:
>>>  [<c026d39b>] tty_unthrottle+0x10/0x38
>>>  [<f8dcc31f>] ppp_sync_receive+0x168/0x170 [ppp_synctty]
>>>  [<f8fbb9ce>] handle_minor_recv+0x187/0x1cd [capi]
>>>  [<f8fbc19b>] capi_recv_message+0x1d9/0x24e [capi]
>> Really need to see the rest of the call trace to be sure
> 
> There wasn't more than what I posted. I had six of them, they looked all
> identical, and all of them ended after the kernel_thread_helper line. 
...
> Why then does mutex_lock() complain?


Maybe it doesn't matter here, but this: 

> INFO: lockdep is turned off.
suggests there was some lockdep issue/warning earlier.


Jarek P.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context
  2009-09-30 18:55   ` Tilman Schmidt
  2009-09-30 20:28     ` Jarek Poplawski
@ 2009-09-30 21:12     ` Jarek Poplawski
  2009-09-30 23:15       ` capi.c calls receive_buf with interrupts disabled (was: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context) Tilman Schmidt
  1 sibling, 1 reply; 7+ messages in thread
From: Jarek Poplawski @ 2009-09-30 21:12 UTC (permalink / raw)
  To: Tilman Schmidt; +Cc: Alan Cox, linux-kernel, netdev, Alan Cox

Tilman Schmidt wrote, On 09/30/2009 08:55 PM:

> Alan Cox schrieb:
>>>  [<c026d39b>] tty_unthrottle+0x10/0x38
>>>  [<f8dcc31f>] ppp_sync_receive+0x168/0x170 [ppp_synctty]
>>>  [<f8fbb9ce>] handle_minor_recv+0x187/0x1cd [capi]
>>>  [<f8fbc19b>] capi_recv_message+0x1d9/0x24e [capi]
>> Really need to see the rest of the call trace to be sure
> 
> There wasn't more than what I posted. I had six of them, they looked all
> identical, and all of them ended after the kernel_thread_helper line. 
> 
>>> Turns out the ppp_sync_receive() function (drivers/net/ppp_synctty.c
>>> line 385ff.) has a comment in front stating:
>>>
>>> /*
>>>  * This can now be called from hard interrupt level as well
>>>  * as soft interrupt level or mainline.
>>>  */
>> Which is wrong. The flip_buffer_push -> rx processing path should never
>> be called from IRQ context and that was fixed for various drivers that
>> mis-set tty->low_latency, as well as in the PPP rework. The PPP case is
>> actually unrelated in many was.
> 
> Might be worth correcting that text then before is misleads someone.
> 
>>> Opinions?
>> See how we got into that code direct from an IRQ path. The expectation of
>> the tty logic is that it gets processed from work queues either
>> specifically in driver or via tty_flip_buffer_push when tty->low_latency
>> = 0
> 
> I'm at a loss here. According to all the backtraces:
> 
> - ppp_sync_receive() was called, as the LD's receive_buf method,
>   via handle_recv_skb() [drivers/isdn/capi/capi.c line 504, inlined]
>   from handle_minor_recv() [drivers/isdn/capi/capi.c line 519]
> 
> - handle_minor_recv() was called from capi_recv_message()
>   [drivers/isdn/capi/capi.c line 656]
> 
> - capi_recv_message() was called, as the CAPI application's
>   recv_message method, from recv_handler()
>   [drivers/isdn/capi/kcapi.c line 268]
> 
> - recv_handler() is never called directly. It's only scheduled
>   via the work queue ap->recv_work from capi_ctr_handle_message()
>   [drivers/isdn/capi/kcapi.c line 349]
> 
> Even if we don't trust the backtraces, there's not much room for
> another activation path. So for all I know, the expectation of the
> tty logic should have been met. The call was indeed processed from
> a work queue.
> 
> Why then does mutex_lock() complain?
 

Hmm... capi_recv_message() calls handle_minor_recv() under
spin_lock_irqsave(), doesn't it?

Jarek P.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context
  2009-09-30 20:28     ` Jarek Poplawski
@ 2009-09-30 22:00       ` Tilman Schmidt
  0 siblings, 0 replies; 7+ messages in thread
From: Tilman Schmidt @ 2009-09-30 22:00 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Alan Cox, linux-kernel, netdev, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

Jarek Poplawski schrieb:
> Tilman Schmidt wrote, On 09/30/2009 08:55 PM:
> ...
>> Why then does mutex_lock() complain?
> 
> 
> Maybe it doesn't matter here, but this: 
> 
>> INFO: lockdep is turned off.
> suggests there was some lockdep issue/warning earlier.

That's just because some <deleted> decided that tainted kernels don't deserve lockdep.

-- 
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* capi.c calls receive_buf with interrupts disabled (was: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context)
  2009-09-30 21:12     ` Jarek Poplawski
@ 2009-09-30 23:15       ` Tilman Schmidt
  0 siblings, 0 replies; 7+ messages in thread
From: Tilman Schmidt @ 2009-09-30 23:15 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Alan Cox, linux-kernel, netdev, Alan Cox, Michael Buesch, isdn4linux

[-- Attachment #1: Type: text/plain, Size: 2709 bytes --]

Jarek Poplawski schrieb:
> Tilman Schmidt wrote, On 09/30/2009 08:55 PM:
[...]
>> - ppp_sync_receive() was called, as the LD's receive_buf method,
>>   via handle_recv_skb() [drivers/isdn/capi/capi.c line 504, inlined]
>>   from handle_minor_recv() [drivers/isdn/capi/capi.c line 519]
>>
>> - handle_minor_recv() was called from capi_recv_message()
>>   [drivers/isdn/capi/capi.c line 656]
>>
>> - capi_recv_message() was called, as the CAPI application's
>>   recv_message method, from recv_handler()
>>   [drivers/isdn/capi/kcapi.c line 268]
>>
>> - recv_handler() is never called directly. It's only scheduled
>>   via the work queue ap->recv_work from capi_ctr_handle_message()
>>   [drivers/isdn/capi/kcapi.c line 349]
>>
>> Even if we don't trust the backtraces, there's not much room for
>> another activation path. So for all I know, the expectation of the
>> tty logic should have been met. The call was indeed processed from
>> a work queue.
>>
>> Why then does mutex_lock() complain?
> 
> Hmm... capi_recv_message() calls handle_minor_recv() under
> spin_lock_irqsave(), doesn't it?

Well spotted. Indeed it does. That explains it, of course.

The spinlock in question was added by:

commit 053b47ff249b9e0a634dae807f81465205e7c228
Author: Michael Buesch <mb@bu3sch.de>
Date:   Mon Feb 12 00:53:26 2007 -0800

    [PATCH] Workaround CAPI subsystem locking issue
    
    I think the following patch should go into the kernel, until the ISDN/CAPI
    guys create the real fix for this issue.
     
    The issue is a concurrency issue with some internal CAPI data structure
    which can crash the kernel.

    On my FritzCard DSL with the AVM driver it crashes about once a day without
    this workaround patch.  With this workaround patch it's rock-stable (at
    least on UP, but I don't see why this shouldn't work on SMP as well.  But
    maybe I'm missing something.)
    
    This workaround is kind of a sledgehammer which inserts a global lock to
    wrap around all the critical sections.  Of course, this is a scalability
    issue, if you have many ISDN/CAPI cards.  But it prevents a crash.  So I
    vote for this fix to get merged, until people come up with a better
    solution.  Better have a stable kernel that's less scalable, than a
    crashing and useless kernel.
    
So let's cc the author of that patch, and also the good people on the
isdn4linux developer mailing list ...

Any ideas for a fix?

Thanks,
Tilman

-- 
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-09-30 23:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-30 14:50 N_PPP_SYNC ldisc BUG: sleeping function called from invalid context Tilman Schmidt
2009-09-30 16:47 ` Alan Cox
2009-09-30 18:55   ` Tilman Schmidt
2009-09-30 20:28     ` Jarek Poplawski
2009-09-30 22:00       ` Tilman Schmidt
2009-09-30 21:12     ` Jarek Poplawski
2009-09-30 23:15       ` capi.c calls receive_buf with interrupts disabled (was: N_PPP_SYNC ldisc BUG: sleeping function called from invalid context) Tilman Schmidt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.