All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer@parallels.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Denys Fedoryshchenko <denys@visp.net.lb>,
	<netdev@vger.kernel.org>, <davem@davemloft.net>,
	<jmorris@namei.org>, <yoshfuji@linux-ipv6.org>, <kaber@trash.net>,
	<linux-kernel@vger.kernel.org>
Subject: Re: inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp
Date: Mon, 30 Apr 2012 01:45:29 -0300	[thread overview]
Message-ID: <4F9E18E9.5010503@parallels.com> (raw)
In-Reply-To: <1335689052.2900.86.camel@edumazet-glaptop>

On 04/29/2012 05:44 AM, Eric Dumazet wrote:
> On Sun, 2012-04-29 at 10:40 +0200, Eric Dumazet wrote:
>> On Sun, 2012-04-29 at 10:27 +0200, Eric Dumazet wrote:
>>> On Sun, 2012-04-29 at 10:41 +0300, Denys Fedoryshchenko wrote:
>>>> I apologize for late night post, and a lot of trash left in report.
>>>> I will cleanup it up now, and send with CC to maintainers.
>>>>
>>>> Server job are proxy, with very high rate of new connections.
>>>> Deadlock at peaktime can be easily reproduced in 10-15 minutes.
>>>>
>>>> Deadlock occured on almost all 3.3-stable versions (tried 3.3.3 -
>>>> 3.3.4). It is not easy to try older kernel,
>>>> but if required i can try.
>>>> Usually also, because SYN rate very high, i can see:
>>>> [   51.612987] TCP: Possible SYN flooding on port 8080. Sending
>>>> cookies.  Check SNMP counters.
>>>>
>>>>    [  762.903868]
>>>>    [  762.903880] =================================
>>>>    [  762.903890] [ INFO: inconsistent lock state ]
>>>>    [  762.903903] 3.3.4-build-0061 #8 Not tainted
>>>>    [  762.904133] ---------------------------------
>>>>    [  762.904344] inconsistent {IN-SOFTIRQ-W} ->  {SOFTIRQ-ON-W} usage.
>>>>    [  762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
>>>>    [  762.904542]  (key#3){+.?...}, at: [<c0232cc4>]
>>>> __percpu_counter_sum+0xd/0x58
>>>>    [  762.904542] {IN-SOFTIRQ-W} state was registered at:
>>>>    [  762.904542]   [<c0158b84>] __lock_acquire+0x284/0xc26
>>>>    [  762.904542]   [<c01598e8>] lock_acquire+0x71/0x85
>>>>    [  762.904542]   [<c0349765>] _raw_spin_lock+0x33/0x40
>>>>    [  762.904542]   [<c0232c93>] __percpu_counter_add+0x58/0x7c
>>>>    [  762.904542]   [<c02cfde1>] sk_clone_lock+0x1e5/0x200
>>>>    [  762.904542]   [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
>>>>    [  762.904542]   [<c0315778>] tcp_create_openreq_child+0x1b/0x404
>>>>    [  762.904542]   [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
>>>>    [  762.904542]   [<c031615a>] tcp_check_req+0x1fd/0x2d7
>>>>    [  762.904542]   [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
>>>>    [  762.904542]   [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
>>>>    [  762.904542]   [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
>>>>    [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
>>>>    [  762.904542]   [<c02fc652>] ip_local_deliver+0x41/0x45
>>>>    [  762.904542]   [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
>>>>    [  762.904542]   [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
>>>>    [  762.904542]   [<c02fc857>] ip_rcv+0x201/0x23e
>>>>    [  762.904542]   [<c02daa3a>] __netif_receive_skb+0x319/0x368
>>>>    [  762.904542]   [<c02dac07>] netif_receive_skb+0x4e/0x7d
>>>>    [  762.904542]   [<c02dacf6>] napi_skb_finish+0x1e/0x34
>>>>    [  762.904542]   [<c02db122>] napi_gro_receive+0x20/0x24
>>>>    [  762.904542]   [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
>>>>    [  762.904542]   [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
>>>>    [  762.904542]   [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
>>>>    [  762.904542]   [<c02db228>] net_rx_action+0x90/0x160
>>>>    [  762.904542]   [<c012a445>] __do_softirq+0x7b/0x118
>>>>    [  762.904542] irq event stamp: 156915469
>>>>    [  762.904542] hardirqs last  enabled at (156915469): [<c019b4f4>]
>>>> __slab_alloc.clone.58.clone.63+0xc4/0x2de
>>>>    [  762.904542] hardirqs last disabled at (156915468): [<c019b452>]
>>>> __slab_alloc.clone.58.clone.63+0x22/0x2de
>>>>    [  762.904542] softirqs last  enabled at (156915466): [<c02ce677>]
>>>> lock_sock_nested+0x64/0x6c
>>>>    [  762.904542] softirqs last disabled at (156915464): [<c0349914>]
>>>> _raw_spin_lock_bh+0xe/0x45
>>>>    [  762.904542]
>>>>    [  762.904542] other info that might help us debug this:
>>>>    [  762.904542]  Possible unsafe locking scenario:
>>>>    [  762.904542]
>>>>    [  762.904542]        CPU0
>>>>    [  762.904542]        ----
>>>>    [  762.904542]   lock(key#3);
>>>>    [  762.904542]<Interrupt>
>>>>    [  762.904542]     lock(key#3);
>>>>    [  762.904542]
>>>>    [  762.904542]  *** DEADLOCK ***
>>>>    [  762.904542]
>>>>    [  762.904542] 1 lock held by squid/1603:
>>>>    [  762.904542]  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>]
>>>> lock_sock+0xa/0xc
>>>>    [  762.904542]
>>>>    [  762.904542] stack backtrace:
>>>>    [  762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
>>>>    [  762.904542] Call Trace:
>>>>    [  762.904542]  [<c0347b73>] ? printk+0x18/0x1d
>>>>    [  762.904542]  [<c015873a>] valid_state+0x1f6/0x201
>>>>    [  762.904542]  [<c0158816>] mark_lock+0xd1/0x1bb
>>>>    [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
>>>>    [  762.904542]  [<c015805d>] ? check_usage_forwards+0x77/0x77
>>>>    [  762.904542]  [<c0158bf8>] __lock_acquire+0x2f8/0xc26
>>>>    [  762.904542]  [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
>>>>    [  762.904542]  [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
>>>>    [  762.904542]  [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
>>>>    [  762.904542]  [<c01598e8>] lock_acquire+0x71/0x85
>>>>    [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
>>>>    [  762.904542]  [<c0349765>] _raw_spin_lock+0x33/0x40
>>>>    [  762.904542]  [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
>>>>    [  762.904542]  [<c0232cc4>] __percpu_counter_sum+0xd/0x58
>>>>    [  762.904542]  [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
>>>>    [  762.904542]  [<c02d178d>] ? __alloc_skb+0x76/0x100
>>>>    [  762.904542]  [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
>>>>    [  762.904542]  [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
>>>>    [  762.904542]  [<c0306567>] tcp_sendmsg+0x18f/0x68b
>>>>    [  762.904542]  [<c031f3dc>] ? ip_fast_csum+0x30/0x30
>>>>    [  762.904542]  [<c0320193>] inet_sendmsg+0x53/0x5a
>>>>    [  762.904542]  [<c02cb633>] sock_aio_write+0xd2/0xda
>>>>    [  762.904542]  [<c015876b>] ? mark_lock+0x26/0x1bb
>>>>    [  762.904542]  [<c01a1017>] do_sync_write+0x9f/0xd9
>>>>    [  762.904542]  [<c01a2111>] ? file_free_rcu+0x2f/0x2f
>>>>    [  762.904542]  [<c01a17a1>] vfs_write+0x8f/0xab
>>>>    [  762.904542]  [<c01a284d>] ? fget_light+0x75/0x7c
>>>>    [  762.904542]  [<c01a1900>] sys_write+0x3d/0x5e
>>>>    [  762.904542]  [<c0349ec9>] syscall_call+0x7/0xb
>>>>    [  762.904542]  [<c0340000>] ? rp_sidt+0x41/0x83
>>>>
>>>
>>>
>>> OK, so when we have memory pressure we can call
>>> percpu_counter_read_positive() with SOFTIRQ enabled, and lockdep
>>> complains...
>>>
>>> This bug was probably added in 2008, in commit 1748376b6626a
>>> (net: Use a percpu_counter for sockets_allocated)
>>
>> Hmm, no this patch was fine.
>>
>> Bug was in fact added by Glauber Costa in commit 180d8cd942ce336b2c869
>> (foundations of per-cgroup memory pressure controlling.)
>>
>> Because he replaced the safe percpu_counter_read_positive() call to
>> unsafe percpu_counter_sum_positive() in
>> sk_sockets_allocated_read_positive()
>>
>> But anyway the patch I sent should fix the problem.
>
> Thinking again, I am not sure why Glauber did this change. He probably
> made a typo or something.
>

Indeed.

It wasn't my intent to change that, it was a mistake.

Very sorry.

  reply	other threads:[~2012-04-30  4:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-29  7:41 inconsistent lock/deadlock crash, vanilla 3.3.4, 32bit, tcp Denys Fedoryshchenko
2012-04-29  8:27 ` Eric Dumazet
2012-04-29  8:40   ` Eric Dumazet
2012-04-29  8:44     ` Eric Dumazet
2012-04-30  4:45       ` Glauber Costa [this message]
2012-04-30 16:49         ` David Miller
2012-04-29  8:46   ` Denys Fedoryshchenko
2012-04-29  9:21     ` Eric Dumazet
2012-04-29 21:32       ` Neal Cardwell
2012-04-30 17:38       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F9E18E9.5010503@parallels.com \
    --to=glommer@parallels.com \
    --cc=davem@davemloft.net \
    --cc=denys@visp.net.lb \
    --cc=eric.dumazet@gmail.com \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.