All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Jike Song <albcamus@gmail.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
	Parag Warudkar <parag.lkml@gmail.com>
Subject: Re: [PATCH] net: Fix sock_wfree() race
Date: Wed, 09 Sep 2009 11:18:01 +0200	[thread overview]
Message-ID: <4AA772C9.9010001@gmail.com> (raw)
In-Reply-To: <df9815e70909090014w27828827j860ecee4f1c3a9e2@mail.gmail.com>

Jike Song a écrit :
> On Wed, Sep 9, 2009 at 6:49 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>> Eric Dumazet a écrit :
>>> Jike Song a écrit :
>>>> On Tue, Sep 8, 2009 at 3:38 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>>>>> We decrement a refcnt while object already freed.
>>>>>
>>>>> (SLUB DEBUG poisons the zone with 0x6B pattern)
>>>>>
>>>>> You might add this patch to trigger a WARN_ON when refcnt >= 0x60000000U
>>>>> in sk_free() : We'll see the path trying to delete an already freed sock
>>>>>
>>>>> diff --git a/net/core/sock.c b/net/core/sock.c
>>>>> index 7633422..1cb85ff 100644
>>>>> --- a/net/core/sock.c
>>>>> +++ b/net/core/sock.c
>>>>> @@ -1058,6 +1058,7 @@ static void __sk_free(struct sock *sk)
>>>>>
>>>>>  void sk_free(struct sock *sk)
>>>>>  {
>>>>> +       WARN_ON(atomic_read(&sk->sk_wmem_alloc) >= 0x60000000U);
>>>>>        /*
>>>>>         * We substract one from sk_wmem_alloc and can know if
>>>>>        * some packets are still in some tx queue.
>>>>>
>>>>>
>>>> The output of dmesg with this patch appllied is attached.
>>>>
>>>>
>>> Unfortunatly this WARN_ON was not triggered,
>>> maybe freeing comes from sock_wfree()
>>>
>>> Could you try this patch instead ?
>>>
>>> Thanks
>>>
>>> diff --git a/net/core/sock.c b/net/core/sock.c
>>> index 7633422..30469dc 100644
>>> --- a/net/core/sock.c
>>> +++ b/net/core/sock.c
>>> @@ -1058,6 +1058,7 @@ static void __sk_free(struct sock *sk)
>>>
>>>  void sk_free(struct sock *sk)
>>>  {
>>> +     WARN_ON(atomic_read(&sk->sk_wmem_alloc) >= 0x60000000U);
>>>       /*
>>>        * We substract one from sk_wmem_alloc and can know if
>>>       * some packets are still in some tx queue.
>>> @@ -1220,6 +1221,7 @@ void sock_wfree(struct sk_buff *skb)
>>>       struct sock *sk = skb->sk;
>>>       int res;
>>>
>>> +     WARN_ON(atomic_read(&sk->sk_wmem_alloc) >= 0x60000000U);
>>>       /* In case it might be waiting for more memory. */
>>>       res = atomic_sub_return(skb->truesize, &sk->sk_wmem_alloc);
>>>       if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE))
>>>
>>
>> David, I believe problem could come from a race in sock_wfree()
>>
>> It used to have two atomic ops.
>>
>> One doing the atomic_sub(skb->truesize, &sk->sk_wmem_alloc);
>> then one sock_put() doing the atomic_dec_and_test(&sk->sk_refcnt)
>>
>> Now, if two cpus are both :
>>
>> CPU 1 calling sock_wfree()
>> CPU 2 calling the 'final' sock_put(),
>> CPU 1 doing sock_wfree() might call sk->sk_write_space(sk)
>> while CPU 2 is already freeing the socket.
>>
>>
>> Please note I did not test this patch, its very late here and I should get some sleep now...
>>
>> Thanks
>>
>> [PATCH] net: Fix sock_wfree() race
>>
>> Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
>> (net: No more expensive sock_hold()/sock_put() on each tx)
>> opens a window in sock_wfree() where another cpu
>> might free the socket we are working on.
>>
>> Fix is to call sk->sk_write_space(sk) only
>> while still holding a reference on sk.
>>
>> Since doing this call is done before the
>> atomic_sub(truesize, &sk->sk_wmem_alloc), we should pass truesize as
>> a bias for possible sk_wmem_alloc evaluations.
>>
>> Reported-by: Jike Song <albcamus@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> Eric, I'm unable to apply this patch neatly.  I applied it by hand,
> and did some change necessary. This patch for test is attached.
> 
> With this patch applied, when run vncviewer, the kerneloops service
> still reports kernel failure. But I can't see any in dmesg output.
> 
> 

Sorry this was a patch against net-next-2.6

We probably can do something less intrusive for linux-2.6.31

[PATCH] net: Fix sock_wfree() race

Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
(net: No more expensive sock_hold()/sock_put() on each tx)
opens a window in sock_wfree() where another cpu
might free the socket we are working on.

A possible fix is to call sk->sk_write_space(sk) only
while still holding a reference on sk.


Reported-by: Jike Song <albcamus@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---

diff --git a/net/core/sock.c b/net/core/sock.c
index 7633422..aba5cd0 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1220,10 +1220,12 @@ void sock_wfree(struct sk_buff *skb)
 	struct sock *sk = skb->sk;
 	int res;

-	/* In case it might be waiting for more memory. */
-	res = atomic_sub_return(skb->truesize, &sk->sk_wmem_alloc);
-	if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE))
+	if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE)) {
+		atomic_sub(skb->truesize - 1, &sk->sk_wmem_alloc);
 		sk->sk_write_space(sk);
+		res = atomic_sub_return(1, &sk->sk_wmem_alloc);
+	} else
+		res = atomic_sub_return(skb->truesize, &sk->sk_wmem_alloc);
 	/*
 	 * if sk_wmem_alloc reached 0, we are last user and should
 	 * free this sock, as sk_free() call could not do it.

  reply	other threads:[~2009-09-09  9:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-08  3:56 BUG UNIX: Poison overwritten with 2.6.31-rc6-00223-g6c30c53 Parag Warudkar
2009-09-08  4:51 ` Jike Song
2009-09-08  7:38   ` Eric Dumazet
2009-09-08  8:09     ` Jike Song
2009-09-08 12:12       ` Eric Dumazet
2009-09-08 22:49         ` [PATCH] net: Fix sock_wfree() race Eric Dumazet
2009-09-09  7:14           ` Jike Song
2009-09-09  7:14             ` Jike Song
2009-09-09  9:18             ` Eric Dumazet [this message]
2009-09-11 18:43           ` David Miller
2009-09-11 19:52             ` David Miller
2009-09-23 13:44               ` Eric Dumazet
2009-09-24 20:07                 ` Jarek Poplawski
2009-09-24 20:49                   ` Eric Dumazet
2009-09-30 23:23                     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AA772C9.9010001@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=albcamus@gmail.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=parag.lkml@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.