From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> To: Vlad Yasevich <vyasevich@gmail.com>, netdev@vger.kernel.org Cc: linux-sctp@vger.kernel.org, dvyukov@google.com, eric.dumazet@gmail.com, syzkaller@googlegroups.com, kcc@google.com, glider@google.com, sasha.levin@oracle.com Subject: Re: [PATCH net] sctp: do sanity checks before migrating the asoc Date: Tue, 19 Jan 2016 17:31:59 -0200 [thread overview] Message-ID: <569E8F2F.5070906@gmail.com> (raw) In-Reply-To: <569E8280.9080701@gmail.com> Em 19-01-2016 16:37, Vlad Yasevich escreveu: > On 01/19/2016 10:59 AM, Marcelo Ricardo Leitner wrote: >> Em 19-01-2016 12:19, Vlad Yasevich escreveu: >>> On 01/15/2016 04:40 PM, Marcelo Ricardo Leitner wrote: >>>> On Fri, Jan 15, 2016 at 08:11:03PM +0100, Dmitry Vyukov wrote: >>>>> On Fri, Jan 15, 2016 at 7:46 PM, Marcelo Ricardo Leitner >>>>> <marcelo.leitner@gmail.com> wrote: >>>>>> On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote: >>>>>>> Hello, >>>>>>> >>>>>>> The following program leads to a leak of two sock objects: >>>>>> ... >>>>>>> >>>>>>> On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). >>>>>> >>>>>> I'm afraid I cannot reproduce this one? >>>>>> I enabled dynprintk at sctp_destroy_sock and it does print twice when I >>>>>> run this test app. >>>>>> Also added debugs to check association lifetime, and then it was >>>>>> destroyed. Same for endpoint. >>>>>> >>>>>> Checking with trace-cmd, both calls to sctp_close() resulted in >>>>>> sctp_destroy_sock() being called. >>>>>> >>>>>> As for sock_hold/put, they are matched too. >>>>>> >>>>>> Ideas? Log is below for double checking >>>>> >>>>> >>>>> Hummm... I can reproduce it pretty reliably. >>>>> >>>>> [ 197.459024] kmemleak: 11 new suspected memory leaks (see >>>>> /sys/kernel/debug/kmemleak) >>>>> [ 307.494874] kmemleak: 409 new suspected memory leaks (see >>>>> /sys/kernel/debug/kmemleak) >>>>> [ 549.784022] kmemleak: 125 new suspected memory leaks (see >>>>> /sys/kernel/debug/kmemleak) >>>>> >>>>> I double checked via /proc/slabinfo: >>>>> >>>>> SCTPv6 4373 4420 2368 13 8 : tunables 0 0 >>>>> 0 : slabdata 340 340 0 >>>>> >>>>> SCTPv6 starts with almost 0, but grows infinitely while I run the >>>>> program in a loop. >>>>> >>>>> Here is my SCTP related configs: >>>>> >>>>> CONFIG_IP_SCTP=y >>>>> CONFIG_NET_SCTPPROBE=y >>>>> CONFIG_SCTP_DBG_OBJCNT=y >>>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set >>>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set >>>>> CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE=y >>>>> # CONFIG_SCTP_COOKIE_HMAC_MD5 is not set >>>>> # CONFIG_SCTP_COOKIE_HMAC_SHA1 is not set >>>>> >>>>> I am on commit 67990608c8b95d2b8ccc29932376ae73d5818727 and I don't >>>>> seem to have any sctp-related changes on top. >>>> >>>> Ok, now I can. Enabled slub debugs, now I cannot see calls to >>>> sctp_destroy_sock. I see to sctp_close, but not to sctp_destroy_sock. >>>> >>>> And SCTPv6 grew by 2 sockets after the execution. >>>> >>>> Further checking, it's a race within SCTP asoc migration: >>>> thread 0 thread 1 >>>> - app creates a sock >>>> - sends a packet to itself >>>> - sctp will create an asoc and do implicit >>>> handshake >>>> - send the packet >>>> - listen() >>>> - accept() is called and >>>> that asoc is migrated >>>> - packet is delivered >>>> - skb->destructor is called, BUT: >>>> >>>> (note that if accept() is called after packet is delivered and skb is freed, it >>>> doesn't happen) >>>> >>>> static void sctp_wfree(struct sk_buff *skb) >>>> { >>>> struct sctp_chunk *chunk = skb_shinfo(skb)->destructor_arg; >>>> struct sctp_association *asoc = chunk->asoc; >>>> struct sock *sk = asoc->base.sk; >>>> ... >>>> atomic_sub(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc); >>>> >>>> and it's pointing to the new socket already. So one socket gets a leak >>>> on sk_wmem_alloc and another gets a negative value: >>>> >>>> --- a/net/sctp/socket.c >>>> +++ b/net/sctp/socket.c >>>> @@ -1537,12 +1537,14 @@ static void sctp_close(struct sock *sk, long timeout) >>>> /* Hold the sock, since sk_common_release() will put sock_put() >>>> * and we have just a little more cleanup. >>>> */ >>>> + printk("%s sock_hold %p\n", __func__, sk); >>>> sock_hold(sk); >>>> sk_common_release(sk); >>>> >>>> bh_unlock_sock(sk); >>>> spin_unlock_bh(&net->sctp.addr_wq_lock); >>>> >>>> + printk("%s sock_put %p %d %d\n", __func__, sk, atomic_read(&sk->sk_refcnt), >>>> atomic_read(&sk->sk_wmem_alloc)); >>>> sock_put(sk); >>>> >>>> SCTP_DBG_OBJCNT_DEC(sock); >>>> >>>> >>>> gave me: >>>> >>>> [ 99.456944] sctp_close sock_hold ffff880137df8940 >>>> ... >>>> [ 99.457337] sctp_close sock_put ffff880137df8940 1 -247 >>>> [ 99.458313] sctp_close sock_hold ffff880137dfef00 >>>> ... >>>> [ 99.458383] sctp_close sock_put ffff880137dfef00 1 249 >>>> >>>> That's why the socket is not freed.. >>>> >>> >>> Interesting... sctp_sock_migrate() accounts for this race in the >>> receive buffer, but not the send buffer. >>> >>> On the one hand I am not crazy about the connect-to-self scenario. >>> On the other, I think to support it correctly, we should support >>> skb migrations for the send case just like we do the receive case. >> >> >> Yes, not thrilled here either about connect-to-self. >> >> But there is a big difference on how both works. For rx we can just look for wanted skbs >> in rx queue, as they aren't going anywhere, but for tx I don't think we can easily block >> sctp_wfree() call because that may be happening on another CPU (or am I mistaken here? >> sctp still doesn't have RFS but even irqbalance could affect this AFAICT) and more than >> one skb may be in transit at a time. > > The way it's done now, we wouldn't have to block sctp_wfree. Chunks are released under > lock when they are acked, so we are OK here. The tx completions will just put 1 byte back > to the socket associated with the tx'ed skb, and that should still be ok as > sctp_packet_release_owner will call sk_free(). Please let me rephrase it. I'm actually worried about the asoc->base.sk part of the story and how it's fetched in sctp_wfree(). I think we can update that sk pointer after sock_wfree() has fetched it but not used it yet, possibly leading to accounting it twice, one during migration and one on sock_wfree. In sock_wfree() it will update some sk stats like sk->sk_wmem_alloc, among others. That is, I don't see anything that would avoid that. >> The lockings for this on sctp_chunk would be pretty nasty, I think, and normal usage lets >> say wouldn't be benefit from it. Considering the possible migration, as we can't trust >> chunk->asoc right away in sctp_wfree, the lock would reside in sctp_chunk and we would >> have to go on taking locks one by one on tx queue for the migration. Ugh ;) >> > > No, the chunks manipulation is done under the socket locket so I don't think we have to > worry about a per chunk lock. We should be able to trust chunk->asoc pointer always > because each chunk holds a ref on the association. The only somewhat ugly thing > about moving tx chunks is that you have to potentially walk a lot of lists to move > things around. There are all the lists in the sctp_outqueue struct, plus the > per-transport retransmit list... Agreed, no per-chunk lock needed, maybe just one to protect sctp_ep_common.sk ? > Even though the above seems to be a PITA, my main reason for recommending this is > that can happen in normal situations too. Consider a very busy association that is > transferring a lot of a data on a 1-to-many socket. The app decides to move do a > peel-off, and we could now be stuck not being able to peel-off for a quite a while > if there is a hick-up in the network and we have to rtx multiple times. Fair point. Marcelo
WARNING: multiple messages have this Message-ID (diff)
From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> To: Vlad Yasevich <vyasevich@gmail.com>, netdev@vger.kernel.org Cc: linux-sctp@vger.kernel.org, dvyukov@google.com, eric.dumazet@gmail.com, syzkaller@googlegroups.com, kcc@google.com, glider@google.com, sasha.levin@oracle.com Subject: Re: [PATCH net] sctp: do sanity checks before migrating the asoc Date: Tue, 19 Jan 2016 19:31:59 +0000 [thread overview] Message-ID: <569E8F2F.5070906@gmail.com> (raw) In-Reply-To: <569E8280.9080701@gmail.com> Em 19-01-2016 16:37, Vlad Yasevich escreveu: > On 01/19/2016 10:59 AM, Marcelo Ricardo Leitner wrote: >> Em 19-01-2016 12:19, Vlad Yasevich escreveu: >>> On 01/15/2016 04:40 PM, Marcelo Ricardo Leitner wrote: >>>> On Fri, Jan 15, 2016 at 08:11:03PM +0100, Dmitry Vyukov wrote: >>>>> On Fri, Jan 15, 2016 at 7:46 PM, Marcelo Ricardo Leitner >>>>> <marcelo.leitner@gmail.com> wrote: >>>>>> On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote: >>>>>>> Hello, >>>>>>> >>>>>>> The following program leads to a leak of two sock objects: >>>>>> ... >>>>>>> >>>>>>> On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). >>>>>> >>>>>> I'm afraid I cannot reproduce this one? >>>>>> I enabled dynprintk at sctp_destroy_sock and it does print twice when I >>>>>> run this test app. >>>>>> Also added debugs to check association lifetime, and then it was >>>>>> destroyed. Same for endpoint. >>>>>> >>>>>> Checking with trace-cmd, both calls to sctp_close() resulted in >>>>>> sctp_destroy_sock() being called. >>>>>> >>>>>> As for sock_hold/put, they are matched too. >>>>>> >>>>>> Ideas? Log is below for double checking >>>>> >>>>> >>>>> Hummm... I can reproduce it pretty reliably. >>>>> >>>>> [ 197.459024] kmemleak: 11 new suspected memory leaks (see >>>>> /sys/kernel/debug/kmemleak) >>>>> [ 307.494874] kmemleak: 409 new suspected memory leaks (see >>>>> /sys/kernel/debug/kmemleak) >>>>> [ 549.784022] kmemleak: 125 new suspected memory leaks (see >>>>> /sys/kernel/debug/kmemleak) >>>>> >>>>> I double checked via /proc/slabinfo: >>>>> >>>>> SCTPv6 4373 4420 2368 13 8 : tunables 0 0 >>>>> 0 : slabdata 340 340 0 >>>>> >>>>> SCTPv6 starts with almost 0, but grows infinitely while I run the >>>>> program in a loop. >>>>> >>>>> Here is my SCTP related configs: >>>>> >>>>> CONFIG_IP_SCTP=y >>>>> CONFIG_NET_SCTPPROBE=y >>>>> CONFIG_SCTP_DBG_OBJCNT=y >>>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set >>>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set >>>>> CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE=y >>>>> # CONFIG_SCTP_COOKIE_HMAC_MD5 is not set >>>>> # CONFIG_SCTP_COOKIE_HMAC_SHA1 is not set >>>>> >>>>> I am on commit 67990608c8b95d2b8ccc29932376ae73d5818727 and I don't >>>>> seem to have any sctp-related changes on top. >>>> >>>> Ok, now I can. Enabled slub debugs, now I cannot see calls to >>>> sctp_destroy_sock. I see to sctp_close, but not to sctp_destroy_sock. >>>> >>>> And SCTPv6 grew by 2 sockets after the execution. >>>> >>>> Further checking, it's a race within SCTP asoc migration: >>>> thread 0 thread 1 >>>> - app creates a sock >>>> - sends a packet to itself >>>> - sctp will create an asoc and do implicit >>>> handshake >>>> - send the packet >>>> - listen() >>>> - accept() is called and >>>> that asoc is migrated >>>> - packet is delivered >>>> - skb->destructor is called, BUT: >>>> >>>> (note that if accept() is called after packet is delivered and skb is freed, it >>>> doesn't happen) >>>> >>>> static void sctp_wfree(struct sk_buff *skb) >>>> { >>>> struct sctp_chunk *chunk = skb_shinfo(skb)->destructor_arg; >>>> struct sctp_association *asoc = chunk->asoc; >>>> struct sock *sk = asoc->base.sk; >>>> ... >>>> atomic_sub(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc); >>>> >>>> and it's pointing to the new socket already. So one socket gets a leak >>>> on sk_wmem_alloc and another gets a negative value: >>>> >>>> --- a/net/sctp/socket.c >>>> +++ b/net/sctp/socket.c >>>> @@ -1537,12 +1537,14 @@ static void sctp_close(struct sock *sk, long timeout) >>>> /* Hold the sock, since sk_common_release() will put sock_put() >>>> * and we have just a little more cleanup. >>>> */ >>>> + printk("%s sock_hold %p\n", __func__, sk); >>>> sock_hold(sk); >>>> sk_common_release(sk); >>>> >>>> bh_unlock_sock(sk); >>>> spin_unlock_bh(&net->sctp.addr_wq_lock); >>>> >>>> + printk("%s sock_put %p %d %d\n", __func__, sk, atomic_read(&sk->sk_refcnt), >>>> atomic_read(&sk->sk_wmem_alloc)); >>>> sock_put(sk); >>>> >>>> SCTP_DBG_OBJCNT_DEC(sock); >>>> >>>> >>>> gave me: >>>> >>>> [ 99.456944] sctp_close sock_hold ffff880137df8940 >>>> ... >>>> [ 99.457337] sctp_close sock_put ffff880137df8940 1 -247 >>>> [ 99.458313] sctp_close sock_hold ffff880137dfef00 >>>> ... >>>> [ 99.458383] sctp_close sock_put ffff880137dfef00 1 249 >>>> >>>> That's why the socket is not freed.. >>>> >>> >>> Interesting... sctp_sock_migrate() accounts for this race in the >>> receive buffer, but not the send buffer. >>> >>> On the one hand I am not crazy about the connect-to-self scenario. >>> On the other, I think to support it correctly, we should support >>> skb migrations for the send case just like we do the receive case. >> >> >> Yes, not thrilled here either about connect-to-self. >> >> But there is a big difference on how both works. For rx we can just look for wanted skbs >> in rx queue, as they aren't going anywhere, but for tx I don't think we can easily block >> sctp_wfree() call because that may be happening on another CPU (or am I mistaken here? >> sctp still doesn't have RFS but even irqbalance could affect this AFAICT) and more than >> one skb may be in transit at a time. > > The way it's done now, we wouldn't have to block sctp_wfree. Chunks are released under > lock when they are acked, so we are OK here. The tx completions will just put 1 byte back > to the socket associated with the tx'ed skb, and that should still be ok as > sctp_packet_release_owner will call sk_free(). Please let me rephrase it. I'm actually worried about the asoc->base.sk part of the story and how it's fetched in sctp_wfree(). I think we can update that sk pointer after sock_wfree() has fetched it but not used it yet, possibly leading to accounting it twice, one during migration and one on sock_wfree. In sock_wfree() it will update some sk stats like sk->sk_wmem_alloc, among others. That is, I don't see anything that would avoid that. >> The lockings for this on sctp_chunk would be pretty nasty, I think, and normal usage lets >> say wouldn't be benefit from it. Considering the possible migration, as we can't trust >> chunk->asoc right away in sctp_wfree, the lock would reside in sctp_chunk and we would >> have to go on taking locks one by one on tx queue for the migration. Ugh ;) >> > > No, the chunks manipulation is done under the socket locket so I don't think we have to > worry about a per chunk lock. We should be able to trust chunk->asoc pointer always > because each chunk holds a ref on the association. The only somewhat ugly thing > about moving tx chunks is that you have to potentially walk a lot of lists to move > things around. There are all the lists in the sctp_outqueue struct, plus the > per-transport retransmit list... Agreed, no per-chunk lock needed, maybe just one to protect sctp_ep_common.sk ? > Even though the above seems to be a PITA, my main reason for recommending this is > that can happen in normal situations too. Consider a very busy association that is > transferring a lot of a data on a 1-to-many socket. The app decides to move do a > peel-off, and we could now be stuck not being able to peel-off for a quite a while > if there is a hick-up in the network and we have to rtx multiple times. Fair point. Marcelo
next prev parent reply other threads:[~2016-01-19 19:32 UTC|newest] Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-12-30 20:42 net/sctp: sock memory leak Dmitry Vyukov 2015-12-30 20:42 ` Dmitry Vyukov 2015-12-30 20:47 ` Marcelo Ricardo Leitner 2015-12-30 20:47 ` Marcelo Ricardo Leitner 2016-01-15 18:46 ` Marcelo Ricardo Leitner 2016-01-15 18:46 ` Marcelo Ricardo Leitner 2016-01-15 19:11 ` Dmitry Vyukov 2016-01-15 19:11 ` Dmitry Vyukov 2016-01-15 21:40 ` [PATCH net] sctp: do sanity checks before migrating the asoc Marcelo Ricardo Leitner 2016-01-15 21:40 ` Marcelo Ricardo Leitner 2016-01-19 14:19 ` Vlad Yasevich 2016-01-19 14:19 ` Vlad Yasevich 2016-01-19 15:59 ` Marcelo Ricardo Leitner 2016-01-19 15:59 ` Marcelo Ricardo Leitner 2016-01-19 18:37 ` Vlad Yasevich 2016-01-19 18:37 ` Vlad Yasevich 2016-01-19 19:31 ` Marcelo Ricardo Leitner [this message] 2016-01-19 19:31 ` Marcelo Ricardo Leitner 2016-01-19 19:55 ` Vlad Yasevich 2016-01-19 19:55 ` Vlad Yasevich 2016-01-19 20:08 ` Marcelo Ricardo Leitner 2016-01-19 20:08 ` Marcelo Ricardo Leitner 2016-02-03 16:13 ` Dmitry Vyukov 2016-02-03 16:13 ` Dmitry Vyukov 2016-02-04 9:47 ` Marcelo Ricardo Leitner 2016-02-04 9:47 ` Marcelo Ricardo Leitner 2016-03-02 8:56 ` net/sctp: sock memory leak Dmitry Vyukov 2016-03-02 8:56 ` Dmitry Vyukov 2016-03-02 19:42 ` Marcelo Ricardo Leitner 2016-03-02 19:42 ` Marcelo Ricardo Leitner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=569E8F2F.5070906@gmail.com \ --to=marcelo.leitner@gmail.com \ --cc=dvyukov@google.com \ --cc=eric.dumazet@gmail.com \ --cc=glider@google.com \ --cc=kcc@google.com \ --cc=linux-sctp@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=sasha.levin@oracle.com \ --cc=syzkaller@googlegroups.com \ --cc=vyasevich@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.