From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8229C433EF for ; Tue, 3 May 2022 13:43:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236378AbiECNrO (ORCPT ); Tue, 3 May 2022 09:47:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236392AbiECNrM (ORCPT ); Tue, 3 May 2022 09:47:12 -0400 Received: from mail-yw1-x112d.google.com (mail-yw1-x112d.google.com [IPv6:2607:f8b0:4864:20::112d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E89C91D32F for ; Tue, 3 May 2022 06:43:39 -0700 (PDT) Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-2f7c424c66cso180419007b3.1 for ; Tue, 03 May 2022 06:43:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GDW9NFvTVUZU0OXiK1V5vkvO7rLAmfjO6t3+PGR4Adg=; b=QgaEpgJAAyem65Q6TAKxwJJ3o1V+C7zObAPPgKlJM62+lc8DygEVsd+kYoEYkM16Fl vitMPIHaEUJ9hGhopT4e1uv51ALgp9qB8NpKutqi8oD5KNfOlzeymJCPL9wLQMqMfZfa gHgkLdEl3lrVRqMvjZeiosAb85yvrAbxdBDLHeGAfJlWEnxgVNoruJpbryO6HtlGHFdE TGDReQmhU8Jwcqp5xdkxTOvk91Uwna7rkKD1Yu7G38UuLHX1HtUBPP1n7v452Aujr25B FUUU62RlnvfoIgHhxDE7Wqxk4Cdl89PBphh7FMcF0ee2SDrJCdEsHQ1NN4+nCl89tghK g2dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GDW9NFvTVUZU0OXiK1V5vkvO7rLAmfjO6t3+PGR4Adg=; b=TJZkopieV2XCf0PufgwgcVXaFPPhOE0UXGZ8DJzZPg/fXNku1ICc6u0LbAhqHe3Y3U OfHx/8W3P4NGpanPPIZIusOYmFMTAxpamXaJe4F6GDh/aap52afY44RqNJ8cscctG0Q0 gHkWB1ZkHRkv+EugUIEQExo4CTKF9M9yreKjH50kTnhWXWVfJNA4av9ikkYsowI6i3GA 7TcGc4f6BzcAbIdLxRRDkpyuoB2uHci63WBZtUEmZbqzo4gU5uheCkPrkTmUlxR8UA/c Hbku9R1jiigLwLUrfnjyLL2sHgLdq+lRSdiAe1XZCqtS8CIm/qNCFanu6PTxc2YmaFO4 hRgA== X-Gm-Message-State: AOAM531ahlebqHcDi1/Vh7rJSM3p/wUPvBHrDwdO1LqO6DNNBa9GtlxQ W3Yn6k6lePP5vJkh5y6mXTmQqB0LVyUMFmPLpA70mQ== X-Google-Smtp-Source: ABdhPJyNRiMmKn7C6XCkZGQqbYy63g4zeoEzquu1MeADlJuMPOJ3hIrnxuDlqew8AFLimody0G+bq16PwL2zsWwlQHM= X-Received: by 2002:a81:1d4e:0:b0:2f7:be8b:502e with SMTP id d75-20020a811d4e000000b002f7be8b502emr15496205ywd.278.1651585418486; Tue, 03 May 2022 06:43:38 -0700 (PDT) MIME-Version: 1.0 References: <00000000000045dc96059f4d7b02@google.com> <000000000000f75af905d3ba0716@google.com> <5f90c2b8-283e-6ca5-65f9-3ea96df00984@I-love.SAKURA.ne.jp> <3b6bc24c8cd3f896dcd480ff75715a2bf9b2db06.camel@redhat.com> <8783dad64b0d41af9624f923cb4e4f03@AcuMS.aculab.com> In-Reply-To: <8783dad64b0d41af9624f923cb4e4f03@AcuMS.aculab.com> From: Eric Dumazet Date: Tue, 3 May 2022 06:43:27 -0700 Message-ID: Subject: Re: [PATCH v2] net: rds: acquire refcount on TCP sockets To: David Laight Cc: Paolo Abeni , Tetsuo Handa , Santosh Shilimkar , "David S. Miller" , Jakub Kicinski , syzbot , netdev , syzkaller-bugs , OFED mailing list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Tue, May 3, 2022 at 6:27 AM David Laight wrote: > > From: Paolo Abeni > > Sent: 03 May 2022 10:03 > > > > Hello, > > > > On Mon, 2022-05-02 at 10:40 +0900, Tetsuo Handa wrote: > > > syzbot is reporting use-after-free read in tcp_retransmit_timer() [1], > > > for TCP socket used by RDS is accessing sock_net() without acquiring a > > > refcount on net namespace. Since TCP's retransmission can happen after > > > a process which created net namespace terminated, we need to explicitly > > > acquire a refcount. > > > > > > Link: https://syzkaller.appspot.com/bug?extid=694120e1002c117747ed [1] > > > Reported-by: syzbot > > > Fixes: 26abe14379f8e2fa ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") > > > Fixes: 8a68173691f03661 ("net: sk_clone_lock() should only do get_net() if the parent is not a > > kernel socket") > > > Signed-off-by: Tetsuo Handa > > > Tested-by: syzbot > > > --- > > > Changes in v2: > > > Add Fixes: tag. > > > Move to inside lock_sock() section. > > > > > > I chose 26abe14379f8e2fa and 8a68173691f03661 which went to 4.2 for Fixes: tag, > > > for refcount was implicitly taken when 70041088e3b97662 ("RDS: Add TCP transport > > > to RDS") was added to 2.6.32. > > > > > > net/rds/tcp.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/net/rds/tcp.c b/net/rds/tcp.c > > > index 5327d130c4b5..2f638f8b7b1e 100644 > > > --- a/net/rds/tcp.c > > > +++ b/net/rds/tcp.c > > > @@ -495,6 +495,14 @@ void rds_tcp_tune(struct socket *sock) > > > > > > tcp_sock_set_nodelay(sock->sk); > > > lock_sock(sk); > > > + /* TCP timer functions might access net namespace even after > > > + * a process which created this net namespace terminated. > > > + */ > > > + if (!sk->sk_net_refcnt) { > > > + sk->sk_net_refcnt = 1; > > > + get_net_track(net, &sk->ns_tracker, GFP_KERNEL); > > > + sock_inuse_add(net, 1); > > > + } > > > if (rtn->sndbuf_size > 0) { > > > sk->sk_sndbuf = rtn->sndbuf_size; > > > sk->sk_userlocks |= SOCK_SNDBUF_LOCK; > > > > This looks equivalent to the fix presented here: > > > > https://lore.kernel.org/all/CANn89i+484ffqb93aQm1N-tjxxvb3WDKX0EbD7318RwRgsatjw@mail.gmail.com/ > > > > but the latter looks a more generic solution. @Tetsuo could you please > > test the above in your setup? > > Wouldn't a more generic solution be to add a flag to sock_create_kern() > so that it acquires a reference to the namespace? > This could be a bit on one of the existing parameters - like SOCK_NONBLOCK. > > I've a driver that uses __sock_create() in order to get that reference. > I'm pretty sure the extra 'security' check will never fail. > This would be silly really. Definition of a 'kernel socket' is that it does not hold a reference to the namespace. (otherwise a netns could not be destroyed by user space) A kernel layer using kernel sockets needs to properly dismantle them when a namespace is destroyed. In the RDS case, the socket was a user socket, or RDS lacked proper tracking of all the sockets so that they can be dismantled properly.