From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752054AbdINLWp (ORCPT ); Thu, 14 Sep 2017 07:22:45 -0400 Received: from mail-io0-f196.google.com ([209.85.223.196]:34626 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbdINLWn (ORCPT ); Thu, 14 Sep 2017 07:22:43 -0400 X-Google-Smtp-Source: AOwi7QDX3DHW9MBKYQ80VvOUdGUrZD45qEMYQXv12e7lFAE/zsOy7QhVzJh19z8b0ZedrG0/Y2g112APm0APLuSyIZ8= MIME-Version: 1.0 In-Reply-To: <20170914024046.92505-1-nixiaoming@huawei.com> References: <20170914024046.92505-1-nixiaoming@huawei.com> From: Willem de Bruijn Date: Thu, 14 Sep 2017 07:22:02 -0400 Message-ID: Subject: Re: [PATCH] net/packet: fix race condition between fanout_add and __unregister_prot_hook To: nixiaoming Cc: David Miller , Eric Dumazet , waltje@uwalt.nl.mugnet.org, gw4pts@gw4pts.ampr.org, Andrey Konovalov , Tobias Klauser , philip.pettersson@gmail.com, Alexander Potapenko , Network Development , LKML , dede.wu@huawei.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 13, 2017 at 10:40 PM, nixiaoming wrote: > If fanout_add is preempted after running po-> fanout = match > and before running __fanout_link, > it will cause BUG_ON when __unregister_prot_hook call __fanout_unlink > > so, we need add mutex_lock(&fanout_mutex) to __unregister_prot_hook > or add spin_lock(&po->bind_lock) before po-> fanout = match > > test on linux 4.1.42: > ./trinity -c setsockopt -C 2 -X & > > BUG: failure at net/packet/af_packet.c:1414/__fanout_unlink()! > Kernel panic - not syncing: BUG! > CPU: 2 PID: 2271 Comm: trinity-c0 Tainted: G W O 4.1.12 #1 > Hardware name: Hisilicon PhosphorHi1382 FPGA (DT) > Call trace: > [] dump_backtrace+0x0/0xf8 > [] show_stack+0x20/0x28 > [] dump_stack+0xac/0xe4 > [] panic+0xf8/0x268 > [] __unregister_prot_hook+0xa0/0x144 > [] packet_set_ring+0x280/0x5b4 > [] packet_setsockopt+0x320/0x950 > [] SyS_setsockopt+0xa4/0xd4 > > Signed-off-by: nixiaoming > Tested-by: wudesheng > --- > net/packet/af_packet.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c > index 008a45c..0300146 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -365,10 +365,12 @@ static void __unregister_prot_hook(struct sock *sk, bool sync) > > po->running = 0; > > + mutex_lock(&fanout_mutex); > if (po->fanout) > __fanout_unlink(sk, po); > else > __dev_remove_pack(&po->prot_hook); > + mutex_unlock(&fanout_mutex); > > __sock_put(sk); I happened to be looking at the same or a very similar race, courtesy of syzkaller. packet_set_ring and fanout_add can race. I believe that one bug is in fanout_add removing the socket protocol hook and adding the fanout protocol hook without holding po->bind_lock. That lock ensures atomic updates to po->running and the actual protocol hook. fanout_add tests po->running without holding the lock if (!po->running) goto out; and later unconditionally unbinds the socket protocol hook and binds the fanout group protocol hook: if (refcount_read(&match->sk_ref) < PACKET_FANOUT_MAX) { __dev_remove_pack(&po->prot_hook); po->fanout = match; refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1); __fanout_link(sk, po); err = 0; } This can happen after packet_set_ring has already removed the protocol hook, causing the socket to be added to the fanout list twice. Testing po->running again, this time while holding the bind_lock, ensures that packet_set_ring cannot have dropped it in between: + spin_lock(&po->bind_lock); + if (!po->running) { + net_err_ratelimited("fanout add, but unbound sock"); + err = -EFAULT; + spin_unlock(&po->bind_lock); + goto out; + } + __dev_remove_pack(&po->prot_hook)); po->fanout = match; refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1); __fanout_link(sk, po); + spin_unlock(&po->bind_lock); I verified that the reproducer logs plenty of "fanout add, but unbound sock" messages. I intend to send this fix after cleaning it up a bit. Will take a closer look at your patch to see whether these are indeed the same bug report.