From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751513AbdIORl1 (ORCPT ); Fri, 15 Sep 2017 13:41:27 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:33881 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751184AbdIORlZ (ORCPT ); Fri, 15 Sep 2017 13:41:25 -0400 X-Google-Smtp-Source: ADKCNb492r/kcxUJ75ftmO3F7ZrFZbkzouSxPaeuetU8VzKzW39xQs8gCaRBiaLTzBgEO5kNKnw5dhYqlmB3kSbSnCk= MIME-Version: 1.0 In-Reply-To: References: <20170914140722.44292-1-nixiaoming@huawei.com> From: Cong Wang Date: Fri, 15 Sep 2017 10:41:04 -0700 Message-ID: Subject: Re: [PATCH] net/packet: fix race condition between fanout_add and __unregister_prot_hook To: Willem de Bruijn Cc: nixiaoming , David Miller , Eric Dumazet , waltje@uwalt.nl.mugnet.org, gw4pts@gw4pts.ampr.org, Andrey Konovalov , Tobias Klauser , Philip Pettersson , Alexander Potapenko , Network Development , LKML , dede.wu@huawei.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 14, 2017 at 7:35 AM, Willem de Bruijn wrote: > On Thu, Sep 14, 2017 at 10:07 AM, nixiaoming wrote: >> From: l00219569 >> >> If fanout_add is preempted after running po-> fanout = match >> and before running __fanout_link, >> it will cause BUG_ON when __unregister_prot_hook call __fanout_unlink >> >> so, we need add mutex_lock(&fanout_mutex) to __unregister_prot_hook > > The packet socket code has no shortage of locks, so there are many > ways to avoid the race condition between fanout_add and packet_set_ring. > > Another option would be to lock the socket when calling fanout_add: > > - return fanout_add(sk, val & 0xffff, val >> 16); > + lock_sock(sk); > + ret = fanout_add(sk, val & 0xffff, val >> 16); > + release_sock(sk); > + return ret; > I don't think this is an option, because __unregister_prot_hook() can be called without lock_sock(), for example in packet_notifier(). > But, for consistency, and to be able to continue to make sense of the > locking policy, we should use the most appropriate lock. This > is po->bind_lock, as it ensures atomicity between testing whether > a protocol hook is active through po->running and the actual existence > of that hook on the protocol hook list. Yeah, register_prot_hook() and unregister_prot_hook() already assume bind_lock. [...] >> out: >> mutex_unlock(&fanout_mutex); >> + spin_unlock(&po->bind_lock); > > This function can call kzalloc with GFP_KERNEL, which may sleep. It is > not correct to sleep while holding a spinlock. Which is why I take the lock > later and test po->running again. Right, no need to mention the mutex_unlock() before the spin_unlock() is clearly wrong. > > I will clean up that patch and send it for review. How about the following patch? diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index c26172995511..f5c696a548ed 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1754,10 +1754,14 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags) match->prot_hook.dev == po->prot_hook.dev) { err = -ENOSPC; if (refcount_read(&match->sk_ref) < PACKET_FANOUT_MAX) { + spin_lock(&po->bind_lock); __dev_remove_pack(&po->prot_hook); - po->fanout = match; - refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1); - __fanout_link(sk, po); + if (po->running) { + refcount_set(&match->sk_ref, refcount_read(&match->sk_ref) + 1); + po->fanout = match; + __fanout_link(sk, po); + } + spin_unlock(&po->bind_lock); err = 0; } }