From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joe Jin Subject: Re: kernel panic in skb_copy_bits Date: Fri, 28 Jun 2013 19:33:03 +0800 Message-ID: <51CD746F.7020603__36109.0254424655$1372419287$gmane$org@oracle.com> References: <51CBAA48.3080802@oracle.com> <1372311118.3301.214.camel@edumazet-glaptop> <51CD0E67.4000008@oracle.com> <1372402340.3301.229.camel@edumazet-glaptop> <1372412262.3301.251.camel@edumazet-glaptop> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1372412262.3301.251.camel@edumazet-glaptop> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Eric Dumazet Cc: Frank Blaschka , "zheng.x.li@oracle.com" , Ian Campbell , Stefano Stabellini , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Xen Devel , Jan Beulich , "David S. Miller" List-Id: xen-devel@lists.xenproject.org Hi Eric, Thanks for your patch, I'll test it then get back to you. Regards, Joe On 06/28/13 17:37, Eric Dumazet wrote: > OK please try the following patch > > > [PATCH] neighbour: fix a race in neigh_destroy() > > There is a race in neighbour code, because neigh_destroy() uses > skb_queue_purge(&neigh->arp_queue) without holding neighbour lock, > while other parts of the code assume neighbour rwlock is what > protects arp_queue > > Convert all skb_queue_purge() calls to the __skb_queue_purge() variant > > Use __skb_queue_head_init() instead of skb_queue_head_init() > to make clear we do not use arp_queue.lock > > And hold neigh->lock in neigh_destroy() to close the race. > > Reported-by: Joe Jin > Signed-off-by: Eric Dumazet > --- > net/core/neighbour.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/net/core/neighbour.c b/net/core/neighbour.c > index 2569ab2..b7de821 100644 > --- a/net/core/neighbour.c > +++ b/net/core/neighbour.c > @@ -231,7 +231,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev) > we must kill timers etc. and move > it to safe state. > */ > - skb_queue_purge(&n->arp_queue); > + __skb_queue_purge(&n->arp_queue); > n->arp_queue_len_bytes = 0; > n->output = neigh_blackhole; > if (n->nud_state & NUD_VALID) > @@ -286,7 +286,7 @@ static struct neighbour *neigh_alloc(struct neigh_table *tbl, struct net_device > if (!n) > goto out_entries; > > - skb_queue_head_init(&n->arp_queue); > + __skb_queue_head_init(&n->arp_queue); > rwlock_init(&n->lock); > seqlock_init(&n->ha_lock); > n->updated = n->used = now; > @@ -708,7 +708,9 @@ void neigh_destroy(struct neighbour *neigh) > if (neigh_del_timer(neigh)) > pr_warn("Impossible event\n"); > > - skb_queue_purge(&neigh->arp_queue); > + write_lock_bh(&neigh->lock); > + __skb_queue_purge(&neigh->arp_queue); > + write_unlock_bh(&neigh->lock); > neigh->arp_queue_len_bytes = 0; > > if (dev->netdev_ops->ndo_neigh_destroy) > @@ -858,7 +860,7 @@ static void neigh_invalidate(struct neighbour *neigh) > neigh->ops->error_report(neigh, skb); > write_lock(&neigh->lock); > } > - skb_queue_purge(&neigh->arp_queue); > + __skb_queue_purge(&neigh->arp_queue); > neigh->arp_queue_len_bytes = 0; > } > > @@ -1210,7 +1212,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new, > > write_lock_bh(&neigh->lock); > } > - skb_queue_purge(&neigh->arp_queue); > + __skb_queue_purge(&neigh->arp_queue); > neigh->arp_queue_len_bytes = 0; > } > out: > >