From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kaiwen Xu Subject: Re: loopback device reference count leakage Date: Fri, 27 Jan 2017 03:15:52 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT Cc: "netdev@vger.kernel.org" To: Cong Wang Return-path: Received: from bay004-omc3s14.hotmail.com ([65.54.190.152]:52849 "EHLO BAY004-OMC3S14.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753347AbdA0DW4 (ORCPT ); Thu, 26 Jan 2017 22:22:56 -0500 In-Reply-To: Content-Language: en-US Content-ID: <40D974CF939CCC429E69ED1B0402E5EB@namprd17.prod.outlook.com> Sender: netdev-owner@vger.kernel.org List-ID: On Thu, Jan 26, 2017 at 05:01:38PM -0800, Cong Wang wrote: > I'd suggest to you add some debugging printk's to the dst refcount functions, > or maybe just inside dst_gc_task(). I think the last dst referring to > the loopback > dev is still being referred at that point, which prevents GC from destroying it. Thanks for the suggestion! I will test it out. > Meanwhile, if it would be also helpful if you can share how you managed to > reproduce this reliably, I saw this bug in our data center before but never > know how to reproduce it. I used one of our applications to reproduce the issue, to be honest, I haven't completely isolated which part of the code is triggering the bug. However, the suspicion is that, since the application basically acts as a web crawler, the bug is manifested after initiating a large amount connections to a wide range of IP addresses in a short period of time. Hope it somewhat helps. Thanks, Kaiwen