All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Wei Wang <weiwan@google.com>, stranche@codeaurora.org
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Martin KaFai Lau <kafai@fb.com>,
	Mahesh Bandewar <maheshb@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Subject: Re: Refcount mismatch when unregistering netdevice from kernel
Date: Tue, 8 Dec 2020 17:03:12 -0700	[thread overview]
Message-ID: <f10c733a-09f8-2c72-c333-41f9d53e1498@gmail.com> (raw)
In-Reply-To: <CAEA6p_ArQdNp=hQCjrsnAo-Xy22d44b=2KdLp7zO7E7XDA4Fog@mail.gmail.com>

On 12/8/20 2:51 PM, Wei Wang wrote:
> On Tue, Dec 8, 2020 at 11:13 AM <stranche@codeaurora.org> wrote:
>>
>> Hi Wei and Eric,
>>
>> Thanks for the replies.
>>
>> This was reported to us on the 5.4.61 kernel during a customer
>> regression suite, so we don't have an exact reproducer unfortunately.
>>  From the trace logs we've added it seems like this is happening during
>> IPv6 transport mode XFRM data transfer and the device is unregistered in
>> the middle of it, but we've been unable to reproduce it ourselves..
>> We're open to trying out and sharing debug patches if needed though.
>>
> 
> I double checked 5.4.61, and I didn't find any missing fixes in this
> area AFAICT.
> 
>>> rt6_uncached_list_flush_dev() actually tries to replace the inet6_dev
>>> with loopback_dev, and release the reference to the previous inet6_dev
>>> by calling in6_dev_put(), which is actually doing the same thing as
>>> ip6_dst_ifdown(). I don't understand why you say " a reference to the
>>> inet6_dev is simply dropped".
>>
>> Fair. I was going off the semantics used by the dst_dev_put() function
>> which calls dst_ops->ifdown() explicitly. At least in the case of
>> xfrm6_dst_ifdown() this swap of the loopback device and putting the
>> refcount seems like it could be missing a few things.
>>
> 
> Looking more into the xfrm code, I think the major difference between
> xfrm dst and non-xfrm dst is that, xfrm code creates a list of dst
> entries in one dst_entry linked by xfrm_dst_child().
> In xfrm_bundle_create(), which I believe is the main function to
> create xfrm dst bundles, it allocates the main dst entry and its
> children, and it calls xfrm_fill_dst() for each of them. So I think
> each dst in the list (including all the children) are added into the
> uncached_list.
> The difference between the current code in
> rt6_uncached_list_flush_dev() vs dst_ops->ifdown() is that, the
> current code only releases the refcnt to inet6_dev associated with the
> main dst, while xfrm6_dst_ifdown() tries to release inet6_dev
> associated with every dst linked by xfrm_dst_child(). However, since
> xfrm_bundle_create() anyway adds each child dst to the uncached list,
> I don't see how the current code could miss any refcnt.
> BTW, have you tried your previous proposed patch and confirmed it
> would fix the issue?
> 
>>> The additional refcount to the DST is also released by doing the
>>> following:
>>>                         if (rt_dev == dev) {
>>>                                 rt->dst.dev = blackhole_netdev;
>>>                                 dev_hold(rt->dst.dev);
>>>                                 dev_put(rt_dev);
>>>                         }
>>> Am I missing something?
>>
>> That dev_put() is on the actual netdevice struct, not the inet6_dev
>> associated with it. We're seeing many calls to icmp6_dst_alloc() and
>> xfrm6_fill_dst() here, both of which seem to associate a reference to
>> the inet6_dev struct with the DST in addition to the standard dev_hold()
>> on the netdevice during the dst_alloc()/dst_init().
>>
> 
> Could we further distinguish between dst added to the uncached list by
> icmp6_dst_alloc() and xfrm6_fill_dst(), and confirm which ones are the
> ones leaking reference?
> I suspect it would be the xfrm ones, but I think it is worth verifying.
> 

Finally found the reference:

tools/testing/selftests/net/l2tp.sh at one point was triggering a
refcount leak:

https://lore.kernel.org/netdev/20190801235421.8344-1-dsahern@kernel.org/

And then Colin found more problems with it:

https://lore.kernel.org/netdev/450f5abb-5fe8-158d-d267-4334e15f8e58@canonical.com/


running that on a 5.8 kernel on Ubuntu 20.10 did not trigger the
problem. Neither did Ubuntu 20.04 with 5.4.0-51-generic.

Can you run it on your 5.4 version and see?

  reply	other threads:[~2020-12-09  0:04 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08  3:55 Refcount mismatch when unregistering netdevice from kernel stranche
2020-12-08 15:08 ` Eric Dumazet
2020-12-08 18:09   ` Wei Wang
2020-12-08 19:12     ` stranche
2020-12-08 21:51       ` Wei Wang
2020-12-09  0:03         ` David Ahern [this message]
2020-12-11  1:12           ` stranche
2020-12-11 16:10             ` David Ahern
2021-01-05  3:05               ` stranche
2021-01-05  4:58                 ` David Ahern
2021-01-05 19:09                   ` Wei Wang
2021-02-11 19:21                     ` Alexei Starovoitov
2021-02-12  1:28                       ` Jakub Kicinski
2021-02-12  1:44                         ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f10c733a-09f8-2c72-c333-41f9d53e1498@gmail.com \
    --to=dsahern@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    --cc=maheshb@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=stranche@codeaurora.org \
    --cc=subashab@codeaurora.org \
    --cc=weiwan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.