All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Wang <weiwan@google.com>
To: stranche@codeaurora.org
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	David Ahern <dsahern@gmail.com>, Martin KaFai Lau <kafai@fb.com>,
	Mahesh Bandewar <maheshb@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Subject: Re: Refcount mismatch when unregistering netdevice from kernel
Date: Tue, 8 Dec 2020 13:51:03 -0800	[thread overview]
Message-ID: <CAEA6p_ArQdNp=hQCjrsnAo-Xy22d44b=2KdLp7zO7E7XDA4Fog@mail.gmail.com> (raw)
In-Reply-To: <307c2de1a2ddbdcd0a346c57da88b394@codeaurora.org>

On Tue, Dec 8, 2020 at 11:13 AM <stranche@codeaurora.org> wrote:
>
> Hi Wei and Eric,
>
> Thanks for the replies.
>
> This was reported to us on the 5.4.61 kernel during a customer
> regression suite, so we don't have an exact reproducer unfortunately.
>  From the trace logs we've added it seems like this is happening during
> IPv6 transport mode XFRM data transfer and the device is unregistered in
> the middle of it, but we've been unable to reproduce it ourselves..
> We're open to trying out and sharing debug patches if needed though.
>

I double checked 5.4.61, and I didn't find any missing fixes in this
area AFAICT.

> > rt6_uncached_list_flush_dev() actually tries to replace the inet6_dev
> > with loopback_dev, and release the reference to the previous inet6_dev
> > by calling in6_dev_put(), which is actually doing the same thing as
> > ip6_dst_ifdown(). I don't understand why you say " a reference to the
> > inet6_dev is simply dropped".
>
> Fair. I was going off the semantics used by the dst_dev_put() function
> which calls dst_ops->ifdown() explicitly. At least in the case of
> xfrm6_dst_ifdown() this swap of the loopback device and putting the
> refcount seems like it could be missing a few things.
>

Looking more into the xfrm code, I think the major difference between
xfrm dst and non-xfrm dst is that, xfrm code creates a list of dst
entries in one dst_entry linked by xfrm_dst_child().
In xfrm_bundle_create(), which I believe is the main function to
create xfrm dst bundles, it allocates the main dst entry and its
children, and it calls xfrm_fill_dst() for each of them. So I think
each dst in the list (including all the children) are added into the
uncached_list.
The difference between the current code in
rt6_uncached_list_flush_dev() vs dst_ops->ifdown() is that, the
current code only releases the refcnt to inet6_dev associated with the
main dst, while xfrm6_dst_ifdown() tries to release inet6_dev
associated with every dst linked by xfrm_dst_child(). However, since
xfrm_bundle_create() anyway adds each child dst to the uncached list,
I don't see how the current code could miss any refcnt.
BTW, have you tried your previous proposed patch and confirmed it
would fix the issue?

> > The additional refcount to the DST is also released by doing the
> > following:
> >                         if (rt_dev == dev) {
> >                                 rt->dst.dev = blackhole_netdev;
> >                                 dev_hold(rt->dst.dev);
> >                                 dev_put(rt_dev);
> >                         }
> > Am I missing something?
>
> That dev_put() is on the actual netdevice struct, not the inet6_dev
> associated with it. We're seeing many calls to icmp6_dst_alloc() and
> xfrm6_fill_dst() here, both of which seem to associate a reference to
> the inet6_dev struct with the DST in addition to the standard dev_hold()
> on the netdevice during the dst_alloc()/dst_init().
>

Could we further distinguish between dst added to the uncached list by
icmp6_dst_alloc() and xfrm6_fill_dst(), and confirm which ones are the
ones leaking reference?
I suspect it would be the xfrm ones, but I think it is worth verifying.


> Thanks,
> Sean

  reply	other threads:[~2020-12-08 21:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08  3:55 Refcount mismatch when unregistering netdevice from kernel stranche
2020-12-08 15:08 ` Eric Dumazet
2020-12-08 18:09   ` Wei Wang
2020-12-08 19:12     ` stranche
2020-12-08 21:51       ` Wei Wang [this message]
2020-12-09  0:03         ` David Ahern
2020-12-11  1:12           ` stranche
2020-12-11 16:10             ` David Ahern
2021-01-05  3:05               ` stranche
2021-01-05  4:58                 ` David Ahern
2021-01-05 19:09                   ` Wei Wang
2021-02-11 19:21                     ` Alexei Starovoitov
2021-02-12  1:28                       ` Jakub Kicinski
2021-02-12  1:44                         ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEA6p_ArQdNp=hQCjrsnAo-Xy22d44b=2KdLp7zO7E7XDA4Fog@mail.gmail.com' \
    --to=weiwan@google.com \
    --cc=dsahern@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    --cc=maheshb@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=stranche@codeaurora.org \
    --cc=subashab@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.