All of lore.kernel.org
 help / color / mirror / Atom feed
* Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")
@ 2018-04-23 13:08 Rafał Miłecki
  2018-04-25 14:16 ` Rafał Miłecki
  0 siblings, 1 reply; 5+ messages in thread
From: Rafał Miłecki @ 2018-04-23 13:08 UTC (permalink / raw)
  To: WANG Cong, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Network Development, jeffy, David Ahern, Khlebnikov
  Cc: Greg Kroah-Hartman, Stable

Hi,

I've just updated my kernel 4.4.x and noticed a regression. Bisecting
pointed me to the commit 2417da3f4d6bc ("ipv6: only call
ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
backport of upstream 76da0704507bb. That backported commit has
appeared in a 4.4.103.

I use OpenWrt/LEDE [1] distribution and LXC [2] 1.1.5. After stopping
a container I start getting these messages:
[  229.419188] unregister_netdevice: waiting for lo to become free.
Usage count = 1
[  239.660408] unregister_netdevice: waiting for lo to become free.
Usage count = 1
[  249.839189] unregister_netdevice: waiting for lo to become free.
Usage count = 1
(...)

Trying to start LXC nevertheless results in lxc-start command hang
around network configuration. Trying to query LXC state afterwards
results in a lxc-info command hang too.

I tried Googling for this issue and found similar reports:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729637
https://github.com/fnproject/fn/issues/686
https://lime-technology.com/forums/topic/66863-kernelunregister_netdevice-waiting-for-lo-to-become-free-usage-count-1/
all of them related to the Docker, which is probably a similar use
case to the LXC.

I couldn't find any reference to commit 76da0704507bb that could
suggest fixing the problem I'm seeing.

Does anyone have an idea what is the issue I'm seeing about? Or even
better, how to fix it? Can I provide any additional info that would
help?


[0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
[1] https://openwrt.org/
[2] https://linuxcontainers.org/

-- 
Rafał

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")
  2018-04-23 13:08 Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER") Rafał Miłecki
@ 2018-04-25 14:16 ` Rafał Miłecki
  2018-04-25 14:30   ` Konstantin Khlebnikov
  0 siblings, 1 reply; 5+ messages in thread
From: Rafał Miłecki @ 2018-04-25 14:16 UTC (permalink / raw)
  To: WANG Cong, David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Network Development, jeffy, David Ahern, Khlebnikov
  Cc: Greg Kroah-Hartman, Stable, Dan Streetman, Dan Streetman

On 23.04.2018 15:08, Rafał Miłecki wrote:
> I've just updated my kernel 4.4.x and noticed a regression. Bisecting
> pointed me to the commit 2417da3f4d6bc ("ipv6: only call
> ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
> backport of upstream 76da0704507bb. That backported commit has
> appeared in a 4.4.103.
> 
> I use OpenWrt/LEDE [1] distribution and LXC [2] 1.1.5. After stopping
> a container I start getting these messages:
> [  229.419188] unregister_netdevice: waiting for lo to become free. Usage count = 1
> [  239.660408] unregister_netdevice: waiting for lo to become free. Usage count = 1
> [  249.839189] unregister_netdevice: waiting for lo to become free. Usage count = 1
> (...)
> 
> Trying to start LXC nevertheless results in lxc-start command hang
> around network configuration. Trying to query LXC state afterwards
> results in a lxc-info command hang too.
> 
> I tried Googling for this issue and found similar reports:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729637
> https://github.com/fnproject/fn/issues/686
> https://lime-technology.com/forums/topic/66863-kernelunregister_netdevice-waiting-for-lo-to-become-free-usage-count-1/
> all of them related to the Docker, which is probably a similar use
> case to the LXC.
> 
> I couldn't find any reference to commit 76da0704507bb that could
> suggest fixing the problem I'm seeing.
> 
> Does anyone have an idea what is the issue I'm seeing about? Or even
> better, how to fix it? Can I provide any additional info that would
> help?
> 
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
> [1] https://openwrt.org/
> [2] https://linuxcontainers.org/

Today I tried 4.14.34 to see if that helps. Unfortunately it doesn't. I
still experience the same problem.

 From reading various reports regarding that "unregister_netdevice:
waiting for lo to become free" message it appears the problem is caused
by a leaking dst refcnt somewhere in the kernel code.

I found links to few commit fixing leaks at various places:
4a31a6b19f9dd ("sctp: fix dst refcnt leak in sctp_v4_get_dst")
957d761cf91cd ("sctp: fix dst refcnt leak in sctp_v6_get_dst()")
4ee806d51176b ("net: tcp: close sock if net namespace is exiting")
d747a7a51b009 ("tcp: reset sk_rx_dst in tcp_disconnect()")
751eb6b6042a5 ("ipv6: addrconf: fix dev refcont leak when DAD failed")

All above patches are present in the linux-v4.4.y and are part of kernel
4.4.124 I use. So it seems I'm facing yet another dst refcnt leak.

Could commit 2417da3f4d6bc ("ipv6: only call ip6_route_dev_notify() once
for NETDEV_UNREGISTER") introduce a new dst refcnt leak? Or does it only
expost existing one?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")
  2018-04-25 14:16 ` Rafał Miłecki
@ 2018-04-25 14:30   ` Konstantin Khlebnikov
  2018-04-25 14:44     ` Rafał Miłecki
  0 siblings, 1 reply; 5+ messages in thread
From: Konstantin Khlebnikov @ 2018-04-25 14:30 UTC (permalink / raw)
  To: Rafał Miłecki, WANG Cong, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Network Development, jeffy,
	David Ahern
  Cc: Greg Kroah-Hartman, Stable, Dan Streetman, Dan Streetman,
	Mathias Tillman


On 25.04.2018 17:16, Rafał Miłecki wrote:
> On 23.04.2018 15:08, Rafał Miłecki wrote:
>> I've just updated my kernel 4.4.x and noticed a regression. Bisecting
>> pointed me to the commit 2417da3f4d6bc ("ipv6: only call
>> ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
>> backport of upstream 76da0704507bb. That backported commit has
>> appeared in a 4.4.103.
>>
>> I use OpenWrt/LEDE [1] distribution and LXC [2] 1.1.5. After stopping
>> a container I start getting these messages:
>> [  229.419188] unregister_netdevice: waiting for lo to become free. Usage count = 1
>> [  239.660408] unregister_netdevice: waiting for lo to become free. Usage count = 1
>> [  249.839189] unregister_netdevice: waiting for lo to become free. Usage count = 1
>> (...)
>>
>> Trying to start LXC nevertheless results in lxc-start command hang
>> around network configuration. Trying to query LXC state afterwards
>> results in a lxc-info command hang too.
>>
>> I tried Googling for this issue and found similar reports:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729637
>> https://github.com/fnproject/fn/issues/686
>> https://lime-technology.com/forums/topic/66863-kernelunregister_netdevice-waiting-for-lo-to-become-free-usage-count-1/
>> all of them related to the Docker, which is probably a similar use
>> case to the LXC.
>>
>> I couldn't find any reference to commit 76da0704507bb that could
>> suggest fixing the problem I'm seeing.
>>
>> Does anyone have an idea what is the issue I'm seeing about? Or even
>> better, how to fix it? Can I provide any additional info that would
>> help?
>>
>>
>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
>> [1] https://openwrt.org/
>> [2] https://linuxcontainers.org/
> 
> Today I tried 4.14.34 to see if that helps. Unfortunately it doesn't. I
> still experience the same problem.
> 
>  From reading various reports regarding that "unregister_netdevice:
> waiting for lo to become free" message it appears the problem is caused
> by a leaking dst refcnt somewhere in the kernel code.
> 
> I found links to few commit fixing leaks at various places:
> 4a31a6b19f9dd ("sctp: fix dst refcnt leak in sctp_v4_get_dst")
> 957d761cf91cd ("sctp: fix dst refcnt leak in sctp_v6_get_dst()")
> 4ee806d51176b ("net: tcp: close sock if net namespace is exiting")
> d747a7a51b009 ("tcp: reset sk_rx_dst in tcp_disconnect()")
> 751eb6b6042a5 ("ipv6: addrconf: fix dev refcont leak when DAD failed")
> 
> All above patches are present in the linux-v4.4.y and are part of kernel
> 4.4.124 I use. So it seems I'm facing yet another dst refcnt leak.
> 
> Could commit 2417da3f4d6bc ("ipv6: only call ip6_route_dev_notify() once
> for NETDEV_UNREGISTER") introduce a new dst refcnt leak? Or does it only
> expost existing one?

Mathias Tillman reported this as "4.4.103 linux kernel regression".
Last message in that thread (which I couldn't find in mailing list archives) had:
| As it turns out, it's due to a patch in the Turris Omnia/OpenWRT code that adds a in6_dev_get call without calling in6_dev_put.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")
  2018-04-25 14:30   ` Konstantin Khlebnikov
@ 2018-04-25 14:44     ` Rafał Miłecki
  2018-05-04  7:54       ` Rafał Miłecki
  0 siblings, 1 reply; 5+ messages in thread
From: Rafał Miłecki @ 2018-04-25 14:44 UTC (permalink / raw)
  To: Konstantin Khlebnikov, WANG Cong, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Network Development, jeffy,
	David Ahern
  Cc: Greg Kroah-Hartman, Stable, Dan Streetman, Dan Streetman,
	Mathias Tillman

On 25.04.2018 16:30, Konstantin Khlebnikov wrote:
> On 25.04.2018 17:16, Rafał Miłecki wrote:
>> On 23.04.2018 15:08, Rafał Miłecki wrote:
>>> I've just updated my kernel 4.4.x and noticed a regression. Bisecting
>>> pointed me to the commit 2417da3f4d6bc ("ipv6: only call
>>> ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
>>> backport of upstream 76da0704507bb. That backported commit has
>>> appeared in a 4.4.103.
>>>
>>> I use OpenWrt/LEDE [1] distribution and LXC [2] 1.1.5. After stopping
>>> a container I start getting these messages:
>>> [  229.419188] unregister_netdevice: waiting for lo to become free. Usage count = 1
>>> [  239.660408] unregister_netdevice: waiting for lo to become free. Usage count = 1
>>> [  249.839189] unregister_netdevice: waiting for lo to become free. Usage count = 1
>>> (...)
>>>
>>> Trying to start LXC nevertheless results in lxc-start command hang
>>> around network configuration. Trying to query LXC state afterwards
>>> results in a lxc-info command hang too.
>>>
>>> I tried Googling for this issue and found similar reports:
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729637
>>> https://github.com/fnproject/fn/issues/686
>>> https://lime-technology.com/forums/topic/66863-kernelunregister_netdevice-waiting-for-lo-to-become-free-usage-count-1/
>>> all of them related to the Docker, which is probably a similar use
>>> case to the LXC.
>>>
>>> I couldn't find any reference to commit 76da0704507bb that could
>>> suggest fixing the problem I'm seeing.
>>>
>>> Does anyone have an idea what is the issue I'm seeing about? Or even
>>> better, how to fix it? Can I provide any additional info that would
>>> help?
>>>
>>>
>>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
>>> [1] https://openwrt.org/
>>> [2] https://linuxcontainers.org/
>>
>> Today I tried 4.14.34 to see if that helps. Unfortunately it doesn't. I
>> still experience the same problem.
>>
>>  From reading various reports regarding that "unregister_netdevice:
>> waiting for lo to become free" message it appears the problem is caused
>> by a leaking dst refcnt somewhere in the kernel code.
>>
>> I found links to few commit fixing leaks at various places:
>> 4a31a6b19f9dd ("sctp: fix dst refcnt leak in sctp_v4_get_dst")
>> 957d761cf91cd ("sctp: fix dst refcnt leak in sctp_v6_get_dst()")
>> 4ee806d51176b ("net: tcp: close sock if net namespace is exiting")
>> d747a7a51b009 ("tcp: reset sk_rx_dst in tcp_disconnect()")
>> 751eb6b6042a5 ("ipv6: addrconf: fix dev refcont leak when DAD failed")
>>
>> All above patches are present in the linux-v4.4.y and are part of kernel
>> 4.4.124 I use. So it seems I'm facing yet another dst refcnt leak.
>>
>> Could commit 2417da3f4d6bc ("ipv6: only call ip6_route_dev_notify() once
>> for NETDEV_UNREGISTER") introduce a new dst refcnt leak? Or does it only
>> expost existing one?
> 
> Mathias Tillman reported this as "4.4.103 linux kernel regression".
> Last message in that thread (which I couldn't find in mailing list archives) had:
> | As it turns out, it's due to a patch in the Turris Omnia/OpenWRT code that adds a in6_dev_get call without calling in6_dev_put.

Wow, this is very helpful, thank you!

Somehow I didn't even think about OpenWrt downstream patches. Too bad
this wasn't reported to the OpenWrt community, I spent 2 days on this.
There is indeed:
target/linux/generic/patches-4.4/670-ipv6-allow-rejecting-with-source-address-failed-policy.patch
[PATCH 1/2] ipv6: allow rejecting with "source address failed policy"

I'll move this issue discussion to the OpenWrt/LEDE now, I hope we can
sort it out.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")
  2018-04-25 14:44     ` Rafał Miłecki
@ 2018-05-04  7:54       ` Rafał Miłecki
  0 siblings, 0 replies; 5+ messages in thread
From: Rafał Miłecki @ 2018-05-04  7:54 UTC (permalink / raw)
  To: Konstantin Khlebnikov, WANG Cong, David S. Miller,
	Alexey Kuznetsov, Hideaki YOSHIFUJI, Network Development, jeffy,
	David Ahern
  Cc: Greg Kroah-Hartman, Stable, Dan Streetman, Dan Streetman,
	Mathias Tillman

On 25 April 2018 at 16:44, Rafał Miłecki <zajec5@gmail.com> wrote:
> On 25.04.2018 16:30, Konstantin Khlebnikov wrote:
>>
>> On 25.04.2018 17:16, Rafał Miłecki wrote:
>>>
>>> On 23.04.2018 15:08, Rafał Miłecki wrote:
>>>>
>>>> I've just updated my kernel 4.4.x and noticed a regression. Bisecting
>>>> pointed me to the commit 2417da3f4d6bc ("ipv6: only call
>>>> ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
>>>> backport of upstream 76da0704507bb. That backported commit has
>>>> appeared in a 4.4.103.
>>>>
>>>> I use OpenWrt/LEDE [1] distribution and LXC [2] 1.1.5. After stopping
>>>> a container I start getting these messages:
>>>> [  229.419188] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 1
>>>> [  239.660408] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 1
>>>> [  249.839189] unregister_netdevice: waiting for lo to become free.
>>>> Usage count = 1
>>>> (...)
>>>>
>>>> Trying to start LXC nevertheless results in lxc-start command hang
>>>> around network configuration. Trying to query LXC state afterwards
>>>> results in a lxc-info command hang too.
>>>>
>>>> I tried Googling for this issue and found similar reports:
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729637
>>>> https://github.com/fnproject/fn/issues/686
>>>>
>>>> https://lime-technology.com/forums/topic/66863-kernelunregister_netdevice-waiting-for-lo-to-become-free-usage-count-1/
>>>> all of them related to the Docker, which is probably a similar use
>>>> case to the LXC.
>>>>
>>>> I couldn't find any reference to commit 76da0704507bb that could
>>>> suggest fixing the problem I'm seeing.
>>>>
>>>> Does anyone have an idea what is the issue I'm seeing about? Or even
>>>> better, how to fix it? Can I provide any additional info that would
>>>> help?
>>>>
>>>>
>>>> [0]
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
>>>> [1] https://openwrt.org/
>>>> [2] https://linuxcontainers.org/
>>>
>>>
>>> Today I tried 4.14.34 to see if that helps. Unfortunately it doesn't. I
>>> still experience the same problem.
>>>
>>>  From reading various reports regarding that "unregister_netdevice:
>>> waiting for lo to become free" message it appears the problem is caused
>>> by a leaking dst refcnt somewhere in the kernel code.
>>>
>>> I found links to few commit fixing leaks at various places:
>>> 4a31a6b19f9dd ("sctp: fix dst refcnt leak in sctp_v4_get_dst")
>>> 957d761cf91cd ("sctp: fix dst refcnt leak in sctp_v6_get_dst()")
>>> 4ee806d51176b ("net: tcp: close sock if net namespace is exiting")
>>> d747a7a51b009 ("tcp: reset sk_rx_dst in tcp_disconnect()")
>>> 751eb6b6042a5 ("ipv6: addrconf: fix dev refcont leak when DAD failed")
>>>
>>> All above patches are present in the linux-v4.4.y and are part of kernel
>>> 4.4.124 I use. So it seems I'm facing yet another dst refcnt leak.
>>>
>>> Could commit 2417da3f4d6bc ("ipv6: only call ip6_route_dev_notify() once
>>> for NETDEV_UNREGISTER") introduce a new dst refcnt leak? Or does it only
>>> expost existing one?
>>
>>
>> Mathias Tillman reported this as "4.4.103 linux kernel regression".
>> Last message in that thread (which I couldn't find in mailing list
>> archives) had:
>> | As it turns out, it's due to a patch in the Turris Omnia/OpenWRT code
>> that adds a in6_dev_get call without calling in6_dev_put.
>
>
> Wow, this is very helpful, thank you!
>
> Somehow I didn't even think about OpenWrt downstream patches. Too bad
> this wasn't reported to the OpenWrt community, I spent 2 days on this.
> There is indeed:
> target/linux/generic/patches-4.4/670-ipv6-allow-rejecting-with-source-address-failed-policy.patch
> [PATCH 1/2] ipv6: allow rejecting with "source address failed policy"
>
> I'll move this issue discussion to the OpenWrt/LEDE now, I hope we can
> sort it out.

For a reference it has been fixed in OpenWrt/LEDE by Felix in:

1) master branch:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=58f7b5b96c301176d639540df4723c798af2a999

2) lede-17.01 branch
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=999bb66b20b03c753801ecebf1ec2a03c6a63c96

-- 
Rafał

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-05-04  7:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-23 13:08 Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER") Rafał Miłecki
2018-04-25 14:16 ` Rafał Miłecki
2018-04-25 14:30   ` Konstantin Khlebnikov
2018-04-25 14:44     ` Rafał Miłecki
2018-05-04  7:54       ` Rafał Miłecki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.