* [PATCH net] net: mld: do not use system_wq in the mld
@ 2022-07-21 12:03 Taehee Yoo
2022-07-21 14:04 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Taehee Yoo @ 2022-07-21 12:03 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet, yoshfuji, dsahern, netdev; +Cc: ap420073
mld works are supposed to be executed in mld_wq.
But mld_{query | report}_work() calls schedule_delayed_work().
schedule_delayed_work() internally uses system_wq.
So, this would cause the reference count leak.
splat looks like:
unregister_netdevice: waiting for br1 to become free. Usage count = 2
leaked reference.
ipv6_add_dev+0x3a5/0x1070
addrconf_notify+0x4f3/0x1760
notifier_call_chain+0x9e/0x180
register_netdevice+0xd10/0x11e0
br_dev_newlink+0x27/0x100 [bridge]
__rtnl_newlink+0xd85/0x14e0
rtnl_newlink+0x5f/0x90
rtnetlink_rcv_msg+0x335/0x9a0
netlink_rcv_skb+0x121/0x350
netlink_unicast+0x439/0x710
netlink_sendmsg+0x75f/0xc00
____sys_sendmsg+0x694/0x860
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xbe/0x150
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: f185de28d9ae ("mld: add new workqueues for process mld events")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
net/ipv6/mcast.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 7f695c39d9a8..87c699d57b36 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1522,7 +1522,6 @@ static void mld_query_work(struct work_struct *work)
if (++cnt >= MLD_MAX_QUEUE) {
rework = true;
- schedule_delayed_work(&idev->mc_query_work, 0);
break;
}
}
@@ -1533,8 +1532,10 @@ static void mld_query_work(struct work_struct *work)
__mld_query_work(skb);
mutex_unlock(&idev->mc_lock);
- if (!rework)
- in6_dev_put(idev);
+ if (rework && queue_delayed_work(mld_wq, &idev->mc_query_work, 0))
+ return;
+
+ in6_dev_put(idev);
}
/* called with rcu_read_lock() */
@@ -1624,7 +1625,6 @@ static void mld_report_work(struct work_struct *work)
if (++cnt >= MLD_MAX_QUEUE) {
rework = true;
- schedule_delayed_work(&idev->mc_report_work, 0);
break;
}
}
@@ -1635,8 +1635,10 @@ static void mld_report_work(struct work_struct *work)
__mld_report_work(skb);
mutex_unlock(&idev->mc_lock);
- if (!rework)
- in6_dev_put(idev);
+ if (rework && queue_delayed_work(mld_wq, &idev->mc_report_work, 0))
+ return;
+
+ in6_dev_put(idev);
}
static bool is_in(struct ifmcaddr6 *pmc, struct ip6_sf_list *psf, int type,
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-21 12:03 [PATCH net] net: mld: do not use system_wq in the mld Taehee Yoo
@ 2022-07-21 14:04 ` Eric Dumazet
2022-07-21 17:53 ` Taehee Yoo
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2022-07-21 14:04 UTC (permalink / raw)
To: Taehee Yoo
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, netdev
On Thu, Jul 21, 2022 at 2:03 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> mld works are supposed to be executed in mld_wq.
> But mld_{query | report}_work() calls schedule_delayed_work().
> schedule_delayed_work() internally uses system_wq.
> So, this would cause the reference count leak.
I do not think the changelog is accurate.
At least I do not understand it yet.
We can not unload the ipv6 module, so destroy_workqueue(mld_wq) is never used.
>
> splat looks like:
> unregister_netdevice: waiting for br1 to become free. Usage count = 2
> leaked reference.
> ipv6_add_dev+0x3a5/0x1070
> addrconf_notify+0x4f3/0x1760
> notifier_call_chain+0x9e/0x180
> register_netdevice+0xd10/0x11e0
> br_dev_newlink+0x27/0x100 [bridge]
> __rtnl_newlink+0xd85/0x14e0
> rtnl_newlink+0x5f/0x90
> rtnetlink_rcv_msg+0x335/0x9a0
> netlink_rcv_skb+0x121/0x350
> netlink_unicast+0x439/0x710
> netlink_sendmsg+0x75f/0xc00
> ____sys_sendmsg+0x694/0x860
> ___sys_sendmsg+0xe9/0x160
> __sys_sendmsg+0xbe/0x150
> do_syscall_64+0x3b/0x90
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> Fixes: f185de28d9ae ("mld: add new workqueues for process mld events")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
> net/ipv6/mcast.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index 7f695c39d9a8..87c699d57b36 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1522,7 +1522,6 @@ static void mld_query_work(struct work_struct *work)
>
> if (++cnt >= MLD_MAX_QUEUE) {
> rework = true;
> - schedule_delayed_work(&idev->mc_query_work, 0);
> break;
> }
> }
> @@ -1533,8 +1532,10 @@ static void mld_query_work(struct work_struct *work)
> __mld_query_work(skb);
> mutex_unlock(&idev->mc_lock);
>
> - if (!rework)
> - in6_dev_put(idev);
> + if (rework && queue_delayed_work(mld_wq, &idev->mc_query_work, 0))
It seems the 'real issue' was that
schedule_delayed_work(&idev->mc_query_work, 0) could be a NOP
because the work queue was already scheduled ?
> + return;
> +
> + in6_dev_put(idev);
> }
>
> /* called with rcu_read_lock() */
> @@ -1624,7 +1625,6 @@ static void mld_report_work(struct work_struct *work)
>
> if (++cnt >= MLD_MAX_QUEUE) {
> rework = true;
> - schedule_delayed_work(&idev->mc_report_work, 0);
> break;
> }
> }
> @@ -1635,8 +1635,10 @@ static void mld_report_work(struct work_struct *work)
> __mld_report_work(skb);
> mutex_unlock(&idev->mc_lock);
>
> - if (!rework)
> - in6_dev_put(idev);
> + if (rework && queue_delayed_work(mld_wq, &idev->mc_report_work, 0))
> + return;
> +
> + in6_dev_put(idev);
> }
>
> static bool is_in(struct ifmcaddr6 *pmc, struct ip6_sf_list *psf, int type,
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-21 14:04 ` Eric Dumazet
@ 2022-07-21 17:53 ` Taehee Yoo
2022-07-21 18:34 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Taehee Yoo @ 2022-07-21 17:53 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, netdev
Hi Eric,
Thank you so much for your review!
On 7/21/22 23:04, Eric Dumazet wrote:
> On Thu, Jul 21, 2022 at 2:03 PM Taehee Yoo <ap420073@gmail.com> wrote:
>>
>> mld works are supposed to be executed in mld_wq.
>> But mld_{query | report}_work() calls schedule_delayed_work().
>> schedule_delayed_work() internally uses system_wq.
>> So, this would cause the reference count leak.
>
> I do not think the changelog is accurate.
> At least I do not understand it yet.
>
> We can not unload the ipv6 module, so destroy_workqueue(mld_wq) is
never used.
>
>
>
>>
>> splat looks like:
>> unregister_netdevice: waiting for br1 to become free. Usage count = 2
>> leaked reference.
>> ipv6_add_dev+0x3a5/0x1070
>> addrconf_notify+0x4f3/0x1760
>> notifier_call_chain+0x9e/0x180
>> register_netdevice+0xd10/0x11e0
>> br_dev_newlink+0x27/0x100 [bridge]
>> __rtnl_newlink+0xd85/0x14e0
>> rtnl_newlink+0x5f/0x90
>> rtnetlink_rcv_msg+0x335/0x9a0
>> netlink_rcv_skb+0x121/0x350
>> netlink_unicast+0x439/0x710
>> netlink_sendmsg+0x75f/0xc00
>> ____sys_sendmsg+0x694/0x860
>> ___sys_sendmsg+0xe9/0x160
>> __sys_sendmsg+0xbe/0x150
>> do_syscall_64+0x3b/0x90
>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> Fixes: f185de28d9ae ("mld: add new workqueues for process mld events")
>> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
>> ---
>> net/ipv6/mcast.c | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
>> index 7f695c39d9a8..87c699d57b36 100644
>> --- a/net/ipv6/mcast.c
>> +++ b/net/ipv6/mcast.c
>> @@ -1522,7 +1522,6 @@ static void mld_query_work(struct work_struct
*work)
>>
>> if (++cnt >= MLD_MAX_QUEUE) {
>> rework = true;
>> - schedule_delayed_work(&idev->mc_query_work, 0);
>> break;
>> }
>> }
>> @@ -1533,8 +1532,10 @@ static void mld_query_work(struct work_struct
*work)
>> __mld_query_work(skb);
>> mutex_unlock(&idev->mc_lock);
>>
>> - if (!rework)
>> - in6_dev_put(idev);
>> + if (rework && queue_delayed_work(mld_wq,
&idev->mc_query_work, 0))
>
> It seems the 'real issue' was that
> schedule_delayed_work(&idev->mc_query_work, 0) could be a NOP
> because the work queue was already scheduled ?
>
I think your assumption is right.
I tested the below scenario, which occurs the real issue.
THREAD0 THREAD1
mld_report_work()
spin_lock_bh()
if (!mod_delayed_work()) <-- queued
in6_dev_hold();
spin_unlock_bh()
spin_lock_bh()
schedule_delayed_work() <-- return false, already queued by THREAD1
spin_unlock_bh()
return;
//no in6_dev_put() regardless return value of schedule_delayed_work().
In order to check, I added printk like below.
if (++cnt >= MLD_MAX_QUEUE) {
rework = true;
if (!schedule_delayed_work(&idev->mc_report_work, 0))
printk("[TEST]%s %u \n", __func__, __LINE__);
break;
If the TEST log message is printed, work is already queued by other logic.
So, it indicates a reference count is leaked.
The result is that I can see log messages only when the reference count
leak occurs.
So, although I tested it only for 1 hour, I'm sure that this bug comes
from missing check a return value of schedule_delayed_work().
As you said, this changelog is not correct.
system_wq and mld_wq are not related to this issue.
I would like to send a v2 patch after some more tests.
The v2 patch will change the commit message.
>
>
>> + return;
>> +
>> + in6_dev_put(idev);
>> }
>>
>> /* called with rcu_read_lock() */
>> @@ -1624,7 +1625,6 @@ static void mld_report_work(struct work_struct
*work)
>>
>> if (++cnt >= MLD_MAX_QUEUE) {
>> rework = true;
>> - schedule_delayed_work(&idev->mc_report_work, 0);
>> break;
>> }
>> }
>> @@ -1635,8 +1635,10 @@ static void mld_report_work(struct
work_struct *work)
>> __mld_report_work(skb);
>> mutex_unlock(&idev->mc_lock);
>>
>> - if (!rework)
>> - in6_dev_put(idev);
>> + if (rework && queue_delayed_work(mld_wq,
&idev->mc_report_work, 0))
>> + return;
>> +
>> + in6_dev_put(idev);
>> }
>>
>> static bool is_in(struct ifmcaddr6 *pmc, struct ip6_sf_list *psf,
int type,
>> --
>> 2.17.1
>>
Thanks a lot!
Taehee Yoo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-21 17:53 ` Taehee Yoo
@ 2022-07-21 18:34 ` Eric Dumazet
2022-07-21 19:01 ` Taehee Yoo
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2022-07-21 18:34 UTC (permalink / raw)
To: Taehee Yoo
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, netdev
On Thu, Jul 21, 2022 at 7:53 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> Hi Eric,
> Thank you so much for your review!
>
...
> I think your assumption is right.
> I tested the below scenario, which occurs the real issue.
> THREAD0 THREAD1
> mld_report_work()
> spin_lock_bh()
> if (!mod_delayed_work()) <-- queued
> in6_dev_hold();
> spin_unlock_bh()
> spin_lock_bh()
> schedule_delayed_work() <-- return false, already queued by THREAD1
> spin_unlock_bh()
> return;
> //no in6_dev_put() regardless return value of schedule_delayed_work().
>
> In order to check, I added printk like below.
> if (++cnt >= MLD_MAX_QUEUE) {
>
> rework = true;
>
> if (!schedule_delayed_work(&idev->mc_report_work, 0))
> printk("[TEST]%s %u \n", __func__, __LINE__);
> break;
>
>
> If the TEST log message is printed, work is already queued by other logic.
> So, it indicates a reference count is leaked.
> The result is that I can see log messages only when the reference count
> leak occurs.
> So, although I tested it only for 1 hour, I'm sure that this bug comes
> from missing check a return value of schedule_delayed_work().
>
> As you said, this changelog is not correct.
> system_wq and mld_wq are not related to this issue.
>
> I would like to send a v2 patch after some more tests.
> The v2 patch will change the commit message.
Can you describe what kind of tests you are running ?
Was it a syzbot report ?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-21 18:34 ` Eric Dumazet
@ 2022-07-21 19:01 ` Taehee Yoo
2022-07-22 2:35 ` Hangbin Liu
0 siblings, 1 reply; 9+ messages in thread
From: Taehee Yoo @ 2022-07-21 19:01 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Jakub Kicinski, Paolo Abeni, Hideaki YOSHIFUJI,
David Ahern, netdev
On 7/22/22 03:34, Eric Dumazet wrote:
> On Thu, Jul 21, 2022 at 7:53 PM Taehee Yoo <ap420073@gmail.com> wrote:
>>
>> Hi Eric,
>> Thank you so much for your review!
>>
>
> ...
>
>> I think your assumption is right.
>> I tested the below scenario, which occurs the real issue.
>> THREAD0 THREAD1
>> mld_report_work()
>> spin_lock_bh()
>> if (!mod_delayed_work()) <-- queued
>> in6_dev_hold();
>> spin_unlock_bh()
>> spin_lock_bh()
>> schedule_delayed_work() <-- return false, already queued by THREAD1
>> spin_unlock_bh()
>> return;
>> //no in6_dev_put() regardless return value of schedule_delayed_work().
>>
>> In order to check, I added printk like below.
>> if (++cnt >= MLD_MAX_QUEUE) {
>>
>> rework = true;
>>
>> if (!schedule_delayed_work(&idev->mc_report_work, 0))
>> printk("[TEST]%s %u \n", __func__, __LINE__);
>> break;
>>
>>
>> If the TEST log message is printed, work is already queued by other
logic.
>> So, it indicates a reference count is leaked.
>> The result is that I can see log messages only when the reference count
>> leak occurs.
>> So, although I tested it only for 1 hour, I'm sure that this bug comes
>> from missing check a return value of schedule_delayed_work().
>>
>> As you said, this changelog is not correct.
>> system_wq and mld_wq are not related to this issue.
>>
>> I would like to send a v2 patch after some more tests.
>> The v2 patch will change the commit message.
>
> Can you describe what kind of tests you are running ?
> Was it a syzbot report ?
I found this bug while testing another syzbot report.
(https://syzkaller.appspot.com/bug?id=ed41eaa4367b421d37aab5dee25e3f4c91ceae93)
And I can't find the same case in the syzbot reports list.
I just use some command lines and many kernel debug options such as
kmemleak, kasan, lockdep, and others.
a=ns-$RANDOM
b=ns-$RANDOM
ip netns add $a
ip netns add $b
echo 'file net/ipv6/addrconf.c +p' >
/sys/kernel/debug/dynamic_debug/control
echo 'file net/ipv6/addrconf_core.c +p' >
/sys/kernel/debug/dynamic_debug/control
echo 'file net/ipv6/mcast.c +p' >
/sys/kernel/debug/dynamic_debug/control
ip netns exec $a sysctl net.ipv6.mld_max_msf=8024 -w
ip netns exec $b sysctl net.ipv6.mld_max_msf=8024 -w
ip netns exec $a ip link add br0 type bridge mcast_querier 1
mcast_query_interval 100 mcast_mld_version 1
ip netns exec $b ip link add br1 type bridge mcast_querier 1
mcast_query_interval 100 mcast_mld_version 1
ip netns exec $a ip link set br0 up
ip netns exec $b ip link set br1 up
for i in {0..5}
do
ip netns exec $a ip link add vetha$i type veth peer name vethb$i
ip netns exec $a ip link set vethb$i netns $b
ip netns exec $a ip link set vetha$i master br0
ip netns exec $a ip link set vetha$i up
ip netns exec $b ip link set vethb$i master br1
ip netns exec $b ip link set vethb$i up
done
sleep 10
ip netns del $a
ip netns del $b
This script is not a real use case.
It just generates *looping* packets and they contain many mld packets
due to the bridge query option.
I think a packet generator would be more useful to reproduce this bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-21 19:01 ` Taehee Yoo
@ 2022-07-22 2:35 ` Hangbin Liu
2022-07-22 4:23 ` Hangbin Liu
0 siblings, 1 reply; 9+ messages in thread
From: Hangbin Liu @ 2022-07-22 2:35 UTC (permalink / raw)
To: Taehee Yoo
Cc: Eric Dumazet, David Miller, Jakub Kicinski, Paolo Abeni,
Hideaki YOSHIFUJI, David Ahern, netdev
On Fri, Jul 22, 2022 at 04:01:13AM +0900, Taehee Yoo wrote:
>
>
> On 7/22/22 03:34, Eric Dumazet wrote:
> > On Thu, Jul 21, 2022 at 7:53 PM Taehee Yoo <ap420073@gmail.com> wrote:
> >>
> >> Hi Eric,
> >> Thank you so much for your review!
> >>
> >
> > ...
> >
> >> I think your assumption is right.
> >> I tested the below scenario, which occurs the real issue.
> >> THREAD0 THREAD1
> >> mld_report_work()
> >> spin_lock_bh()
> >> if (!mod_delayed_work()) <-- queued
> >> in6_dev_hold();
> >> spin_unlock_bh()
> >> spin_lock_bh()
> >> schedule_delayed_work() <-- return false, already queued by THREAD1
> >> spin_unlock_bh()
> >> return;
> >> //no in6_dev_put() regardless return value of schedule_delayed_work().
> >>
> >> In order to check, I added printk like below.
> >> if (++cnt >= MLD_MAX_QUEUE) {
> >>
> >> rework = true;
> >>
> >> if (!schedule_delayed_work(&idev->mc_report_work, 0))
> >> printk("[TEST]%s %u \n", __func__, __LINE__);
> >> break;
> >>
> >>
> >> If the TEST log message is printed, work is already queued by other
> logic.
> >> So, it indicates a reference count is leaked.
> >> The result is that I can see log messages only when the reference count
> >> leak occurs.
> >> So, although I tested it only for 1 hour, I'm sure that this bug comes
> >> from missing check a return value of schedule_delayed_work().
> >>
> >> As you said, this changelog is not correct.
> >> system_wq and mld_wq are not related to this issue.
> >>
> >> I would like to send a v2 patch after some more tests.
> >> The v2 patch will change the commit message.
> >
> > Can you describe what kind of tests you are running ?
> > Was it a syzbot report ?
>
> I found this bug while testing another syzbot report.
> (https://syzkaller.appspot.com/bug?id=ed41eaa4367b421d37aab5dee25e3f4c91ceae93)
> And I can't find the same case in the syzbot reports list.
>
> I just use some command lines and many kernel debug options such as
> kmemleak, kasan, lockdep, and others.
>
Hi Taehee,
I got a similar issue with yours after Eric's 2d3916f31891
("ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()").
I use force_mld_version=1 and adding a lot of IPv6 address to generate the
mld reports flood. Here is my reproducer:
[root@bootp-73-131-221 ~]# cat mld.sh
#!/bin/sh
ip netns add ns1
ip netns add ns2
ip netns exec ns1 sysctl -w net.ipv6.conf.all.force_mld_version=1
ip netns exec ns2 sysctl -w net.ipv6.conf.all.force_mld_version=1
ip -n ns1 link add veth0 type veth peer name veth0 netns ns2
ip -n ns1 link set veth0 up
ip -n ns2 link set veth0 up
for i in `seq 50`; do
for j in `seq 100`; do
ip -n ns1 addr add 2021:${i}::${j}/64 dev veth0
ip -n ns2 addr add 2022:${i}::${j}/64 dev veth0
done
done
modprobe -r veth
ip -a netns del
After `modprobe -r veth` we will the the ref leaker error:
[ 1382.683371] unregister_netdevice: waiting for veth0 to become free. Usage count = 2
[ 1392.931397] unregister_netdevice: waiting for veth0 to become free. Usage count = 2
[ 1402.939234] unregister_netdevice: waiting for veth0 to become free. Usage count = 2
[ 1413.179296] unregister_netdevice: waiting for veth0 to become free. Usage count = 2
I tried to debug the reason but didn't have much clue. Maybe this reproducer
could help you.
Thanks
Hangbin
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-22 2:35 ` Hangbin Liu
@ 2022-07-22 4:23 ` Hangbin Liu
2022-07-22 4:50 ` Taehee Yoo
0 siblings, 1 reply; 9+ messages in thread
From: Hangbin Liu @ 2022-07-22 4:23 UTC (permalink / raw)
To: Taehee Yoo
Cc: Eric Dumazet, David Miller, Jakub Kicinski, Paolo Abeni,
Hideaki YOSHIFUJI, David Ahern, netdev
On Fri, Jul 22, 2022 at 10:35:52AM +0800, Hangbin Liu wrote:
> > I found this bug while testing another syzbot report.
> > (https://syzkaller.appspot.com/bug?id=ed41eaa4367b421d37aab5dee25e3f4c91ceae93)
> > And I can't find the same case in the syzbot reports list.
> >
> > I just use some command lines and many kernel debug options such as
> > kmemleak, kasan, lockdep, and others.
> >
>
> Hi Taehee,
>
> I got a similar issue with yours after Eric's 2d3916f31891
> ("ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()").
> I use force_mld_version=1 and adding a lot of IPv6 address to generate the
> mld reports flood. Here is my reproducer:
BTW, thanks for your fix. With your patch the issue is fixed. Please feel free
to add
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Cheers
Hangbin
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-22 4:23 ` Hangbin Liu
@ 2022-07-22 4:50 ` Taehee Yoo
2022-07-22 6:16 ` Hangbin Liu
0 siblings, 1 reply; 9+ messages in thread
From: Taehee Yoo @ 2022-07-22 4:50 UTC (permalink / raw)
To: Hangbin Liu
Cc: Eric Dumazet, David Miller, Jakub Kicinski, Paolo Abeni,
Hideaki YOSHIFUJI, David Ahern, netdev
Hi Hangbin,
Thank you so much for the test and review!
On 7/22/22 13:23, Hangbin Liu wrote:
> On Fri, Jul 22, 2022 at 10:35:52AM +0800, Hangbin Liu wrote:
>>> I found this bug while testing another syzbot report.
>>>
(https://syzkaller.appspot.com/bug?id=ed41eaa4367b421d37aab5dee25e3f4c91ceae93)
>>> And I can't find the same case in the syzbot reports list.
>>>
>>> I just use some command lines and many kernel debug options such as
>>> kmemleak, kasan, lockdep, and others.
>>>
>>
>> Hi Taehee,
>>
>> I got a similar issue with yours after Eric's 2d3916f31891
>> ("ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()").
>> I use force_mld_version=1 and adding a lot of IPv6 address to
generate the
>> mld reports flood. Here is my reproducer:
>
> BTW, thanks for your fix. With your patch the issue is fixed. Please
feel free
> to add
>
> Tested-by: Hangbin Liu <liuhangbin@gmail.com>
>
I also tested with your reproducer.
I checked that it reproduces same issue.
[ 69.862696][ T58] [TEST]mld_report_work 1629
[ 87.129371][ T10] unregister_netdevice: waiting for veth0 to become
free. Usage count = 2
[ 87.132106][ T10] leaked reference.
[ 87.133276][ T10] ipv6_add_dev+0x324/0xec0
[ 87.134724][ T10] addrconf_notify+0x481/0xd10
[ 87.136200][ T10] raw_notifier_call_chain+0xe3/0x120
[ 87.137829][ T10] call_netdevice_notifiers+0x106/0x160
[ 87.139454][ T10] register_netdevice+0x114c/0x16b0
[ 87.140380][ T10] veth_newlink+0x48b/0xa50 [veth]
[ 87.141268][ T10] rtnl_newlink+0x11a2/0x1a40
[ 87.142073][ T10] rtnetlink_rcv_msg+0x63f/0xc00
[ 87.142956][ T10] netlink_rcv_skb+0x1df/0x3e0
[ 87.143861][ T10] netlink_unicast+0x5de/0x850
[ 87.144725][ T10] netlink_sendmsg+0x6c9/0xa90
[ 87.145595][ T10] ____sys_sendmsg+0x76a/0x780
[ 87.146483][ T10] __sys_sendmsg+0x27c/0x340
[ 87.147340][ T10] do_syscall_64+0x43/0x90
[ 87.148158][ T10] entry_SYSCALL_64_after_hwframe+0x63/0xcd
printk message and calltrace are same.
So, I'm sure that this is the same issue.
Also, I tested with my patch and your script for 1 hour, and the
reference count leak disappeared.
If you are okay, I would like to attach your reproducer script to the
commit message.
> Cheers
> Hangbin
Thank you so much
Taehee Yoo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net] net: mld: do not use system_wq in the mld
2022-07-22 4:50 ` Taehee Yoo
@ 2022-07-22 6:16 ` Hangbin Liu
0 siblings, 0 replies; 9+ messages in thread
From: Hangbin Liu @ 2022-07-22 6:16 UTC (permalink / raw)
To: Taehee Yoo
Cc: Eric Dumazet, David Miller, Jakub Kicinski, Paolo Abeni,
Hideaki YOSHIFUJI, David Ahern, netdev
On Fri, Jul 22, 2022 at 01:50:14PM +0900, Taehee Yoo wrote:
> If you are okay, I would like to attach your reproducer script to the commit
> message.
Sure, please feel free to do that.
Hangbin
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-07-22 6:16 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-21 12:03 [PATCH net] net: mld: do not use system_wq in the mld Taehee Yoo
2022-07-21 14:04 ` Eric Dumazet
2022-07-21 17:53 ` Taehee Yoo
2022-07-21 18:34 ` Eric Dumazet
2022-07-21 19:01 ` Taehee Yoo
2022-07-22 2:35 ` Hangbin Liu
2022-07-22 4:23 ` Hangbin Liu
2022-07-22 4:50 ` Taehee Yoo
2022-07-22 6:16 ` Hangbin Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).