* Re: INFO: rcu detected stall in wg_packet_tx_worker [not found] <00000000000005d1ab05a4351006@google.com> @ 2020-04-26 17:57 ` syzbot 2020-04-26 19:40 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: syzbot @ 2020-04-26 17:57 UTC (permalink / raw) To: Jason, davem, f.fainelli, gregkh, jason, jhs, jiri, krzk, kuba, kvalo, leon, linux-kernel, linux-kselftest, netdev, shuah, syzkaller-bugs, tglx, vivien.didelot, xiyou.wangcong syzbot has bisected this bug to: commit e7096c131e5161fa3b8e52a650d7719d2857adfd Author: Jason A. Donenfeld <Jason@zx2c4.com> Date: Sun Dec 8 23:27:34 2019 +0000 net: WireGuard secure network tunnel bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. git tree: upstream final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a userspace arch: i386 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 17:57 ` INFO: rcu detected stall in wg_packet_tx_worker syzbot @ 2020-04-26 19:40 ` Eric Dumazet 2020-04-26 19:42 ` Jason A. Donenfeld 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2020-04-26 19:40 UTC (permalink / raw) To: syzbot, Jason, davem, f.fainelli, gregkh, jhs, jiri, krzk, kuba, kvalo, leon, linux-kernel, linux-kselftest, netdev, shuah, syzkaller-bugs, tglx, vivien.didelot, xiyou.wangcong On 4/26/20 10:57 AM, syzbot wrote: > syzbot has bisected this bug to: > > commit e7096c131e5161fa3b8e52a650d7719d2857adfd > Author: Jason A. Donenfeld <Jason@zx2c4.com> > Date: Sun Dec 8 23:27:34 2019 +0000 > > net: WireGuard secure network tunnel > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 > start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. > git tree: upstream > final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 > console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 > dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a > userspace arch: i386 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 > > Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com > Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection > I have not looked at the repro closely, but WireGuard has some workers that might loop forever, cond_resched() might help a bit. diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c index da3b782ab7d31df11e381529b144bcc494234a38..349a71e1907e081c61967c77c9f25a6ec5e57a24 100644 --- a/drivers/net/wireguard/receive.c +++ b/drivers/net/wireguard/receive.c @@ -518,6 +518,7 @@ void wg_packet_decrypt_worker(struct work_struct *work) &PACKET_CB(skb)->keypair->receiving)) ? PACKET_STATE_CRYPTED : PACKET_STATE_DEAD; wg_queue_enqueue_per_peer_napi(skb, state); + cond_resched(); } } diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c index 7348c10cbae3db54bfcb31f23c2753185735f876..f5b88693176c84b4bfdf8c4e05071481a3ce45b5 100644 --- a/drivers/net/wireguard/send.c +++ b/drivers/net/wireguard/send.c @@ -281,6 +281,7 @@ void wg_packet_tx_worker(struct work_struct *work) wg_noise_keypair_put(keypair, false); wg_peer_put(peer); + cond_resched(); } } ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 19:40 ` Eric Dumazet @ 2020-04-26 19:42 ` Jason A. Donenfeld 2020-04-26 19:52 ` Jason A. Donenfeld 2020-04-26 20:26 ` Eric Dumazet 0 siblings, 2 replies; 10+ messages in thread From: Jason A. Donenfeld @ 2020-04-26 19:42 UTC (permalink / raw) To: Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang On Sun, Apr 26, 2020 at 1:40 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 4/26/20 10:57 AM, syzbot wrote: > > syzbot has bisected this bug to: > > > > commit e7096c131e5161fa3b8e52a650d7719d2857adfd > > Author: Jason A. Donenfeld <Jason@zx2c4.com> > > Date: Sun Dec 8 23:27:34 2019 +0000 > > > > net: WireGuard secure network tunnel > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 > > start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. > > git tree: upstream > > final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 > > console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 > > dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a > > userspace arch: i386 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 > > > > Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com > > Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") > > > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection > > > > I have not looked at the repro closely, but WireGuard has some workers > that might loop forever, cond_resched() might help a bit. I'm working on this right now. Having a bit difficult of a time getting it to reproduce locally... The reports show the stall happening always at: static struct sk_buff * sfq_dequeue(struct Qdisc *sch) { struct sfq_sched_data *q = qdisc_priv(sch); struct sk_buff *skb; sfq_index a, next_a; struct sfq_slot *slot; /* No active slots */ if (q->tail == NULL) return NULL; next_slot: a = q->tail->next; slot = &q->slots[a]; Which is kind of interesting, because it's not like that should block or anything, unless there's some kasan faulting happening. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 19:42 ` Jason A. Donenfeld @ 2020-04-26 19:52 ` Jason A. Donenfeld 2020-04-26 19:58 ` Jason A. Donenfeld 2020-04-26 20:26 ` Eric Dumazet 1 sibling, 1 reply; 10+ messages in thread From: Jason A. Donenfeld @ 2020-04-26 19:52 UTC (permalink / raw) To: Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang It looks like part of the issue might be that I call udp_tunnel6_xmit_skb while holding rcu_read_lock_bh, in drivers/net/wireguard/socket.c. But I think there's good reason to do so, and udp_tunnel6_xmit_skb should be rcu safe. In fact, every.single.other user of udp_tunnel6_xmit_skb in the kernel uses it with rcu locked. So, hm... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 19:52 ` Jason A. Donenfeld @ 2020-04-26 19:58 ` Jason A. Donenfeld 0 siblings, 0 replies; 10+ messages in thread From: Jason A. Donenfeld @ 2020-04-26 19:58 UTC (permalink / raw) To: Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang On Sun, Apr 26, 2020 at 1:52 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > It looks like part of the issue might be that I call > udp_tunnel6_xmit_skb while holding rcu_read_lock_bh, in > drivers/net/wireguard/socket.c. But I think there's good reason to do > so, and udp_tunnel6_xmit_skb should be rcu safe. In fact, > every.single.other user of udp_tunnel6_xmit_skb in the kernel uses it > with rcu locked. So, hm... In the syzkaller log, it looks like several runs are hitting: run #0: crashed: INFO: rcu detected stall in netlink_sendmsg And other runs are hitting yet different functions. So actually, it's not clear that this is the fault of the call to udp_tunnel6_xmit_skb. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 19:42 ` Jason A. Donenfeld 2020-04-26 19:52 ` Jason A. Donenfeld @ 2020-04-26 20:26 ` Eric Dumazet 2020-04-26 20:38 ` Eric Dumazet 1 sibling, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2020-04-26 20:26 UTC (permalink / raw) To: Jason A. Donenfeld, Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang On 4/26/20 12:42 PM, Jason A. Donenfeld wrote: > On Sun, Apr 26, 2020 at 1:40 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >> >> >> >> On 4/26/20 10:57 AM, syzbot wrote: >>> syzbot has bisected this bug to: >>> >>> commit e7096c131e5161fa3b8e52a650d7719d2857adfd >>> Author: Jason A. Donenfeld <Jason@zx2c4.com> >>> Date: Sun Dec 8 23:27:34 2019 +0000 >>> >>> net: WireGuard secure network tunnel >>> >>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 >>> start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. >>> git tree: upstream >>> final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 >>> console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 >>> dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a >>> userspace arch: i386 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 >>> >>> Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com >>> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") >>> >>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection >>> >> >> I have not looked at the repro closely, but WireGuard has some workers >> that might loop forever, cond_resched() might help a bit. > > I'm working on this right now. Having a bit difficult of a time > getting it to reproduce locally... > > The reports show the stall happening always at: > > static struct sk_buff * > sfq_dequeue(struct Qdisc *sch) > { > struct sfq_sched_data *q = qdisc_priv(sch); > struct sk_buff *skb; > sfq_index a, next_a; > struct sfq_slot *slot; > > /* No active slots */ > if (q->tail == NULL) > return NULL; > > next_slot: > a = q->tail->next; > slot = &q->slots[a]; > > Which is kind of interesting, because it's not like that should block > or anything, unless there's some kasan faulting happening. > I am not really sure WireGuard is involved, the repro does not rely on it anyway. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 20:26 ` Eric Dumazet @ 2020-04-26 20:38 ` Eric Dumazet 2020-04-26 20:46 ` Jason A. Donenfeld 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2020-04-26 20:38 UTC (permalink / raw) To: Jason A. Donenfeld Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang On 4/26/20 1:26 PM, Eric Dumazet wrote: > > > On 4/26/20 12:42 PM, Jason A. Donenfeld wrote: >> On Sun, Apr 26, 2020 at 1:40 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: >>> >>> >>> >>> On 4/26/20 10:57 AM, syzbot wrote: >>>> syzbot has bisected this bug to: >>>> >>>> commit e7096c131e5161fa3b8e52a650d7719d2857adfd >>>> Author: Jason A. Donenfeld <Jason@zx2c4.com> >>>> Date: Sun Dec 8 23:27:34 2019 +0000 >>>> >>>> net: WireGuard secure network tunnel >>>> >>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 >>>> start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. >>>> git tree: upstream >>>> final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a >>>> userspace arch: i386 >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 >>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 >>>> >>>> Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com >>>> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") >>>> >>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection >>>> >>> >>> I have not looked at the repro closely, but WireGuard has some workers >>> that might loop forever, cond_resched() might help a bit. >> >> I'm working on this right now. Having a bit difficult of a time >> getting it to reproduce locally... >> >> The reports show the stall happening always at: >> >> static struct sk_buff * >> sfq_dequeue(struct Qdisc *sch) >> { >> struct sfq_sched_data *q = qdisc_priv(sch); >> struct sk_buff *skb; >> sfq_index a, next_a; >> struct sfq_slot *slot; >> >> /* No active slots */ >> if (q->tail == NULL) >> return NULL; >> >> next_slot: >> a = q->tail->next; >> slot = &q->slots[a]; >> >> Which is kind of interesting, because it's not like that should block >> or anything, unless there's some kasan faulting happening. >> > > I am not really sure WireGuard is involved, the repro does not rely on it anyway. > Yes, do not spend too much time on this. syzbot found its way into crazy qdisc settings these last days. ( I sent a patch yesterday for choke qdisc, it seems similar checks are needed in sfq ) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 20:38 ` Eric Dumazet @ 2020-04-26 20:46 ` Jason A. Donenfeld 2020-04-26 21:53 ` Eric Dumazet 0 siblings, 1 reply; 10+ messages in thread From: Jason A. Donenfeld @ 2020-04-26 20:46 UTC (permalink / raw) To: Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang On Sun, Apr 26, 2020 at 2:38 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > On 4/26/20 1:26 PM, Eric Dumazet wrote: > > > > > > On 4/26/20 12:42 PM, Jason A. Donenfeld wrote: > >> On Sun, Apr 26, 2020 at 1:40 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > >>> > >>> > >>> > >>> On 4/26/20 10:57 AM, syzbot wrote: > >>>> syzbot has bisected this bug to: > >>>> > >>>> commit e7096c131e5161fa3b8e52a650d7719d2857adfd > >>>> Author: Jason A. Donenfeld <Jason@zx2c4.com> > >>>> Date: Sun Dec 8 23:27:34 2019 +0000 > >>>> > >>>> net: WireGuard secure network tunnel > >>>> > >>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 > >>>> start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. > >>>> git tree: upstream > >>>> final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 > >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 > >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 > >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a > >>>> userspace arch: i386 > >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 > >>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 > >>>> > >>>> Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com > >>>> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") > >>>> > >>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection > >>>> > >>> > >>> I have not looked at the repro closely, but WireGuard has some workers > >>> that might loop forever, cond_resched() might help a bit. > >> > >> I'm working on this right now. Having a bit difficult of a time > >> getting it to reproduce locally... > >> > >> The reports show the stall happening always at: > >> > >> static struct sk_buff * > >> sfq_dequeue(struct Qdisc *sch) > >> { > >> struct sfq_sched_data *q = qdisc_priv(sch); > >> struct sk_buff *skb; > >> sfq_index a, next_a; > >> struct sfq_slot *slot; > >> > >> /* No active slots */ > >> if (q->tail == NULL) > >> return NULL; > >> > >> next_slot: > >> a = q->tail->next; > >> slot = &q->slots[a]; > >> > >> Which is kind of interesting, because it's not like that should block > >> or anything, unless there's some kasan faulting happening. > >> > > > > I am not really sure WireGuard is involved, the repro does not rely on it anyway. > > > > Yes, do not spend too much time on this. > > syzbot found its way into crazy qdisc settings these last days. > > ( I sent a patch yesterday for choke qdisc, it seems similar checks are needed in sfq ) Ah, whew, okay. I had just begun instrumenting sfq (the highly technical term for "adding printks everywhere") to figure out what's going on. Looks like you've got a handle on it, so I'll let you have at it. On the brighter side, it seems like Dmitry's and my effort to get full coverage of WireGuard has paid off in the sense that tons of packets wind up being shoveled through it in one way or another, which is good. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 20:46 ` Jason A. Donenfeld @ 2020-04-26 21:53 ` Eric Dumazet 2020-05-04 11:23 ` Jason A. Donenfeld 0 siblings, 1 reply; 10+ messages in thread From: Eric Dumazet @ 2020-04-26 21:53 UTC (permalink / raw) To: Jason A. Donenfeld, Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, jhs, Jiří Pírko, Krzysztof Kozlowski, kuba, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang On 4/26/20 1:46 PM, Jason A. Donenfeld wrote: > > Ah, whew, okay. I had just begun instrumenting sfq (the highly > technical term for "adding printks everywhere") to figure out what's > going on. Looks like you've got a handle on it, so I'll let you have > at it. Yes, syzbot manages to put a zero in q->scaled_quantum I will send a fix. > > On the brighter side, it seems like Dmitry's and my effort to get full > coverage of WireGuard has paid off in the sense that tons of packets > wind up being shoveled through it in one way or another, which is > good. > Sure ! ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: INFO: rcu detected stall in wg_packet_tx_worker 2020-04-26 21:53 ` Eric Dumazet @ 2020-05-04 11:23 ` Jason A. Donenfeld 0 siblings, 0 replies; 10+ messages in thread From: Jason A. Donenfeld @ 2020-05-04 11:23 UTC (permalink / raw) To: Eric Dumazet Cc: syzbot, David Miller, Florian Fainelli, Greg Kroah-Hartman, Jamal Hadi Salim, Jiří Pírko, Krzysztof Kozlowski, Jakub Kicinski, kvalo, leon, LKML, linux-kselftest, Netdev, Shuah Khan, syzkaller-bugs, Thomas Gleixner, vivien.didelot, Cong Wang So in spite of this Syzkaller bug being unrelated in the end, I've continued to think about the stacktrace a bit, and combined with some other [potentially false alarm] bug reports I'm trying to wrap my head around, I'm a bit a curious about ideal usage for the udp_tunnel API. All the uses I've seen in the kernel (including wireguard) follow this pattern: rcu_read_lock_bh(); sock = rcu_dereference(obj->sock); ... udp_tunnel_xmit_skb(..., sock, ...); rcu_read_unlock_bh(); udp_tunnel_xmit_skb calls iptunnel_xmit, which winds up in the usual ip_local_out path, which eventually winds up calling some other devices' ndo_xmit, or gets queued up in a qdisc. Calls to udp_tunnel_xmit_skb aren't exactly cheap. So I wonder: is holding the rcu lock for all that time really a good thing? A different pattern that avoids holding the rcu lock would be: rcu_read_lock_bh(); sock = rcu_dereference(obj->sock); sock_hold(sock); rcu_read_unlock_bh(); ... udp_tunnel_xmit_skb(..., sock, ...); sock_put(sock); This seems better, but I wonder if it has some drawbacks too. For example, sock_put has some comment that warns against incrementing it in response to forwarded packets. And if this isn't necessary to do, it's marginally more costly than the first pattern. Any opinions about this? Jason ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-05-04 11:23 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <00000000000005d1ab05a4351006@google.com> 2020-04-26 17:57 ` INFO: rcu detected stall in wg_packet_tx_worker syzbot 2020-04-26 19:40 ` Eric Dumazet 2020-04-26 19:42 ` Jason A. Donenfeld 2020-04-26 19:52 ` Jason A. Donenfeld 2020-04-26 19:58 ` Jason A. Donenfeld 2020-04-26 20:26 ` Eric Dumazet 2020-04-26 20:38 ` Eric Dumazet 2020-04-26 20:46 ` Jason A. Donenfeld 2020-04-26 21:53 ` Eric Dumazet 2020-05-04 11:23 ` Jason A. Donenfeld
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).