* [PATCH net] rxrpc: Restore removed timer deletion
@ 2022-04-13 10:16 David Howells
2022-04-13 17:14 ` Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: David Howells @ 2022-04-13 10:16 UTC (permalink / raw)
To: netdev; +Cc: Eric Dumazet, Marc Dionne, linux-afs, dhowells, linux-kernel
A recent patch[1] from Eric Dumazet flipped the order in which the
keepalive timer and the keepalive worker were cancelled in order to fix a
syzbot reported issue[2]. Unfortunately, this enables the mirror image bug
whereby the timer races with rxrpc_exit_net(), restarting the worker after
it has been cancelled:
CPU 1 CPU 2
=============== =====================
if (rxnet->live)
<INTERRUPT>
rxnet->live = false;
cancel_work_sync(&rxnet->peer_keepalive_work);
rxrpc_queue_work(&rxnet->peer_keepalive_work);
del_timer_sync(&rxnet->peer_keepalive_timer);
Fix this by restoring the removed del_timer_sync() so that we try to remove
the timer twice. If the timer runs again, it should see ->live == false
and not restart the worker.
Fixes: 1946014ca3b1 ("rxrpc: fix a race in rxrpc_exit_net()")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20220404183439.3537837-1-eric.dumazet@gmail.com/ [1]
Link: https://syzkaller.appspot.com/bug?extid=724378c4bb58f703b09a [2]
---
net/rxrpc/net_ns.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/rxrpc/net_ns.c b/net/rxrpc/net_ns.c
index f15d6942da45..cc7e30733feb 100644
--- a/net/rxrpc/net_ns.c
+++ b/net/rxrpc/net_ns.c
@@ -113,7 +113,9 @@ static __net_exit void rxrpc_exit_net(struct net *net)
struct rxrpc_net *rxnet = rxrpc_net(net);
rxnet->live = false;
+ del_timer_sync(&rxnet->peer_keepalive_timer);
cancel_work_sync(&rxnet->peer_keepalive_work);
+ /* Remove the timer again as the worker may have restarted it. */
del_timer_sync(&rxnet->peer_keepalive_timer);
rxrpc_destroy_all_calls(rxnet);
rxrpc_destroy_all_connections(rxnet);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net] rxrpc: Restore removed timer deletion
2022-04-13 10:16 [PATCH net] rxrpc: Restore removed timer deletion David Howells
@ 2022-04-13 17:14 ` Eric Dumazet
2022-04-13 17:41 ` David Howells
2022-04-15 10:00 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2022-04-13 17:14 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Marc Dionne, linux-afs, linux-kernel
On Wed, Apr 13, 2022 at 3:16 AM David Howells <dhowells@redhat.com> wrote:
>
> A recent patch[1] from Eric Dumazet flipped the order in which the
> keepalive timer and the keepalive worker were cancelled in order to fix a
> syzbot reported issue[2]. Unfortunately, this enables the mirror image bug
> whereby the timer races with rxrpc_exit_net(), restarting the worker after
> it has been cancelled:
>
> CPU 1 CPU 2
> =============== =====================
> if (rxnet->live)
> <INTERRUPT>
> rxnet->live = false;
> cancel_work_sync(&rxnet->peer_keepalive_work);
> rxrpc_queue_work(&rxnet->peer_keepalive_work);
> del_timer_sync(&rxnet->peer_keepalive_timer);
>
> Fix this by restoring the removed del_timer_sync() so that we try to remove
> the timer twice. If the timer runs again, it should see ->live == false
> and not restart the worker.
>
> Fixes: 1946014ca3b1 ("rxrpc: fix a race in rxrpc_exit_net()")
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: linux-afs@lists.infradead.org
> Link: https://lore.kernel.org/r/20220404183439.3537837-1-eric.dumazet@gmail.com/ [1]
> Link: https://syzkaller.appspot.com/bug?extid=724378c4bb58f703b09a [2]
> ---
>
> net/rxrpc/net_ns.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/rxrpc/net_ns.c b/net/rxrpc/net_ns.c
> index f15d6942da45..cc7e30733feb 100644
> --- a/net/rxrpc/net_ns.c
> +++ b/net/rxrpc/net_ns.c
> @@ -113,7 +113,9 @@ static __net_exit void rxrpc_exit_net(struct net *net)
> struct rxrpc_net *rxnet = rxrpc_net(net);
>
> rxnet->live = false;
> + del_timer_sync(&rxnet->peer_keepalive_timer);
> cancel_work_sync(&rxnet->peer_keepalive_work);
> + /* Remove the timer again as the worker may have restarted it. */
> del_timer_sync(&rxnet->peer_keepalive_timer);
> rxrpc_destroy_all_calls(rxnet);
> rxrpc_destroy_all_connections(rxnet);
>
>
ok... so we have a timer and a work queue, both activating each other
in kind of a ping pong ?
Any particular reason not using delayed works ?
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] rxrpc: Restore removed timer deletion
2022-04-13 10:16 [PATCH net] rxrpc: Restore removed timer deletion David Howells
2022-04-13 17:14 ` Eric Dumazet
@ 2022-04-13 17:41 ` David Howells
2022-04-13 17:53 ` Eric Dumazet
2022-04-15 10:00 ` patchwork-bot+netdevbpf
2 siblings, 1 reply; 5+ messages in thread
From: David Howells @ 2022-04-13 17:41 UTC (permalink / raw)
To: Eric Dumazet; +Cc: dhowells, netdev, Marc Dionne, linux-afs, linux-kernel
Eric Dumazet <edumazet@google.com> wrote:
> ok... so we have a timer and a work queue, both activating each other
> in kind of a ping pong ?
Yes. I want to emit regular keepalive pokes.
> Any particular reason not using delayed works ?
Because there's a race between starting the keepalive timer when a new peer is
added and when the keepalive worker is resetting the timer for the next peer
in the list. This is why I'm using timer_reduce(). delayed_work doesn't
currently have such a facility. It's not simple to add because
try_to_grab_pending() as called from mod_delayed_work_on() cancels the timer -
which is not what I want it to do.
David
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] rxrpc: Restore removed timer deletion
2022-04-13 17:41 ` David Howells
@ 2022-04-13 17:53 ` Eric Dumazet
0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2022-04-13 17:53 UTC (permalink / raw)
To: David Howells; +Cc: netdev, Marc Dionne, linux-afs, linux-kernel
On Wed, Apr 13, 2022 at 10:41 AM David Howells <dhowells@redhat.com> wrote:
>
> Eric Dumazet <edumazet@google.com> wrote:
>
> > ok... so we have a timer and a work queue, both activating each other
> > in kind of a ping pong ?
>
> Yes. I want to emit regular keepalive pokes.
>
> > Any particular reason not using delayed works ?
>
> Because there's a race between starting the keepalive timer when a new peer is
> added and when the keepalive worker is resetting the timer for the next peer
> in the list. This is why I'm using timer_reduce(). delayed_work doesn't
> currently have such a facility. It's not simple to add because
> try_to_grab_pending() as called from mod_delayed_work_on() cancels the timer -
> which is not what I want it to do.
>
SGTM, thanks !
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] rxrpc: Restore removed timer deletion
2022-04-13 10:16 [PATCH net] rxrpc: Restore removed timer deletion David Howells
2022-04-13 17:14 ` Eric Dumazet
2022-04-13 17:41 ` David Howells
@ 2022-04-15 10:00 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-04-15 10:00 UTC (permalink / raw)
To: David Howells; +Cc: netdev, edumazet, marc.dionne, linux-afs, linux-kernel
Hello:
This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:
On Wed, 13 Apr 2022 11:16:25 +0100 you wrote:
> A recent patch[1] from Eric Dumazet flipped the order in which the
> keepalive timer and the keepalive worker were cancelled in order to fix a
> syzbot reported issue[2]. Unfortunately, this enables the mirror image bug
> whereby the timer races with rxrpc_exit_net(), restarting the worker after
> it has been cancelled:
>
> CPU 1 CPU 2
> =============== =====================
> if (rxnet->live)
> <INTERRUPT>
> rxnet->live = false;
> cancel_work_sync(&rxnet->peer_keepalive_work);
> rxrpc_queue_work(&rxnet->peer_keepalive_work);
> del_timer_sync(&rxnet->peer_keepalive_timer);
>
> [...]
Here is the summary with links:
- [net] rxrpc: Restore removed timer deletion
https://git.kernel.org/netdev/net/c/ee3b0826b476
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-04-15 10:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13 10:16 [PATCH net] rxrpc: Restore removed timer deletion David Howells
2022-04-13 17:14 ` Eric Dumazet
2022-04-13 17:41 ` David Howells
2022-04-13 17:53 ` Eric Dumazet
2022-04-15 10:00 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).