All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net] af_unix: Clear stale u->oob_skb.
@ 2024-04-05 22:10 Kuniyuki Iwashima
  2024-04-06  5:48 ` Eric Dumazet
  2024-04-09  3:30 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 5+ messages in thread
From: Kuniyuki Iwashima @ 2024-04-05 22:10 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Rao shoaib, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev,
	syzbot+7f7f201cc2668a8fd169

syzkaller started to report deadlock of unix_gc_lock after commit
4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but
it just uncovers the bug that has been there since commit 314001f0bf92
("af_unix: Add OOB support").

The repro basically does the following.

  from socket import *
  from array import array

  c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
  c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB)
  c2.recv(1)  # blocked as no normal data in recv queue

  c2.close()  # done async and unblock recv()
  c1.close()  # done async and trigger GC

A socket sends its file descriptor to itself as OOB data and tries to
receive normal data, but finally recv() fails due to async close().

The problem here is wrong handling of OOB skb in manage_oob().  When
recvmsg() is called without MSG_OOB, manage_oob() is called to check
if the peeked skb is OOB skb.  In such a case, manage_oob() pops it
out of the receive queue but does not clear unix_sock(sk)->oob_skb.
This is wrong in terms of uAPI.

Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB.
The 'o' is handled as OOB data.  When recv() is called twice without
MSG_OOB, the OOB data should be lost.

  >>> from socket import *
  >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0)
  >>> c1.send(b'hello', MSG_OOB)  # 'o' is OOB data
  5
  >>> c1.send(b'world')
  5
  >>> c2.recv(5)  # OOB data is not received
  b'hell'
  >>> c2.recv(5)  # OOB date is skipped
  b'world'
  >>> c2.recv(5, MSG_OOB)  # This should return an error
  b'o'

In the same situation, TCP actually returns -EINVAL for the last
recv().

Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set
EPOLLPRI even though the data has passed through by previous recv().

To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing
it from recv queue.

The reason why the old GC did not trigger the deadlock is because the
old GC relied on the receive queue to detect the loop.

When it is triggered, the socket with OOB data is marked as GC candidate
because file refcount == inflight count (1).  However, after traversing
all inflight sockets, the socket still has a positive inflight count (1),
thus the socket is excluded from candidates.  Then, the old GC lose the
chance to garbage-collect the socket.

With the old GC, the repro continues to create true garbage that will
never be freed nor detected by kmemleak as it's linked to the global
inflight list.  That's why we couldn't even notice the issue.

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Reported-by: syzbot+7f7f201cc2668a8fd169@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7f7f201cc2668a8fd169
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
v2: Use skb_unref() instead of kfree_skb()
v1: https://lore.kernel.org/netdev/20240405204145.93169-1-kuniyu@amazon.com/
---
 net/unix/af_unix.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 5b41e2321209..d032eb5fa6df 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2665,7 +2665,9 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
 				}
 			} else if (!(flags & MSG_PEEK)) {
 				skb_unlink(skb, &sk->sk_receive_queue);
-				consume_skb(skb);
+				WRITE_ONCE(u->oob_skb, NULL);
+				if (!WARN_ON_ONCE(skb_unref(skb)))
+					kfree_skb(skb);
 				skb = skb_peek(&sk->sk_receive_queue);
 			}
 		}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] af_unix: Clear stale u->oob_skb.
  2024-04-05 22:10 [PATCH v2 net] af_unix: Clear stale u->oob_skb Kuniyuki Iwashima
@ 2024-04-06  5:48 ` Eric Dumazet
  2024-04-09  3:30 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2024-04-06  5:48 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Rao shoaib,
	Kuniyuki Iwashima, netdev, syzbot+7f7f201cc2668a8fd169

On Sat, Apr 6, 2024 at 12:11 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> syzkaller started to report deadlock of unix_gc_lock after commit
> 4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but
> it just uncovers the bug that has been there since commit 314001f0bf92
> ("af_unix: Add OOB support").
>
> The repro basically does the following.
>
>   from socket import *
>   from array import array
>
>   c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
>   c1.sendmsg([b'a'], [(SOL_SOCKET, SCM_RIGHTS, array("i", [c2.fileno()]))], MSG_OOB)
>   c2.recv(1)  # blocked as no normal data in recv queue
>
>   c2.close()  # done async and unblock recv()
>   c1.close()  # done async and trigger GC
>
> A socket sends its file descriptor to itself as OOB data and tries to
> receive normal data, but finally recv() fails due to async close().
>
> The problem here is wrong handling of OOB skb in manage_oob().  When
> recvmsg() is called without MSG_OOB, manage_oob() is called to check
> if the peeked skb is OOB skb.  In such a case, manage_oob() pops it
> out of the receive queue but does not clear unix_sock(sk)->oob_skb.
> This is wrong in terms of uAPI.
>
> Let's say we send "hello" with MSG_OOB, and "world" without MSG_OOB.
> The 'o' is handled as OOB data.  When recv() is called twice without
> MSG_OOB, the OOB data should be lost.
>
>   >>> from socket import *
>   >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM, 0)
>   >>> c1.send(b'hello', MSG_OOB)  # 'o' is OOB data
>   5
>   >>> c1.send(b'world')
>   5
>   >>> c2.recv(5)  # OOB data is not received
>   b'hell'
>   >>> c2.recv(5)  # OOB date is skipped
>   b'world'
>   >>> c2.recv(5, MSG_OOB)  # This should return an error
>   b'o'
>
> In the same situation, TCP actually returns -EINVAL for the last
> recv().
>
> Also, if we do not clear unix_sk(sk)->oob_skb, unix_poll() always set
> EPOLLPRI even though the data has passed through by previous recv().
>
> To avoid these issues, we must clear unix_sk(sk)->oob_skb when dequeuing
> it from recv queue.
>
> The reason why the old GC did not trigger the deadlock is because the
> old GC relied on the receive queue to detect the loop.
>
> When it is triggered, the socket with OOB data is marked as GC candidate
> because file refcount == inflight count (1).  However, after traversing
> all inflight sockets, the socket still has a positive inflight count (1),
> thus the socket is excluded from candidates.  Then, the old GC lose the
> chance to garbage-collect the socket.
>
> With the old GC, the repro continues to create true garbage that will
> never be freed nor detected by kmemleak as it's linked to the global
> inflight list.  That's why we couldn't even notice the issue.
>
> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
> Reported-by: syzbot+7f7f201cc2668a8fd169@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=7f7f201cc2668a8fd169
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] af_unix: Clear stale u->oob_skb.
  2024-04-05 22:10 [PATCH v2 net] af_unix: Clear stale u->oob_skb Kuniyuki Iwashima
  2024-04-06  5:48 ` Eric Dumazet
@ 2024-04-09  3:30 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-04-09  3:30 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: davem, edumazet, kuba, pabeni, rao.shoaib, kuni1840, netdev,
	syzbot+7f7f201cc2668a8fd169

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 5 Apr 2024 15:10:57 -0700 you wrote:
> syzkaller started to report deadlock of unix_gc_lock after commit
> 4090fa373f0e ("af_unix: Replace garbage collection algorithm."), but
> it just uncovers the bug that has been there since commit 314001f0bf92
> ("af_unix: Add OOB support").
> 
> The repro basically does the following.
> 
> [...]

Here is the summary with links:
  - [v2,net] af_unix: Clear stale u->oob_skb.
    https://git.kernel.org/netdev/net/c/b46f4eaa4f0e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] af_unix: Clear stale u->oob_skb.
  2024-04-05 23:01 ` Kuniyuki Iwashima
@ 2024-04-05 23:28   ` Rao Shoaib
  0 siblings, 0 replies; 5+ messages in thread
From: Rao Shoaib @ 2024-04-05 23:28 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: davem, edumazet, kuba, kuni1840, netdev, pabeni,
	syzbot+7f7f201cc2668a8fd169



On 4/5/24 16:01, Kuniyuki Iwashima wrote:
> From: Rao Shoaib <rao.shoaib@oracle.com>
> Date: Fri, 5 Apr 2024 15:43:26 -0700
> [...]
>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>> index 5b41e2321209..d032eb5fa6df 100644
>>> --- a/net/unix/af_unix.c
>>> +++ b/net/unix/af_unix.c
>>> @@ -2665,7 +2665,9 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
>>>   				}
>>>   			} else if (!(flags & MSG_PEEK)) {
>>>   				skb_unlink(skb, &sk->sk_receive_queue);
>>> -				consume_skb(skb);
>>> +				WRITE_ONCE(u->oob_skb, NULL);
>>> +				if (!WARN_ON_ONCE(skb_unref(skb)))
>>> +					kfree_skb(skb);
>>>   				skb = skb_peek(&sk->sk_receive_queue);
>>>   			}
>>>   		}
>>
>> Isn't  consume_skb doing the same thing() ? .
> 
> Only if you disable CONFIG_TRACEPOINTS, otherwise each function
> uses different tracepoints, trace_consume_skb() vs trace_kfree_skb().
> 
> Here, we clearly drop the skb that's not received by user, so
> kfree_skb() should be used.

I won't get into the semantics of freed or consumed. So fine go with it. 
However if I was tracing I would just trace kfree_skb.
> 
> 
>> The only action needed is to clear out u->oob_skb -- No?
> 
> Also, queue_oob() now calls skb_get() and holds another refcnt,
> so skb_unref() is needed.

A quick look, does not show me that the proposed fix does anything 
different wrt to the refcnt. consume_skb() also calls skb_unref() and 
kfree_skb().

I am fine with this.

Reviewed-by: Rao shoaib <rao.shoaib@oracle.com>

Shoaib


> 
> BTW, do you know if MSG_OOB is actually used in production ?
> I don't know any real user other than syzbot.
> 
> P.S.
> Please send mail in plain text format as mailing list drops HTML.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] af_unix: Clear stale u->oob_skb.
       [not found] <ae5def68-6cc0-4cfe-a031-fefb103e854c@oracle.com>
@ 2024-04-05 23:01 ` Kuniyuki Iwashima
  2024-04-05 23:28   ` Rao Shoaib
  0 siblings, 1 reply; 5+ messages in thread
From: Kuniyuki Iwashima @ 2024-04-05 23:01 UTC (permalink / raw)
  To: rao.shoaib
  Cc: kuniyu, davem, edumazet, kuba, kuni1840, netdev, pabeni,
	syzbot+7f7f201cc2668a8fd169

From: Rao Shoaib <rao.shoaib@oracle.com>
Date: Fri, 5 Apr 2024 15:43:26 -0700
[...]
> > diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> > index 5b41e2321209..d032eb5fa6df 100644
> > --- a/net/unix/af_unix.c
> > +++ b/net/unix/af_unix.c
> > @@ -2665,7 +2665,9 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
> >  				}
> >  			} else if (!(flags & MSG_PEEK)) {
> >  				skb_unlink(skb, &sk->sk_receive_queue);
> > -				consume_skb(skb);
> > +				WRITE_ONCE(u->oob_skb, NULL);
> > +				if (!WARN_ON_ONCE(skb_unref(skb)))
> > +					kfree_skb(skb);
> >  				skb = skb_peek(&sk->sk_receive_queue);
> >  			}
> >  		}
> 
> Isn't  consume_skb doing the same thing() ? .

Only if you disable CONFIG_TRACEPOINTS, otherwise each function
uses different tracepoints, trace_consume_skb() vs trace_kfree_skb().

Here, we clearly drop the skb that's not received by user, so
kfree_skb() should be used.


> The only action needed is to clear out u->oob_skb -- No?

Also, queue_oob() now calls skb_get() and holds another refcnt,
so skb_unref() is needed.

BTW, do you know if MSG_OOB is actually used in production ?
I don't know any real user other than syzbot.

P.S.
Please send mail in plain text format as mailing list drops HTML.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-04-09  3:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-05 22:10 [PATCH v2 net] af_unix: Clear stale u->oob_skb Kuniyuki Iwashima
2024-04-06  5:48 ` Eric Dumazet
2024-04-09  3:30 ` patchwork-bot+netdevbpf
     [not found] <ae5def68-6cc0-4cfe-a031-fefb103e854c@oracle.com>
2024-04-05 23:01 ` Kuniyuki Iwashima
2024-04-05 23:28   ` Rao Shoaib

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.