All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
	Kuniyuki Iwashima <kuni1840@gmail.com>, <netdev@vger.kernel.org>
Subject: [PATCH v5 net-next 08/15] af_unix: Fix up unix_edge.successor for embryo socket.
Date: Mon, 25 Mar 2024 13:24:18 -0700	[thread overview]
Message-ID: <20240325202425.60930-9-kuniyu@amazon.com> (raw)
In-Reply-To: <20240325202425.60930-1-kuniyu@amazon.com>

To garbage collect inflight AF_UNIX sockets, we must define the
cyclic reference appropriately.  This is a bit tricky if the loop
consists of embryo sockets.

Suppose that the fd of AF_UNIX socket A is passed to D and the fd B
to C and that C and D are embryo sockets of A and B, respectively.
It may appear that there are two separate graphs, A (-> D) and
B (-> C), but this is not correct.

     A --. .-- B
          X
     C <-' `-> D

Now, D holds A's refcount, and C has B's refcount, so unix_release()
will never be called for A and B when we close() them.  However, no
one can call close() for D and C to free skbs holding refcounts of A
and B because C/D is in A/B's receive queue, which should have been
purged by unix_release() for A and B.

So, here's another type of cyclic reference.  When a fd of an AF_UNIX
socket is passed to an embryo socket, the reference is indirectly held
by its parent listening socket.

  .-> A                            .-> B
  |   `- sk_receive_queue          |   `- sk_receive_queue
  |      `- skb                    |      `- skb
  |         `- sk == C             |         `- sk == D
  |            `- sk_receive_queue |           `- sk_receive_queue
  |               `- skb +---------'               `- skb +-.
  |                                                         |
  `---------------------------------------------------------'

Technically, the graph must be denoted as A <-> B instead of A (-> D)
and B (-> C) to find such a cyclic reference without touching each
socket's receive queue.

  .-> A --. .-- B <-.
  |        X        |  ==  A <-> B
  `-- C <-' `-> D --'

We apply this fixup during GC by fetching the real successor by
unix_edge_successor().

When we call accept(), we clear unix_sock.listener under unix_gc_lock
not to confuse GC.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/net/af_unix.h |  1 +
 net/unix/af_unix.c    |  2 +-
 net/unix/garbage.c    | 20 +++++++++++++++++++-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index dc7469191195..414463803b7e 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -24,6 +24,7 @@ void unix_inflight(struct user_struct *user, struct file *fp);
 void unix_notinflight(struct user_struct *user, struct file *fp);
 void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver);
 void unix_del_edges(struct scm_fp_list *fpl);
+void unix_update_edges(struct unix_sock *receiver);
 int unix_prepare_fpl(struct scm_fp_list *fpl);
 void unix_destroy_fpl(struct scm_fp_list *fpl);
 void unix_gc(void);
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index af74e7ebc35a..ae77e2dc0dae 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1721,7 +1721,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags,
 	}
 
 	tsk = skb->sk;
-	unix_sk(tsk)->listener = NULL;
+	unix_update_edges(unix_sk(tsk));
 	skb_free_datagram(sk, skb);
 	wake_up_interruptible(&unix_sk(sk)->peer_wait);
 
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 33aadaa35346..8d0912c1d01a 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -101,6 +101,17 @@ struct unix_sock *unix_get_socket(struct file *filp)
 	return NULL;
 }
 
+static struct unix_vertex *unix_edge_successor(struct unix_edge *edge)
+{
+	/* If an embryo socket has a fd,
+	 * the listener indirectly holds the fd's refcnt.
+	 */
+	if (edge->successor->listener)
+		return unix_sk(edge->successor->listener)->vertex;
+
+	return edge->successor->vertex;
+}
+
 static LIST_HEAD(unix_unvisited_vertices);
 
 enum unix_vertex_index {
@@ -209,6 +220,13 @@ void unix_del_edges(struct scm_fp_list *fpl)
 	fpl->inflight = false;
 }
 
+void unix_update_edges(struct unix_sock *receiver)
+{
+	spin_lock(&unix_gc_lock);
+	receiver->listener = NULL;
+	spin_unlock(&unix_gc_lock);
+}
+
 int unix_prepare_fpl(struct scm_fp_list *fpl)
 {
 	struct unix_vertex *vertex;
@@ -268,7 +286,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex)
 
 	/* Explore neighbour vertices (receivers of the current vertex's fd). */
 	list_for_each_entry(edge, &vertex->edges, vertex_entry) {
-		struct unix_vertex *next_vertex = edge->successor->vertex;
+		struct unix_vertex *next_vertex = unix_edge_successor(edge);
 
 		if (!next_vertex)
 			continue;
-- 
2.30.2


  parent reply	other threads:[~2024-03-25 20:27 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 20:24 [PATCH v5 net-next 00/15] af_unix: Rework GC Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 01/15] af_unix: Allocate struct unix_vertex for each inflight AF_UNIX fd Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 02/15] af_unix: Allocate struct unix_edge " Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 03/15] af_unix: Link struct unix_edge when queuing skb Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 04/15] af_unix: Bulk update unix_tot_inflight/unix_inflight " Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 05/15] af_unix: Iterate all vertices by DFS Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 06/15] af_unix: Detect Strongly Connected Components Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 07/15] af_unix: Save listener for embryo socket Kuniyuki Iwashima
2024-03-25 20:24 ` Kuniyuki Iwashima [this message]
2024-03-25 20:24 ` [PATCH v5 net-next 09/15] af_unix: Save O(n) setup of Tarjan's algo Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 10/15] af_unix: Skip GC if no cycle exists Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 11/15] af_unix: Avoid Tarjan's algorithm if unnecessary Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 12/15] af_unix: Assign a unique index to SCC Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 13/15] af_unix: Detect dead SCC Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 14/15] af_unix: Replace garbage collection algorithm Kuniyuki Iwashima
2024-03-25 20:24 ` [PATCH v5 net-next 15/15] selftest: af_unix: Test GC for SCM_RIGHTS Kuniyuki Iwashima
2024-03-29  9:55 ` [PATCH v5 net-next 00/15] af_unix: Rework GC Paolo Abeni
2024-03-29 15:50 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240325202425.60930-9-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.