All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
To: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp>,
	Kuniyuki Iwashima <kuni1840@gmail.com>,
	Benjamin Herrenschmidt <benh@amazon.com>,
	<netdev@vger.kernel.org>
Subject: [PATCH net-next 13/13] af_unix: Relax race in unix_autobind().
Date: Sat, 6 Nov 2021 18:17:12 +0900	[thread overview]
Message-ID: <20211106091712.15206-14-kuniyu@amazon.co.jp> (raw)
In-Reply-To: <20211106091712.15206-1-kuniyu@amazon.co.jp>

When we bind an AF_UNIX socket without a name specified, the kernel selects
an available one from 0x00000 to 0xFFFFF.  unix_autobind() starts searching
from a number in the 'static' variable and increments it after acquiring
two locks.

If multiple processes try autobind, they obtain the same lock and check if
a socket in the hash list has the same name.  If not, one process uses it,
and all except one end up retrying the _next_ number (actually not, it may
be incremented by the other processes).  The more we autobind sockets in
parallel, the longer the latency gets.  We can avoid such a race by
searching for a name from a random number.

These show latency in unix_autobind() while 64 CPUs are simultaneously
autobind-ing 1024 sockets for each.

  Without this patch:

     usec          : count     distribution
        0          : 1176     |***                                     |
        2          : 3655     |***********                             |
        4          : 4094     |*************                           |
        6          : 3831     |************                            |
        8          : 3829     |************                            |
        10         : 3844     |************                            |
        12         : 3638     |***********                             |
        14         : 2992     |*********                               |
        16         : 2485     |*******                                 |
        18         : 2230     |*******                                 |
        20         : 2095     |******                                  |
        22         : 1853     |*****                                   |
        24         : 1827     |*****                                   |
        26         : 1677     |*****                                   |
        28         : 1473     |****                                    |
        30         : 1573     |*****                                   |
        32         : 1417     |****                                    |
        34         : 1385     |****                                    |
        36         : 1345     |****                                    |
        38         : 1344     |****                                    |
        40         : 1200     |***                                     |

  With this patch:

     usec          : count     distribution
        0          : 1855     |******                                  |
        2          : 6464     |*********************                   |
        4          : 9936     |********************************        |
        6          : 12107    |****************************************|
        8          : 10441    |**********************************      |
        10         : 7264     |***********************                 |
        12         : 4254     |**************                          |
        14         : 2538     |********                                |
        16         : 1596     |*****                                   |
        18         : 1088     |***                                     |
        20         : 800      |**                                      |
        22         : 670      |**                                      |
        24         : 601      |*                                       |
        26         : 562      |*                                       |
        28         : 525      |*                                       |
        30         : 446      |*                                       |
        32         : 378      |*                                       |
        34         : 337      |*                                       |
        36         : 317      |*                                       |
        38         : 314      |*                                       |
        40         : 298      |                                        |

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
---
 net/unix/af_unix.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 643f0358bf7a..55d570b23475 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1075,8 +1075,7 @@ static int unix_autobind(struct sock *sk)
 	unsigned int new_hash, old_hash = sk->sk_hash;
 	struct unix_sock *u = unix_sk(sk);
 	struct unix_address *addr;
-	unsigned int retries = 0;
-	static u32 ordernum = 1;
+	u32 initnum, ordernum;
 	int err;
 
 	err = mutex_lock_interruptible(&u->bindlock);
@@ -1091,31 +1090,33 @@ static int unix_autobind(struct sock *sk)
 	if (!addr)
 		goto out;
 
+	addr->len = offsetof(struct sockaddr_un, sun_path) + 6;
 	addr->name->sun_family = AF_UNIX;
 	refcount_set(&addr->refcnt, 1);
 
+	initnum = ordernum = prandom_u32();
 retry:
-	addr->len = sprintf(addr->name->sun_path + 1, "%05x", ordernum) +
-		offsetof(struct sockaddr_un, sun_path) + 1;
+	ordernum = (ordernum + 1) & 0xFFFFF;
+	sprintf(addr->name->sun_path + 1, "%05x", ordernum);
 
 	new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type);
 	unix_table_double_lock(old_hash, new_hash);
-	ordernum = (ordernum+1)&0xFFFFF;
 
 	if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, new_hash)) {
 		unix_table_double_unlock(old_hash, new_hash);
 
-		/*
-		 * __unix_find_socket_byname() may take long time if many names
+		/* __unix_find_socket_byname() may take long time if many names
 		 * are already in use.
 		 */
 		cond_resched();
-		/* Give up if all names seems to be in use. */
-		if (retries++ == 0xFFFFF) {
+
+		if (ordernum == initnum) {
+			/* Give up if all names seems to be in use. */
 			err = -ENOSPC;
-			kfree(addr);
+			unix_release_addr(addr);
 			goto out;
 		}
+
 		goto retry;
 	}
 
-- 
2.30.2


  parent reply	other threads:[~2021-11-06  9:20 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-06  9:16 [PATCH net-next 00/13] af_unix: Replace unix_table_lock with per-hash locks Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 01/13] af_unix: Use offsetof() instead of sizeof() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 02/13] af_unix: Pass struct sock to unix_autobind() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 03/13] af_unix: Factorise unix_find_other() based on address types Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 04/13] af_unix: Return an error as a pointer in unix_find_other() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 05/13] af_unix: Cut unix_validate_addr() out of unix_mkname() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 06/13] af_unix: Copy unix_mkname() into unix_find_(bsd|abstract)() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 07/13] af_unix: Remove unix_mkname() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 08/13] af_unix: Allocate unix_address in unix_bind_(bsd|abstract)() Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 09/13] af_unix: Remove UNIX_ABSTRACT() macro and test sun_path[0] instead Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 10/13] af_unix: Add helpers to calculate hashes Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 11/13] af_unix: Save hash in sk_hash Kuniyuki Iwashima
2021-11-06  9:17 ` [PATCH net-next 12/13] af_unix: Replace the big lock with small locks Kuniyuki Iwashima
2021-11-10  6:39   ` [af_unix] 95e381b609: WARNING:possible_recursive_locking_detected kernel test robot
2021-11-10  6:39     ` kernel test robot
2021-11-06  9:17 ` Kuniyuki Iwashima [this message]
2021-11-08 23:10   ` [PATCH net-next 13/13] af_unix: Relax race in unix_autobind() Eric Dumazet
2021-11-09  2:24     ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211106091712.15206-14-kuniyu@amazon.co.jp \
    --to=kuniyu@amazon.co.jp \
    --cc=benh@amazon.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.