linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fix race in AF_UNIX
@ 2007-06-02 21:50 Miklos Szeredi
  2007-06-02 22:11 ` Arnaldo Carvalho de Melo
                   ` (2 more replies)
  0 siblings, 3 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-02 21:50 UTC (permalink / raw)
  To: akpm; +Cc: netdev, linux-kernel

From: Miklos Szeredi <mszeredi@suse.cz>

A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
send()+close() on the peer, causing recv() to return zero, even though
the sent data should be received.

This happens if the send() and the close() is performed between
skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():

process A  skb_dequeue() returns NULL, there's no data in the socket queue
process B  new data is inserted onto the queue by unix_stream_sendmsg()
process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
process A  sk->sk_shutdown is checked, unix_release_sock() returns zero

I'm surprised nobody noticed this, it's not hard to trigger.  Maybe
it's just (un)luck with the timing.

It's possible to work around this bug in userspace, by retrying the
recv() once in case of a zero return value.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---

Index: linux-2.6.22-rc2/net/unix/af_unix.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c	2007-06-02 23:45:47.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c	2007-06-02 23:45:49.000000000 +0200
@@ -1711,20 +1711,23 @@ static int unix_stream_recvmsg(struct ki
 		int chunk;
 		struct sk_buff *skb;
 
+		unix_state_rlock(sk);
 		skb = skb_dequeue(&sk->sk_receive_queue);
 		if (skb==NULL)
 		{
 			if (copied >= target)
-				break;
+				goto unlock;
 
 			/*
 			 *	POSIX 1003.1g mandates this order.
 			 */
 
 			if ((err = sock_error(sk)) != 0)
-				break;
+				goto unlock;
 			if (sk->sk_shutdown & RCV_SHUTDOWN)
-				break;
+				goto unlock;
+
+			unix_state_runlock(sk);
 			err = -EAGAIN;
 			if (!timeo)
 				break;
@@ -1738,7 +1741,11 @@ static int unix_stream_recvmsg(struct ki
 			}
 			mutex_lock(&u->readlock);
 			continue;
+ unlock:
+			unix_state_runlock(sk);
+			break;
 		}
+		unix_state_runlock(sk);
 
 		if (check_creds) {
 			/* Never glue messages from different writers */

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-02 21:50 [PATCH] fix race in AF_UNIX Miklos Szeredi
@ 2007-06-02 22:11 ` Arnaldo Carvalho de Melo
  2007-06-04  9:45 ` Miklos Szeredi
  2007-06-04  9:53 ` Miklos Szeredi
  2 siblings, 0 replies; 47+ messages in thread
From: Arnaldo Carvalho de Melo @ 2007-06-02 22:11 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, netdev, linux-kernel

On 6/2/07, Miklos Szeredi <miklos@szeredi.hu> wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
>
> A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> send()+close() on the peer, causing recv() to return zero, even though
> the sent data should be received.
>
> This happens if the send() and the close() is performed between
> skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
>
> process A  skb_dequeue() returns NULL, there's no data in the socket queue
> process B  new data is inserted onto the queue by unix_stream_sendmsg()
> process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> process A  sk->sk_shutdown is checked, unix_release_sock() returns zero
>
> I'm surprised nobody noticed this, it's not hard to trigger.  Maybe
> it's just (un)luck with the timing.
>
> It's possible to work around this bug in userspace, by retrying the
> recv() once in case of a zero return value.
>
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> ---
>
> Index: linux-2.6.22-rc2/net/unix/af_unix.c
> ===================================================================
> --- linux-2.6.22-rc2.orig/net/unix/af_unix.c    2007-06-02 23:45:47.000000000 +0200
> +++ linux-2.6.22-rc2/net/unix/af_unix.c 2007-06-02 23:45:49.000000000 +0200
> @@ -1711,20 +1711,23 @@ static int unix_stream_recvmsg(struct ki
>                 int chunk;
>                 struct sk_buff *skb;
>
> +               unix_state_rlock(sk);

this function doesn't exist anymore, see:

http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commitdiff;h=e97b6936b03dd0a62991f361c048cca38ac00198

There is also another AF_UNIX patch, also related to a race.

- Arnaldo

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-02 21:50 [PATCH] fix race in AF_UNIX Miklos Szeredi
  2007-06-02 22:11 ` Arnaldo Carvalho de Melo
@ 2007-06-04  9:45 ` Miklos Szeredi
  2007-06-05  7:02   ` David Miller
  2007-06-04  9:53 ` Miklos Szeredi
  2 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-04  9:45 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

> A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> send()+close() on the peer, causing recv() to return zero, even though
> the sent data should be received.
> 
> This happens if the send() and the close() is performed between
> skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> 
> process A  skb_dequeue() returns NULL, there's no data in the socket queue
> process B  new data is inserted onto the queue by unix_stream_sendmsg()
> process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> process A  sk->sk_shutdown is checked, unix_release_sock() returns zero

This is only part of the story.  It turns out, there are other races
involving the garbage collector, that can throw away perfectly good
packets with AF_UNIX sockets in them.

The problems arise when a socket goes from installed to in-flight or
vica versa during garbage collection.  Since gc is done with a
spinlock held, this only shows up on SMP.

The following patch fixes it for me, but it's possibly the wrong
approach.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---

Index: linux-2.6.22-rc2/net/unix/garbage.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/garbage.c	2007-06-03 23:58:11.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/garbage.c	2007-06-04 11:39:42.000000000 +0200
@@ -90,6 +90,7 @@
 static struct sock *gc_current = GC_HEAD; /* stack of objects to mark */
 
 atomic_t unix_tot_inflight = ATOMIC_INIT(0);
+DECLARE_RWSEM(unix_gc_sem);
 
 
 static struct sock *unix_get_socket(struct file *filp)
@@ -169,7 +170,7 @@ static void maybe_unmark_and_push(struct
 
 void unix_gc(void)
 {
-	static DEFINE_MUTEX(unix_gc_sem);
+	static DEFINE_MUTEX(unix_gc_local_lock);
 	int i;
 	struct sock *s;
 	struct sk_buff_head hitlist;
@@ -179,9 +180,22 @@ void unix_gc(void)
 	 *	Avoid a recursive GC.
 	 */
 
-	if (!mutex_trylock(&unix_gc_sem))
+	if (!mutex_trylock(&unix_gc_local_lock))
 		return;
 
+
+	/*
+	 * unix_gc_sem protects against sockets going from in-flight to
+	 * installed
+	 *
+	 * Can't sleep on this, because skb_recv_datagram could be
+	 * waiting for a packet that is to be sent by the thread which
+	 * invoked the gc
+	 */
+	if (!down_write_trylock(&unix_gc_sem)) {
+		mutex_unlock(&unix_gc_local_lock);
+		return;
+	}
 	spin_lock(&unix_table_lock);
 
 	forall_unix_sockets(i, s)
@@ -207,8 +221,6 @@ void unix_gc(void)
 
 	forall_unix_sockets(i, s)
 	{
-		int open_count = 0;
-
 		/*
 		 *	If all instances of the descriptor are not
 		 *	in flight we are in use.
@@ -218,10 +230,20 @@ void unix_gc(void)
 		 *	In this case (see unix_create1()) we set artificial
 		 *	negative inflight counter to close race window.
 		 *	It is trick of course and dirty one.
+		 *
+		 *	Get the inflight counter first, then the open
+		 *	counter.  This avoids problems if racing with
+		 *	sendmsg
+		 *
+		 *	If just created socket is not yet attached to
+		 *	a file descriptor, assume open_count of 1
 		 */
+		int inflight_count = atomic_read(&unix_sk(s)->inflight);
+		int open_count = 1;
+
 		if (s->sk_socket && s->sk_socket->file)
 			open_count = file_count(s->sk_socket->file);
-		if (open_count > atomic_read(&unix_sk(s)->inflight))
+		if (open_count > inflight_count)
 			maybe_unmark_and_push(s);
 	}
 
@@ -302,11 +324,12 @@ void unix_gc(void)
 		u->gc_tree = GC_ORPHAN;
 	}
 	spin_unlock(&unix_table_lock);
+	up_write(&unix_gc_sem);
 
 	/*
 	 *	Here we are. Hitlist is filled. Die.
 	 */
 
 	__skb_queue_purge(&hitlist);
-	mutex_unlock(&unix_gc_sem);
+	mutex_unlock(&unix_gc_local_lock);
 }
Index: linux-2.6.22-rc2/include/net/af_unix.h
===================================================================
--- linux-2.6.22-rc2.orig/include/net/af_unix.h	2007-04-26 05:08:32.000000000 +0200
+++ linux-2.6.22-rc2/include/net/af_unix.h	2007-06-04 09:13:56.000000000 +0200
@@ -14,6 +14,7 @@ extern void unix_gc(void);
 
 extern struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
 extern spinlock_t unix_table_lock;
+extern struct rw_semaphore unix_gc_sem;
 
 extern atomic_t unix_tot_inflight;
 
Index: linux-2.6.22-rc2/net/unix/af_unix.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c	2007-06-03 23:58:11.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c	2007-06-04 11:04:15.000000000 +0200
@@ -1572,6 +1572,7 @@ static int unix_dgram_recvmsg(struct kio
 
 	msg->msg_namelen = 0;
 
+	down_read(&unix_gc_sem);
 	mutex_lock(&u->readlock);
 
 	skb = skb_recv_datagram(sk, flags, noblock, &err);
@@ -1629,6 +1630,7 @@ out_free:
 	skb_free_datagram(sk,skb);
 out_unlock:
 	mutex_unlock(&u->readlock);
+	up_read(&unix_gc_sem);
 out:
 	return err;
 }
@@ -1704,6 +1706,7 @@ static int unix_stream_recvmsg(struct ki
 		memset(&tmp_scm, 0, sizeof(tmp_scm));
 	}
 
+	down_read(&unix_gc_sem);
 	mutex_lock(&u->readlock);
 
 	do
@@ -1732,6 +1735,7 @@ static int unix_stream_recvmsg(struct ki
 			if (!timeo)
 				break;
 			mutex_unlock(&u->readlock);
+			up_read(&unix_gc_sem);
 
 			timeo = unix_stream_data_wait(sk, timeo);
 
@@ -1739,6 +1743,7 @@ static int unix_stream_recvmsg(struct ki
 				err = sock_intr_errno(timeo);
 				goto out;
 			}
+			down_read(&unix_gc_sem);
 			mutex_lock(&u->readlock);
 			continue;
  unlock:
@@ -1810,6 +1815,7 @@ static int unix_stream_recvmsg(struct ki
 	} while (size);
 
 	mutex_unlock(&u->readlock);
+	up_read(&unix_gc_sem);
 	scm_recv(sock, msg, siocb->scm, flags);
 out:
 	return copied ? : err;

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-02 21:50 [PATCH] fix race in AF_UNIX Miklos Szeredi
  2007-06-02 22:11 ` Arnaldo Carvalho de Melo
  2007-06-04  9:45 ` Miklos Szeredi
@ 2007-06-04  9:53 ` Miklos Szeredi
  2 siblings, 0 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-04  9:53 UTC (permalink / raw)
  To: netdev; +Cc: akpm, linux-kernel

(resend, sorry, fscked up the address list)

> A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> send()+close() on the peer, causing recv() to return zero, even though
> the sent data should be received.
> 
> This happens if the send() and the close() is performed between
> skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> 
> process A  skb_dequeue() returns NULL, there's no data in the socket queue
> process B  new data is inserted onto the queue by unix_stream_sendmsg()
> process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> process A  sk->sk_shutdown is checked, unix_release_sock() returns zero

This is only part of the story.  It turns out, there are other races
involving the garbage collector, that can throw away perfectly good
packets with AF_UNIX sockets in them.

The problems arise when a socket goes from installed to in-flight or
vica versa during garbage collection.  Since gc is done with a
spinlock held, this only shows up on SMP.

The following patch fixes it for me, but it's possibly the wrong
approach.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---

Index: linux-2.6.22-rc2/net/unix/garbage.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/garbage.c	2007-06-03 23:58:11.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/garbage.c	2007-06-04 11:39:42.000000000 +0200
@@ -90,6 +90,7 @@
 static struct sock *gc_current = GC_HEAD; /* stack of objects to mark */
 
 atomic_t unix_tot_inflight = ATOMIC_INIT(0);
+DECLARE_RWSEM(unix_gc_sem);
 
 
 static struct sock *unix_get_socket(struct file *filp)
@@ -169,7 +170,7 @@ static void maybe_unmark_and_push(struct
 
 void unix_gc(void)
 {
-	static DEFINE_MUTEX(unix_gc_sem);
+	static DEFINE_MUTEX(unix_gc_local_lock);
 	int i;
 	struct sock *s;
 	struct sk_buff_head hitlist;
@@ -179,9 +180,22 @@ void unix_gc(void)
 	 *	Avoid a recursive GC.
 	 */
 
-	if (!mutex_trylock(&unix_gc_sem))
+	if (!mutex_trylock(&unix_gc_local_lock))
 		return;
 
+
+	/*
+	 * unix_gc_sem protects against sockets going from in-flight to
+	 * installed
+	 *
+	 * Can't sleep on this, because skb_recv_datagram could be
+	 * waiting for a packet that is to be sent by the thread which
+	 * invoked the gc
+	 */
+	if (!down_write_trylock(&unix_gc_sem)) {
+		mutex_unlock(&unix_gc_local_lock);
+		return;
+	}
 	spin_lock(&unix_table_lock);
 
 	forall_unix_sockets(i, s)
@@ -207,8 +221,6 @@ void unix_gc(void)
 
 	forall_unix_sockets(i, s)
 	{
-		int open_count = 0;
-
 		/*
 		 *	If all instances of the descriptor are not
 		 *	in flight we are in use.
@@ -218,10 +230,20 @@ void unix_gc(void)
 		 *	In this case (see unix_create1()) we set artificial
 		 *	negative inflight counter to close race window.
 		 *	It is trick of course and dirty one.
+		 *
+		 *	Get the inflight counter first, then the open
+		 *	counter.  This avoids problems if racing with
+		 *	sendmsg
+		 *
+		 *	If just created socket is not yet attached to
+		 *	a file descriptor, assume open_count of 1
 		 */
+		int inflight_count = atomic_read(&unix_sk(s)->inflight);
+		int open_count = 1;
+
 		if (s->sk_socket && s->sk_socket->file)
 			open_count = file_count(s->sk_socket->file);
-		if (open_count > atomic_read(&unix_sk(s)->inflight))
+		if (open_count > inflight_count)
 			maybe_unmark_and_push(s);
 	}
 
@@ -302,11 +324,12 @@ void unix_gc(void)
 		u->gc_tree = GC_ORPHAN;
 	}
 	spin_unlock(&unix_table_lock);
+	up_write(&unix_gc_sem);
 
 	/*
 	 *	Here we are. Hitlist is filled. Die.
 	 */
 
 	__skb_queue_purge(&hitlist);
-	mutex_unlock(&unix_gc_sem);
+	mutex_unlock(&unix_gc_local_lock);
 }
Index: linux-2.6.22-rc2/include/net/af_unix.h
===================================================================
--- linux-2.6.22-rc2.orig/include/net/af_unix.h	2007-04-26 05:08:32.000000000 +0200
+++ linux-2.6.22-rc2/include/net/af_unix.h	2007-06-04 09:13:56.000000000 +0200
@@ -14,6 +14,7 @@ extern void unix_gc(void);
 
 extern struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1];
 extern spinlock_t unix_table_lock;
+extern struct rw_semaphore unix_gc_sem;
 
 extern atomic_t unix_tot_inflight;
 
Index: linux-2.6.22-rc2/net/unix/af_unix.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c	2007-06-03 23:58:11.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c	2007-06-04 11:04:15.000000000 +0200
@@ -1572,6 +1572,7 @@ static int unix_dgram_recvmsg(struct kio
 
 	msg->msg_namelen = 0;
 
+	down_read(&unix_gc_sem);
 	mutex_lock(&u->readlock);
 
 	skb = skb_recv_datagram(sk, flags, noblock, &err);
@@ -1629,6 +1630,7 @@ out_free:
 	skb_free_datagram(sk,skb);
 out_unlock:
 	mutex_unlock(&u->readlock);
+	up_read(&unix_gc_sem);
 out:
 	return err;
 }
@@ -1704,6 +1706,7 @@ static int unix_stream_recvmsg(struct ki
 		memset(&tmp_scm, 0, sizeof(tmp_scm));
 	}
 
+	down_read(&unix_gc_sem);
 	mutex_lock(&u->readlock);
 
 	do
@@ -1732,6 +1735,7 @@ static int unix_stream_recvmsg(struct ki
 			if (!timeo)
 				break;
 			mutex_unlock(&u->readlock);
+			up_read(&unix_gc_sem);
 
 			timeo = unix_stream_data_wait(sk, timeo);
 
@@ -1739,6 +1743,7 @@ static int unix_stream_recvmsg(struct ki
 				err = sock_intr_errno(timeo);
 				goto out;
 			}
+			down_read(&unix_gc_sem);
 			mutex_lock(&u->readlock);
 			continue;
  unlock:
@@ -1810,6 +1815,7 @@ static int unix_stream_recvmsg(struct ki
 	} while (size);
 
 	mutex_unlock(&u->readlock);
+	up_read(&unix_gc_sem);
 	scm_recv(sock, msg, siocb->scm, flags);
 out:
 	return copied ? : err;


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-04  9:45 ` Miklos Szeredi
@ 2007-06-05  7:02   ` David Miller
  2007-06-05  7:42     ` Miklos Szeredi
  2007-06-06  0:31     ` David Miller
  0 siblings, 2 replies; 47+ messages in thread
From: David Miller @ 2007-06-05  7:02 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 04 Jun 2007 11:45:32 +0200

> > A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> > send()+close() on the peer, causing recv() to return zero, even though
> > the sent data should be received.
> > 
> > This happens if the send() and the close() is performed between
> > skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> > 
> > process A  skb_dequeue() returns NULL, there's no data in the socket queue
> > process B  new data is inserted onto the queue by unix_stream_sendmsg()
> > process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> > process A  sk->sk_shutdown is checked, unix_release_sock() returns zero
> 
> This is only part of the story.  It turns out, there are other races
> involving the garbage collector, that can throw away perfectly good
> packets with AF_UNIX sockets in them.
> 
> The problems arise when a socket goes from installed to in-flight or
> vica versa during garbage collection.  Since gc is done with a
> spinlock held, this only shows up on SMP.
> 
> The following patch fixes it for me, but it's possibly the wrong
> approach.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>

I haven't seen a repost of the first patch, which is necessary because
that first patch doesn't apply to the current tree.  Please don't
ignore Arnaldo's feedback like that, or else I'll ignore you just the
same. :-)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-05  7:02   ` David Miller
@ 2007-06-05  7:42     ` Miklos Szeredi
  2007-06-05  7:55       ` David Miller
  2007-06-05 20:11       ` David Miller
  2007-06-06  0:31     ` David Miller
  1 sibling, 2 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-05  7:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel

> > > A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> > > send()+close() on the peer, causing recv() to return zero, even though
> > > the sent data should be received.
> > > 
> > > This happens if the send() and the close() is performed between
> > > skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> > > 
> > > process A  skb_dequeue() returns NULL, there's no data in the socket queue
> > > process B  new data is inserted onto the queue by unix_stream_sendmsg()
> > > process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> > > process A  sk->sk_shutdown is checked, unix_release_sock() returns zero
> > 
> > This is only part of the story.  It turns out, there are other races
> > involving the garbage collector, that can throw away perfectly good
> > packets with AF_UNIX sockets in them.
> > 
> > The problems arise when a socket goes from installed to in-flight or
> > vica versa during garbage collection.  Since gc is done with a
> > spinlock held, this only shows up on SMP.
> > 
> > The following patch fixes it for me, but it's possibly the wrong
> > approach.
> > 
> > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> 
> I haven't seen a repost of the first patch, which is necessary because
> that first patch doesn't apply to the current tree.  Please don't
> ignore Arnaldo's feedback like that, or else I'll ignore you just the
> same. :-)

I just want to win the "who's laziest?" league.  It would take me
about 5 minutes to get the netdev tree and test compile the change.
Of which 5 seconds would be actually updating the patch.  I was
thought it was OK to pass that 5 seconds worth of hard work to you in
order to save the rest ;)

Anyway here's the updated (but not compile tested) patch.

Thanks,
Miklos

From: Miklos Szeredi <mszeredi@suse.cz>

A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
send()+close() on the peer, causing recv() to return zero, even though
the sent data should be received.

This happens if the send() and the close() is performed between
skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():

process A  skb_dequeue() returns NULL, there's no data in the socket queue
process B  new data is inserted onto the queue by unix_stream_sendmsg()
process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
process A  sk->sk_shutdown is checked, unix_release_sock() returns zero

I'm surprised nobody noticed this, it's not hard to trigger.  Maybe
it's just (un)luck with the timing.

It's possible to work around this bug in userspace, by retrying the
recv() once in case of a zero return value.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---

Index: linux-2.6.22-rc2/net/unix/af_unix.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/af_unix.c	2007-06-02 23:45:47.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/af_unix.c	2007-06-02 23:45:49.000000000 +0200
@@ -1711,20 +1711,23 @@ static int unix_stream_recvmsg(struct ki
 		int chunk;
 		struct sk_buff *skb;
 
+		unix_state_lock(sk);
 		skb = skb_dequeue(&sk->sk_receive_queue);
 		if (skb==NULL)
 		{
 			if (copied >= target)
-				break;
+				goto unlock;
 
 			/*
 			 *	POSIX 1003.1g mandates this order.
 			 */
 
 			if ((err = sock_error(sk)) != 0)
-				break;
+				goto unlock;
 			if (sk->sk_shutdown & RCV_SHUTDOWN)
-				break;
+				goto unlock;
+
+			unix_state_unlock(sk);
 			err = -EAGAIN;
 			if (!timeo)
 				break;
@@ -1738,7 +1741,11 @@ static int unix_stream_recvmsg(struct ki
 			}
 			mutex_lock(&u->readlock);
 			continue;
+ unlock:
+			unix_state_unlock(sk);
+			break;
 		}
+		unix_state_unlock(sk);
 
 		if (check_creds) {
 			/* Never glue messages from different writers */


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-05  7:42     ` Miklos Szeredi
@ 2007-06-05  7:55       ` David Miller
  2007-06-05  8:11         ` Miklos Szeredi
  2007-06-05 20:11       ` David Miller
  1 sibling, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-05  7:55 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Tue, 05 Jun 2007 09:42:41 +0200

> I just want to win the "who's laziest?" league.  It would take me
> about 5 minutes to get the netdev tree and test compile the change.
> Of which 5 seconds would be actually updating the patch.  I was
> thought it was OK to pass that 5 seconds worth of hard work to you in
> order to save the rest ;)

That tradeoff is fine, if, in return you'll do the rest of the
networking subsystem maintainership work I need to to. :-)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-05  7:55       ` David Miller
@ 2007-06-05  8:11         ` Miklos Szeredi
  2007-06-05  8:19           ` David Miller
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-05  8:11 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel

> > I just want to win the "who's laziest?" league.  It would take me
> > about 5 minutes to get the netdev tree and test compile the change.
> > Of which 5 seconds would be actually updating the patch.  I was
> > thought it was OK to pass that 5 seconds worth of hard work to you in
> > order to save the rest ;)
> 
> That tradeoff is fine, if, in return you'll do the rest of the
> networking subsystem maintainership work I need to to. :-)

Well, I _did_ save you quite a bit of time by tracking down these
bugs.  That 5 sec of dedication in exchange would have really made me
feel good ;(

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-05  8:11         ` Miklos Szeredi
@ 2007-06-05  8:19           ` David Miller
  0 siblings, 0 replies; 47+ messages in thread
From: David Miller @ 2007-06-05  8:19 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Tue, 05 Jun 2007 10:11:56 +0200

> > > I just want to win the "who's laziest?" league.  It would take me
> > > about 5 minutes to get the netdev tree and test compile the change.
> > > Of which 5 seconds would be actually updating the patch.  I was
> > > thought it was OK to pass that 5 seconds worth of hard work to you in
> > > order to save the rest ;)
> > 
> > That tradeoff is fine, if, in return you'll do the rest of the
> > networking subsystem maintainership work I need to to. :-)
> 
> Well, I _did_ save you quite a bit of time by tracking down these
> bugs.  That 5 sec of dedication in exchange would have really made me
> feel good ;(

The only reason I can process as many patches as I can every day is
that I depend upon the end-nodes (that's you) doing most of the
time intensive work so that I can concentrate on reviewing patches
for correctness and proper implementation.

In any event, thanks for respinning the patch, it's late here so
I'll review it tomorrow.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-05  7:42     ` Miklos Szeredi
  2007-06-05  7:55       ` David Miller
@ 2007-06-05 20:11       ` David Miller
  1 sibling, 0 replies; 47+ messages in thread
From: David Miller @ 2007-06-05 20:11 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Tue, 05 Jun 2007 09:42:41 +0200

> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> send()+close() on the peer, causing recv() to return zero, even though
> the sent data should be received.
> 
> This happens if the send() and the close() is performed between
> skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> 
> process A  skb_dequeue() returns NULL, there's no data in the socket queue
> process B  new data is inserted onto the queue by unix_stream_sendmsg()
> process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> process A  sk->sk_shutdown is checked, unix_release_sock() returns zero
> 
> I'm surprised nobody noticed this, it's not hard to trigger.  Maybe
> it's just (un)luck with the timing.
> 
> It's possible to work around this bug in userspace, by retrying the
> recv() once in case of a zero return value.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>

Looks good, applied.

Another way to fix this would have been to take the u->readlock
in unix_release() and unix_shutdown(), but that would have hurt
unix_dgram_recvmsg() which doesn't need to handle this race.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-05  7:02   ` David Miller
  2007-06-05  7:42     ` Miklos Szeredi
@ 2007-06-06  0:31     ` David Miller
  2007-06-06  5:26       ` Miklos Szeredi
  1 sibling, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-06  0:31 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: David Miller <davem@davemloft.net>
Date: Tue, 05 Jun 2007 00:02:47 -0700 (PDT)

> From: Miklos Szeredi <miklos@szeredi.hu>
> Date: Mon, 04 Jun 2007 11:45:32 +0200
> 
> > > A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> > > send()+close() on the peer, causing recv() to return zero, even though
> > > the sent data should be received.
> > > 
> > > This happens if the send() and the close() is performed between
> > > skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> > > 
> > > process A  skb_dequeue() returns NULL, there's no data in the socket queue
> > > process B  new data is inserted onto the queue by unix_stream_sendmsg()
> > > process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> > > process A  sk->sk_shutdown is checked, unix_release_sock() returns zero
> > 
> > This is only part of the story.  It turns out, there are other races
> > involving the garbage collector, that can throw away perfectly good
> > packets with AF_UNIX sockets in them.
> > 
> > The problems arise when a socket goes from installed to in-flight or
> > vica versa during garbage collection.  Since gc is done with a
> > spinlock held, this only shows up on SMP.
> > 
> > The following patch fixes it for me, but it's possibly the wrong
> > approach.
> > 
> > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>

Concerning this specific patch I think we need to rethink it
a bit.

Holding a global mutex over recvmsg() calls under AF_UNIX is pretty
much a non-starter, this will kill performance for multi-threaded
apps.

One possible solution is for the garbage collection code to hold the
u->readlock while processing a socket, but be careful about deadlocks.

Anyone want to give that a try?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-06  0:31     ` David Miller
@ 2007-06-06  5:26       ` Miklos Szeredi
  2007-06-06  5:41         ` David Miller
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-06  5:26 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel

> > From: Miklos Szeredi <miklos@szeredi.hu>
> > Date: Mon, 04 Jun 2007 11:45:32 +0200
> > 
> > > > A recv() on an AF_UNIX, SOCK_STREAM socket can race with a
> > > > send()+close() on the peer, causing recv() to return zero, even though
> > > > the sent data should be received.
> > > > 
> > > > This happens if the send() and the close() is performed between
> > > > skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg():
> > > > 
> > > > process A  skb_dequeue() returns NULL, there's no data in the socket queue
> > > > process B  new data is inserted onto the queue by unix_stream_sendmsg()
> > > > process B  sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock()
> > > > process A  sk->sk_shutdown is checked, unix_release_sock() returns zero
> > > 
> > > This is only part of the story.  It turns out, there are other races
> > > involving the garbage collector, that can throw away perfectly good
> > > packets with AF_UNIX sockets in them.
> > > 
> > > The problems arise when a socket goes from installed to in-flight or
> > > vica versa during garbage collection.  Since gc is done with a
> > > spinlock held, this only shows up on SMP.
> > > 
> > > The following patch fixes it for me, but it's possibly the wrong
> > > approach.
> > > 
> > > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> 
> Concerning this specific patch I think we need to rethink it
> a bit.
> 
> Holding a global mutex over recvmsg() calls under AF_UNIX is pretty
> much a non-starter, this will kill performance for multi-threaded
> apps.

That's an rwsem held for read.  It's held for write in unix_gc() only
for a short duration, and unix_gc() should only rarely be called.  So
I don't think there's any performance problem here.

> 
> One possible solution is for the garbage collection code to hold the
> u->readlock while processing a socket, but be careful about deadlocks.

That would have exactly the same effect.  Only the code would be more
complicated.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-06  5:26       ` Miklos Szeredi
@ 2007-06-06  5:41         ` David Miller
  2007-06-06  8:08           ` Miklos Szeredi
  0 siblings, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-06  5:41 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Wed, 06 Jun 2007 07:26:52 +0200

> > Holding a global mutex over recvmsg() calls under AF_UNIX is pretty
> > much a non-starter, this will kill performance for multi-threaded
> > apps.
> 
> That's an rwsem held for read.  It's held for write in unix_gc() only
> for a short duration, and unix_gc() should only rarely be called.  So
> I don't think there's any performance problem here.

It pulls a non-local cacheline into the local thread, that's extremely
expensive on SMP.

If everyone starts grabbing this thing during recvmsg() it's going to
become a really hot lock and kill performance, even if it's a read
side lock being taken.

That's why I said we need to investigate solutions involving
u->readlock, that already has to be taken and is local to the socket.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-06  5:41         ` David Miller
@ 2007-06-06  8:08           ` Miklos Szeredi
  2007-06-06  8:12             ` David Miller
  2007-06-08  1:47             ` David Miller
  0 siblings, 2 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-06  8:08 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel

> > > Holding a global mutex over recvmsg() calls under AF_UNIX is pretty
> > > much a non-starter, this will kill performance for multi-threaded
> > > apps.
> > 
> > That's an rwsem held for read.  It's held for write in unix_gc() only
> > for a short duration, and unix_gc() should only rarely be called.  So
> > I don't think there's any performance problem here.
> 
> It pulls a non-local cacheline into the local thread, that's extremely
> expensive on SMP.

OK, here's an updated patch, that uses ->readlock, and passes my
testing.

----
From: Miklos Szeredi <mszeredi@suse.cz>

There are races involving the garbage collector, that can throw away
perfectly good packets with AF_UNIX sockets in them.

The problems arise when a socket goes from installed to in-flight or
vice versa during garbage collection.  Since gc is done with a
spinlock held, this only shows up on SMP.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---

Index: linux-2.6.22-rc2/net/unix/garbage.c
===================================================================
--- linux-2.6.22-rc2.orig/net/unix/garbage.c	2007-06-03 23:58:11.000000000 +0200
+++ linux-2.6.22-rc2/net/unix/garbage.c	2007-06-06 09:48:36.000000000 +0200
@@ -186,7 +186,21 @@ void unix_gc(void)
 
 	forall_unix_sockets(i, s)
 	{
-		unix_sk(s)->gc_tree = GC_ORPHAN;
+		struct unix_sock *u = unix_sk(s);
+
+		u->gc_tree = GC_ORPHAN;
+
+		/*
+		 * Hold ->readlock to protect against sockets going from
+		 * in-flight to installed
+		 *
+		 * Can't sleep on this, because
+		 *   a) we are under spinlock
+		 *   b) skb_recv_datagram() could be waiting for a packet that
+		 *      is to be sent by this thread
+		 */
+		if (!mutex_trylock(&u->readlock))
+			goto lock_failed;
 	}
 	/*
 	 *	Everything is now marked
@@ -207,8 +221,6 @@ void unix_gc(void)
 
 	forall_unix_sockets(i, s)
 	{
-		int open_count = 0;
-
 		/*
 		 *	If all instances of the descriptor are not
 		 *	in flight we are in use.
@@ -218,10 +230,20 @@ void unix_gc(void)
 		 *	In this case (see unix_create1()) we set artificial
 		 *	negative inflight counter to close race window.
 		 *	It is trick of course and dirty one.
+		 *
+		 *	Get the inflight counter first, then the open
+		 *	counter.  This avoids problems if racing with
+		 *	sendmsg
+		 *
+		 *	If just created socket is not yet attached to
+		 *	a file descriptor, assume open_count of 1
 		 */
+		int inflight_count = atomic_read(&unix_sk(s)->inflight);
+		int open_count = 1;
+
 		if (s->sk_socket && s->sk_socket->file)
 			open_count = file_count(s->sk_socket->file);
-		if (open_count > atomic_read(&unix_sk(s)->inflight))
+		if (open_count > inflight_count)
 			maybe_unmark_and_push(s);
 	}
 
@@ -300,6 +322,7 @@ void unix_gc(void)
 			spin_unlock(&s->sk_receive_queue.lock);
 		}
 		u->gc_tree = GC_ORPHAN;
+		mutex_unlock(&u->readlock);
 	}
 	spin_unlock(&unix_table_lock);
 
@@ -309,4 +332,19 @@ void unix_gc(void)
 
 	__skb_queue_purge(&hitlist);
 	mutex_unlock(&unix_gc_sem);
+	return;
+
+ lock_failed:
+	{
+		struct sock *s1;
+		forall_unix_sockets(i, s1) {
+			if (s1 == s) {
+				spin_unlock(&unix_table_lock);
+				mutex_unlock(&unix_gc_sem);
+				return;
+			}
+			mutex_unlock(&unix_sk(s1)->readlock);
+		}
+		BUG();
+	}
 }


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-06  8:08           ` Miklos Szeredi
@ 2007-06-06  8:12             ` David Miller
  2007-06-08  1:47             ` David Miller
  1 sibling, 0 replies; 47+ messages in thread
From: David Miller @ 2007-06-06  8:12 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Wed, 06 Jun 2007 10:08:29 +0200

> > > > Holding a global mutex over recvmsg() calls under AF_UNIX is pretty
> > > > much a non-starter, this will kill performance for multi-threaded
> > > > apps.
> > > 
> > > That's an rwsem held for read.  It's held for write in unix_gc() only
> > > for a short duration, and unix_gc() should only rarely be called.  So
> > > I don't think there's any performance problem here.
> > 
> > It pulls a non-local cacheline into the local thread, that's extremely
> > expensive on SMP.
> 
> OK, here's an updated patch, that uses ->readlock, and passes my
> testing.

Thanks a lot, I'll review this as soon as possible unless
someone else beats me to it :)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-06  8:08           ` Miklos Szeredi
  2007-06-06  8:12             ` David Miller
@ 2007-06-08  1:47             ` David Miller
  2007-06-11  9:57               ` Miklos Szeredi
  1 sibling, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-08  1:47 UTC (permalink / raw)
  To: miklos; +Cc: netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Wed, 06 Jun 2007 10:08:29 +0200

> There are races involving the garbage collector, that can throw away
> perfectly good packets with AF_UNIX sockets in them.
> 
> The problems arise when a socket goes from installed to in-flight or
> vice versa during garbage collection.  Since gc is done with a
> spinlock held, this only shows up on SMP.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>

I'm going to hold off on this one for now.

Holding all of the read locks kind of defeats the purpose of using
the per-socket lock.

Can't you just lock purely around the receive queue operation?

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-08  1:47             ` David Miller
@ 2007-06-11  9:57               ` Miklos Szeredi
  2007-06-18  7:49                 ` Miklos Szeredi
  2007-06-21 15:18                 ` Eric W. Biederman
  0 siblings, 2 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-11  9:57 UTC (permalink / raw)
  To: davem; +Cc: viro, alan, netdev, linux-kernel

[CC'd Al Viro and Alan Cox, restored patch]

> > There are races involving the garbage collector, that can throw away
> > perfectly good packets with AF_UNIX sockets in them.
> > 
> > The problems arise when a socket goes from installed to in-flight or
> > vice versa during garbage collection.  Since gc is done with a
> > spinlock held, this only shows up on SMP.
> > 
> > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> 
> I'm going to hold off on this one for now.
> 
> Holding all of the read locks kind of defeats the purpose of using
> the per-socket lock.
> 
> Can't you just lock purely around the receive queue operation?

That's already protected by the receive queue spinlock.  The race
however happens _between_ pushing the root set and marking of the
in-flight but reachable sockets.

If in that space any of the AF_UNIX sockets goes from in-flight to
installed into a file descriptor, the garbage collector can miss it.
If we want to protect against this using unix_sk(s)->readlock, then we
have to hold all of them for the duration of the marking.

Al, Alan, you have more experience with this piece of code.  Do you
have better ideas about how to fix this?

Thanks,
Miklos

> Index: linux-2.6.22-rc2/net/unix/garbage.c
> ===================================================================
> --- linux-2.6.22-rc2.orig/net/unix/garbage.c	2007-06-03 23:58:11.000000000 +0200
> +++ linux-2.6.22-rc2/net/unix/garbage.c	2007-06-06 09:48:36.000000000 +0200
> @@ -186,7 +186,21 @@ void unix_gc(void)
>  
>  	forall_unix_sockets(i, s)
>  	{
> -		unix_sk(s)->gc_tree = GC_ORPHAN;
> +		struct unix_sock *u = unix_sk(s);
> +
> +		u->gc_tree = GC_ORPHAN;
> +
> +		/*
> +		 * Hold ->readlock to protect against sockets going from
> +		 * in-flight to installed
> +		 *
> +		 * Can't sleep on this, because
> +		 *   a) we are under spinlock
> +		 *   b) skb_recv_datagram() could be waiting for a packet that
> +		 *      is to be sent by this thread
> +		 */
> +		if (!mutex_trylock(&u->readlock))
> +			goto lock_failed;
>  	}
>  	/*
>  	 *	Everything is now marked
> @@ -207,8 +221,6 @@ void unix_gc(void)
>  
>  	forall_unix_sockets(i, s)
>  	{
> -		int open_count = 0;
> -
>  		/*
>  		 *	If all instances of the descriptor are not
>  		 *	in flight we are in use.
> @@ -218,10 +230,20 @@ void unix_gc(void)
>  		 *	In this case (see unix_create1()) we set artificial
>  		 *	negative inflight counter to close race window.
>  		 *	It is trick of course and dirty one.
> +		 *
> +		 *	Get the inflight counter first, then the open
> +		 *	counter.  This avoids problems if racing with
> +		 *	sendmsg
> +		 *
> +		 *	If just created socket is not yet attached to
> +		 *	a file descriptor, assume open_count of 1
>  		 */
> +		int inflight_count = atomic_read(&unix_sk(s)->inflight);
> +		int open_count = 1;
> +
>  		if (s->sk_socket && s->sk_socket->file)
>  			open_count = file_count(s->sk_socket->file);
> -		if (open_count > atomic_read(&unix_sk(s)->inflight))
> +		if (open_count > inflight_count)
>  			maybe_unmark_and_push(s);
>  	}
>  
> @@ -300,6 +322,7 @@ void unix_gc(void)
>  			spin_unlock(&s->sk_receive_queue.lock);
>  		}
>  		u->gc_tree = GC_ORPHAN;
> +		mutex_unlock(&u->readlock);
>  	}
>  	spin_unlock(&unix_table_lock);
>  
> @@ -309,4 +332,19 @@ void unix_gc(void)
>  
>  	__skb_queue_purge(&hitlist);
>  	mutex_unlock(&unix_gc_sem);
> +	return;
> +
> + lock_failed:
> +	{
> +		struct sock *s1;
> +		forall_unix_sockets(i, s1) {
> +			if (s1 == s) {
> +				spin_unlock(&unix_table_lock);
> +				mutex_unlock(&unix_gc_sem);
> +				return;
> +			}
> +			mutex_unlock(&unix_sk(s1)->readlock);
> +		}
> +		BUG();
> +	}
>  }
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-11  9:57               ` Miklos Szeredi
@ 2007-06-18  7:49                 ` Miklos Szeredi
  2007-06-18  7:57                   ` David Miller
  2007-06-21 15:18                 ` Eric W. Biederman
  1 sibling, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18  7:49 UTC (permalink / raw)
  To: davem; +Cc: viro, alan, netdev, linux-kernel

Ping Dave,

Since there doesn't seem to be any new ideas forthcoming, can we
please decide on either one of my two sumbitted patches?

Thanks,
Miklos

> [CC'd Al Viro and Alan Cox, restored patch]
> 
> > > There are races involving the garbage collector, that can throw away
> > > perfectly good packets with AF_UNIX sockets in them.
> > > 
> > > The problems arise when a socket goes from installed to in-flight or
> > > vice versa during garbage collection.  Since gc is done with a
> > > spinlock held, this only shows up on SMP.
> > > 
> > > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> > 
> > I'm going to hold off on this one for now.
> > 
> > Holding all of the read locks kind of defeats the purpose of using
> > the per-socket lock.
> > 
> > Can't you just lock purely around the receive queue operation?
> 
> That's already protected by the receive queue spinlock.  The race
> however happens _between_ pushing the root set and marking of the
> in-flight but reachable sockets.
> 
> If in that space any of the AF_UNIX sockets goes from in-flight to
> installed into a file descriptor, the garbage collector can miss it.
> If we want to protect against this using unix_sk(s)->readlock, then we
> have to hold all of them for the duration of the marking.
> 
> Al, Alan, you have more experience with this piece of code.  Do you
> have better ideas about how to fix this?
> 
> Thanks,
> Miklos
> 
> > Index: linux-2.6.22-rc2/net/unix/garbage.c
> > ===================================================================
> > --- linux-2.6.22-rc2.orig/net/unix/garbage.c	2007-06-03 23:58:11.000000000 +0200
> > +++ linux-2.6.22-rc2/net/unix/garbage.c	2007-06-06 09:48:36.000000000 +0200
> > @@ -186,7 +186,21 @@ void unix_gc(void)
> >  
> >  	forall_unix_sockets(i, s)
> >  	{
> > -		unix_sk(s)->gc_tree = GC_ORPHAN;
> > +		struct unix_sock *u = unix_sk(s);
> > +
> > +		u->gc_tree = GC_ORPHAN;
> > +
> > +		/*
> > +		 * Hold ->readlock to protect against sockets going from
> > +		 * in-flight to installed
> > +		 *
> > +		 * Can't sleep on this, because
> > +		 *   a) we are under spinlock
> > +		 *   b) skb_recv_datagram() could be waiting for a packet that
> > +		 *      is to be sent by this thread
> > +		 */
> > +		if (!mutex_trylock(&u->readlock))
> > +			goto lock_failed;
> >  	}
> >  	/*
> >  	 *	Everything is now marked
> > @@ -207,8 +221,6 @@ void unix_gc(void)
> >  
> >  	forall_unix_sockets(i, s)
> >  	{
> > -		int open_count = 0;
> > -
> >  		/*
> >  		 *	If all instances of the descriptor are not
> >  		 *	in flight we are in use.
> > @@ -218,10 +230,20 @@ void unix_gc(void)
> >  		 *	In this case (see unix_create1()) we set artificial
> >  		 *	negative inflight counter to close race window.
> >  		 *	It is trick of course and dirty one.
> > +		 *
> > +		 *	Get the inflight counter first, then the open
> > +		 *	counter.  This avoids problems if racing with
> > +		 *	sendmsg
> > +		 *
> > +		 *	If just created socket is not yet attached to
> > +		 *	a file descriptor, assume open_count of 1
> >  		 */
> > +		int inflight_count = atomic_read(&unix_sk(s)->inflight);
> > +		int open_count = 1;
> > +
> >  		if (s->sk_socket && s->sk_socket->file)
> >  			open_count = file_count(s->sk_socket->file);
> > -		if (open_count > atomic_read(&unix_sk(s)->inflight))
> > +		if (open_count > inflight_count)
> >  			maybe_unmark_and_push(s);
> >  	}
> >  
> > @@ -300,6 +322,7 @@ void unix_gc(void)
> >  			spin_unlock(&s->sk_receive_queue.lock);
> >  		}
> >  		u->gc_tree = GC_ORPHAN;
> > +		mutex_unlock(&u->readlock);
> >  	}
> >  	spin_unlock(&unix_table_lock);
> >  
> > @@ -309,4 +332,19 @@ void unix_gc(void)
> >  
> >  	__skb_queue_purge(&hitlist);
> >  	mutex_unlock(&unix_gc_sem);
> > +	return;
> > +
> > + lock_failed:
> > +	{
> > +		struct sock *s1;
> > +		forall_unix_sockets(i, s1) {
> > +			if (s1 == s) {
> > +				spin_unlock(&unix_table_lock);
> > +				mutex_unlock(&unix_gc_sem);
> > +				return;
> > +			}
> > +			mutex_unlock(&unix_sk(s1)->readlock);
> > +		}
> > +		BUG();
> > +	}
> >  }
> > 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  7:49                 ` Miklos Szeredi
@ 2007-06-18  7:57                   ` David Miller
  2007-06-18  8:20                     ` Miklos Szeredi
  0 siblings, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-18  7:57 UTC (permalink / raw)
  To: miklos; +Cc: viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 09:49:32 +0200

> Ping Dave,
> 
> Since there doesn't seem to be any new ideas forthcoming, can we
> please decide on either one of my two sumbitted patches?

You just need to be patient, we can't just put a bad patch
in because a better one has not been suggested yet :-)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  7:57                   ` David Miller
@ 2007-06-18  8:20                     ` Miklos Szeredi
  2007-06-18  9:18                       ` David Miller
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18  8:20 UTC (permalink / raw)
  To: davem; +Cc: viro, alan, netdev, linux-kernel

> From: Miklos Szeredi <miklos@szeredi.hu>
> Date: Mon, 18 Jun 2007 09:49:32 +0200
> 
> > Ping Dave,
> > 
> > Since there doesn't seem to be any new ideas forthcoming, can we
> > please decide on either one of my two sumbitted patches?
> 
> You just need to be patient, we can't just put a bad patch
> in because a better one has not been suggested yet :-)

And is anyone working on a better patch?

Those patches aren't "bad" in the correctness sense.  So IMO any one
of them is better, than having that bug in there.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  8:20                     ` Miklos Szeredi
@ 2007-06-18  9:18                       ` David Miller
  2007-06-18  9:29                         ` Miklos Szeredi
  0 siblings, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-18  9:18 UTC (permalink / raw)
  To: miklos; +Cc: viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 10:20:23 +0200

> And is anyone working on a better patch?

I have no idea.

> Those patches aren't "bad" in the correctness sense.  So IMO any one
> of them is better, than having that bug in there.

You're adding a very serious performance regression, which is
about as bad as the bug itself.

It can wait for a more appropriate fix.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:18                       ` David Miller
@ 2007-06-18  9:29                         ` Miklos Szeredi
  2007-06-18  9:35                           ` David Miller
  2007-06-18 11:47                           ` Alan Cox
  0 siblings, 2 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18  9:29 UTC (permalink / raw)
  To: davem; +Cc: akpm, viro, alan, netdev, linux-kernel

> > And is anyone working on a better patch?
> 
> I have no idea.
> 
> > Those patches aren't "bad" in the correctness sense.  So IMO any one
> > of them is better, than having that bug in there.
> 
> You're adding a very serious performance regression, which is
> about as bad as the bug itself.

No, correctness always trumps performance.  Lost packets on an AF_UNIX
socket are _unexceptable_, and this is definitely not a theoretical
problem.

And BTW my second patch does _not_ have the performance problems you
are arguing about, it's just plain ugly.  But hey, if you can't live
with ugly code, go and fix it.

> It can wait for a more appropriate fix.

Now _please_ be a bit more constructive.

Do you want me to send the patch to Andrew instead?  His attitude
towards bugfixes is rather better ;)

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:29                         ` Miklos Szeredi
@ 2007-06-18  9:35                           ` David Miller
  2007-06-18  9:44                             ` Miklos Szeredi
  2007-06-18 11:47                           ` Alan Cox
  1 sibling, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-18  9:35 UTC (permalink / raw)
  To: miklos; +Cc: akpm, viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 11:29:52 +0200

> > > And is anyone working on a better patch?
> > 
> > I have no idea.
> > 
> > > Those patches aren't "bad" in the correctness sense.  So IMO any one
> > > of them is better, than having that bug in there.
> > 
> > You're adding a very serious performance regression, which is
> > about as bad as the bug itself.
> 
> No, correctness always trumps performance.

To a point.  There is no black and white in this world.

> Lost packets on an AF_UNIX socket are _unexceptable_, and this is
> definitely not a theoretical problem.

A lot of people will consider having all of their AF_UNIX sockets on
their 64 cpu system just stop when garbage collection runs to be
unacceptable as well.

Secondarily, this bug has been around for years and nobody noticed.
The world will not explode if this bug takes a few more days or
even a week to work out.  Let's do it right instead of ramming
arbitrary turds into the kernel.

> Do you want me to send the patch to Andrew instead?  His attitude
> towards bugfixes is rather better ;)

When I explain the ramifications of your patch to him, I'm pretty
sure he'll agree with me.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:35                           ` David Miller
@ 2007-06-18  9:44                             ` Miklos Szeredi
  2007-06-18  9:48                               ` David Miller
  2007-06-18 10:32                               ` Thomas Graf
  0 siblings, 2 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18  9:44 UTC (permalink / raw)
  To: davem; +Cc: akpm, viro, alan, netdev, linux-kernel

> > > > And is anyone working on a better patch?
> > > 
> > > I have no idea.
> > > 
> > > > Those patches aren't "bad" in the correctness sense.  So IMO any one
> > > > of them is better, than having that bug in there.
> > > 
> > > You're adding a very serious performance regression, which is
> > > about as bad as the bug itself.
> > 
> > No, correctness always trumps performance.
> 
> To a point.  There is no black and white in this world.
> 
> > Lost packets on an AF_UNIX socket are _unexceptable_, and this is
> > definitely not a theoretical problem.
> 
> A lot of people will consider having all of their AF_UNIX sockets on
> their 64 cpu system just stop when garbage collection runs to be
> unacceptable as well.

Garbage collection only ever happens, if the app is sending AF_UNIX
sockets over AF_UNIX sockets.  Which is a rather rare case.  And which
is basically why this bug went unnoticed for so long.

So my second patch only affects the performance of _exactly_ those
apps which might well be bitten by the bug itself.

> Secondarily, this bug has been around for years and nobody noticed.
> The world will not explode if this bug takes a few more days or
> even a week to work out.  Let's do it right instead of ramming
> arbitrary turds into the kernel.

Fine, but just wishing a bug to get fixed won't accomplish anything.
I've spent a fair amount of time debugging this thing, and I'm out of
ideas.  Really.  So unless somebody steps up to look at this, it won't
_ever_ get fixed.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:44                             ` Miklos Szeredi
@ 2007-06-18  9:48                               ` David Miller
  2007-06-18  9:55                                 ` Miklos Szeredi
  2007-06-18 10:32                               ` Thomas Graf
  1 sibling, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-18  9:48 UTC (permalink / raw)
  To: miklos; +Cc: akpm, viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 11:44:07 +0200

> > Secondarily, this bug has been around for years and nobody noticed.
> > The world will not explode if this bug takes a few more days or
> > even a week to work out.  Let's do it right instead of ramming
> > arbitrary turds into the kernel.
> 
> Fine, but just wishing a bug to get fixed won't accomplish anything.
> I've spent a fair amount of time debugging this thing, and I'm out of
> ideas.  Really.  So unless somebody steps up to look at this, it won't
> _ever_ get fixed.

Somone just needs to find a way to only lock the socket as it is
being operated upon.

The race you are dealing with is rather simple, the queue check
and the state check need to be done atomically.  The only chore
is to find a way to make that happen in the context of what the
garbage allocator is trying to do.

I'm not even convinced that your most recent attempt is deadlock free.
Locking multiple objects the same way all at once like that is
something that needs to be seriously audited.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:48                               ` David Miller
@ 2007-06-18  9:55                                 ` Miklos Szeredi
  2007-06-18  9:59                                   ` David Miller
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18  9:55 UTC (permalink / raw)
  To: davem; +Cc: akpm, viro, alan, netdev, linux-kernel

> > > Secondarily, this bug has been around for years and nobody noticed.
> > > The world will not explode if this bug takes a few more days or
> > > even a week to work out.  Let's do it right instead of ramming
> > > arbitrary turds into the kernel.
> > 
> > Fine, but just wishing a bug to get fixed won't accomplish anything.
> > I've spent a fair amount of time debugging this thing, and I'm out of
> > ideas.  Really.  So unless somebody steps up to look at this, it won't
> > _ever_ get fixed.
> 
> Somone just needs to find a way to only lock the socket as it is
> being operated upon.

No, you are still not getting it.  The problem is that the socket
needs to be locked for the _whole_ of the garbage collection.  This is
because the gc is done in two phases, in the first phase sockets which
are installed into file descriptors are collected as a "root set",
then the set is expanded by iterating over everything it's in there
and that's later added.  If a socket moves from in-flight to installed
_during_ the gc, then it can miss being collected.  So the socket must
be locked for the duration of _both_ phases.

> The race you are dealing with is rather simple, the queue check
> and the state check need to be done atomically.  The only chore
> is to find a way to make that happen in the context of what the
> garbage allocator is trying to do.
> 
> I'm not even convinced that your most recent attempt is deadlock free.
> Locking multiple objects the same way all at once like that is
> something that needs to be seriously audited.

It's doing trylocks and releasing all those locks within the same spin
locked region, how the f**k can that deadlock?

The only problem I've experienced with this patch (other than being
ugly) is that the multitude of locks it acquires makes the lockdep
code give up.  But I think we can live with that.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:55                                 ` Miklos Szeredi
@ 2007-06-18  9:59                                   ` David Miller
  0 siblings, 0 replies; 47+ messages in thread
From: David Miller @ 2007-06-18  9:59 UTC (permalink / raw)
  To: miklos; +Cc: akpm, viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 11:55:19 +0200

> It's doing trylocks and releasing all those locks within the same spin
> locked region, how the f**k can that deadlock?

The case I'm very concerned about is the interactions of your
new code with the unix_state_double_lock() stuff I added recently
to fix another bug.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:44                             ` Miklos Szeredi
  2007-06-18  9:48                               ` David Miller
@ 2007-06-18 10:32                               ` Thomas Graf
  2007-06-18 10:39                                 ` Miklos Szeredi
  2007-06-18 10:40                                 ` Thomas Graf
  1 sibling, 2 replies; 47+ messages in thread
From: Thomas Graf @ 2007-06-18 10:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, akpm, viro, alan, netdev, linux-kernel

* Miklos Szeredi <miklos@szeredi.hu> 2007-06-18 11:44
> Garbage collection only ever happens, if the app is sending AF_UNIX
> sockets over AF_UNIX sockets.  Which is a rather rare case.  And which
> is basically why this bug went unnoticed for so long.
> 
> So my second patch only affects the performance of _exactly_ those
> apps which might well be bitten by the bug itself.

That's not entirely the truth. It affects all applications using
AF_UNIX sockets while file descriptors are being transfered. I
agree that the performance impact is not severe on most systems
but if file descriptors are being transfered continously by just
a single application it can become rather severe.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:32                               ` Thomas Graf
@ 2007-06-18 10:39                                 ` Miklos Szeredi
  2007-06-18 10:43                                   ` Thomas Graf
  2007-06-18 10:40                                 ` Thomas Graf
  1 sibling, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18 10:39 UTC (permalink / raw)
  To: tgraf; +Cc: davem, akpm, viro, alan, netdev, linux-kernel

> * Miklos Szeredi <miklos@szeredi.hu> 2007-06-18 11:44
> > Garbage collection only ever happens, if the app is sending AF_UNIX
> > sockets over AF_UNIX sockets.  Which is a rather rare case.  And which
> > is basically why this bug went unnoticed for so long.
> > 
> > So my second patch only affects the performance of _exactly_ those
> > apps which might well be bitten by the bug itself.
> 
> That's not entirely the truth. It affects all applications using
> AF_UNIX sockets while file descriptors are being transfered. I
> agree that the performance impact is not severe on most systems
> but if file descriptors are being transfered continously by just
> a single application it can become rather severe.

You are wrong.  Look in unix_release_sock():

	if (atomic_read(&unix_tot_inflight))
		unix_gc();		/* Garbage collect fds */


unix_tot_inflight is the number of AF_UNIX sockets currently being
transferred over some AF_UNIX sockets.

That means that just sending (non-unix socket) fds over unix sockets
will never invoke the gc.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:32                               ` Thomas Graf
  2007-06-18 10:39                                 ` Miklos Szeredi
@ 2007-06-18 10:40                                 ` Thomas Graf
  2007-06-18 10:47                                   ` Miklos Szeredi
  1 sibling, 1 reply; 47+ messages in thread
From: Thomas Graf @ 2007-06-18 10:40 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, akpm, viro, alan, netdev, linux-kernel

* Thomas Graf <tgraf@suug.ch> 2007-06-18 12:32
> * Miklos Szeredi <miklos@szeredi.hu> 2007-06-18 11:44
> > Garbage collection only ever happens, if the app is sending AF_UNIX
> > sockets over AF_UNIX sockets.  Which is a rather rare case.  And which
> > is basically why this bug went unnoticed for so long.
> > 
> > So my second patch only affects the performance of _exactly_ those
> > apps which might well be bitten by the bug itself.
> 
> That's not entirely the truth. It affects all applications using
> AF_UNIX sockets while file descriptors are being transfered. I
> agree that the performance impact is not severe on most systems
> but if file descriptors are being transfered continously by just
> a single application it can become rather severe.

Also think of the scenario where an application, deliberately or not,
begins a file descriptor tranfser using sendmsg() and the receiving
part never invokes recvmsg() to decrement the inflight counters
again. Every unix socket that gets closed would result in a gc call
locking all sockets.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:39                                 ` Miklos Szeredi
@ 2007-06-18 10:43                                   ` Thomas Graf
  2007-06-18 12:01                                     ` Alan Cox
  0 siblings, 1 reply; 47+ messages in thread
From: Thomas Graf @ 2007-06-18 10:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, akpm, viro, alan, netdev, linux-kernel

* Miklos Szeredi <miklos@szeredi.hu> 2007-06-18 12:39
> You are wrong.  Look in unix_release_sock():
> 
> 	if (atomic_read(&unix_tot_inflight))
> 		unix_gc();		/* Garbage collect fds */
> 
> 
> unix_tot_inflight is the number of AF_UNIX sockets currently being
> transferred over some AF_UNIX sockets.
> 
> That means that just sending (non-unix socket) fds over unix sockets
> will never invoke the gc.

That's what I meant, I'm sorry, I should have written unix socket
file descriptor to not leave any room for misinterpretation.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:40                                 ` Thomas Graf
@ 2007-06-18 10:47                                   ` Miklos Szeredi
  2007-06-18 10:51                                     ` David Miller
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18 10:47 UTC (permalink / raw)
  To: tgraf; +Cc: davem, akpm, viro, alan, netdev, linux-kernel

> * Thomas Graf <tgraf@suug.ch> 2007-06-18 12:32
> > * Miklos Szeredi <miklos@szeredi.hu> 2007-06-18 11:44
> > > Garbage collection only ever happens, if the app is sending AF_UNIX
> > > sockets over AF_UNIX sockets.  Which is a rather rare case.  And which
> > > is basically why this bug went unnoticed for so long.
> > > 
> > > So my second patch only affects the performance of _exactly_ those
> > > apps which might well be bitten by the bug itself.
> > 
> > That's not entirely the truth. It affects all applications using
> > AF_UNIX sockets while file descriptors are being transfered. I
> > agree that the performance impact is not severe on most systems
> > but if file descriptors are being transfered continously by just
> > a single application it can become rather severe.
> 
> Also think of the scenario where an application, deliberately or not,
> begins a file descriptor tranfser using sendmsg() and the receiving
> part never invokes recvmsg() to decrement the inflight counters
> again. Every unix socket that gets closed would result in a gc call
> locking all sockets.

And if some of the sent files were unix sockets, rightly so, since the
sent sockets might need to be garbage collected.

And BTW the whole gc is done with the unix_table_lock held, so it will
stop some socket operations anyway.  The fact that it needs to stop
some more operations is a necessary thing.  But we are talking about a
_spinlocked_ region, which for zillions of sockets might run for a
long time, but it's not as if it's really going to affect performance
in real cases.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:47                                   ` Miklos Szeredi
@ 2007-06-18 10:51                                     ` David Miller
  2007-06-18 10:55                                       ` Miklos Szeredi
  0 siblings, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-18 10:51 UTC (permalink / raw)
  To: miklos; +Cc: tgraf, akpm, viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 12:47:17 +0200

> but it's not as if it's really going to affect performance
> in real cases.

Since these circumstances are creatable by any user, we have
to consider the cases caused by malicious entities.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:51                                     ` David Miller
@ 2007-06-18 10:55                                       ` Miklos Szeredi
  2007-06-18 11:02                                         ` David Miller
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18 10:55 UTC (permalink / raw)
  To: davem; +Cc: tgraf, akpm, viro, alan, netdev, linux-kernel

> > but it's not as if it's really going to affect performance
> > in real cases.
> 
> Since these circumstances are creatable by any user, we have
> to consider the cases caused by malicious entities.

OK.  But then the whole gc thing is already broken, since a user can
DoS socket creation/destruction.

I'm all for fixing this gc mess that we have now.  But please don't
expect me to be the one who's doing it.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:55                                       ` Miklos Szeredi
@ 2007-06-18 11:02                                         ` David Miller
  2007-06-18 11:06                                           ` Miklos Szeredi
  2007-06-18 11:09                                           ` David Miller
  0 siblings, 2 replies; 47+ messages in thread
From: David Miller @ 2007-06-18 11:02 UTC (permalink / raw)
  To: miklos; +Cc: tgraf, akpm, viro, alan, netdev, linux-kernel

From: Miklos Szeredi <miklos@szeredi.hu>
Date: Mon, 18 Jun 2007 12:55:24 +0200

> I'm all for fixing this gc mess that we have now.  But please don't
> expect me to be the one who's doing it.

Don't worry, I only expect you to make the situation worse :-)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 11:02                                         ` David Miller
@ 2007-06-18 11:06                                           ` Miklos Szeredi
  2007-06-18 11:09                                           ` David Miller
  1 sibling, 0 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18 11:06 UTC (permalink / raw)
  To: davem; +Cc: tgraf, akpm, viro, alan, netdev, linux-kernel

> > I'm all for fixing this gc mess that we have now.  But please don't
> > expect me to be the one who's doing it.
> 
> Don't worry, I only expect you to make the situation worse :-)

That's real nice.  Looks like finding and fixing bugs in not
appreciated in the networking subsystem :-/

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 11:02                                         ` David Miller
  2007-06-18 11:06                                           ` Miklos Szeredi
@ 2007-06-18 11:09                                           ` David Miller
  2007-06-18 11:46                                             ` Miklos Szeredi
  1 sibling, 1 reply; 47+ messages in thread
From: David Miller @ 2007-06-18 11:09 UTC (permalink / raw)
  To: miklos; +Cc: tgraf, akpm, viro, alan, netdev, linux-kernel

From: David Miller <davem@davemloft.net>
Date: Mon, 18 Jun 2007 04:02:53 -0700 (PDT)

> From: Miklos Szeredi <miklos@szeredi.hu>
> Date: Mon, 18 Jun 2007 12:55:24 +0200
> 
> > I'm all for fixing this gc mess that we have now.  But please don't
> > expect me to be the one who's doing it.
> 
> Don't worry, I only expect you to make the situation worse :-)

In any event, I'll try to find some time to look more at your patch.

But just like you don't want to be expected to rewrite the AF_UNIX GC
code, you can't expect me to go right to your patch right when you
want me to.

I am one human being, and I depend upon other developers to help me
out with patch review and so on.  If nobody else wants to review your
patch that doesn't work out so well for me.

Your behavior also tends to make me (and others) put your patch near
the end of the todo list rather than near the top.  Fix your attitude
and people will enjoy reviewing your patches that much more.  Reviewing
your work will become something to look forward to rather than a chore.

Thanks.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 11:47                           ` Alan Cox
@ 2007-06-18 11:45                             ` Jan Engelhardt
  2007-06-18 12:00                             ` Miklos Szeredi
  1 sibling, 0 replies; 47+ messages in thread
From: Jan Engelhardt @ 2007-06-18 11:45 UTC (permalink / raw)
  To: Alan Cox; +Cc: Miklos Szeredi, davem, akpm, viro, netdev, linux-kernel


On Jun 18 2007 12:47, Alan Cox wrote:
>
>> Do you want me to send the patch to Andrew instead?  His attitude
>> towards bugfixes is rather better ;)
>
>And it'll get NAKked and binned. DaveM is (as happens sometimes ;)) right
>to insist on the code being clean and efficient.

Or see RFC 1925 number 7a. :)


	Jan
-- 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 11:09                                           ` David Miller
@ 2007-06-18 11:46                                             ` Miklos Szeredi
  0 siblings, 0 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18 11:46 UTC (permalink / raw)
  To: davem; +Cc: tgraf, akpm, viro, alan, netdev, linux-kernel

> > > I'm all for fixing this gc mess that we have now.  But please don't
> > > expect me to be the one who's doing it.
> > 
> > Don't worry, I only expect you to make the situation worse :-)
> 
> In any event, I'll try to find some time to look more at your patch.
> 
> But just like you don't want to be expected to rewrite the AF_UNIX GC
> code, you can't expect me to go right to your patch right when you
> want me to.
> 
> I am one human being, and I depend upon other developers to help me
> out with patch review and so on.  If nobody else wants to review your
> patch that doesn't work out so well for me.

Sure, no problem.  I just pinged you after a week of inactivity, to
make sure it hasn't been forgotten.  Just say if you are busy and I'll
wait, but trying to forget the issue is not the way to deal with it.

> Your behavior also tends to make me (and others) put your patch near
> the end of the todo list rather than near the top.  Fix your attitude
> and people will enjoy reviewing your patches that much more.  Reviewing
> your work will become something to look forward to rather than a chore.

I'll happily admit I'm a fool when it comes to social interactions.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18  9:29                         ` Miklos Szeredi
  2007-06-18  9:35                           ` David Miller
@ 2007-06-18 11:47                           ` Alan Cox
  2007-06-18 11:45                             ` Jan Engelhardt
  2007-06-18 12:00                             ` Miklos Szeredi
  1 sibling, 2 replies; 47+ messages in thread
From: Alan Cox @ 2007-06-18 11:47 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, akpm, viro, netdev, linux-kernel

> No, correctness always trumps performance.  Lost packets on an AF_UNIX
> socket are _unexceptable_, and this is definitely not a theoretical
> problem.

If its so unacceptable why has nobody noticed until now - its a bug
clearly, it needs fixing clearly, but unless you can produce some kind of
exploit from it (eg a DoS attack or kernel memory leak exploiter) it
doesn't appear to be that serious.

> And BTW my second patch does _not_ have the performance problems you
> are arguing about, it's just plain ugly.  But hey, if you can't live
> with ugly code, go and fix it.

If you put ugly code into the kernel you pay for maintaining it for years
to come. If you get it right then you don't

> Do you want me to send the patch to Andrew instead?  His attitude
> towards bugfixes is rather better ;)

And it'll get NAKked and binned. DaveM is (as happens sometimes ;)) right
to insist on the code being clean and efficient.

Alan

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 11:47                           ` Alan Cox
  2007-06-18 11:45                             ` Jan Engelhardt
@ 2007-06-18 12:00                             ` Miklos Szeredi
  1 sibling, 0 replies; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-18 12:00 UTC (permalink / raw)
  To: alan; +Cc: davem, akpm, viro, netdev, linux-kernel

> > No, correctness always trumps performance.  Lost packets on an AF_UNIX
> > socket are _unexceptable_, and this is definitely not a theoretical
> > problem.
> 
> If its so unacceptable why has nobody noticed until now - its a bug
> clearly, it needs fixing clearly, but unless you can produce some kind of
> exploit from it (eg a DoS attack or kernel memory leak exploiter) it
> doesn't appear to be that serious.

It's serious in that it affects the operation of one of my
applications in a way that is rather hard to work around.  Sure, I
could resend the lost sockets, but it's something one doesn't do
unnecessarily for reliable transports.

And it's entirely possible, that it could bite other apps in rare and
mysterious ways.  All you need is

  - an application that sends a unix socket over a unix socket

  - a parallel close() operation on an independent unix socket

The two might happen to be from totally unrelated apps.

It's all the more serious, because it could happen rarely and
unreproducibly, and could well be crashing the app which is not
expecting this behavior.

> > And BTW my second patch does _not_ have the performance problems you
> > are arguing about, it's just plain ugly.  But hey, if you can't live
> > with ugly code, go and fix it.
> 
> If you put ugly code into the kernel you pay for maintaining it for years
> to come. If you get it right then you don't
> 
> > Do you want me to send the patch to Andrew instead?  His attitude
> > towards bugfixes is rather better ;)
> 
> And it'll get NAKked and binned. DaveM is (as happens sometimes ;)) right
> to insist on the code being clean and efficient.

Right, but treating a bug in your subsystem as if it's entirely the
responsibility of the reporter is not the right attitude either I
think.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-18 10:43                                   ` Thomas Graf
@ 2007-06-18 12:01                                     ` Alan Cox
  0 siblings, 0 replies; 47+ messages in thread
From: Alan Cox @ 2007-06-18 12:01 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Miklos Szeredi, davem, akpm, viro, netdev, linux-kernel

On Mon, 18 Jun 2007 12:43:40 +0200
Thomas Graf <tgraf@suug.ch> wrote:

> * Miklos Szeredi <miklos@szeredi.hu> 2007-06-18 12:39
> > You are wrong.  Look in unix_release_sock():
> > 
> > 	if (atomic_read(&unix_tot_inflight))
> > 		unix_gc();		/* Garbage collect fds */
> > 
> > 
> > unix_tot_inflight is the number of AF_UNIX sockets currently being
> > transferred over some AF_UNIX sockets.
> > 
> > That means that just sending (non-unix socket) fds over unix sockets
> > will never invoke the gc.
> 
> That's what I meant, I'm sorry, I should have written unix socket
> file descriptor to not leave any room for misinterpretation.

You can bound the worst case on this and I think stay within the specs
(as the specs don't say a lot about it). One way would be to make
unix_gc() kick off a thread/tasklet and if the last unix_gc was within 5
seconds then use a timer to defer it - that prevents any user driven
"lets cause a ton of gc" cases.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-11  9:57               ` Miklos Szeredi
  2007-06-18  7:49                 ` Miklos Szeredi
@ 2007-06-21 15:18                 ` Eric W. Biederman
  2007-06-23  8:48                   ` Miklos Szeredi
  1 sibling, 1 reply; 47+ messages in thread
From: Eric W. Biederman @ 2007-06-21 15:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, viro, alan, netdev, linux-kernel

Miklos Szeredi <miklos@szeredi.hu> writes:

> [CC'd Al Viro and Alan Cox, restored patch]
>
>> > There are races involving the garbage collector, that can throw away
>> > perfectly good packets with AF_UNIX sockets in them.
>> > 
>> > The problems arise when a socket goes from installed to in-flight or
>> > vice versa during garbage collection.  Since gc is done with a
>> > spinlock held, this only shows up on SMP.
>> > 
>> > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
>> 
>> I'm going to hold off on this one for now.
>> 
>> Holding all of the read locks kind of defeats the purpose of using
>> the per-socket lock.
>> 
>> Can't you just lock purely around the receive queue operation?
>
> That's already protected by the receive queue spinlock.  The race
> however happens _between_ pushing the root set and marking of the
> in-flight but reachable sockets.
>
> If in that space any of the AF_UNIX sockets goes from in-flight to
> installed into a file descriptor, the garbage collector can miss it.
> If we want to protect against this using unix_sk(s)->readlock, then we
> have to hold all of them for the duration of the marking.
>
> Al, Alan, you have more experience with this piece of code.  Do you
> have better ideas about how to fix this?

I haven't looked at the code closely enough to be confident of
changing something in this area.  However the classic solution to this
kind of gc problem is to mark things that are manipulated during
garbage collection as dirty (not orphaned).

It should be possible to fix this problem by simply changing gc_tree
when we perform a problematic manipulation of a passed socket, such
as installing a passed socket into the file descriptors of a process.

Essentially the idea is moving the current code in the direction of
an incremental gc algorithm.


If I understand the race properly.  What happens is that we dequeue
a socket (which has packets in it passing sockets) before the
garbage collector gets to it.  Therefore the garbage collector
never processes that socket.  So it sounds like we just
need to call maybe_unmark_and_push or possibly just wait for
the garbage collector to complete when we do that and the packet
we have pulled out 



So just looking at this quickly.  It looks like we need to hold
the u->readlock mutex in garbage.c while looking at the receive
queue.  Otherwise we may be processing a packet and have
the file descriptors in limbo.  Either that or we need
to slightly extend the scope of the receive queue lock on the
receive side.

Additionally we need to modify unix_notinflight to mark the sockets
as inuse, or to at least wait for the garbage collection to complete.
While being very careful to not add a deadlock scenario.

The require changes to fix this without adding heavy handed locking
look a bit nasty to make.

I half suspect switching to one of the simpler incremental garbage
collection algorithms might not be equally easy to implement and
give us more performance at the same time.  Having a garbage collector
that can block has advantages.

The practical problem is you can't modify the list of pushed object
aka gc_tree without holding a lock.  And we never drop the
unix_table_lock.


Anyway hopefully that is enough fodder for someone to come up with
a light weight fix for this problem.

Eric

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-21 15:18                 ` Eric W. Biederman
@ 2007-06-23  8:48                   ` Miklos Szeredi
  2007-06-23 16:42                     ` Eric W. Biederman
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-23  8:48 UTC (permalink / raw)
  To: ebiederm; +Cc: davem, viro, alan, netdev, linux-kernel

Eric, thanks for looking at this.

> >> > There are races involving the garbage collector, that can throw away
> >> > perfectly good packets with AF_UNIX sockets in them.
> >> > 
> >> > The problems arise when a socket goes from installed to in-flight or
> >> > vice versa during garbage collection.  Since gc is done with a
> >> > spinlock held, this only shows up on SMP.
> >> > 
> >> > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> >> 
> >> I'm going to hold off on this one for now.
> >> 
> >> Holding all of the read locks kind of defeats the purpose of using
> >> the per-socket lock.
> >> 
> >> Can't you just lock purely around the receive queue operation?
> >
> > That's already protected by the receive queue spinlock.  The race
> > however happens _between_ pushing the root set and marking of the
> > in-flight but reachable sockets.
> >
> > If in that space any of the AF_UNIX sockets goes from in-flight to
> > installed into a file descriptor, the garbage collector can miss it.
> > If we want to protect against this using unix_sk(s)->readlock, then we
> > have to hold all of them for the duration of the marking.
> >
> > Al, Alan, you have more experience with this piece of code.  Do you
> > have better ideas about how to fix this?
> 
> I haven't looked at the code closely enough to be confident of
> changing something in this area.  However the classic solution to this
> kind of gc problem is to mark things that are manipulated during
> garbage collection as dirty (not orphaned).
> 
> It should be possible to fix this problem by simply changing gc_tree
> when we perform a problematic manipulation of a passed socket, such
> as installing a passed socket into the file descriptors of a process.
> 
> Essentially the idea is moving the current code in the direction of
> an incremental gc algorithm.
> 
> 
> If I understand the race properly.  What happens is that we dequeue
> a socket (which has packets in it passing sockets) before the
> garbage collector gets to it.  Therefore the garbage collector
> never processes that socket.  So it sounds like we just
> need to call maybe_unmark_and_push or possibly just wait for
> the garbage collector to complete when we do that and the packet
> we have pulled out 

Right.  But the devil is in the details, and (as you correctly point
out later) to implement this, the whole locking scheme needs to be
overhauled.  Problems:

 - Using the queue lock to make the dequeue and the fd detach atomic
   wrt the GC is difficult, if not impossible: they are are far from
   each other with various magic in between.  It would need thorough
   understanding of these functions and _big_ changes to implement.

 - Sleeping on u->readlock in GC is currently not possible, since that
   could deadlock with unix_dgram_recvmsg().  That function could
   probably be modified to release u->readlock, while waiting for
   data, similarly to unix_stream_recvmsg() at the cost of some added
   complexity.

 - Sleeping on u->readlock is also impossible, because GC is holding
   unix_table_lock for the whole operation.  We could release
   unix_table_lock, but then would have to cope with sockets coming
   and going, making the current socket iterator unworkable.

So theoretically it's quite simple, but it needs big changes.  And
this wouldn't even solve all the problems with the GC, like being a
possible DoS vector.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-23  8:48                   ` Miklos Szeredi
@ 2007-06-23 16:42                     ` Eric W. Biederman
  2007-06-26  8:54                       ` Miklos Szeredi
  0 siblings, 1 reply; 47+ messages in thread
From: Eric W. Biederman @ 2007-06-23 16:42 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, viro, alan, netdev, linux-kernel

Miklos Szeredi <miklos@szeredi.hu> writes:

> Right.  But the devil is in the details, and (as you correctly point
> out later) to implement this, the whole locking scheme needs to be
> overhauled.  Problems:
>
>  - Using the queue lock to make the dequeue and the fd detach atomic
>    wrt the GC is difficult, if not impossible: they are are far from
>    each other with various magic in between.  It would need thorough
>    understanding of these functions and _big_ changes to implement.
>
>  - Sleeping on u->readlock in GC is currently not possible, since that
>    could deadlock with unix_dgram_recvmsg().  That function could
>    probably be modified to release u->readlock, while waiting for
>    data, similarly to unix_stream_recvmsg() at the cost of some added
>    complexity.
>
>  - Sleeping on u->readlock is also impossible, because GC is holding
>    unix_table_lock for the whole operation.  We could release
>    unix_table_lock, but then would have to cope with sockets coming
>    and going, making the current socket iterator unworkable.
>
> So theoretically it's quite simple, but it needs big changes.  And
> this wouldn't even solve all the problems with the GC, like being a
> possible DoS vector.

Making the GC fully incremental will solve the DoS vector problem as
well.  Basically you do a fixed amount of reclaim in the new socket
allocation code.

It appears clear that since we can't stop the world and garbage
collect we need an incremental collector.

Eric

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-23 16:42                     ` Eric W. Biederman
@ 2007-06-26  8:54                       ` Miklos Szeredi
  2007-06-26 15:24                         ` Eric W. Biederman
  0 siblings, 1 reply; 47+ messages in thread
From: Miklos Szeredi @ 2007-06-26  8:54 UTC (permalink / raw)
  To: ebiederm; +Cc: davem, viro, alan, netdev, linux-kernel

> > Right.  But the devil is in the details, and (as you correctly point
> > out later) to implement this, the whole locking scheme needs to be
> > overhauled.  Problems:
> >
> >  - Using the queue lock to make the dequeue and the fd detach atomic
> >    wrt the GC is difficult, if not impossible: they are are far from
> >    each other with various magic in between.  It would need thorough
> >    understanding of these functions and _big_ changes to implement.
> >
> >  - Sleeping on u->readlock in GC is currently not possible, since that
> >    could deadlock with unix_dgram_recvmsg().  That function could
> >    probably be modified to release u->readlock, while waiting for
> >    data, similarly to unix_stream_recvmsg() at the cost of some added
> >    complexity.
> >
> >  - Sleeping on u->readlock is also impossible, because GC is holding
> >    unix_table_lock for the whole operation.  We could release
> >    unix_table_lock, but then would have to cope with sockets coming
> >    and going, making the current socket iterator unworkable.
> >
> > So theoretically it's quite simple, but it needs big changes.  And
> > this wouldn't even solve all the problems with the GC, like being a
> > possible DoS vector.
> 
> Making the GC fully incremental will solve the DoS vector problem as
> well.  Basically you do a fixed amount of reclaim in the new socket
> allocation code.

And I think incremental GC algorithms are much too complex for this
task.  What I've realized, is that in fact we don't require a generic
garbage collection algorithm, just a much more specialized cycle
collection algorithm, since refcounting in struct file takes care of
the rest.

This would help with localizing the problem to the problematic sockets
(which have an in-flight unix socket), instead of having to blindly
traverse _all_ unix sockets in the system.

I'll look at reimplementing the GC with such an algorithm.

> It appears clear that since we can't stop the world and garbage
> collect we need an incremental collector.

Constraining ourselves to stopping unix sockets from going in flight
or coming out of flight during garbage collection should be OK I
think.  There's still a possibility of a DoS there, but it would only
be able to affect _very_ few applications.

Miklos

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH] fix race in AF_UNIX
  2007-06-26  8:54                       ` Miklos Szeredi
@ 2007-06-26 15:24                         ` Eric W. Biederman
  0 siblings, 0 replies; 47+ messages in thread
From: Eric W. Biederman @ 2007-06-26 15:24 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: davem, viro, alan, netdev, linux-kernel

Miklos Szeredi <miklos@szeredi.hu> writes:
>
> And I think incremental GC algorithms are much too complex for this
> task.  What I've realized, is that in fact we don't require a generic
> garbage collection algorithm, just a much more specialized cycle
> collection algorithm, since refcounting in struct file takes care of
> the rest.
>
> This would help with localizing the problem to the problematic sockets
> (which have an in-flight unix socket), instead of having to blindly
> traverse _all_ unix sockets in the system.
>
> I'll look at reimplementing the GC with such an algorithm.

Ok.  If you can do it more simply have at it.

There are incremental garbage collectors that are essentially just the
current algorithm with fine-grained locking.  So we don't have to live
in a spin-lock the whole time.

If your approach fails we can look at something more fine-grained.

>> It appears clear that since we can't stop the world and garbage
>> collect we need an incremental collector.
>
> Constraining ourselves to stopping unix sockets from going in flight
> or coming out of flight during garbage collection should be OK I
> think.  There's still a possibility of a DoS there, but it would only
> be able to affect _very_ few applications.

Yes.

Eric

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2007-06-26 15:25 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-02 21:50 [PATCH] fix race in AF_UNIX Miklos Szeredi
2007-06-02 22:11 ` Arnaldo Carvalho de Melo
2007-06-04  9:45 ` Miklos Szeredi
2007-06-05  7:02   ` David Miller
2007-06-05  7:42     ` Miklos Szeredi
2007-06-05  7:55       ` David Miller
2007-06-05  8:11         ` Miklos Szeredi
2007-06-05  8:19           ` David Miller
2007-06-05 20:11       ` David Miller
2007-06-06  0:31     ` David Miller
2007-06-06  5:26       ` Miklos Szeredi
2007-06-06  5:41         ` David Miller
2007-06-06  8:08           ` Miklos Szeredi
2007-06-06  8:12             ` David Miller
2007-06-08  1:47             ` David Miller
2007-06-11  9:57               ` Miklos Szeredi
2007-06-18  7:49                 ` Miklos Szeredi
2007-06-18  7:57                   ` David Miller
2007-06-18  8:20                     ` Miklos Szeredi
2007-06-18  9:18                       ` David Miller
2007-06-18  9:29                         ` Miklos Szeredi
2007-06-18  9:35                           ` David Miller
2007-06-18  9:44                             ` Miklos Szeredi
2007-06-18  9:48                               ` David Miller
2007-06-18  9:55                                 ` Miklos Szeredi
2007-06-18  9:59                                   ` David Miller
2007-06-18 10:32                               ` Thomas Graf
2007-06-18 10:39                                 ` Miklos Szeredi
2007-06-18 10:43                                   ` Thomas Graf
2007-06-18 12:01                                     ` Alan Cox
2007-06-18 10:40                                 ` Thomas Graf
2007-06-18 10:47                                   ` Miklos Szeredi
2007-06-18 10:51                                     ` David Miller
2007-06-18 10:55                                       ` Miklos Szeredi
2007-06-18 11:02                                         ` David Miller
2007-06-18 11:06                                           ` Miklos Szeredi
2007-06-18 11:09                                           ` David Miller
2007-06-18 11:46                                             ` Miklos Szeredi
2007-06-18 11:47                           ` Alan Cox
2007-06-18 11:45                             ` Jan Engelhardt
2007-06-18 12:00                             ` Miklos Szeredi
2007-06-21 15:18                 ` Eric W. Biederman
2007-06-23  8:48                   ` Miklos Szeredi
2007-06-23 16:42                     ` Eric W. Biederman
2007-06-26  8:54                       ` Miklos Szeredi
2007-06-26 15:24                         ` Eric W. Biederman
2007-06-04  9:53 ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).