linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
@ 2012-02-20 15:57 Javier Martinez Canillas
  2012-02-20 15:57 ` [PATCH 01/10] af_unix: Documentation on multicast unix sockets Javier Martinez Canillas
                   ` (4 more replies)
  0 siblings, 5 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-20 15:57 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Lennart Poettering, Kay Sievers, Alban Crequy,
	Bart Cerneels, Rodrigo Moya, Sjoerd Simons, netdev, linux-kernel

This patch-set add multicast support to Unix domain socket familiy for datagram
and seqpacket sockets. This work was made by Alban Crequy as a result of a
research we have been doing to improve the performance of the D-bus IPC system.

The first approach was to create a new AF_DBUS socket address family and
move the routing logic of the D-bus daemon to the kernel. The motivations behind
that approach and the thread of the patches post can be found in [1] and [2].

The feedback was that having D-bus specific code in the kernel is a bad
idea so the second approach was to implement multicast Unix domain sockets so
clients can directly send messages to peers bypassing the D-bus daemon.
A previous version of the patches was already posted by Alban [3] who also has
a good explanation of the implementation on his blog [4].

[1]http://alban-apinc.blogspot.com/2011/12/d-bus-in-kernel-faster.html
[2]http://thread.gmane.org/gmane.linux.kernel/1040481
[3]http://thread.gmane.org/gmane.linux.network/178772
[4]http://alban-apinc.blogspot.com/2011/12/introducing-multicast-unix-sockets.html

The patch-set is composed of the following patches:

[PATCH 01/10] af_unix: Documentation on multicast unix sockets
[PATCH 02/10] af_unix: Add constant for unix socket options level
[PATCH 03/10] af_unix: add setsockopt on unix sockets
[PATCH 04/10] af_unix: create, join and leave multicast groups with setsockopt
[PATCH 05/10] af_unix: find the recipients of a multicast group
[PATCH 06/10] af_unix: Deliver message to several recipients in case of multicast
[PATCH 07/10] af_unix: implement poll(POLLOUT) for multicast sockets
[PATCH 08/10] af_unix: Unsubscribe sockets from their multicast groups on RCV_SHUTDOWN
[PATCH 09/10] Allow server side of SOCK_SEQPACKET sockets to accept a new member
[PATCH 10/10] af_unix: Add a peer BPF for multicast Unix sockets

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/10] af_unix: Documentation on multicast unix sockets
  2012-02-20 15:57 [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Javier Martinez Canillas
@ 2012-02-20 15:57 ` Javier Martinez Canillas
  2012-02-20 15:57 ` [PATCH 02/10] af_unix: Add constant for unix socket options level Javier Martinez Canillas
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-20 15:57 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Lennart Poettering, Kay Sievers, Alban Crequy,
	Bart Cerneels, Rodrigo Moya, Sjoerd Simons, netdev, linux-kernel

From: Alban Crequy <alban.crequy@collabora.co.uk>

Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: Ian Molton <ian.molton@collabora.co.uk>
---
 .../networking/multicast-unix-sockets.txt          |  180 ++++++++++++++++++++
 1 files changed, 180 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/networking/multicast-unix-sockets.txt

diff --git a/Documentation/networking/multicast-unix-sockets.txt b/Documentation/networking/multicast-unix-sockets.txt
new file mode 100644
index 0000000..ec9a19c
--- /dev/null
+++ b/Documentation/networking/multicast-unix-sockets.txt
@@ -0,0 +1,180 @@
+Multicast Unix sockets
+======================
+
+Multicast is implemented on SOCK_DGRAM and SOCK_SEQPACKET Unix sockets.
+
+An userspace application can create a multicast group with:
+
+  struct unix_mreq mreq = {0,};
+  mreq.address.sun_family = AF_UNIX;
+  mreq.address.sun_path[0] = '\0';
+  strcpy(mreq.address.sun_path + 1, "socket-address");
+
+  sockfd = socket(AF_UNIX, SOCK_DGRAM, 0);
+  ret = setsockopt(sockfd, SOL_UNIX, UNIX_CREATE_GROUP, &mreq, sizeof(mreq));
+
+This allocates a struct unix_mcast_group, which is reference counted and exists
+as long as the socket who created it exists or the group has at least one
+member.
+
+SOCK_DGRAM sockets can join a multicast group with:
+
+  ret = setsockopt(sockfd, SOL_UNIX, UNIX_JOIN_GROUP, &mreq, sizeof(mreq));
+
+This allocates a struct unix_mcast, which holds the settings of the membership,
+mainly whether loopback is enabled. A socket can be a member of several
+multicast groups.
+
+Since SOCK_SEQPACKET sockets are connection-oriented the semantics are
+different. A client cannot join a group but it can only connect and the
+multicast accept socket is used to allow the peer to join the group with:
+
+  ret = setsockopt(groupfd, SOL_UNIX, UNIX_CREATE_GROUP, &val, vallen);
+  ret = listen(groupfd, 10);
+  connfd = accept(sockfd, NULL, 0);
+  ret = setsockopt(connfd, SOL_UNIX, UNIX_ACCEPT_GROUP, &mreq, sizeof(mreq));
+
+The socket is part of the multicast group until it is released, shutdown with
+RCV_SHUTDOWN or it leaves explicitely the group:
+
+  ret = setsockopt(sockfd, SOL_UNIX, UNIX_LEAVE_GROUP, &mreq, sizeof(mreq));
+
+Struct unix_mcast nodes are linked in two RCU lists:
+- (struct unix_sock)->mcast_subscriptions
+- (struct unix_mcast_group)->mcast_members
+
+              unix_mcast_group  unix_mcast_group
+                      |                 |
+                      v                 v
+unix_sock  ---->  unix_mcast  ----> unix_mcast
+                      |
+                      v
+unix_sock  ---->  unix_mcast
+                      |
+                      v
+unix_sock  ---->  unix_mcast
+
+
+SOCK_DGRAM semantics
+====================
+
+          G          The socket which created the group
+       /  |  \
+     P1  P2  P3      The member sockets
+
+Messages sent to the group are received by all members except the sender itself
+unless the sending socket has UNIX_MREQ_LOOPBACK set.
+
+Non-members can also send to the group socket G and the message will be
+broadcast to the group members, however socket G does not receive messages sent
+to the group, via it, itself.
+
+
+SOCK_SEQPACKET semantics
+========================
+
+When a connection is performed on a SOCK_SEQPACKET multicast socket, a new
+socket is created and its file descriptor is received by accept().
+
+          L          The listening socket
+       /  |  \
+     A1  A2  A3      The accepted sockets
+      |   |   |
+     C1  C2  C3      The connected sockets
+
+Messages sent on the C1 socket are received by:
+- C1 itself if UNIX_MREQ_LOOPBACK is set.
+- The peer socket A1 if UNIX_MREQ_SEND_TO_PEER is set.
+- The other members of the multicast group C2 and C3.
+
+Only members can send to the group in this case.
+
+
+Atomic delivery and ordering
+============================
+
+Each message sent is delivered atomically to either none of the recipients or
+all the recipients, even with interruptions and errors.
+
+Locking is used in order to keep the ordering consistent on all recipients. We
+want to avoid the following scenario. Two emitters A and B, and 2 recipients, C
+and D:
+
+           C    D
+A -------->|    |    Step 1: A's message is delivered to C
+B -------->|    |    Step 2: B's message is delivered to C
+B ---------|--->|    Step 3: B's message is delivered to D
+A ---------|--->|    Step 4: A's message is delivered to D
+
+Result: - C received (A, B)
+        - D received (B, A)
+
+Although A and B had a list of recipients (C, D) in the same order, C and D
+received the messages in a different order. To avoid this scenario, we need a
+locking mechanism while the messages are being delivered with skb_queue_tail().
+
+Solution 1:
+The easiest implementation would be to use a global spinlock on the group, but
+it creates an avoidable contention, especially when there are two independent
+streams set up with socket filters; e.g. if A sends messages received only by
+C, and B sends messages received only by D.
+
+Solution 2:
+Fine-grained locking could be implemented with a spinlock on each recipient.
+Before delivering the message to the recipients, the sender takes a spinlock on
+each recipient at the same time.
+
+Taking several spinlocks on the same struct can be dangerous and leads to
+deadlocks. This is prevented by sorting the list of sockets by memory address
+and taking the spinlocks in that order. The ordered list of recipients is
+computed on demand when a message is sent and the list is cached for
+performance. When the group membership changes, the generation of the
+membership is incremented and the ordered recipient list is invalidated.
+
+With this solution, the number of spinlocks taken simultaneously can be
+arbitrary big. Whilst it works, it breaks the lockdep mechanism.
+
+Solution 3:
+The current implementation is similar to solution 2 but with a limit on the
+number of spinlocks taken simultaneously (8), so lockdep works fine. A hash
+function and bit array with n=8 specifies which spinlocks to take.  Contention
+on independent streams can still happen but it is less likely.
+
+
+Flow control
+============
+
+When a socket's receiving queue is full, the default behavior is to block
+senders (or to return -EAGAIN on non-blocking sockets). The socket can also
+join a multicast group with the flag UNIX_MREQ_DROP_WHEN_FULL. In this case,
+messages sent to the group will not be delivered to that socket when its
+receiving queue is full.
+
+Messages are still delivered atomically to all members who don't have the flag
+UNIX_MREQ_DROP_WHEN_FULL. If send() returns -EAGAIN, nobody received the
+message. If send() blocks because of one member, the other members don't
+receive the message until all sockets (except those with
+UNIX_MREQ_DROP_WHEN_FULL set) can receive at the same time.
+
+poll/epoll/select on POLLOUT events have a consistent behavior; they block if
+at least one member of the multicast group without UNIX_MREQ_DROP_WHEN_FULL has
+a full receiving queue.
+
+
+Multicast socket reference counting
+===================================
+
+A poller for POLLOUT events can block for any member of the group. The poller
+can use the wait queue "peer_wait" of any member. So it is important that Unix
+sockets are not released before all pollers exit. This is achieved by:
+
+- Incrementing the reference counter of a socket when it joins a multicast
+  group.
+- Decrementing it when the group is destroyed, that is when all
+  sockets keeping a reference on the group released their reference on the
+  group.
+
+struct unix_mcast_group keeps track of both current members and previous
+members. When a socket leaves a group, it is removed from the members list and
+put in the dead members list. This is done in order to take advantage of RCU
+lists, which reduces lock contention.
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/10] af_unix: Add constant for unix socket options level
  2012-02-20 15:57 [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Javier Martinez Canillas
  2012-02-20 15:57 ` [PATCH 01/10] af_unix: Documentation on multicast unix sockets Javier Martinez Canillas
@ 2012-02-20 15:57 ` Javier Martinez Canillas
  2012-02-20 15:57 ` [PATCH 03/10] af_unix: add setsockopt on unix sockets Javier Martinez Canillas
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-20 15:57 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Lennart Poettering, Kay Sievers, Alban Crequy,
	Bart Cerneels, Rodrigo Moya, Sjoerd Simons, netdev, linux-kernel

From: Alban Crequy <alban.crequy@collabora.co.uk>

Assign the next free socket options level to be used by the unix
protocol and address family.

Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: Ian Molton <ian.molton@collabora.co.uk>
---
 include/linux/socket.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/socket.h b/include/linux/socket.h
index d0e77f6..a6b8f35 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -312,6 +312,7 @@ struct ucred {
 #define SOL_IUCV	277
 #define SOL_CAIF	278
 #define SOL_ALG		279
+#define SOL_UNIX	280
 
 /* IPX options */
 #define IPX_TYPE	1
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/10] af_unix: add setsockopt on unix sockets
  2012-02-20 15:57 [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Javier Martinez Canillas
  2012-02-20 15:57 ` [PATCH 01/10] af_unix: Documentation on multicast unix sockets Javier Martinez Canillas
  2012-02-20 15:57 ` [PATCH 02/10] af_unix: Add constant for unix socket options level Javier Martinez Canillas
@ 2012-02-20 15:57 ` Javier Martinez Canillas
  2012-02-20 16:20   ` David Miller
  2012-02-20 19:13 ` [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Colin Walters
  2012-02-24 20:36 ` David Miller
  4 siblings, 1 reply; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-20 15:57 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Lennart Poettering, Kay Sievers, Alban Crequy,
	Bart Cerneels, Rodrigo Moya, Sjoerd Simons, netdev, linux-kernel

From: Alban Crequy <alban.crequy@collabora.co.uk>

unix_setsockopt() is called only on SOCK_DGRAM and SOCK_SEQPACKET unix sockets

Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk>
Reviewed-by: Ian Molton <ian.molton@collabora.co.uk>
---
 net/unix/af_unix.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 85d3bb7..3537f20 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -515,6 +515,8 @@ static unsigned int unix_dgram_poll(struct file *, struct socket *,
 				    poll_table *);
 static int unix_ioctl(struct socket *, unsigned int, unsigned long);
 static int unix_shutdown(struct socket *, int);
+static int unix_setsockopt(struct socket *, int, int,
+			   char __user *, unsigned int);
 static int unix_stream_sendmsg(struct kiocb *, struct socket *,
 			       struct msghdr *, size_t);
 static int unix_stream_recvmsg(struct kiocb *, struct socket *,
@@ -564,7 +566,7 @@ static const struct proto_ops unix_dgram_ops = {
 	.ioctl =	unix_ioctl,
 	.listen =	sock_no_listen,
 	.shutdown =	unix_shutdown,
-	.setsockopt =	sock_no_setsockopt,
+	.setsockopt =	unix_setsockopt,
 	.getsockopt =	sock_no_getsockopt,
 	.sendmsg =	unix_dgram_sendmsg,
 	.recvmsg =	unix_dgram_recvmsg,
@@ -585,7 +587,7 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.ioctl =	unix_ioctl,
 	.listen =	unix_listen,
 	.shutdown =	unix_shutdown,
-	.setsockopt =	sock_no_setsockopt,
+	.setsockopt =	unix_setsockopt,
 	.getsockopt =	sock_no_getsockopt,
 	.sendmsg =	unix_seqpacket_sendmsg,
 	.recvmsg =	unix_seqpacket_recvmsg,
@@ -1583,6 +1585,13 @@ out:
 }
 
 
+static int unix_setsockopt(struct socket *sock, int level, int optname,
+			   char __user *optval, unsigned int optlen)
+{
+	return -EOPNOTSUPP;
+}
+
+
 static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 			       struct msghdr *msg, size_t len)
 {
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 03/10] af_unix: add setsockopt on unix sockets
  2012-02-20 15:57 ` [PATCH 03/10] af_unix: add setsockopt on unix sockets Javier Martinez Canillas
@ 2012-02-20 16:20   ` David Miller
  0 siblings, 0 replies; 51+ messages in thread
From: David Miller @ 2012-02-20 16:20 UTC (permalink / raw)
  To: javier
  Cc: eric.dumazet, lennart, kay.sievers, alban.crequy, bart.cerneels,
	rodrigo.moya, sjoerd.simons, netdev, linux-kernel


Well, where's the rest?

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-20 15:57 [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Javier Martinez Canillas
                   ` (2 preceding siblings ...)
  2012-02-20 15:57 ` [PATCH 03/10] af_unix: add setsockopt on unix sockets Javier Martinez Canillas
@ 2012-02-20 19:13 ` Colin Walters
  2012-02-21  8:07   ` Rodrigo Moya
  2012-02-24 20:36 ` David Miller
  4 siblings, 1 reply; 51+ messages in thread
From: Colin Walters @ 2012-02-20 19:13 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: David S. Miller, Eric Dumazet, Lennart Poettering, Kay Sievers,
	Alban Crequy, Bart Cerneels, Rodrigo Moya, Sjoerd Simons, netdev,
	linux-kernel

On Mon, 2012-02-20 at 16:57 +0100, Javier Martinez Canillas wrote:
> This patch-set add multicast support to Unix domain socket familiy for datagram
> and seqpacket sockets. This work was made by Alban Crequy as a result of a
> research we have been doing to improve the performance of the D-bus IPC system.

Do you have links to any modifications to userspace dbus to take
advantage of this?




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-20 19:13 ` [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Colin Walters
@ 2012-02-21  8:07   ` Rodrigo Moya
  0 siblings, 0 replies; 51+ messages in thread
From: Rodrigo Moya @ 2012-02-21  8:07 UTC (permalink / raw)
  To: Colin Walters
  Cc: Javier Martinez Canillas, David S. Miller, Eric Dumazet,
	Lennart Poettering, Kay Sievers, Alban Crequy, Bart Cerneels,
	Sjoerd Simons, netdev, linux-kernel

On Mon, 2012-02-20 at 14:13 -0500, Colin Walters wrote:
> On Mon, 2012-02-20 at 16:57 +0100, Javier Martinez Canillas wrote:
> > This patch-set add multicast support to Unix domain socket familiy for datagram
> > and seqpacket sockets. This work was made by Alban Crequy as a result of a
> > research we have been doing to improve the performance of the D-bus IPC system.
> 
> Do you have links to any modifications to userspace dbus to take
> advantage of this?

we have a work in progress at
http://cgit.collabora.com/git/user/rodrigo/dbus.git/ in the
unix-sockets-multicast branch


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-20 15:57 [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Javier Martinez Canillas
                   ` (3 preceding siblings ...)
  2012-02-20 19:13 ` [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Colin Walters
@ 2012-02-24 20:36 ` David Miller
  2012-02-27 14:00   ` Javier Martinez Canillas
  4 siblings, 1 reply; 51+ messages in thread
From: David Miller @ 2012-02-24 20:36 UTC (permalink / raw)
  To: javier
  Cc: eric.dumazet, lennart, kay.sievers, alban.crequy, bart.cerneels,
	rodrigo.moya, sjoerd.simons, netdev, linux-kernel


My first impression is that I'm amazed at how much complicated new
code you have to add to support groups of receivers of AF_UNIX
messages.

I can't see how this is better than doing multicast over ipv4 using
UDP or something like that, code which we have already and has been
tested for decades.

I really don't want to apply this stuff, it looks bloated,
complicated, and there is another avenue for doing what you want to
do.

Applications have to change to support the new multicast facilities,
so they can equally be changed to use a real transport that already
supports multicasting.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-24 20:36 ` David Miller
@ 2012-02-27 14:00   ` Javier Martinez Canillas
  2012-02-27 19:05     ` David Miller
  0 siblings, 1 reply; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-27 14:00 UTC (permalink / raw)
  To: David Miller
  Cc: javier, eric.dumazet, lennart, kay.sievers, alban.crequy,
	bart.cerneels, rodrigo.moya, sjoerd.simons, netdev, linux-kernel


On 02/24/2012 09:36 PM, David Miller wrote:
> 
> My first impression is that I'm amazed at how much complicated new
> code you have to add to support groups of receivers of AF_UNIX
> messages.
> 
> I can't see how this is better than doing multicast over ipv4 using
> UDP or something like that, code which we have already and has been
> tested for decades.
> 

Primary for performance reasons. D-bus is an IPC system for processes in
the same machine so traversing the whole TCP/IP stack seems a little
overkill to me. We will try it though to have numbers on the actual
overhead of using UDP multicast over IP instead of multicast Unix domain
sockets.

We also thought of using Netlink sockets since it already supports
multicast and should be more lightweight than IP multicast. But even
Netlink doesn't meet our needs since our multicast on Unix sockets
implementation has different semantics needed for D-bus:

- total order is guaranteed: If sender A sends a message before B, then
receiver C and D should both get message A first and then B.

- slow readers: dropping packets vs blocking the sender. Although
  datagrams are not reliable on IP, datagrams on Unix sockets are never
  lost. So if one receiver has its buffer full the sender is blocked
instead of dropping packets. That way we guarantee a reliable
communication channel.

- multicast group acess control: controlling who can join the multicast
group.

- multicast on loopback is not supported: which means we have to use a
NIC (i.e: eth0).

> I really don't want to apply this stuff, it looks bloated,
> complicated, and there is another avenue for doing what you want to
> do.
> 

We can work to reduce the implementation complexity and make it less
bloated.

Or you don't like the idea in general?

> Applications have to change to support the new multicast facilities,
> so they can equally be changed to use a real transport that already
> supports multicasting.

Yes, this is not about minimizing user-space application change but to
improve the D-bus performance, or any other framework that relies on
multicast communication on a single machine.

Best regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-27 14:00   ` Javier Martinez Canillas
@ 2012-02-27 19:05     ` David Miller
  2012-02-28 10:47       ` Rodrigo Moya
  0 siblings, 1 reply; 51+ messages in thread
From: David Miller @ 2012-02-27 19:05 UTC (permalink / raw)
  To: javier.martinez
  Cc: javier, eric.dumazet, lennart, kay.sievers, alban.crequy,
	bart.cerneels, rodrigo.moya, sjoerd.simons, netdev, linux-kernel

From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Date: Mon, 27 Feb 2012 15:00:06 +0100

> Primary for performance reasons. D-bus is an IPC system for processes in
> the same machine so traversing the whole TCP/IP stack seems a little
> overkill to me.

You haven't actually tested what the cost of this actually is, so what
you're saying is mere speculation.  In many cases TCP/UDP over
loopback is actually faster than AF_UNIX.

Since this is the premise of your whole rebuttal, I'll simply stop
reading here.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-27 19:05     ` David Miller
@ 2012-02-28 10:47       ` Rodrigo Moya
  2012-02-28 14:28         ` David Lamparter
  2012-02-28 19:05         ` David Miller
  0 siblings, 2 replies; 51+ messages in thread
From: Rodrigo Moya @ 2012-02-28 10:47 UTC (permalink / raw)
  To: David Miller
  Cc: javier.martinez, javier, eric.dumazet, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

Hi David

On Mon, 2012-02-27 at 14:05 -0500, David Miller wrote:
> From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
> Date: Mon, 27 Feb 2012 15:00:06 +0100
> 
> > Primary for performance reasons. D-bus is an IPC system for processes in
> > the same machine so traversing the whole TCP/IP stack seems a little
> > overkill to me.
> 
> You haven't actually tested what the cost of this actually is, so what
> you're saying is mere speculation.  In many cases TCP/UDP over
> loopback is actually faster than AF_UNIX.
> 
you're right we haven't tested this, but because of the other points in
Javier's mail, which are the special semantics we need for this to fit
the D-Bus usage:

> - total order is guaranteed: If sender A sends a message before B,
then
> receiver C and D should both get message A first and then B.
> 
> - slow readers: dropping packets vs blocking the sender. Although
>   datagrams are not reliable on IP, datagrams on Unix sockets are
never
>   lost. So if one receiver has its buffer full the sender is blocked
> instead of dropping packets. That way we guarantee a reliable
> communication channel.
> 
> - multicast group acess control: controlling who can join the
multicast
> group.
> 
> - multicast on loopback is not supported: which means we have to use a
> NIC (i.e: eth0). 

Because of all of this, UDP/IP multicast wasn't even considered as an
option. We might be wrong in some/all of those, so could you please
comment on them to check if that's so?

thanks


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-28 10:47       ` Rodrigo Moya
@ 2012-02-28 14:28         ` David Lamparter
  2012-02-28 15:24           ` Javier Martinez Canillas
  2012-02-28 19:05         ` David Miller
  1 sibling, 1 reply; 51+ messages in thread
From: David Lamparter @ 2012-02-28 14:28 UTC (permalink / raw)
  To: Rodrigo Moya
  Cc: David Miller, javier.martinez, javier, eric.dumazet, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

On Tue, Feb 28, 2012 at 11:47:39AM +0100, Rodrigo Moya wrote:
> > - slow readers: dropping packets vs blocking the sender. Although
> >   datagrams are not reliable on IP, datagrams on Unix sockets are
> never
> >   lost. So if one receiver has its buffer full the sender is blocked
> > instead of dropping packets. That way we guarantee a reliable
> > communication channel.

This sounds like a terribly nice way to f*ck the entire D-Bus system by
having one broken (or malicious) desktop application. What's the
intended way of coping with users that block the socket by not reading?


-David L.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-28 14:28         ` David Lamparter
@ 2012-02-28 15:24           ` Javier Martinez Canillas
  2012-02-28 16:33             ` Javier Martinez Canillas
  0 siblings, 1 reply; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-28 15:24 UTC (permalink / raw)
  To: David Lamparter
  Cc: Rodrigo Moya, David Miller, javier, eric.dumazet, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

On 02/28/2012 03:28 PM, David Lamparter wrote:
> On Tue, Feb 28, 2012 at 11:47:39AM +0100, Rodrigo Moya wrote:
>> > - slow readers: dropping packets vs blocking the sender. Although
>> >   datagrams are not reliable on IP, datagrams on Unix sockets are
>> never
>> >   lost. So if one receiver has its buffer full the sender is blocked
>> > instead of dropping packets. That way we guarantee a reliable
>> > communication channel.
> 
> This sounds like a terribly nice way to f*ck the entire D-Bus system by
> having one broken (or malicious) desktop application. What's the
> intended way of coping with users that block the socket by not reading?
> 
> 
> -David L.

The problem is that D-bus expects a reliable transport method (TCP or
SOCK_STREAM Unix socks) but this is not the case with multicast Unix
sockets. Since our implementation is for SOCK_SEQPACKET and SOCK_DGRAM
socket types.

So, you have to either add another layer to the D-bus protocol to make
it reliable (acks, retransmissions, flow control, etc) or avoid losing
D-bus messages (by blocking the sender if one of the receivers has its
buffer full).

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-28 15:24           ` Javier Martinez Canillas
@ 2012-02-28 16:33             ` Javier Martinez Canillas
  0 siblings, 0 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-02-28 16:33 UTC (permalink / raw)
  To: David Lamparter
  Cc: Rodrigo Moya, David Miller, javier, eric.dumazet, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

On 02/28/2012 04:24 PM, Javier Martinez Canillas wrote:
> On 02/28/2012 03:28 PM, David Lamparter wrote:
>> On Tue, Feb 28, 2012 at 11:47:39AM +0100, Rodrigo Moya wrote:
>>> > - slow readers: dropping packets vs blocking the sender. Although
>>> >   datagrams are not reliable on IP, datagrams on Unix sockets are
>>> never
>>> >   lost. So if one receiver has its buffer full the sender is blocked
>>> > instead of dropping packets. That way we guarantee a reliable
>>> > communication channel.
>> 
>> This sounds like a terribly nice way to f*ck the entire D-Bus system by
>> having one broken (or malicious) desktop application. What's the
>> intended way of coping with users that block the socket by not reading?
>> 
>> 
>> -David L.
> 
> The problem is that D-bus expects a reliable transport method (TCP or
> SOCK_STREAM Unix socks) but this is not the case with multicast Unix
> sockets. Since our implementation is for SOCK_SEQPACKET and SOCK_DGRAM
> socket types.
> 
> So, you have to either add another layer to the D-bus protocol to make
> it reliable (acks, retransmissions, flow control, etc) or avoid losing
> D-bus messages (by blocking the sender if one of the receivers has its
> buffer full).
> 

Also, this problem exists with current D-bus implementation. If a
malicious desktop application doesn't read its socket then the messages
sent to it will be buffered in the daemon:
https://bugs.freedesktop.org/show_bug.cgi?id=33606

dbus-daemon memory usage will ballooning until
max_incoming_bytes/max_outgoing_bytes limit is reached (1GB for session
bus in default configuration)

<limit name="max_incoming_bytes">1000000000</limit>
<limit name="max_outgoing_bytes">1000000000</limit>

It only works because not many applications are broken and user-space
memory is virtualized. But if you bypass the daemon and use a multicast
transport layer (as in our multicast Unix socket implementation) you
don't have that much memory to buffer the packets.

So you have to either block the senders or:

- drop the slow reader
- kill the spammer
- have an infinite amount of memory

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-28 10:47       ` Rodrigo Moya
  2012-02-28 14:28         ` David Lamparter
@ 2012-02-28 19:05         ` David Miller
  2012-03-01 11:57           ` Javier Martinez Canillas
  1 sibling, 1 reply; 51+ messages in thread
From: David Miller @ 2012-02-28 19:05 UTC (permalink / raw)
  To: rodrigo.moya
  Cc: javier.martinez, javier, eric.dumazet, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

From: Rodrigo Moya <rodrigo.moya@collabora.co.uk>
Date: Tue, 28 Feb 2012 11:47:39 +0100

> Because of all of this, UDP/IP multicast wasn't even considered as an
> option. We might be wrong in some/all of those, so could you please
> comment on them to check if that's so?

You guys seem to want something that isn't AF_UNIX, ordering guarentees
and whatnot, it really has no place in these protocols.

You've designed a userlevel subsystem with requirements that no existing
socket layer can give, and you just figured you'd work that out later.

I think you rather should have reconsidered these premises and designed
something that could handle reality which is AF_UNIX can't do multicast
and nobody guarentees those strange ordering requirements you seem to
have.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-02-28 19:05         ` David Miller
@ 2012-03-01 11:57           ` Javier Martinez Canillas
  2012-03-01 12:26             ` Eric Dumazet
                               ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-01 11:57 UTC (permalink / raw)
  To: David Miller
  Cc: rodrigo.moya, javier, eric.dumazet, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

On 02/28/2012 08:05 PM, David Miller wrote:
> From: Rodrigo Moya <rodrigo.moya@collabora.co.uk>
> Date: Tue, 28 Feb 2012 11:47:39 +0100
> 
>> Because of all of this, UDP/IP multicast wasn't even considered as an
>> option. We might be wrong in some/all of those, so could you please
>> comment on them to check if that's so?
> 
> You guys seem to want something that isn't AF_UNIX, ordering guarentees
> and whatnot, it really has no place in these protocols.
> 
> You've designed a userlevel subsystem with requirements that no existing
> socket layer can give, and you just figured you'd work that out later.
> 
> I think you rather should have reconsidered these premises and designed
> something that could handle reality which is AF_UNIX can't do multicast
> and nobody guarentees those strange ordering requirements you seem to
> have.

Yes, you are right it doesn't follow AF_UNIX semantics so Unix sockets
is not the best place to add our multicast implementation.

So, now we are trying a different approach. To create a new address
family AF_MCAST. That way we can have more control over the semantics of
the socket interface for that family.

We expect to have some patches in a few days and we will resend.

Does this makes more sense to you?

Best regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 11:57           ` Javier Martinez Canillas
@ 2012-03-01 12:26             ` Eric Dumazet
  2012-03-01 12:33               ` David Laight
  2012-03-01 20:44               ` David Miller
  2012-03-01 12:57             ` Luiz Augusto von Dentz
  2012-03-01 20:42             ` David Miller
  2 siblings, 2 replies; 51+ messages in thread
From: Eric Dumazet @ 2012-03-01 12:26 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: David Miller, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

Le jeudi 01 mars 2012 à 12:57 +0100, Javier Martinez Canillas a écrit :

> Yes, you are right it doesn't follow AF_UNIX semantics so Unix sockets
> is not the best place to add our multicast implementation.
> 

Right, AF_UNIX is already a nightmare to maintain.

> So, now we are trying a different approach. To create a new address
> family AF_MCAST. That way we can have more control over the semantics of
> the socket interface for that family.
> 
> We expect to have some patches in a few days and we will resend.
> 
> Does this makes more sense to you?
> 

Why adding an obscure set of IPC mechanism in network tree, and not
using (maybe extending) traditional IPC (Messages queues, semaphores,
Shared memory, pipes, futexes, ...).




^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 12:26             ` Eric Dumazet
@ 2012-03-01 12:33               ` David Laight
  2012-03-01 12:50                 ` Rodrigo Moya
  2012-03-01 20:44               ` David Miller
  1 sibling, 1 reply; 51+ messages in thread
From: David Laight @ 2012-03-01 12:33 UTC (permalink / raw)
  To: Eric Dumazet, Javier Martinez Canillas
  Cc: David Miller, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

 
> > So, now we are trying a different approach. To create a new address
> > family AF_MCAST. That way we can have more control over the
semantics of
> > the socket interface for that family.
> > 
> > We expect to have some patches in a few days and we will resend.
> > 
> > Does this makes more sense to you?
> > 
> 
> Why adding an obscure set of IPC mechanism in network tree, and not
> using (maybe extending) traditional IPC (Messages queues, semaphores,
> Shared memory, pipes, futexes, ...).

If it isn't a totally silly suggestion, why not write a simple
device driver that just does what you want?
Which (I think) is named pipes with multiple readers.

	David



^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 12:33               ` David Laight
@ 2012-03-01 12:50                 ` Rodrigo Moya
  2012-03-01 12:59                   ` Eric Dumazet
  0 siblings, 1 reply; 51+ messages in thread
From: Rodrigo Moya @ 2012-03-01 12:50 UTC (permalink / raw)
  To: David Laight
  Cc: Eric Dumazet, Javier Martinez Canillas, David Miller, javier,
	lennart, kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons,
	netdev, linux-kernel

On Thu, 2012-03-01 at 12:33 +0000, David Laight wrote:
> > > So, now we are trying a different approach. To create a new address
> > > family AF_MCAST. That way we can have more control over the
> semantics of
> > > the socket interface for that family.
> > > 
> > > We expect to have some patches in a few days and we will resend.
> > > 
> > > Does this makes more sense to you?
> > > 
> > 
> > Why adding an obscure set of IPC mechanism in network tree, and not
> > using (maybe extending) traditional IPC (Messages queues, semaphores,
> > Shared memory, pipes, futexes, ...).
> 
> If it isn't a totally silly suggestion, why not write a simple
> device driver that just does what you want?
> Which (I think) is named pipes with multiple readers.
> 
the main problem in D-Bus we are trying to solve is the context
switches, since right now, there is a daemon, which listens on a UNIX
socket, and all traffic in the bus goes through it, and then the daemon
has to route the messages it gets on that socket to the corresponding
place(s). So, every time someone sends a message to D-Bus, since all
traffic goes through the daemon, dbus-daemon gets waked-up, which is one
of the biggest bottlenecks we are trying to fix.

That's why we are thinking about using multicast with socket filters, so
that the daemon only gets traffic it cares about and thus is not waked
up and context switches don't happen when not needed.

Using message queues, AFAICS, we would have the same problem, as the
daemon would create the message queue and would get all traffic, right?

cheers


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 11:57           ` Javier Martinez Canillas
  2012-03-01 12:26             ` Eric Dumazet
@ 2012-03-01 12:57             ` Luiz Augusto von Dentz
  2012-03-01 20:42             ` David Miller
  2 siblings, 0 replies; 51+ messages in thread
From: Luiz Augusto von Dentz @ 2012-03-01 12:57 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: David Miller, rodrigo.moya, javier, eric.dumazet, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

Hi Javier,

On Thu, Mar 1, 2012 at 1:57 PM, Javier Martinez Canillas
<javier.martinez@collabora.co.uk> wrote:
> On 02/28/2012 08:05 PM, David Miller wrote:
>> From: Rodrigo Moya <rodrigo.moya@collabora.co.uk>
>> Date: Tue, 28 Feb 2012 11:47:39 +0100
>>
>>> Because of all of this, UDP/IP multicast wasn't even considered as an
>>> option. We might be wrong in some/all of those, so could you please
>>> comment on them to check if that's so?
>>
>> You guys seem to want something that isn't AF_UNIX, ordering guarentees
>> and whatnot, it really has no place in these protocols.
>>
>> You've designed a userlevel subsystem with requirements that no existing
>> socket layer can give, and you just figured you'd work that out later.
>>
>> I think you rather should have reconsidered these premises and designed
>> something that could handle reality which is AF_UNIX can't do multicast
>> and nobody guarentees those strange ordering requirements you seem to
>> have.
>
> Yes, you are right it doesn't follow AF_UNIX semantics so Unix sockets
> is not the best place to add our multicast implementation.
>
> So, now we are trying a different approach. To create a new address
> family AF_MCAST. That way we can have more control over the semantics of
> the socket interface for that family.
>
> We expect to have some patches in a few days and we will resend.

Lets say AF_MCAST is acceptable, wouldn't it make AF_UNIX obsolete?
>From what I can tell a lot, if not most, of users of AF_UNIX uses it
to implement some kind of IPC being it D-Bus, chromium or wayland and
eventually all of them run into the same problems. Actually the
article in lwn put it nice together: http://lwn.net/Articles/466304/

What about SCM_RIGHTS and other Ancillary Messages, would that be
acceptable in other socket families?

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 12:50                 ` Rodrigo Moya
@ 2012-03-01 12:59                   ` Eric Dumazet
  2012-03-01 13:56                     ` Javier Martinez Canillas
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Dumazet @ 2012-03-01 12:59 UTC (permalink / raw)
  To: Rodrigo Moya
  Cc: David Laight, Javier Martinez Canillas, David Miller, javier,
	lennart, kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons,
	netdev, linux-kernel

Le jeudi 01 mars 2012 à 13:50 +0100, Rodrigo Moya a écrit :
> the main problem in D-Bus we are trying to solve is the context
> switches, since right now, there is a daemon, which listens on a UNIX
> socket, and all traffic in the bus goes through it, and then the daemon
> has to route the messages it gets on that socket to the corresponding
> place(s). So, every time someone sends a message to D-Bus, since all
> traffic goes through the daemon, dbus-daemon gets waked-up, which is one
> of the biggest bottlenecks we are trying to fix.
> 
> That's why we are thinking about using multicast with socket filters, so
> that the daemon only gets traffic it cares about and thus is not waked
> up and context switches don't happen when not needed.
> 
> Using message queues, AFAICS, we would have the same problem, as the
> daemon would create the message queue and would get all traffic, right?
> 

This is why I mentioned extensions.

Anyway, if you think multicast sockets is the way to go, then you could
setup a virtual network just to be able to use AF_INET multicast.

Thats probably doable without kernel patching.




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 12:59                   ` Eric Dumazet
@ 2012-03-01 13:56                     ` Javier Martinez Canillas
  2012-03-01 16:00                       ` Eric Dumazet
                                         ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-01 13:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Rodrigo Moya, David Laight, David Miller, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

On 03/01/2012 01:59 PM, Eric Dumazet wrote:
> Le jeudi 01 mars 2012 à 13:50 +0100, Rodrigo Moya a écrit :
>> the main problem in D-Bus we are trying to solve is the context
>> switches, since right now, there is a daemon, which listens on a UNIX
>> socket, and all traffic in the bus goes through it, and then the daemon
>> has to route the messages it gets on that socket to the corresponding
>> place(s). So, every time someone sends a message to D-Bus, since all
>> traffic goes through the daemon, dbus-daemon gets waked-up, which is one
>> of the biggest bottlenecks we are trying to fix.
>> 
>> That's why we are thinking about using multicast with socket filters, so
>> that the daemon only gets traffic it cares about and thus is not waked
>> up and context switches don't happen when not needed.
>> 
>> Using message queues, AFAICS, we would have the same problem, as the
>> daemon would create the message queue and would get all traffic, right?
>> 
> 
> This is why I mentioned extensions.



> 
> Anyway, if you think multicast sockets is the way to go, then you could
> setup a virtual network just to be able to use AF_INET multicast.
> 
> Thats probably doable without kernel patching.
> 

We could use AF_INET multicast on a local machine but we need some
ordering and control flow requirements that are not guaranteed on UDP
multicast over IP. That's why we thought to add a new address family
AF_MCAST.

To make it a general local multicast solution and not being too specific
we added some flags to control its behavior like
MCAST_MREQ_DROP_WHEN_FULL to decide to either block the sender or drop
the packet when one receiver has its queue full.

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 13:56                     ` Javier Martinez Canillas
@ 2012-03-01 16:00                       ` Eric Dumazet
  2012-03-01 16:02                       ` Luiz Augusto von Dentz
  2012-03-01 20:55                       ` David Miller
  2 siblings, 0 replies; 51+ messages in thread
From: Eric Dumazet @ 2012-03-01 16:00 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: Rodrigo Moya, David Laight, David Miller, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

Le jeudi 01 mars 2012 à 14:56 +0100, Javier Martinez Canillas a écrit :

> We could use AF_INET multicast on a local machine but we need some
> ordering and control flow requirements that are not guaranteed on UDP
> multicast over IP. That's why we thought to add a new address family
> AF_MCAST.
> 

It seems application logic and complexity pushed into kernel, for a very
single user (even if used in a lot of products) : D-Bus

> To make it a general local multicast solution and not being too specific
> we added some flags to control its behavior like
> MCAST_MREQ_DROP_WHEN_FULL to decide to either block the sender or drop
> the packet when one receiver has its queue full.

I am only wondering how many lines this is going to add in kernel for a
complete implementation, given your performance expectations, flow
control, reliability, not counting all security issues (ancillary
messages and so on)

In case of IP_MULTICAST_LOOP, we could allow the sender to sleep if
receiver queue is full, with a bit of tweaking in stack (current
implementation uses loopback re-inject, so requires softirq handling).

In fact, we could use a new IP_MULTICAST_LOCAL option, so that sender
processing doesnt trigger a softirq handler at all and is allowed to
sleep if needed. For example skb allocations could use GFP_KERNEL
instead of current GFP_ATOMIC ones in udp mcast .

I dont know, maybe it would be a smaller patch.




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 13:56                     ` Javier Martinez Canillas
  2012-03-01 16:00                       ` Eric Dumazet
@ 2012-03-01 16:02                       ` Luiz Augusto von Dentz
  2012-03-01 17:06                         ` Javier Martinez Canillas
                                           ` (2 more replies)
  2012-03-01 20:55                       ` David Miller
  2 siblings, 3 replies; 51+ messages in thread
From: Luiz Augusto von Dentz @ 2012-03-01 16:02 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: Eric Dumazet, Rodrigo Moya, David Laight, David Miller, javier,
	lennart, kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons,
	netdev, linux-kernel

Hi Javier,

On Thu, Mar 1, 2012 at 3:56 PM, Javier Martinez Canillas
<javier.martinez@collabora.co.uk> wrote:
>>
>> Anyway, if you think multicast sockets is the way to go, then you could
>> setup a virtual network just to be able to use AF_INET multicast.
>>
>> Thats probably doable without kernel patching.
>>
>
> We could use AF_INET multicast on a local machine but we need some
> ordering and control flow requirements that are not guaranteed on UDP
> multicast over IP. That's why we thought to add a new address family
> AF_MCAST.

I don't want to sound like a broken record, but Im afraid I have to,
what about Ancillary Messages, how you are going to support passing
fd? Actually the whole virtual network sounds like a bad idea, are we
going to give ips to each and every application connected to the bus,
actually it is necessary to have one virtual network for each bus.

Contrary to someones believes I don't think AF_INET is that fast (e.g.
http://scottmoonen.com/2008/04/05/a-performance-comparison-of-af_unix-with-loopback-on-linux/)


-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 16:02                       ` Luiz Augusto von Dentz
@ 2012-03-01 17:06                         ` Javier Martinez Canillas
  2012-03-01 17:59                         ` Eric Dumazet
  2012-03-01 18:53                         ` David Dillow
  2 siblings, 0 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-01 17:06 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: Eric Dumazet, Rodrigo Moya, David Laight, David Miller, javier,
	lennart, kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons,
	netdev, linux-kernel

On 03/01/2012 05:02 PM, Luiz Augusto von Dentz wrote:
> Hi Javier,
> 
> On Thu, Mar 1, 2012 at 3:56 PM, Javier Martinez Canillas
> <javier.martinez@collabora.co.uk> wrote:
>>>
>>> Anyway, if you think multicast sockets is the way to go, then you could
>>> setup a virtual network just to be able to use AF_INET multicast.
>>>
>>> Thats probably doable without kernel patching.
>>>
>>
>> We could use AF_INET multicast on a local machine but we need some
>> ordering and control flow requirements that are not guaranteed on UDP
>> multicast over IP. That's why we thought to add a new address family
>> AF_MCAST.
> 
> I don't want to sound like a broken record, but Im afraid I have to,
> what about Ancillary Messages, how you are going to support passing
> fd? Actually the whole virtual network sounds like a bad idea, are we
> going to give ips to each and every application connected to the bus,
> actually it is necessary to have one virtual network for each bus.
> 
> Contrary to someones believes I don't think AF_INET is that fast (e.g.
> http://scottmoonen.com/2008/04/05/a-performance-comparison-of-af_unix-with-loopback-on-linux/)
> 
> 

You are right. Ancillary messages are PF_UNIX specific and also some
D-bus applications use fd passing for out-of-band communication. So,
using multicast on AF_INET will break these applications.

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 16:02                       ` Luiz Augusto von Dentz
  2012-03-01 17:06                         ` Javier Martinez Canillas
@ 2012-03-01 17:59                         ` Eric Dumazet
  2012-03-01 18:10                           ` Alan Cox
  2012-03-01 19:02                           ` Javier Martinez Canillas
  2012-03-01 18:53                         ` David Dillow
  2 siblings, 2 replies; 51+ messages in thread
From: Eric Dumazet @ 2012-03-01 17:59 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: Javier Martinez Canillas, Rodrigo Moya, David Laight,
	David Miller, javier, lennart, kay.sievers, alban.crequy,
	bart.cerneels, sjoerd.simons, netdev, linux-kernel

Le 1 mars 2012 08:02, Luiz Augusto von Dentz <luiz.dentz@gmail.com> a écrit :
>
> Contrary to someones believes I don't think AF_INET is that fast (e.g.
> http://scottmoonen.com/2008/04/05/a-performance-comparison-of-af_unix-with-loopback-on-linux/)
>

Oh you mention a recent zork it seems ;)

Are we speaking of performance problems, apart from scheduler problems
for D-Bus (each message wakeing all receivers, all receivers read and
drop message but the target) ?

I am actually one of the few people working to improve performance on
both AF_INET and AF_UNIX parts. Just take a look at recent commits.

Right now you can send/receive millions of udp messages per second on
your linux machine, if you figured out how to avoid process scheduler
costs. If D-Bus wants more, I highly suggest using shared memory
instead of passing messages.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 17:59                         ` Eric Dumazet
@ 2012-03-01 18:10                           ` Alan Cox
  2012-03-01 19:02                           ` Javier Martinez Canillas
  1 sibling, 0 replies; 51+ messages in thread
From: Alan Cox @ 2012-03-01 18:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Luiz Augusto von Dentz, Javier Martinez Canillas, Rodrigo Moya,
	David Laight, David Miller, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

> Right now you can send/receive millions of udp messages per second on
> your linux machine, if you figured out how to avoid process scheduler
> costs. If D-Bus wants more, I highly suggest using shared memory
> instead of passing messages.

Or some rather artful use of BPF ?

Alan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 16:02                       ` Luiz Augusto von Dentz
  2012-03-01 17:06                         ` Javier Martinez Canillas
  2012-03-01 17:59                         ` Eric Dumazet
@ 2012-03-01 18:53                         ` David Dillow
  2 siblings, 0 replies; 51+ messages in thread
From: David Dillow @ 2012-03-01 18:53 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: Javier Martinez Canillas, Eric Dumazet, Rodrigo Moya,
	David Laight, David Miller, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

On Thu, 2012-03-01 at 18:02 +0200, Luiz Augusto von Dentz wrote:
> Contrary to someones believes I don't think AF_INET is that fast (e.g.
> http://scottmoonen.com/2008/04/05/a-performance-comparison-of-af_unix-with-loopback-on-linux/)

There has been a huge amount of work on the stack in the four years
since that was written, and even longer since 2.6.18 was considered
current.

Have anything more recent?



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 17:59                         ` Eric Dumazet
  2012-03-01 18:10                           ` Alan Cox
@ 2012-03-01 19:02                           ` Javier Martinez Canillas
  2012-03-01 19:29                             ` Javier Martinez Canillas
  1 sibling, 1 reply; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-01 19:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Luiz Augusto von Dentz, Javier Martinez Canillas, Rodrigo Moya,
	David Laight, David Miller, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

On Thu, Mar 1, 2012 at 6:59 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le 1 mars 2012 08:02, Luiz Augusto von Dentz <luiz.dentz@gmail.com> a écrit :
>>
>> Contrary to someones believes I don't think AF_INET is that fast (e.g.
>> http://scottmoonen.com/2008/04/05/a-performance-comparison-of-af_unix-with-loopback-on-linux/)
>>
>
> Oh you mention a recent zork it seems ;)
>
> Are we speaking of performance problems, apart from scheduler problems
> for D-Bus (each message wakeing all receivers, all receivers read and
> drop message but the target) ?
>

Hi Eric,

The only performance problem we are talking about is the scheduling
for D-bus (context switch to the daemon for each message). With today
implementation the receivers only gets messages that were sent to it
but the D-bus daemon has to be wake it up for every message to he can
do the routing. For multicast messages (i.e: D-bus signals) this is
even worse since the daemon has to do a send() for each receiver.

> I am actually one of the few people working to improve performance on
> both AF_INET and AF_UNIX parts. Just take a look at recent commits.
>
> Right now you can send/receive millions of udp messages per second on
> your linux machine, if you figured out how to avoid process scheduler
> costs. If D-Bus wants more, I highly suggest using shared memory
> instead of passing messages.
> --

Yes, I also thought that AF_UNIX would be more efficient than AF_INET
but I was wrong. Yesterday I wrote some tests using our multicast unix
socket, UDP multicast over IP on a single machine and even multicast
using AF_NETLINK sockets and got very similar performance results.

The only problem is the ordering and control flow requirements for D-bus.

Best regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 19:02                           ` Javier Martinez Canillas
@ 2012-03-01 19:29                             ` Javier Martinez Canillas
  0 siblings, 0 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-01 19:29 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: Eric Dumazet, Luiz Augusto von Dentz, Rodrigo Moya, David Laight,
	David Miller, javier, lennart, kay.sievers, alban.crequy,
	bart.cerneels, sjoerd.simons, netdev, linux-kernel

On 03/01/2012 08:02 PM, Javier Martinez Canillas wrote:
> On Thu, Mar 1, 2012 at 6:59 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> Le 1 mars 2012 08:02, Luiz Augusto von Dentz <luiz.dentz@gmail.com> a écrit :
>>>
>>> Contrary to someones believes I don't think AF_INET is that fast (e.g.
>>> http://scottmoonen.com/2008/04/05/a-performance-comparison-of-af_unix-with-loopback-on-linux/)
>>>
>>
>> Oh you mention a recent zork it seems ;)
>>
>> Are we speaking of performance problems, apart from scheduler problems
>> for D-Bus (each message wakeing all receivers, all receivers read and
>> drop message but the target) ?
>>
> 
> Hi Eric,
> 
> The only performance problem we are talking about is the scheduling
> for D-bus (context switch to the daemon for each message). With today
> implementation the receivers only gets messages that were sent to it
> but the D-bus daemon has to be wake it up for every message to he can
> do the routing. For multicast messages (i.e: D-bus signals) this is
> even worse since the daemon has to do a send() for each receiver.
> 
>> I am actually one of the few people working to improve performance on
>> both AF_INET and AF_UNIX parts. Just take a look at recent commits.
>>
>> Right now you can send/receive millions of udp messages per second on
>> your linux machine, if you figured out how to avoid process scheduler
>> costs. If D-Bus wants more, I highly suggest using shared memory
>> instead of passing messages.
>> --
> 
> Yes, I also thought that AF_UNIX would be more efficient than AF_INET
> but I was wrong. Yesterday I wrote some tests using our multicast unix
> socket, UDP multicast over IP on a single machine and even multicast
> using AF_NETLINK sockets and got very similar performance results.
> 
> The only problem is the ordering and control flow requirements for D-bus.
> 

And the fd passing for out-ouf-band communication used for some D-bus
application such as oFono and BlueZ.

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 11:57           ` Javier Martinez Canillas
  2012-03-01 12:26             ` Eric Dumazet
  2012-03-01 12:57             ` Luiz Augusto von Dentz
@ 2012-03-01 20:42             ` David Miller
  2 siblings, 0 replies; 51+ messages in thread
From: David Miller @ 2012-03-01 20:42 UTC (permalink / raw)
  To: javier.martinez
  Cc: rodrigo.moya, javier, eric.dumazet, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Date: Thu, 01 Mar 2012 12:57:18 +0100

> Does this makes more sense to you?

No, creating an entire new socket family for one user doesn't make
any sense.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 12:26             ` Eric Dumazet
  2012-03-01 12:33               ` David Laight
@ 2012-03-01 20:44               ` David Miller
  2012-03-01 22:01                 ` Luiz Augusto von Dentz
  1 sibling, 1 reply; 51+ messages in thread
From: David Miller @ 2012-03-01 20:44 UTC (permalink / raw)
  To: eric.dumazet
  Cc: javier.martinez, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 01 Mar 2012 04:26:42 -0800

> Why adding an obscure set of IPC mechanism in network tree, and not
> using (maybe extending) traditional IPC (Messages queues, semaphores,
> Shared memory, pipes, futexes, ...).

I actually don't understand why there is so much resistence to using a
real bonafide on-the-wire protocol, and that way if you ever wanted to
connect dbus instances on multiple machines or log dbus transactions
remotely for debugging, you could just do it.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 13:56                     ` Javier Martinez Canillas
  2012-03-01 16:00                       ` Eric Dumazet
  2012-03-01 16:02                       ` Luiz Augusto von Dentz
@ 2012-03-01 20:55                       ` David Miller
  2012-03-02  4:40                         ` Stephen Hemminger
  2 siblings, 1 reply; 51+ messages in thread
From: David Miller @ 2012-03-01 20:55 UTC (permalink / raw)
  To: javier.martinez
  Cc: eric.dumazet, rodrigo.moya, David.Laight, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Date: Thu, 01 Mar 2012 14:56:11 +0100

> We could use AF_INET multicast on a local machine but we need some
> ordering and control flow requirements that are not guaranteed on UDP
> multicast over IP. That's why we thought to add a new address family
> AF_MCAST.

None of this makes any sense to me.

Unless you have infinite amounts of memory you have to handle packet
drops, and the same things that handle packet drops on a protocol
level can handle out-of-order delivery too.

Stop reinventing the wheel, use facilities that exist already.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 20:44               ` David Miller
@ 2012-03-01 22:01                 ` Luiz Augusto von Dentz
  2012-03-01 22:08                   ` David Miller
  0 siblings, 1 reply; 51+ messages in thread
From: Luiz Augusto von Dentz @ 2012-03-01 22:01 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, javier.martinez, rodrigo.moya, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

Hi David,

On Thu, Mar 1, 2012 at 10:44 PM, David Miller <davem@davemloft.net> wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 01 Mar 2012 04:26:42 -0800
>
>> Why adding an obscure set of IPC mechanism in network tree, and not
>> using (maybe extending) traditional IPC (Messages queues, semaphores,
>> Shared memory, pipes, futexes, ...).
>
> I actually don't understand why there is so much resistence to using a
> real bonafide on-the-wire protocol, and that way if you ever wanted to
> connect dbus instances on multiple machines or log dbus transactions
> remotely for debugging, you could just do it.

I don't think you understood the problem, we want something that scale
for less powerful devices, why do you think Android have all the
trouble to create binder?

Besides what is really the point in having AF_UNIX if you can't use
for what it is for?

"The  AF_UNIX (also known as AF_LOCAL) socket family is used to
communicate between processes on the same machine efficiently."

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 22:01                 ` Luiz Augusto von Dentz
@ 2012-03-01 22:08                   ` David Miller
  2012-03-02  8:39                     ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 51+ messages in thread
From: David Miller @ 2012-03-01 22:08 UTC (permalink / raw)
  To: luiz.dentz
  Cc: eric.dumazet, javier.martinez, rodrigo.moya, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

From: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Date: Fri, 2 Mar 2012 00:01:40 +0200

> I don't think you understood the problem, we want something that scale
> for less powerful devices, why do you think Android have all the
> trouble to create binder?

So our protocol stack is so cpu hungry compared to AF_UNIX that it's
unusable on low power devices?

I can't take you seriously if you say this after showing us the
thousands of lines of code you guys think we should add to the AF_UNIX
socket layer.

> Besides what is really the point in having AF_UNIX if you can't use
> for what it is for?

Because it doesn't have the handful of extra features you absolutely
require of it.

AF_UNIX is a complicated socket layer which is already extremely hard
to maintain.  We're still finding bugs in it even after all these
years, and that's without adding major new functionality.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 20:55                       ` David Miller
@ 2012-03-02  4:40                         ` Stephen Hemminger
  0 siblings, 0 replies; 51+ messages in thread
From: Stephen Hemminger @ 2012-03-02  4:40 UTC (permalink / raw)
  To: David Miller
  Cc: javier.martinez, eric.dumazet, rodrigo.moya, David.Laight,
	javier, lennart, kay.sievers, alban.crequy, bart.cerneels,
	sjoerd.simons, netdev, linux-kernel

On Thu, 01 Mar 2012 15:55:05 -0500 (EST)
David Miller <davem@davemloft.net> wrote:

> From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
> Date: Thu, 01 Mar 2012 14:56:11 +0100
> 
> > We could use AF_INET multicast on a local machine but we need some
> > ordering and control flow requirements that are not guaranteed on UDP
> > multicast over IP. That's why we thought to add a new address family
> > AF_MCAST.
> 
> None of this makes any sense to me.
> 
> Unless you have infinite amounts of memory you have to handle packet
> drops, and the same things that handle packet drops on a protocol
> level can handle out-of-order delivery too.
> 
> Stop reinventing the wheel, use facilities that exist already.

Look at ZeroMq http://www.zeromq.org/ library seems to be a good fit for what D-bus wants.
And it supports multiple protocols.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-01 22:08                   ` David Miller
@ 2012-03-02  8:39                     ` Luiz Augusto von Dentz
  2012-03-02  8:55                       ` David Miller
  0 siblings, 1 reply; 51+ messages in thread
From: Luiz Augusto von Dentz @ 2012-03-02  8:39 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, javier.martinez, rodrigo.moya, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

Hi David,

On Fri, Mar 2, 2012 at 12:08 AM, David Miller <davem@davemloft.net> wrote:
> From: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Date: Fri, 2 Mar 2012 00:01:40 +0200
>
>> I don't think you understood the problem, we want something that scale
>> for less powerful devices, why do you think Android have all the
>> trouble to create binder?
>
> So our protocol stack is so cpu hungry compared to AF_UNIX that it's
> unusable on low power devices?

I never said unusable, it will drastically increase latency of message
which translates in less responsive applications.

> I can't take you seriously if you say this after showing us the
> thousands of lines of code you guys think we should add to the AF_UNIX
> socket layer.

But what you are suggesting transforms dbus-daemon in a ip router just
to do multicast, actually how many lines of code do you think we gonna
need to implement that? Probably much more than adding this much to
the kernel and is not necessarily useful for anybody else.

Like I said before there is many projects using AF_UNIX as IPC
transport, the documentation actually induces people to use for this
purpose, and many would benefit from being able to do multicast.

Btw Im not involved with the implementation and perhaps it need some
extra work, but IMO the idea is very useful.

>> Besides what is really the point in having AF_UNIX if you can't use
>> for what it is for?
>
> Because it doesn't have the handful of extra features you absolutely
> require of it.

You mean multicast, that is one and only, with many implementation
details with that I agree.

> AF_UNIX is a complicated socket layer which is already extremely hard
> to maintain.  We're still finding bugs in it even after all these
> years, and that's without adding major new functionality.

I understand your concern, this could make things even more unstable,
but in the other hand hacking support of multicast to loopback would
also mess with AF_INET, so in one way or the other the kernel will
have to be involved.

Also note that AF_UNIX has very key features of an efficient IPC, like
the ability to pass fd to another process with SCM_RIGHTS.

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  8:39                     ` Luiz Augusto von Dentz
@ 2012-03-02  8:55                       ` David Miller
  2012-03-02  9:27                         ` Javier Martinez Canillas
                                           ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: David Miller @ 2012-03-02  8:55 UTC (permalink / raw)
  To: luiz.dentz
  Cc: eric.dumazet, javier.martinez, rodrigo.moya, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

From: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Date: Fri, 2 Mar 2012 10:39:24 +0200

> Like I said before there is many projects using AF_UNIX as IPC
> transport, the documentation actually induces people to use for this
> purpose, and many would benefit from being able to do multicast.

You can't have it both ways.

If it's useful for many applications, then many applications would
benefit from a userland library that solved the problem using
existing facilities such as IP multicast.

If it's only useful for dbus that that absoltely means we should
not add thousands of lines of code to the kernel specifically for
that application.

So either way, kernel changes are not justified.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  8:55                       ` David Miller
@ 2012-03-02  9:27                         ` Javier Martinez Canillas
  2012-03-02  9:39                           ` David Miller
                                             ` (2 more replies)
  2012-03-02 10:08                         ` Luiz Augusto von Dentz
  2012-03-02 22:19                         ` david
  2 siblings, 3 replies; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-02  9:27 UTC (permalink / raw)
  To: David Miller, shemminger, ying.xue
  Cc: luiz.dentz, eric.dumazet, rodrigo.moya, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

On 03/02/2012 09:55 AM, David Miller wrote:
> From: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Date: Fri, 2 Mar 2012 10:39:24 +0200
> 
>> Like I said before there is many projects using AF_UNIX as IPC
>> transport, the documentation actually induces people to use for this
>> purpose, and many would benefit from being able to do multicast.
> 
> You can't have it both ways.
> 
> If it's useful for many applications, then many applications would
> benefit from a userland library that solved the problem using
> existing facilities such as IP multicast.
> 
> If it's only useful for dbus that that absoltely means we should
> not add thousands of lines of code to the kernel specifically for
> that application.
> 

You are right that D-bus is the only one that will use it but D-bus is
more than an application is an IPC system that is used for almost every
single application that runs on your Linux desktop.

> So either way, kernel changes are not justified.

Yes, you are right that packets drops, out-of-order delivery and flow
control could be handled in another layer (i.e: the D-bus library in
user-space).

Also I won't argue about performance since we did some stress test and
found that AF_INET, AF_UNIX and AF_NETLINK performs very similar for
multicast.

> Stop reinventing the wheel, use facilities that exist already.

We are the most interested in using a facility already found in the
kernel, we will try ZeroMQ as Stephen suggested and TIPC but really
didn't find an IPC mechanism that fits our needs. The most important
issue right now is the fd passing for D-bus application doing
out-of-band communication.

Another approach that we are trying is to use Netlink sockets using the
Generic Netlink kernel API and develop a kernel module that does the
routing. That way if you don't accept our code at least it will be
easier for us to maintain. Not sure if netlink supports fd passing though.

Do you think that a simpler AF_UNIX multicast implementation without the
locking to guarantee order delivery and the flow control that blocks the
sender can be resend to you to reconsider merging it?

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  9:27                         ` Javier Martinez Canillas
@ 2012-03-02  9:39                           ` David Miller
  2012-03-02 13:13                           ` Eric Dumazet
  2012-03-05 18:55                           ` David Lamparter
  2 siblings, 0 replies; 51+ messages in thread
From: David Miller @ 2012-03-02  9:39 UTC (permalink / raw)
  To: javier.martinez
  Cc: shemminger, ying.xue, luiz.dentz, eric.dumazet, rodrigo.moya,
	javier, lennart, kay.sievers, alban.crequy, bart.cerneels,
	sjoerd.simons, netdev, linux-kernel

From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Date: Fri, 02 Mar 2012 10:27:16 +0100

> Do you think that a simpler AF_UNIX multicast implementation without the
> locking to guarantee order delivery and the flow control that blocks the
> sender can be resend to you to reconsider merging it?

No.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  8:55                       ` David Miller
  2012-03-02  9:27                         ` Javier Martinez Canillas
@ 2012-03-02 10:08                         ` Luiz Augusto von Dentz
  2012-03-03 12:20                           ` Martin Mares
  2012-03-02 22:19                         ` david
  2 siblings, 1 reply; 51+ messages in thread
From: Luiz Augusto von Dentz @ 2012-03-02 10:08 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, javier.martinez, rodrigo.moya, javier, lennart,
	kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons, netdev,
	linux-kernel

Hi David,

On Fri, Mar 2, 2012 at 10:55 AM, David Miller <davem@davemloft.net> wrote:
> From: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Date: Fri, 2 Mar 2012 10:39:24 +0200
>
>> Like I said before there is many projects using AF_UNIX as IPC
>> transport, the documentation actually induces people to use for this
>> purpose, and many would benefit from being able to do multicast.
>
> You can't have it both ways.
>
> If it's useful for many applications, then many applications would
> benefit from a userland library that solved the problem using
> existing facilities such as IP multicast.
>
> If it's only useful for dbus that that absoltely means we should
> not add thousands of lines of code to the kernel specifically for
> that application.

Instead we should add many times that into dbus-daemon and do IP
multicast, am I missing something?

> So either way, kernel changes are not justified.

I respect your opinion, but I don't agree with it, you are pushing
userspace to a much more complex solution.

At this point it would probably better to just use shared memory and
forget about any security, eavesdrop all the way.

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  9:27                         ` Javier Martinez Canillas
  2012-03-02  9:39                           ` David Miller
@ 2012-03-02 13:13                           ` Eric Dumazet
  2012-03-02 16:34                             ` Javier Martinez Canillas
  2012-03-05 18:55                           ` David Lamparter
  2 siblings, 1 reply; 51+ messages in thread
From: Eric Dumazet @ 2012-03-02 13:13 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: David Miller, shemminger, ying.xue, luiz.dentz, rodrigo.moya,
	javier, lennart, kay.sievers, alban.crequy, bart.cerneels,
	sjoerd.simons, netdev, linux-kernel

Le vendredi 02 mars 2012 à 10:27 +0100, Javier Martinez Canillas a
écrit :

> We are the most interested in using a facility already found in the
> kernel, we will try ZeroMQ as Stephen suggested and TIPC but really
> didn't find an IPC mechanism that fits our needs. The most important
> issue right now is the fd passing for D-bus application doing
> out-of-band communication.

Why on earth the needed D-Bus IPC should use a single kernel mechanism ?

I mean, of course AF_INET cannot pass fd around and never will.
Of course AF_UNIX cannot use multicast and never will.
Of course shared memory wont pass fds around and never will.
... Add other impossible combinations as you want.

There are reasons fd passing is hard to implement. I find stuffing this
functionality in AF_UNIX was a bad design choice from the very
beginning.

Instead of pushing extra complexity to a single kernel component, why
not trying to use a combination of existing, well designed and supported
ones ?




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02 13:13                           ` Eric Dumazet
@ 2012-03-02 16:34                             ` Javier Martinez Canillas
  2012-03-02 17:08                               ` Alan Cox
  0 siblings, 1 reply; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-02 16:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, shemminger, ying.xue, luiz.dentz, rodrigo.moya,
	javier, lennart, kay.sievers, alban.crequy, bart.cerneels,
	sjoerd.simons, netdev, linux-kernel, Marcel Holtmann

On 03/02/2012 02:13 PM, Eric Dumazet wrote:
> Le vendredi 02 mars 2012 à 10:27 +0100, Javier Martinez Canillas a
> écrit :
> 
>> We are the most interested in using a facility already found in the
>> kernel, we will try ZeroMQ as Stephen suggested and TIPC but really
>> didn't find an IPC mechanism that fits our needs. The most important
>> issue right now is the fd passing for D-bus application doing
>> out-of-band communication.
> 
> Why on earth the needed D-Bus IPC should use a single kernel mechanism ?
> 
> I mean, of course AF_INET cannot pass fd around and never will.
> Of course AF_UNIX cannot use multicast and never will.
> Of course shared memory wont pass fds around and never will.
> ... Add other impossible combinations as you want.
> 
> There are reasons fd passing is hard to implement. I find stuffing this
> functionality in AF_UNIX was a bad design choice from the very
> beginning.
> 

Yes, can't say that everyone is happy with fd passing. It seems like a
workaround since D-bus didn't scale for big chunks of data IMHO.

> Instead of pushing extra complexity to a single kernel component, why
> not trying to use a combination of existing, well designed and supported
> ones ?
> 

You are right, maybe a combination of IPC mechanism could be used.

Basically we have this scenario:

1- Most applications today uses D-bus as an IPC system and is a central
part of the Linux desktop.

2- The transport layer used by D-bus is not performance sensitive
basically due:

a) high number of context switches required to send messages between peer.
b) the D-bus daemon doing the routing and being a bottleneck of the whole.
c) amount of messages copied between kernel space and user space.

3- We still haven't found a single kernel IPC mechanism or a combination
of IPC mechanism that can address this issue.

This is a real concern in the Linux embedded world. Since Linux based
products wants to use well probed software components found in Linux
distros such as oFono, BlueZ, Pulseaudio, Connman and Telepathy to name
a few. All of them uses D-bus to expose its API to other applications.

I'm not saying that extending AF_UNIX for supporting multicast is the
best approach but what I'm saying is that we should find a solution to
this problem.

PD: I'm cc'ing Marcel Holtmann so hopefully he can add his point of view
to the problem (and possible solutions).

I know that Marcel is also working on improving the D-bus system but
moving to the kernel some tasks made by the D-bus daemon today.

Regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02 16:34                             ` Javier Martinez Canillas
@ 2012-03-02 17:08                               ` Alan Cox
  2012-03-05  8:38                                 ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 51+ messages in thread
From: Alan Cox @ 2012-03-02 17:08 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: Eric Dumazet, David Miller, shemminger, ying.xue, luiz.dentz,
	rodrigo.moya, javier, lennart, kay.sievers, alban.crequy,
	bart.cerneels, sjoerd.simons, netdev, linux-kernel,
	Marcel Holtmann

> 2- The transport layer used by D-bus is not performance sensitive
> basically due:
> 
> a) high number of context switches required to send messages between peer.

This is a user space design issue. The fact dbus wakes up so much
stuff wants fixing at the dbus level.

> b) the D-bus daemon doing the routing and being a bottleneck of the whole.

This is a userspace design issue.

> c) amount of messages copied between kernel space and user space.

This is mostly a userspace design issue and fixing a would fix much of c
because you wouldn't keep sending people crap they didn't need.

You've already got multicast facilities in kernel (if dbus must work by
shouting not state change subscription like saner setups), and you've got
BPF filtering facilities to try and cure some of the wakeups even doing
multicast.

Beyond that I don't see what the kernel can do given its mostly an
architectural problem.

Your model appears to be "since its causing enormous amounts of work we
should do the work faster". The right model would appear to me to be "We
shouldn't cause enormous amounts of work"

Alan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  8:55                       ` David Miller
  2012-03-02  9:27                         ` Javier Martinez Canillas
  2012-03-02 10:08                         ` Luiz Augusto von Dentz
@ 2012-03-02 22:19                         ` david
  2 siblings, 0 replies; 51+ messages in thread
From: david @ 2012-03-02 22:19 UTC (permalink / raw)
  To: David Miller
  Cc: luiz.dentz, eric.dumazet, javier.martinez, rodrigo.moya, javier,
	lennart, kay.sievers, alban.crequy, bart.cerneels, sjoerd.simons,
	netdev, linux-kernel

On Fri, 2 Mar 2012, David Miller wrote:

> From: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Date: Fri, 2 Mar 2012 10:39:24 +0200
>
>> Like I said before there is many projects using AF_UNIX as IPC
>> transport, the documentation actually induces people to use for this
>> purpose, and many would benefit from being able to do multicast.
>
> You can't have it both ways.
>
> If it's useful for many applications, then many applications would
> benefit from a userland library that solved the problem using
> existing facilities such as IP multicast.

I missed the start of this discussion (but did see the lwn.net article on 
it)

as I understand it, they are looking for some features that are not in IP 
multicast (or at least not as I understand it)

1. reliable delivery

2. in-order delivery

3. sender blocking on recipients rather than dropping messages when the 
channel is full.

IP multicast definantly does not do #3, and as far as I understand it, is 
essentially UDP to multiple recipients, and UDP does not provide either #1 
or #2

Yes, this could be done entirely in userspace (with something like 0MQ as 
I see others mentioning), and I don't understand the Android aversion to 
any userspace daemons, but with all of that being said, I do think that a 
kernel-based mechanism that supports having iptables type filters on it 
would be a very nice thing to have (and should be able to re-use a lot of 
existing code that would end up being duplicated if this is done in a 
userspace daemon)

now it may be that some of the requirements may result in error O_PONY or 
O_SANITY (the sender blocking seems like a potential problem, but that 
may possibly make sense as a configurable option)

David Lang

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02 10:08                         ` Luiz Augusto von Dentz
@ 2012-03-03 12:20                           ` Martin Mares
  0 siblings, 0 replies; 51+ messages in thread
From: Martin Mares @ 2012-03-03 12:20 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: David Miller, eric.dumazet, javier.martinez, rodrigo.moya,
	javier, lennart, kay.sievers, alban.crequy, bart.cerneels,
	sjoerd.simons, netdev, linux-kernel

Hello!

> Instead we should add many times that into dbus-daemon and do IP
> multicast, am I missing something?

I completely agree with Alan that if routing all messages through
DBUS daemon is a bottleneck, then something is seriously wrong with
the way the applications use the message bus.

Also, you mentioned the need of passing fd's between applications.
If I understand correctly, it is a rare case and if you handle such
messages in the same way as before, it won't hurt performance.

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj@ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Even nostalgia isn't what it used to be.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02 17:08                               ` Alan Cox
@ 2012-03-05  8:38                                 ` Luiz Augusto von Dentz
  2012-03-05 14:05                                   ` Martin Mares
  0 siblings, 1 reply; 51+ messages in thread
From: Luiz Augusto von Dentz @ 2012-03-05  8:38 UTC (permalink / raw)
  To: Alan Cox
  Cc: Javier Martinez Canillas, Eric Dumazet, David Miller, shemminger,
	ying.xue, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel,
	Marcel Holtmann

Hi Alan,

On Fri, Mar 2, 2012 at 7:08 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>> 2- The transport layer used by D-bus is not performance sensitive
>> basically due:
>>
>> a) high number of context switches required to send messages between peer.
>
> This is a user space design issue. The fact dbus wakes up so much
> stuff wants fixing at the dbus level.

Can you be more specific, afaik centralizing the message subscription
on the daemon minimize the wakeups of the applications, in the other
hand BPF might be a better solution to filter the packets but is more
recent than D-Bus itself. If you have a suggestion of a better design
could you please let us know.

>> b) the D-bus daemon doing the routing and being a bottleneck of the whole.
>
> This is a userspace design issue.

But do you think letting the clients manage their connections to each
other client it talk would have been better? The number of fd per
client would sky rocketed.

>> c) amount of messages copied between kernel space and user space.
>
> This is mostly a userspace design issue and fixing a would fix much of c
> because you wouldn't keep sending people crap they didn't need.

Afaik this is not a problem in D-Bus, perhaps if you have eavesdrop
enabled but that is your configuration, the client only gets signals
they subscribe to and messages addressed to its connection (method
call). That doesn't mean that are bad implement client who subscribe
for everything and which translate in more data being copied and
wakeups, but that is not D-Bus fault and even with BPF the client can
do that too.

> You've already got multicast facilities in kernel (if dbus must work by
> shouting not state change subscription like saner setups), and you've got
> BPF filtering facilities to try and cure some of the wakeups even doing
> multicast.

Isn't that what is this all about? The problem seems to be that with
BPF alone it would not be possible to implement multicast without
sacrificing security, at least method call and reply messages should
be private to the peers involved without eavesdrop being enabled.

Btw this was posted in detail here:
http://blogs.gnome.org/rodrigo/2012/02/27/d-bus-optimizations/

> Beyond that I don't see what the kernel can do given its mostly an
> architectural problem.
>
> Your model appears to be "since its causing enormous amounts of work we
> should do the work faster". The right model would appear to me to be "We
> shouldn't cause enormous amounts of work"

Please check the link above and tell me if that different than the
model you suggested using BPF, apparently we are talking about the
very same solution but the implementation detail are getting in the
way because a lot of code was added.

-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-05  8:38                                 ` Luiz Augusto von Dentz
@ 2012-03-05 14:05                                   ` Martin Mares
  2012-03-05 15:11                                     ` Javier Martinez Canillas
  0 siblings, 1 reply; 51+ messages in thread
From: Martin Mares @ 2012-03-05 14:05 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: Alan Cox, Javier Martinez Canillas, Eric Dumazet, David Miller,
	shemminger, ying.xue, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel,
	Marcel Holtmann

Hi!

> Please check the link above and tell me if that different than the
> model you suggested using BPF, apparently we are talking about the
> very same solution but the implementation detail are getting in the
> way because a lot of code was added.

...

First of all, you should come up with some real data confirming that
the problem you are trying to solve really exist -- i.e., that in some
real (and sensible) setup, routing all messages through DBUS daemon
is a bottleneck.

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj@ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
More memory available, but not for you!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-05 14:05                                   ` Martin Mares
@ 2012-03-05 15:11                                     ` Javier Martinez Canillas
  2012-03-05 15:49                                       ` Martin Mares
  0 siblings, 1 reply; 51+ messages in thread
From: Javier Martinez Canillas @ 2012-03-05 15:11 UTC (permalink / raw)
  To: Martin Mares
  Cc: Luiz Augusto von Dentz, Alan Cox, Eric Dumazet, David Miller,
	shemminger, ying.xue, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel,
	Marcel Holtmann

On 03/05/2012 03:05 PM, Martin Mares wrote:
> Hi!
> 
>> Please check the link above and tell me if that different than the
>> model you suggested using BPF, apparently we are talking about the
>> very same solution but the implementation detail are getting in the
>> way because a lot of code was added.
> 
> ...
> 
> First of all, you should come up with some real data confirming that
> the problem you are trying to solve really exist -- i.e., that in some
> real (and sensible) setup, routing all messages through DBUS daemon
> is a bottleneck.
> 
> 				Have a nice fortnight

We still don't have performance numbers for D-bus using AF_UNIX
multicast since our D-bus daemon branch is still not stable. But Alban
did some tests for the first approach (creating a new socket address
family AF_DBUS) and the performance gain was x1.8 for KVM/i386 and x3
for N900/ARM.

Alban's blog entry can be found here:
http://alban-apinc.blogspot.com/2011/12/d-bus-in-kernel-faster.html

Yes, D-bus has many architectural flaws that has to be addressed. The
out-of-order delivery requirement maybe is not even important in real
world and the control flow is something that probably we can fix in
user-space too. That every message has to pass through the D-bus daemon
is something that can also be fixed without requiring any kernel
modification.

But there is one problem that we can't solve without Linux kernel
support. The fact that multicast messages have to be directly sent to
the receivers.

The problem is that Linux lacks of an easy IPC mechanism to send
multicast messages to processes in the same machine. We can use UDP
multicast over IP but even when the sending/receiving performance is
similar to our AF_UNIX multicast implementation, the connection setup is
much more complex.

We will investigate if we can use Netlink sockets as an multicast IPC
mechanism even when it is designed for the kernel-space/user-space use
case and not well suited to user-space/user-space communication.

Best regards,
Javier

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-05 15:11                                     ` Javier Martinez Canillas
@ 2012-03-05 15:49                                       ` Martin Mares
  0 siblings, 0 replies; 51+ messages in thread
From: Martin Mares @ 2012-03-05 15:49 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: Luiz Augusto von Dentz, Alan Cox, Eric Dumazet, David Miller,
	shemminger, ying.xue, rodrigo.moya, javier, lennart, kay.sievers,
	alban.crequy, bart.cerneels, sjoerd.simons, netdev, linux-kernel,
	Marcel Holtmann

Hello!

> We still don't have performance numbers for D-bus using AF_UNIX
> multicast since our D-bus daemon branch is still not stable. But Alban
> did some tests for the first approach (creating a new socket address
> family AF_DBUS) and the performance gain was x1.8 for KVM/i386 and x3
> for N900/ARM.

I did not ask for the performance improvement in artificial benchmarks,
they will obviously show some :)

What I am interested in is a test showing that _in_real_world_, the system
spends considerable amount of time by passing messages. That is, a reason
for optimizing the thing at all.

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj@ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
American patent law: two monkeys, fourteen days.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX
  2012-03-02  9:27                         ` Javier Martinez Canillas
  2012-03-02  9:39                           ` David Miller
  2012-03-02 13:13                           ` Eric Dumazet
@ 2012-03-05 18:55                           ` David Lamparter
  2 siblings, 0 replies; 51+ messages in thread
From: David Lamparter @ 2012-03-05 18:55 UTC (permalink / raw)
  To: Javier Martinez Canillas
  Cc: David Miller, shemminger, ying.xue, luiz.dentz, eric.dumazet,
	rodrigo.moya, javier, lennart, kay.sievers, alban.crequy,
	bart.cerneels, sjoerd.simons, netdev, linux-kernel

On Fri, Mar 02, 2012 at 10:27:16AM +0100, Javier Martinez Canillas wrote:
> Do you think that a simpler AF_UNIX multicast implementation without the
> locking to guarantee order delivery and the flow control that blocks the
> sender can be resend to you to reconsider merging it?

I still don't get how blocking the sender when the receiver doesn't
empty his socket queue can possibly ever be a good idea. All I see is a
very nice way to choke the entire D-Bus from one malicious or broken
app.

Note that originally we were talking about blocking delivery for
_multicast_. In that case you can't even poll on writability on a
granularity finer than group level.

Yet, this still comes up here and there as a requirement for IPC
mechanisms to back D-Bus.

When the buffers at the receiver are fully filled, IMHO that's the point
to cut off the client. If this becomes an issue, the buffers can be
increased in size, but at some point it's a sign that you're using D-Bus
for too much?


-David

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2012-03-05 18:55 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-20 15:57 [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Javier Martinez Canillas
2012-02-20 15:57 ` [PATCH 01/10] af_unix: Documentation on multicast unix sockets Javier Martinez Canillas
2012-02-20 15:57 ` [PATCH 02/10] af_unix: Add constant for unix socket options level Javier Martinez Canillas
2012-02-20 15:57 ` [PATCH 03/10] af_unix: add setsockopt on unix sockets Javier Martinez Canillas
2012-02-20 16:20   ` David Miller
2012-02-20 19:13 ` [PATCH 0/10] af_unix: add multicast and filtering features to AF_UNIX Colin Walters
2012-02-21  8:07   ` Rodrigo Moya
2012-02-24 20:36 ` David Miller
2012-02-27 14:00   ` Javier Martinez Canillas
2012-02-27 19:05     ` David Miller
2012-02-28 10:47       ` Rodrigo Moya
2012-02-28 14:28         ` David Lamparter
2012-02-28 15:24           ` Javier Martinez Canillas
2012-02-28 16:33             ` Javier Martinez Canillas
2012-02-28 19:05         ` David Miller
2012-03-01 11:57           ` Javier Martinez Canillas
2012-03-01 12:26             ` Eric Dumazet
2012-03-01 12:33               ` David Laight
2012-03-01 12:50                 ` Rodrigo Moya
2012-03-01 12:59                   ` Eric Dumazet
2012-03-01 13:56                     ` Javier Martinez Canillas
2012-03-01 16:00                       ` Eric Dumazet
2012-03-01 16:02                       ` Luiz Augusto von Dentz
2012-03-01 17:06                         ` Javier Martinez Canillas
2012-03-01 17:59                         ` Eric Dumazet
2012-03-01 18:10                           ` Alan Cox
2012-03-01 19:02                           ` Javier Martinez Canillas
2012-03-01 19:29                             ` Javier Martinez Canillas
2012-03-01 18:53                         ` David Dillow
2012-03-01 20:55                       ` David Miller
2012-03-02  4:40                         ` Stephen Hemminger
2012-03-01 20:44               ` David Miller
2012-03-01 22:01                 ` Luiz Augusto von Dentz
2012-03-01 22:08                   ` David Miller
2012-03-02  8:39                     ` Luiz Augusto von Dentz
2012-03-02  8:55                       ` David Miller
2012-03-02  9:27                         ` Javier Martinez Canillas
2012-03-02  9:39                           ` David Miller
2012-03-02 13:13                           ` Eric Dumazet
2012-03-02 16:34                             ` Javier Martinez Canillas
2012-03-02 17:08                               ` Alan Cox
2012-03-05  8:38                                 ` Luiz Augusto von Dentz
2012-03-05 14:05                                   ` Martin Mares
2012-03-05 15:11                                     ` Javier Martinez Canillas
2012-03-05 15:49                                       ` Martin Mares
2012-03-05 18:55                           ` David Lamparter
2012-03-02 10:08                         ` Luiz Augusto von Dentz
2012-03-03 12:20                           ` Martin Mares
2012-03-02 22:19                         ` david
2012-03-01 12:57             ` Luiz Augusto von Dentz
2012-03-01 20:42             ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).