All of lore.kernel.org
 help / color / mirror / Atom feed
* Why not make kdbus use CUSE?
@ 2014-11-29  6:34 ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-11-29  6:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, linux-api

I had the opportunity at LinuxCon Europe to chat with Greg and some other kdbus
developers. A few things stood out from our conversation that I thought I would
bring to the list for discussion.

The first is that I asked them why we need to add yet another IPC mechanism (and
quite possibly another exploit target) to the kernel. Apparently, they want to
use dbus to do multicast and the existing dbus software is not peformant enough.
There was not much discussion of why the existing network stack is not usable
for this, but I was not terribly concerned about it, so the remainder of our
discussion focused on compatibility.

They regard a userland compatibility shim in the systemd repostory to provide
backward compatibility for applications. Unfortunately, this is insufficient to
ensure compatibility because dependency trees have multiple levels. If cross
platform package A depends on cross platform library B, which depends on dbus,
and cross platform library B decides to switch to kdbus, then it ceases to be
cross platform and cross platform package A is now dependent on Linux kernels
with kdbus. Not only does that affect other POSIX systems, but it also affects
LTS versions of Linux.

It is somewhat tempting to think that being in the kernel is necessary for
performance, this does not appear to be true from my discussion with Greg and
others. In specific, a key advantage of being in the kernel is a reduction in
context switches and consequently, one would expect programs using the old API
to benefit, but they were quite clear to me that programs using the old API do
not benefit. At the same time, we had a similar situation where people thought
that the httpd server had to be inside the kernel until Linux 2.6, when our
userland APIs improved to the point where we were able to get similar if not
better performance in userland compared to the implementation of khttpd in Linux
2.4.y.

Putting daemons in the kernel is always more performant than putting daemons
into userland, but it has the drawback of violating the principle of least
privilege. When code is in userland, we can apply security mechanisms to it via
things like SELinux and seccomp to limit the damage caused by compromise.  With
an in-kernel component, there is no way of doing that. One might be tempted to
think that controlling the IPC mechanism is as good as controlling the system,
but this is not true when we consider things like lxc, where compromise of dbus
in a container does not give full control over the system.

I started to think that we probably ought to design a way to put kdbus into
userland and then I realized that we already have one in the form of CUSE. This
would not only makes kdbus play nicely with SELinux and lxc, but also other
POSIX systems that currently share dbus with Linux systems, which includes older
Linux kernels. Greg claimed that the kdbus code was fairly self contained and
was just a character device, so I assume this is possible and I am curious why
it is not done.

I should probably mention one other thing that I recall from my discussion with
Greg and others, which is that the systemd project wants to depend on it. The
nature of controlling pid 1 means that systemd is more than capable of starting
dbus before anything that needs it and that includes its own components (aside
from its pid 1). The systemd project wanting the API is not a valid reason for
why it should be in the kernel, although it could be a reason to make a CUSE
version go into systemd's pid 1.

That said, why not make kdbus use CUSE?

P.S. I also mentioned my concern that having the shim in the systemd repository
would have a negative effect on distributons that use alterntaive libc libraries
because the systemd developers refuse to support alternative libc libraries. I
mentioned this to one of the people to whom Greg introduced me (and whose name
escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
told quite plainly that such distributions are not worth consideration. If kdbus
is merged despite concerns about security and backward compatibility, could we
at least have the shim moved to libc netural place, like Linus' tree?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Why not make kdbus use CUSE?
@ 2014-11-29  6:34 ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-11-29  6:34 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Greg Kroah-Hartman, linux-api-u79uwXL29TY76Z2rM5mHXA

I had the opportunity at LinuxCon Europe to chat with Greg and some other kdbus
developers. A few things stood out from our conversation that I thought I would
bring to the list for discussion.

The first is that I asked them why we need to add yet another IPC mechanism (and
quite possibly another exploit target) to the kernel. Apparently, they want to
use dbus to do multicast and the existing dbus software is not peformant enough.
There was not much discussion of why the existing network stack is not usable
for this, but I was not terribly concerned about it, so the remainder of our
discussion focused on compatibility.

They regard a userland compatibility shim in the systemd repostory to provide
backward compatibility for applications. Unfortunately, this is insufficient to
ensure compatibility because dependency trees have multiple levels. If cross
platform package A depends on cross platform library B, which depends on dbus,
and cross platform library B decides to switch to kdbus, then it ceases to be
cross platform and cross platform package A is now dependent on Linux kernels
with kdbus. Not only does that affect other POSIX systems, but it also affects
LTS versions of Linux.

It is somewhat tempting to think that being in the kernel is necessary for
performance, this does not appear to be true from my discussion with Greg and
others. In specific, a key advantage of being in the kernel is a reduction in
context switches and consequently, one would expect programs using the old API
to benefit, but they were quite clear to me that programs using the old API do
not benefit. At the same time, we had a similar situation where people thought
that the httpd server had to be inside the kernel until Linux 2.6, when our
userland APIs improved to the point where we were able to get similar if not
better performance in userland compared to the implementation of khttpd in Linux
2.4.y.

Putting daemons in the kernel is always more performant than putting daemons
into userland, but it has the drawback of violating the principle of least
privilege. When code is in userland, we can apply security mechanisms to it via
things like SELinux and seccomp to limit the damage caused by compromise.  With
an in-kernel component, there is no way of doing that. One might be tempted to
think that controlling the IPC mechanism is as good as controlling the system,
but this is not true when we consider things like lxc, where compromise of dbus
in a container does not give full control over the system.

I started to think that we probably ought to design a way to put kdbus into
userland and then I realized that we already have one in the form of CUSE. This
would not only makes kdbus play nicely with SELinux and lxc, but also other
POSIX systems that currently share dbus with Linux systems, which includes older
Linux kernels. Greg claimed that the kdbus code was fairly self contained and
was just a character device, so I assume this is possible and I am curious why
it is not done.

I should probably mention one other thing that I recall from my discussion with
Greg and others, which is that the systemd project wants to depend on it. The
nature of controlling pid 1 means that systemd is more than capable of starting
dbus before anything that needs it and that includes its own components (aside
from its pid 1). The systemd project wanting the API is not a valid reason for
why it should be in the kernel, although it could be a reason to make a CUSE
version go into systemd's pid 1.

That said, why not make kdbus use CUSE?

P.S. I also mentioned my concern that having the shim in the systemd repository
would have a negative effect on distributons that use alterntaive libc libraries
because the systemd developers refuse to support alternative libc libraries. I
mentioned this to one of the people to whom Greg introduced me (and whose name
escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
told quite plainly that such distributions are not worth consideration. If kdbus
is merged despite concerns about security and backward compatibility, could we
at least have the shim moved to libc netural place, like Linus' tree?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-11-29 17:59   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-29 17:59 UTC (permalink / raw)
  To: Richard Yao; +Cc: linux-kernel, linux-api

On Sat, Nov 29, 2014 at 06:34:16AM +0000, Richard Yao wrote:
> I had the opportunity at LinuxCon Europe to chat with Greg and some other kdbus
> developers. A few things stood out from our conversation that I thought I would
> bring to the list for discussion.

Any reason why you didn't respond to the kdbus patches themselves?
Critiquing the specific code is much better than random discussions.

> They regard a userland compatibility shim in the systemd repostory to provide
> backward compatibility for applications. Unfortunately, this is insufficient to
> ensure compatibility because dependency trees have multiple levels. If cross
> platform package A depends on cross platform library B, which depends on dbus,
> and cross platform library B decides to switch to kdbus, then it ceases to be
> cross platform and cross platform package A is now dependent on Linux kernels
> with kdbus. Not only does that affect other POSIX systems, but it also affects
> LTS versions of Linux.

What does LTS versions have anything to do here?  And what specific
dependancies are you worried about?

> It is somewhat tempting to think that being in the kernel is necessary for
> performance, this does not appear to be true from my discussion with Greg and
> others. In specific, a key advantage of being in the kernel is a reduction in
> context switches and consequently, one would expect programs using the old API
> to benefit, but they were quite clear to me that programs using the old API do
> not benefit. At the same time, we had a similar situation where people thought
> that the httpd server had to be inside the kernel until Linux 2.6, when our
> userland APIs improved to the point where we were able to get similar if not
> better performance in userland compared to the implementation of khttpd in Linux
> 2.4.y.

Again, please see the kernel patches for lots of detail as to why this
should be in the kernel.  If you disagree with the specific statements I
have listed there, please respond with specifics.

> I started to think that we probably ought to design a way to put kdbus into
> userland and then I realized that we already have one in the form of CUSE. This
> would not only makes kdbus play nicely with SELinux and lxc, but also other
> POSIX systems that currently share dbus with Linux systems, which includes older
> Linux kernels. Greg claimed that the kdbus code was fairly self contained and
> was just a character device, so I assume this is possible and I am curious why
> it is not done.

The latest version is a filesystem not a character device, your
information is out of date :)

> P.S. I also mentioned my concern that having the shim in the systemd repository
> would have a negative effect on distributons that use alterntaive libc libraries
> because the systemd developers refuse to support alternative libc libraries. I
> mentioned this to one of the people to whom Greg introduced me (and whose name
> escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
> told quite plainly that such distributions are not worth consideration. If kdbus
> is merged despite concerns about security and backward compatibility, could we
> at least have the shim moved to libc netural place, like Linus' tree?

Take that up on the systemd mailing list, it's not a kernel issue.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-11-29 17:59   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-29 17:59 UTC (permalink / raw)
  To: Richard Yao
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On Sat, Nov 29, 2014 at 06:34:16AM +0000, Richard Yao wrote:
> I had the opportunity at LinuxCon Europe to chat with Greg and some other kdbus
> developers. A few things stood out from our conversation that I thought I would
> bring to the list for discussion.

Any reason why you didn't respond to the kdbus patches themselves?
Critiquing the specific code is much better than random discussions.

> They regard a userland compatibility shim in the systemd repostory to provide
> backward compatibility for applications. Unfortunately, this is insufficient to
> ensure compatibility because dependency trees have multiple levels. If cross
> platform package A depends on cross platform library B, which depends on dbus,
> and cross platform library B decides to switch to kdbus, then it ceases to be
> cross platform and cross platform package A is now dependent on Linux kernels
> with kdbus. Not only does that affect other POSIX systems, but it also affects
> LTS versions of Linux.

What does LTS versions have anything to do here?  And what specific
dependancies are you worried about?

> It is somewhat tempting to think that being in the kernel is necessary for
> performance, this does not appear to be true from my discussion with Greg and
> others. In specific, a key advantage of being in the kernel is a reduction in
> context switches and consequently, one would expect programs using the old API
> to benefit, but they were quite clear to me that programs using the old API do
> not benefit. At the same time, we had a similar situation where people thought
> that the httpd server had to be inside the kernel until Linux 2.6, when our
> userland APIs improved to the point where we were able to get similar if not
> better performance in userland compared to the implementation of khttpd in Linux
> 2.4.y.

Again, please see the kernel patches for lots of detail as to why this
should be in the kernel.  If you disagree with the specific statements I
have listed there, please respond with specifics.

> I started to think that we probably ought to design a way to put kdbus into
> userland and then I realized that we already have one in the form of CUSE. This
> would not only makes kdbus play nicely with SELinux and lxc, but also other
> POSIX systems that currently share dbus with Linux systems, which includes older
> Linux kernels. Greg claimed that the kdbus code was fairly self contained and
> was just a character device, so I assume this is possible and I am curious why
> it is not done.

The latest version is a filesystem not a character device, your
information is out of date :)

> P.S. I also mentioned my concern that having the shim in the systemd repository
> would have a negative effect on distributons that use alterntaive libc libraries
> because the systemd developers refuse to support alternative libc libraries. I
> mentioned this to one of the people to whom Greg introduced me (and whose name
> escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
> told quite plainly that such distributions are not worth consideration. If kdbus
> is merged despite concerns about security and backward compatibility, could we
> at least have the shim moved to libc netural place, like Linus' tree?

Take that up on the systemd mailing list, it's not a kernel issue.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-01 14:23   ` One Thousand Gnomes
  0 siblings, 0 replies; 22+ messages in thread
From: One Thousand Gnomes @ 2014-12-01 14:23 UTC (permalink / raw)
  To: Richard Yao; +Cc: linux-kernel, Greg Kroah-Hartman, linux-api

> told quite plainly that such distributions are not worth consideration. If kdbus
> is merged despite concerns about security and backward compatibility, could we
> at least have the shim moved to libc netural place, like Linus' tree?

I would expect any other libc would fork the shim anyway (or just not
bother with systemd in most cases).

Alan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-01 14:23   ` One Thousand Gnomes
  0 siblings, 0 replies; 22+ messages in thread
From: One Thousand Gnomes @ 2014-12-01 14:23 UTC (permalink / raw)
  To: Richard Yao
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman,
	linux-api-u79uwXL29TY76Z2rM5mHXA

> told quite plainly that such distributions are not worth consideration. If kdbus
> is merged despite concerns about security and backward compatibility, could we
> at least have the shim moved to libc netural place, like Linus' tree?

I would expect any other libc would fork the shim anyway (or just not
bother with systemd in most cases).

Alan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02  4:31     ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-02  4:31 UTC (permalink / raw)
  To: One Thousand Gnomes; +Cc: linux-kernel, Greg Kroah-Hartman, linux-api

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]

On 12/01/2014 09:23 AM, One Thousand Gnomes wrote:
>> told quite plainly that such distributions are not worth consideration. If kdbus
>> is merged despite concerns about security and backward compatibility, could we
>> at least have the shim moved to libc netural place, like Linus' tree?
> 
> I would expect any other libc would fork the shim anyway (or just not
> bother with systemd in most cases).

If the shim were in glibc, then that would be reasonable, but the shim
is in systemd. That would make the systemd developers the gate keepers
ta kernel interface. If we are going to enforce Linus' stable API
policy, then the shim should be in a repository that has a track record
for interface stability, which the systemd developers simply do not
have. It was not that long ago that firmware loading was moved into the
kernel because they had caused problems. I do not think that the systemd
developers are the correct ones to assume stewardship over such code. If
kdbus is merged, I think the best situation would be to move it into
Linus' tree.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02  4:31     ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-02  4:31 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman,
	linux-api-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]

On 12/01/2014 09:23 AM, One Thousand Gnomes wrote:
>> told quite plainly that such distributions are not worth consideration. If kdbus
>> is merged despite concerns about security and backward compatibility, could we
>> at least have the shim moved to libc netural place, like Linus' tree?
> 
> I would expect any other libc would fork the shim anyway (or just not
> bother with systemd in most cases).

If the shim were in glibc, then that would be reasonable, but the shim
is in systemd. That would make the systemd developers the gate keepers
ta kernel interface. If we are going to enforce Linus' stable API
policy, then the shim should be in a repository that has a track record
for interface stability, which the systemd developers simply do not
have. It was not that long ago that firmware loading was moved into the
kernel because they had caused problems. I do not think that the systemd
developers are the correct ones to assume stewardship over such code. If
kdbus is merged, I think the best situation would be to move it into
Linus' tree.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02  5:40     ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-02  5:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, linux-api

[-- Attachment #1: Type: text/plain, Size: 7917 bytes --]

On 11/29/2014 12:59 PM, Greg Kroah-Hartman wrote:
> On Sat, Nov 29, 2014 at 06:34:16AM +0000, Richard Yao wrote:
>> I had the opportunity at LinuxCon Europe to chat with Greg and some other kdbus
>> developers. A few things stood out from our conversation that I thought I would
>> bring to the list for discussion.
> 
> Any reason why you didn't respond to the kdbus patches themselves?
> Critiquing the specific code is much better than random discussions.

I am not subscribed to the list because of the enormous volume of email
that I would need to process when I am already at my limit from various
mailing lists. Consequently, I did not have the message-id to use
in-reply-to. In hindsight, I should have fetched them from an online
archive. I will make an effort to send additional emails with the proper
message ids under in-reply-to.

However, I might not have time to dedicate to that until the weekend. My
employer was good enough to allow me to work remotely from Shanghai so
that I could visit family. Unfortunately, the Internet connectivity here
leaves something to be desired. The only way to get Internet
connectivity for a short stay is via the mobile network and conventional
4G is not deployed. What I suspect is a bug in the network stack causes
the last mile to randomly die on me with no helpful messages printed to
dmesg or the system log.

Things like patch review for the linux kernel and debugging the network
stack are things that I get to do on my time. So far, I have not found
time to debug it beyond verifying that different 3G radios from
different manufacturers (Huawei E261 and Ericsson F5521gw) exhibit the
same behavior. Additionally, all traffic appears to be routed through
the national firewall in Beijing, where the peering links between China
and the US have degraded to the point where connections are worse than
US  dial-up connections from the 1990s. I have managed to use VM hosts
to route traffic over less congested links, but the latencies and packet
loss ave combined to make TCP congestion control extraordinarily painful.

>> They regard a userland compatibility shim in the systemd repostory to provide
>> backward compatibility for applications. Unfortunately, this is insufficient to
>> ensure compatibility because dependency trees have multiple levels. If cross
>> platform package A depends on cross platform library B, which depends on dbus,
>> and cross platform library B decides to switch to kdbus, then it ceases to be
>> cross platform and cross platform package A is now dependent on Linux kernels
>> with kdbus. Not only does that affect other POSIX systems, but it also affects
>> LTS versions of Linux.
> 
> What does LTS versions have anything to do here?  And what specific
> dependancies are you worried about?

Lets say that you have a Linux 3.10 system and you want some package
that indirectly depends on the new API due to library dependencies. You
will have a problem. You could probably install an older version of the
library, but if the older version has a CVE, most end users will end up
between a rock and a hard place. This situation should merit some
consideration because you are taking something that lived previously in
userland, modifying it so that anything depending on the modifications
is no longer backward compatible and then tying it to new kernels.

I think trying to use existing APIs to implement this in userspace is
worth consideration. I recall that you were very enthusiastic about CUSE
enabling people to move drivers out of the kernel. If statements about
kdbus' reduction in context-switch overhead not being a significant
benefit are to be believed, I would think that we could reuse CUSE.

>> It is somewhat tempting to think that being in the kernel is necessary for
>> performance, this does not appear to be true from my discussion with Greg and
>> others. In specific, a key advantage of being in the kernel is a reduction in
>> context switches and consequently, one would expect programs using the old API
>> to benefit, but they were quite clear to me that programs using the old API do
>> not benefit. At the same time, we had a similar situation where people thought
>> that the httpd server had to be inside the kernel until Linux 2.6, when our
>> userland APIs improved to the point where we were able to get similar if not
>> better performance in userland compared to the implementation of khttpd in Linux
>> 2.4.y.
> 
> Again, please see the kernel patches for lots of detail as to why this
> should be in the kernel.  If you disagree with the specific statements I
> have listed there, please respond with specifics.

I have some broader architectural concerns:

1. Debugging kernel code is a pain while debugging user code is
relatively easy.

2. Security vulnerabilities in kernel code give complete access to
everything while security vulnerabilities in userspace code can be
limited in scope by SELinux.

3. Integration with things like LXC should be easier from userspace,
where each container can have its own daemon.

We do not put everything into one address space so that we can limit the
potential for things to go wrong and enable us to debug them when they
do. If implementing this via FUSE/CUSE is an option, we should try it
first. Moving it into the kernel is always possible afterward. However,
moving it into userspace is not because the kernel will need to support
the new API *indefinitely*. The statements made at LinuxCon Europe
strongly suggest to me that the API design is what enables higher
performance, not a reduction in context switch overhead. If that is the
case, context switch performance does not seem to be the reason for
being in the kernel and consequently, using CUSE/FUSE to keep it in
userspace should be doable.

>> I started to think that we probably ought to design a way to put kdbus into
>> userland and then I realized that we already have one in the form of CUSE. This
>> would not only makes kdbus play nicely with SELinux and lxc, but also other
>> POSIX systems that currently share dbus with Linux systems, which includes older
>> Linux kernels. Greg claimed that the kdbus code was fairly self contained and
>> was just a character device, so I assume this is possible and I am curious why
>> it is not done.
> 
> The latest version is a filesystem not a character device, your
> information is out of date :)

CUSE is an extension of FUSE, so roughly the same APIs would be used in
either case.

>> P.S. I also mentioned my concern that having the shim in the systemd repository
>> would have a negative effect on distributons that use alterntaive libc libraries
>> because the systemd developers refuse to support alternative libc libraries. I
>> mentioned this to one of the people to whom Greg introduced me (and whose name
>> escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
>> told quite plainly that such distributions are not worth consideration. If kdbus
>> is merged despite concerns about security and backward compatibility, could we
>> at least have the shim moved to libc netural place, like Linus' tree?
> 
> Take that up on the systemd mailing list, it's not a kernel issue.

It became a kernel issue the moment that you proposed a kernel API with
corresponding library code in the systemd repository. Not that long ago,
the firmware loading code was moved into the kernel because there were
problems with systemd's stewardship over that mechanism in udev. Giving
the systemd developers the responsibility of maintaining the only
library for a proprosed kernel API so soon afterward seems unwise to me.
If the library is small, there is no reason why it cannot be part of the
mainline tree, much like other small things that are bound to kernel
APIs, like perf.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02  5:40     ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-02  5:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 7917 bytes --]

On 11/29/2014 12:59 PM, Greg Kroah-Hartman wrote:
> On Sat, Nov 29, 2014 at 06:34:16AM +0000, Richard Yao wrote:
>> I had the opportunity at LinuxCon Europe to chat with Greg and some other kdbus
>> developers. A few things stood out from our conversation that I thought I would
>> bring to the list for discussion.
> 
> Any reason why you didn't respond to the kdbus patches themselves?
> Critiquing the specific code is much better than random discussions.

I am not subscribed to the list because of the enormous volume of email
that I would need to process when I am already at my limit from various
mailing lists. Consequently, I did not have the message-id to use
in-reply-to. In hindsight, I should have fetched them from an online
archive. I will make an effort to send additional emails with the proper
message ids under in-reply-to.

However, I might not have time to dedicate to that until the weekend. My
employer was good enough to allow me to work remotely from Shanghai so
that I could visit family. Unfortunately, the Internet connectivity here
leaves something to be desired. The only way to get Internet
connectivity for a short stay is via the mobile network and conventional
4G is not deployed. What I suspect is a bug in the network stack causes
the last mile to randomly die on me with no helpful messages printed to
dmesg or the system log.

Things like patch review for the linux kernel and debugging the network
stack are things that I get to do on my time. So far, I have not found
time to debug it beyond verifying that different 3G radios from
different manufacturers (Huawei E261 and Ericsson F5521gw) exhibit the
same behavior. Additionally, all traffic appears to be routed through
the national firewall in Beijing, where the peering links between China
and the US have degraded to the point where connections are worse than
US  dial-up connections from the 1990s. I have managed to use VM hosts
to route traffic over less congested links, but the latencies and packet
loss ave combined to make TCP congestion control extraordinarily painful.

>> They regard a userland compatibility shim in the systemd repostory to provide
>> backward compatibility for applications. Unfortunately, this is insufficient to
>> ensure compatibility because dependency trees have multiple levels. If cross
>> platform package A depends on cross platform library B, which depends on dbus,
>> and cross platform library B decides to switch to kdbus, then it ceases to be
>> cross platform and cross platform package A is now dependent on Linux kernels
>> with kdbus. Not only does that affect other POSIX systems, but it also affects
>> LTS versions of Linux.
> 
> What does LTS versions have anything to do here?  And what specific
> dependancies are you worried about?

Lets say that you have a Linux 3.10 system and you want some package
that indirectly depends on the new API due to library dependencies. You
will have a problem. You could probably install an older version of the
library, but if the older version has a CVE, most end users will end up
between a rock and a hard place. This situation should merit some
consideration because you are taking something that lived previously in
userland, modifying it so that anything depending on the modifications
is no longer backward compatible and then tying it to new kernels.

I think trying to use existing APIs to implement this in userspace is
worth consideration. I recall that you were very enthusiastic about CUSE
enabling people to move drivers out of the kernel. If statements about
kdbus' reduction in context-switch overhead not being a significant
benefit are to be believed, I would think that we could reuse CUSE.

>> It is somewhat tempting to think that being in the kernel is necessary for
>> performance, this does not appear to be true from my discussion with Greg and
>> others. In specific, a key advantage of being in the kernel is a reduction in
>> context switches and consequently, one would expect programs using the old API
>> to benefit, but they were quite clear to me that programs using the old API do
>> not benefit. At the same time, we had a similar situation where people thought
>> that the httpd server had to be inside the kernel until Linux 2.6, when our
>> userland APIs improved to the point where we were able to get similar if not
>> better performance in userland compared to the implementation of khttpd in Linux
>> 2.4.y.
> 
> Again, please see the kernel patches for lots of detail as to why this
> should be in the kernel.  If you disagree with the specific statements I
> have listed there, please respond with specifics.

I have some broader architectural concerns:

1. Debugging kernel code is a pain while debugging user code is
relatively easy.

2. Security vulnerabilities in kernel code give complete access to
everything while security vulnerabilities in userspace code can be
limited in scope by SELinux.

3. Integration with things like LXC should be easier from userspace,
where each container can have its own daemon.

We do not put everything into one address space so that we can limit the
potential for things to go wrong and enable us to debug them when they
do. If implementing this via FUSE/CUSE is an option, we should try it
first. Moving it into the kernel is always possible afterward. However,
moving it into userspace is not because the kernel will need to support
the new API *indefinitely*. The statements made at LinuxCon Europe
strongly suggest to me that the API design is what enables higher
performance, not a reduction in context switch overhead. If that is the
case, context switch performance does not seem to be the reason for
being in the kernel and consequently, using CUSE/FUSE to keep it in
userspace should be doable.

>> I started to think that we probably ought to design a way to put kdbus into
>> userland and then I realized that we already have one in the form of CUSE. This
>> would not only makes kdbus play nicely with SELinux and lxc, but also other
>> POSIX systems that currently share dbus with Linux systems, which includes older
>> Linux kernels. Greg claimed that the kdbus code was fairly self contained and
>> was just a character device, so I assume this is possible and I am curious why
>> it is not done.
> 
> The latest version is a filesystem not a character device, your
> information is out of date :)

CUSE is an extension of FUSE, so roughly the same APIs would be used in
either case.

>> P.S. I also mentioned my concern that having the shim in the systemd repository
>> would have a negative effect on distributons that use alterntaive libc libraries
>> because the systemd developers refuse to support alternative libc libraries. I
>> mentioned this to one of the people to whom Greg introduced me (and whose name
>> escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
>> told quite plainly that such distributions are not worth consideration. If kdbus
>> is merged despite concerns about security and backward compatibility, could we
>> at least have the shim moved to libc netural place, like Linus' tree?
> 
> Take that up on the systemd mailing list, it's not a kernel issue.

It became a kernel issue the moment that you proposed a kernel API with
corresponding library code in the systemd repository. Not that long ago,
the firmware loading code was moved into the kernel because there were
problems with systemd's stewardship over that mechanism in udev. Giving
the systemd developers the responsibility of maintaining the only
library for a proprosed kernel API so soon afterward seems unwise to me.
If the library is small, there is no reason why it cannot be part of the
mainline tree, much like other small things that are bound to kernel
APIs, like perf.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02  5:48       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-02  5:48 UTC (permalink / raw)
  To: Richard Yao; +Cc: linux-kernel, linux-api

On Tue, Dec 02, 2014 at 12:40:09AM -0500, Richard Yao wrote:
> >> They regard a userland compatibility shim in the systemd repostory to provide
> >> backward compatibility for applications. Unfortunately, this is insufficient to
> >> ensure compatibility because dependency trees have multiple levels. If cross
> >> platform package A depends on cross platform library B, which depends on dbus,
> >> and cross platform library B decides to switch to kdbus, then it ceases to be
> >> cross platform and cross platform package A is now dependent on Linux kernels
> >> with kdbus. Not only does that affect other POSIX systems, but it also affects
> >> LTS versions of Linux.
> > 
> > What does LTS versions have anything to do here?  And what specific
> > dependancies are you worried about?
> 
> Lets say that you have a Linux 3.10 system and you want some package
> that indirectly depends on the new API due to library dependencies. You
> will have a problem. You could probably install an older version of the
> library, but if the older version has a CVE, most end users will end up
> between a rock and a hard place. This situation should merit some
> consideration because you are taking something that lived previously in
> userland, modifying it so that anything depending on the modifications
> is no longer backward compatible and then tying it to new kernels.

Then you need to get a better distro, as any "well run" long-term
enterprise distro handles stuff like this for you.  Otherwise you need
to update systems properly.  There's nothing that I can do here to help
with that, nor do I ever want to, sorry.

> I think trying to use existing APIs to implement this in userspace is
> worth consideration. I recall that you were very enthusiastic about CUSE
> enabling people to move drivers out of the kernel. If statements about
> kdbus' reduction in context-switch overhead not being a significant
> benefit are to be believed, I would think that we could reuse CUSE.

I fail to understand how any of this relates to CUSE, please provide
specifics.

> 1. Debugging kernel code is a pain while debugging user code is
> relatively easy.

You have full access to a debugger, what more do you need?  :)

And why would you need to debug the kernel kdbus code?  Is something not
working properly in it?  Otherwise just use wireshark to read the kdbus
data stream and all should be fine.

> 2. Security vulnerabilities in kernel code give complete access to
> everything while security vulnerabilities in userspace code can be
> limited in scope by SELinux.

Kernel code is hard, security matters, yes I know this, we all have been
doing this for a very long time.  Of course bugs happen, but if you look
closely, your "attack surface" is now smaller using kdbus than it was
using old-style dbus.

> 3. Integration with things like LXC should be easier from userspace,
> where each container can have its own daemon.

How does the current implementation not work properly for this?  The
filesystem implementation makes this easier than ever, while sticking
with the character device made this quite difficult in different ways.

> >> I started to think that we probably ought to design a way to put kdbus into
> >> userland and then I realized that we already have one in the form of CUSE. This
> >> would not only makes kdbus play nicely with SELinux and lxc, but also other
> >> POSIX systems that currently share dbus with Linux systems, which includes older
> >> Linux kernels. Greg claimed that the kdbus code was fairly self contained and
> >> was just a character device, so I assume this is possible and I am curious why
> >> it is not done.
> > 
> > The latest version is a filesystem not a character device, your
> > information is out of date :)
> 
> CUSE is an extension of FUSE, so roughly the same APIs would be used in
> either case.

Not really, sorry, the specifics are quite different.

> >> P.S. I also mentioned my concern that having the shim in the systemd repository
> >> would have a negative effect on distributons that use alterntaive libc libraries
> >> because the systemd developers refuse to support alternative libc libraries. I
> >> mentioned this to one of the people to whom Greg introduced me (and whose name
> >> escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
> >> told quite plainly that such distributions are not worth consideration. If kdbus
> >> is merged despite concerns about security and backward compatibility, could we
> >> at least have the shim moved to libc netural place, like Linus' tree?
> > 
> > Take that up on the systemd mailing list, it's not a kernel issue.
> 
> It became a kernel issue the moment that you proposed a kernel API with
> corresponding library code in the systemd repository.

One specific implementation of the library code is in the systemd repo.
There is nothing keeping anyone from forking it and putting it somewhere
else if you depend on it.  Odds are, you aren't going to need to do that
as your old-style dbus applications will work just fine, no changes
needed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02  5:48       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-02  5:48 UTC (permalink / raw)
  To: Richard Yao
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On Tue, Dec 02, 2014 at 12:40:09AM -0500, Richard Yao wrote:
> >> They regard a userland compatibility shim in the systemd repostory to provide
> >> backward compatibility for applications. Unfortunately, this is insufficient to
> >> ensure compatibility because dependency trees have multiple levels. If cross
> >> platform package A depends on cross platform library B, which depends on dbus,
> >> and cross platform library B decides to switch to kdbus, then it ceases to be
> >> cross platform and cross platform package A is now dependent on Linux kernels
> >> with kdbus. Not only does that affect other POSIX systems, but it also affects
> >> LTS versions of Linux.
> > 
> > What does LTS versions have anything to do here?  And what specific
> > dependancies are you worried about?
> 
> Lets say that you have a Linux 3.10 system and you want some package
> that indirectly depends on the new API due to library dependencies. You
> will have a problem. You could probably install an older version of the
> library, but if the older version has a CVE, most end users will end up
> between a rock and a hard place. This situation should merit some
> consideration because you are taking something that lived previously in
> userland, modifying it so that anything depending on the modifications
> is no longer backward compatible and then tying it to new kernels.

Then you need to get a better distro, as any "well run" long-term
enterprise distro handles stuff like this for you.  Otherwise you need
to update systems properly.  There's nothing that I can do here to help
with that, nor do I ever want to, sorry.

> I think trying to use existing APIs to implement this in userspace is
> worth consideration. I recall that you were very enthusiastic about CUSE
> enabling people to move drivers out of the kernel. If statements about
> kdbus' reduction in context-switch overhead not being a significant
> benefit are to be believed, I would think that we could reuse CUSE.

I fail to understand how any of this relates to CUSE, please provide
specifics.

> 1. Debugging kernel code is a pain while debugging user code is
> relatively easy.

You have full access to a debugger, what more do you need?  :)

And why would you need to debug the kernel kdbus code?  Is something not
working properly in it?  Otherwise just use wireshark to read the kdbus
data stream and all should be fine.

> 2. Security vulnerabilities in kernel code give complete access to
> everything while security vulnerabilities in userspace code can be
> limited in scope by SELinux.

Kernel code is hard, security matters, yes I know this, we all have been
doing this for a very long time.  Of course bugs happen, but if you look
closely, your "attack surface" is now smaller using kdbus than it was
using old-style dbus.

> 3. Integration with things like LXC should be easier from userspace,
> where each container can have its own daemon.

How does the current implementation not work properly for this?  The
filesystem implementation makes this easier than ever, while sticking
with the character device made this quite difficult in different ways.

> >> I started to think that we probably ought to design a way to put kdbus into
> >> userland and then I realized that we already have one in the form of CUSE. This
> >> would not only makes kdbus play nicely with SELinux and lxc, but also other
> >> POSIX systems that currently share dbus with Linux systems, which includes older
> >> Linux kernels. Greg claimed that the kdbus code was fairly self contained and
> >> was just a character device, so I assume this is possible and I am curious why
> >> it is not done.
> > 
> > The latest version is a filesystem not a character device, your
> > information is out of date :)
> 
> CUSE is an extension of FUSE, so roughly the same APIs would be used in
> either case.

Not really, sorry, the specifics are quite different.

> >> P.S. I also mentioned my concern that having the shim in the systemd repository
> >> would have a negative effect on distributons that use alterntaive libc libraries
> >> because the systemd developers refuse to support alternative libc libraries. I
> >> mentioned this to one of the people to whom Greg introduced me (and whose name
> >> escapes me) as we were walking to Michael Kerrisk's talk on API design. I was
> >> told quite plainly that such distributions are not worth consideration. If kdbus
> >> is merged despite concerns about security and backward compatibility, could we
> >> at least have the shim moved to libc netural place, like Linus' tree?
> > 
> > Take that up on the systemd mailing list, it's not a kernel issue.
> 
> It became a kernel issue the moment that you proposed a kernel API with
> corresponding library code in the systemd repository.

One specific implementation of the library code is in the systemd repo.
There is nothing keeping anyone from forking it and putting it somewhere
else if you depend on it.  Odds are, you aren't going to need to do that
as your old-style dbus applications will work just fine, no changes
needed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
  2014-12-02  5:48       ` Greg Kroah-Hartman
  (?)
@ 2014-12-02  7:59       ` Richard Yao
  2014-12-02 12:22         ` Richard Yao
  2014-12-02 17:12           ` Greg Kroah-Hartman
  -1 siblings, 2 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-02  7:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, linux-api

[-- Attachment #1: Type: text/plain, Size: 5252 bytes --]

On 12/02/2014 12:48 AM, Greg Kroah-Hartman wrote:
> On Tue, Dec 02, 2014 at 12:40:09AM -0500, Richard Yao wrote:
>>>> They regard a userland compatibility shim in the systemd repostory to provide
>>>> backward compatibility for applications. Unfortunately, this is insufficient to
>>>> ensure compatibility because dependency trees have multiple levels. If cross
>>>> platform package A depends on cross platform library B, which depends on dbus,
>>>> and cross platform library B decides to switch to kdbus, then it ceases to be
>>>> cross platform and cross platform package A is now dependent on Linux kernels
>>>> with kdbus. Not only does that affect other POSIX systems, but it also affects
>>>> LTS versions of Linux.
>>>
>>> What does LTS versions have anything to do here?  And what specific
>>> dependancies are you worried about?
>>
>> Lets say that you have a Linux 3.10 system and you want some package
>> that indirectly depends on the new API due to library dependencies. You
>> will have a problem. You could probably install an older version of the
>> library, but if the older version has a CVE, most end users will end up
>> between a rock and a hard place. This situation should merit some
>> consideration because you are taking something that lived previously in
>> userland, modifying it so that anything depending on the modifications
>> is no longer backward compatible and then tying it to new kernels.
> 
> Then you need to get a better distro, as any "well run" long-term
> enterprise distro handles stuff like this for you.  Otherwise you need
> to update systems properly.  There's nothing that I can do here to help
> with that, nor do I ever want to, sorry.

Another option is to include KVM-style kernel compatibility code to
allow the module to be built against older kernels. If you target
2.6.32.y, 3.2.y, 3.4.y and 3.10.y, the risk of people on older Linux
systems being left behind would be minimized.

>> 1. Debugging kernel code is a pain while debugging user code is
>> relatively easy.
> 
> You have full access to a debugger, what more do you need?  :)

I would prefer not to start bringing userland daemons into the kernel
unless there is no other choice. That way, a wider range of people can
tackle bugs and the code could be applied to a larger number of systems.

> And why would you need to debug the kernel kdbus code?  Is something not
> working properly in it?  Otherwise just use wireshark to read the kdbus
> data stream and all should be fine.

Putting daemons in the kernel means that we further complicate already
complex relationships with regard to things like memory utilization and
CPU time. It is easier to deal with this in userland where we could
better utilize cgroups.

>> 2. Security vulnerabilities in kernel code give complete access to
>> everything while security vulnerabilities in userspace code can be
>> limited in scope by SELinux.
> 
> Kernel code is hard, security matters, yes I know this, we all have been
> doing this for a very long time.  Of course bugs happen, but if you look
> closely, your "attack surface" is now smaller using kdbus than it was
> using old-style dbus.

Lets say that I have a system running LXC containers, someone does full
disclosure of proof of concept code for an arbitrary code execution zero
day and then someone else tries the exploit in a LXC container on
mysystem. With old-style dbus, only the container is affected and if
selinux is used, then it is possible to restrict daemon to things in the
container using dbus. A FUSE daemon using the new protocol is similar.
However, an in-kernel version not only means that the attacker breaks
out, but he has the ability to execute code with full kernel privileges.
Every Linux container on the system is therefore compromised.

I heard quite clearly at LinuxCon Europe that there are no expected
benefits from using the shim with kdbus such that we have the equivalent
of the original dbus daemon in the kernel, but there were plenty of
benefits from the protocol. If that is the case, it seems that being in
the kernel is not a necessity, but the new protocol is. FUSE might be
somewhat slower than an in-kernel filesystem, but it allows us to
enforce least privilege like we can do now with the current dbus daemon.
We cannot do that with kdbus/kdbusfs. If the reduction in context switch
overhead actually mattered, I would understand the desire to put this
into the kernel, but I have heard quite consistently that the context
switch overhead is not a significant motivation for pushing this code
into the kernel. If it were, the current userland code could have been
adapted into a kernel module.

>> 3. Integration with things like LXC should be easier from userspace,
>> where each container can have its own daemon.
> 
> How does the current implementation not work properly for this?  The
> filesystem implementation makes this easier than ever, while sticking
> with the character device made this quite difficult in different ways.

As you pointed out, my information was out of date. Making this into a
filesystem is an excellent idea that handles container integration quite
nicely.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
  2014-12-02  7:59       ` Richard Yao
@ 2014-12-02 12:22         ` Richard Yao
  2014-12-02 17:26             ` Greg Kroah-Hartman
  2014-12-02 17:12           ` Greg Kroah-Hartman
  1 sibling, 1 reply; 22+ messages in thread
From: Richard Yao @ 2014-12-02 12:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, linux-api

[-- Attachment #1: Type: text/plain, Size: 2591 bytes --]

Dear Greg,

I had hoped that I could avoid reading through the code for yet another
IPC mechanism when I asked why we needed kdbus at LinuxCon Europe. In
hindsight, I should have just checked out the code and read it instead
of asking. However, what I did instead was ask and then do some thinking
based on that, mean to send an email and then sent it long after I
should have.

Our conversation now has lead me to realize my mistake and I have tried
to rectify it. I now see that kdbus is hooking into kernel interface to
implement KDBUS_ITEM_PAYLOAD_MEMFD in a way that we could only achieve
from userspace with UNIX domain sockets. I imagine that we could avoid
putting this code into the kernel through a combination of libev,
libfuse, memfds and UNIX domain sockets.

A fallback path could be provided by using anonymous files on the FUSE
filesystem. This could probably by by the sender doing something like
the following:

void *buf;
int coookie;
char *name = strdup("/sys/fs/kdbus/tmp/XXXXXX");
int fd = mkstemp(name);
unlink(name);
buf = mmap(NULL, LENGTH, PROT_WRITE, MAP_SHARED, fd, 0);
// Do your writes here
cookie = ioctl(fd, SEAL);
close(fd);
munmap(buf);
// Send a message via a UNIX domain socket to the server with the
// cookie, plus whatever XXXXXX became and instructions on where to
// send the data. If the file had been closed before the message was
// received, the server should be able to say okay. Otherwise, it can //
send an error. The server would need to have a timer to handle the
// case where a process never actually sends a message with the cookie.

Assuming that this dance succeeds, the FUSE process could then make a
readonly file in itself, open it read only, unlink it, put the data into
the file and send the file descriptor via UNIX domain socket while
refusing further writes. If it has its own user/group, the file should
be safe from prying eyes.

This is not as good as a memfd and also suffers from the race that
O_TMPFILE was meant to close, but it should be able to function as a
decent fallback. This would preserve portability across not only
different versions of Linux, but also other POSIX systems. Keeping the
code in userspace would allow us to apply SELinux policies to it, which
is something that we would lose if it were go to into the kernel.

That said, it is still not clear to me that dbus must be inside the
kernel to be able to perform multicast and zero copy using memfd. Is
there something that I have missed that make this not the case?

Yours truly,
Richard Yao


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02 17:12           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-02 17:12 UTC (permalink / raw)
  To: Richard Yao; +Cc: linux-kernel, linux-api

On Tue, Dec 02, 2014 at 02:59:21AM -0500, Richard Yao wrote:
> On 12/02/2014 12:48 AM, Greg Kroah-Hartman wrote:
> > On Tue, Dec 02, 2014 at 12:40:09AM -0500, Richard Yao wrote:
> >>>> They regard a userland compatibility shim in the systemd repostory to provide
> >>>> backward compatibility for applications. Unfortunately, this is insufficient to
> >>>> ensure compatibility because dependency trees have multiple levels. If cross
> >>>> platform package A depends on cross platform library B, which depends on dbus,
> >>>> and cross platform library B decides to switch to kdbus, then it ceases to be
> >>>> cross platform and cross platform package A is now dependent on Linux kernels
> >>>> with kdbus. Not only does that affect other POSIX systems, but it also affects
> >>>> LTS versions of Linux.
> >>>
> >>> What does LTS versions have anything to do here?  And what specific
> >>> dependancies are you worried about?
> >>
> >> Lets say that you have a Linux 3.10 system and you want some package
> >> that indirectly depends on the new API due to library dependencies. You
> >> will have a problem. You could probably install an older version of the
> >> library, but if the older version has a CVE, most end users will end up
> >> between a rock and a hard place. This situation should merit some
> >> consideration because you are taking something that lived previously in
> >> userland, modifying it so that anything depending on the modifications
> >> is no longer backward compatible and then tying it to new kernels.
> > 
> > Then you need to get a better distro, as any "well run" long-term
> > enterprise distro handles stuff like this for you.  Otherwise you need
> > to update systems properly.  There's nothing that I can do here to help
> > with that, nor do I ever want to, sorry.
> 
> Another option is to include KVM-style kernel compatibility code to
> allow the module to be built against older kernels. If you target
> 2.6.32.y, 3.2.y, 3.4.y and 3.10.y, the risk of people on older Linux
> systems being left behind would be minimized.

If you want to do that for an out-of-tree patch/module, feel free to do
so, but this has nothing to do with the in-kernel kdbusfs code, sorry.

> >> 1. Debugging kernel code is a pain while debugging user code is
> >> relatively easy.
> > 
> > You have full access to a debugger, what more do you need?  :)
> 
> I would prefer not to start bringing userland daemons into the kernel
> unless there is no other choice. That way, a wider range of people can
> tackle bugs and the code could be applied to a larger number of systems.

What exactly do you mean here?  There are thousands of people who know
how to properly debug Linux kernel code, this isn't an issue at all.

> > And why would you need to debug the kernel kdbus code?  Is something not
> > working properly in it?  Otherwise just use wireshark to read the kdbus
> > data stream and all should be fine.
> 
> Putting daemons in the kernel means that we further complicate already
> complex relationships with regard to things like memory utilization and
> CPU time. It is easier to deal with this in userland where we could
> better utilize cgroups.

What does cgroups have to do with dbus userspace libraries here?  In
fact, I don't think you looked at the code, as we properly tie into
namespaces and all the stuff you can only do in the kernel, you aren't
reading my introductory email at all that explains all of this.

> >> 2. Security vulnerabilities in kernel code give complete access to
> >> everything while security vulnerabilities in userspace code can be
> >> limited in scope by SELinux.
> > 
> > Kernel code is hard, security matters, yes I know this, we all have been
> > doing this for a very long time.  Of course bugs happen, but if you look
> > closely, your "attack surface" is now smaller using kdbus than it was
> > using old-style dbus.
> 
> Lets say that I have a system running LXC containers, someone does full
> disclosure of proof of concept code for an arbitrary code execution zero
> day and then someone else tries the exploit in a LXC container on
> mysystem. With old-style dbus, only the container is affected and if
> selinux is used, then it is possible to restrict daemon to things in the
> container using dbus.

And how exactly does this relate to the kdbusfs code?  Please, stop
making random statements that have nothing to do with the code being
proposed.

> I heard quite clearly at LinuxCon Europe that there are no expected
> benefits from using the shim with kdbus such that we have the equivalent
> of the original dbus daemon in the kernel, but there were plenty of
> benefits from the protocol. If that is the case, it seems that being in
> the kernel is not a necessity, but the new protocol is. FUSE might be
> somewhat slower than an in-kernel filesystem, but it allows us to
> enforce least privilege like we can do now with the current dbus daemon.
> We cannot do that with kdbus/kdbusfs. If the reduction in context switch
> overhead actually mattered, I would understand the desire to put this
> into the kernel, but I have heard quite consistently that the context
> switch overhead is not a significant motivation for pushing this code
> into the kernel.

You heard wrong, the context switch removal is a big thing, and a major
issue for a lot of users.  But that's not the only reason this is being
proposed, again, go read and respond to the 00 patch introduction
please, or even better yet, read the code and documentation and respond
to issues you find there.

Again, FUSE makes no sense here, sorry.

> >> 3. Integration with things like LXC should be easier from userspace,
> >> where each container can have its own daemon.
> > 
> > How does the current implementation not work properly for this?  The
> > filesystem implementation makes this easier than ever, while sticking
> > with the character device made this quite difficult in different ways.
> 
> As you pointed out, my information was out of date. Making this into a
> filesystem is an excellent idea that handles container integration quite
> nicely.

I'm glad you agree with the current implementation, thanks for your
approval.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02 17:12           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-02 17:12 UTC (permalink / raw)
  To: Richard Yao
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On Tue, Dec 02, 2014 at 02:59:21AM -0500, Richard Yao wrote:
> On 12/02/2014 12:48 AM, Greg Kroah-Hartman wrote:
> > On Tue, Dec 02, 2014 at 12:40:09AM -0500, Richard Yao wrote:
> >>>> They regard a userland compatibility shim in the systemd repostory to provide
> >>>> backward compatibility for applications. Unfortunately, this is insufficient to
> >>>> ensure compatibility because dependency trees have multiple levels. If cross
> >>>> platform package A depends on cross platform library B, which depends on dbus,
> >>>> and cross platform library B decides to switch to kdbus, then it ceases to be
> >>>> cross platform and cross platform package A is now dependent on Linux kernels
> >>>> with kdbus. Not only does that affect other POSIX systems, but it also affects
> >>>> LTS versions of Linux.
> >>>
> >>> What does LTS versions have anything to do here?  And what specific
> >>> dependancies are you worried about?
> >>
> >> Lets say that you have a Linux 3.10 system and you want some package
> >> that indirectly depends on the new API due to library dependencies. You
> >> will have a problem. You could probably install an older version of the
> >> library, but if the older version has a CVE, most end users will end up
> >> between a rock and a hard place. This situation should merit some
> >> consideration because you are taking something that lived previously in
> >> userland, modifying it so that anything depending on the modifications
> >> is no longer backward compatible and then tying it to new kernels.
> > 
> > Then you need to get a better distro, as any "well run" long-term
> > enterprise distro handles stuff like this for you.  Otherwise you need
> > to update systems properly.  There's nothing that I can do here to help
> > with that, nor do I ever want to, sorry.
> 
> Another option is to include KVM-style kernel compatibility code to
> allow the module to be built against older kernels. If you target
> 2.6.32.y, 3.2.y, 3.4.y and 3.10.y, the risk of people on older Linux
> systems being left behind would be minimized.

If you want to do that for an out-of-tree patch/module, feel free to do
so, but this has nothing to do with the in-kernel kdbusfs code, sorry.

> >> 1. Debugging kernel code is a pain while debugging user code is
> >> relatively easy.
> > 
> > You have full access to a debugger, what more do you need?  :)
> 
> I would prefer not to start bringing userland daemons into the kernel
> unless there is no other choice. That way, a wider range of people can
> tackle bugs and the code could be applied to a larger number of systems.

What exactly do you mean here?  There are thousands of people who know
how to properly debug Linux kernel code, this isn't an issue at all.

> > And why would you need to debug the kernel kdbus code?  Is something not
> > working properly in it?  Otherwise just use wireshark to read the kdbus
> > data stream and all should be fine.
> 
> Putting daemons in the kernel means that we further complicate already
> complex relationships with regard to things like memory utilization and
> CPU time. It is easier to deal with this in userland where we could
> better utilize cgroups.

What does cgroups have to do with dbus userspace libraries here?  In
fact, I don't think you looked at the code, as we properly tie into
namespaces and all the stuff you can only do in the kernel, you aren't
reading my introductory email at all that explains all of this.

> >> 2. Security vulnerabilities in kernel code give complete access to
> >> everything while security vulnerabilities in userspace code can be
> >> limited in scope by SELinux.
> > 
> > Kernel code is hard, security matters, yes I know this, we all have been
> > doing this for a very long time.  Of course bugs happen, but if you look
> > closely, your "attack surface" is now smaller using kdbus than it was
> > using old-style dbus.
> 
> Lets say that I have a system running LXC containers, someone does full
> disclosure of proof of concept code for an arbitrary code execution zero
> day and then someone else tries the exploit in a LXC container on
> mysystem. With old-style dbus, only the container is affected and if
> selinux is used, then it is possible to restrict daemon to things in the
> container using dbus.

And how exactly does this relate to the kdbusfs code?  Please, stop
making random statements that have nothing to do with the code being
proposed.

> I heard quite clearly at LinuxCon Europe that there are no expected
> benefits from using the shim with kdbus such that we have the equivalent
> of the original dbus daemon in the kernel, but there were plenty of
> benefits from the protocol. If that is the case, it seems that being in
> the kernel is not a necessity, but the new protocol is. FUSE might be
> somewhat slower than an in-kernel filesystem, but it allows us to
> enforce least privilege like we can do now with the current dbus daemon.
> We cannot do that with kdbus/kdbusfs. If the reduction in context switch
> overhead actually mattered, I would understand the desire to put this
> into the kernel, but I have heard quite consistently that the context
> switch overhead is not a significant motivation for pushing this code
> into the kernel.

You heard wrong, the context switch removal is a big thing, and a major
issue for a lot of users.  But that's not the only reason this is being
proposed, again, go read and respond to the 00 patch introduction
please, or even better yet, read the code and documentation and respond
to issues you find there.

Again, FUSE makes no sense here, sorry.

> >> 3. Integration with things like LXC should be easier from userspace,
> >> where each container can have its own daemon.
> > 
> > How does the current implementation not work properly for this?  The
> > filesystem implementation makes this easier than ever, while sticking
> > with the character device made this quite difficult in different ways.
> 
> As you pointed out, my information was out of date. Making this into a
> filesystem is an excellent idea that handles container integration quite
> nicely.

I'm glad you agree with the current implementation, thanks for your
approval.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02 17:26             ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-02 17:26 UTC (permalink / raw)
  To: Richard Yao; +Cc: linux-kernel, linux-api

On Tue, Dec 02, 2014 at 07:22:11AM -0500, Richard Yao wrote:
> Assuming that this dance succeeds, the FUSE process could then make a
> readonly file in itself, open it read only, unlink it, put the data into
> the file and send the file descriptor via UNIX domain socket while
> refusing further writes. If it has its own user/group, the file should
> be safe from prying eyes.
> 
> This is not as good as a memfd and also suffers from the race that
> O_TMPFILE was meant to close, but it should be able to function as a
> decent fallback.

We can't knowingly create and advocate for broken code, sorry.

> This would preserve portability across not only
> different versions of Linux, but also other POSIX systems.

I honestly do not care about any other system than Linux, so I don't see
why this would ever be an issue.

> Keeping the code in userspace would allow us to apply SELinux policies
> to it, which is something that we would lose if it were go to into the
> kernel.

On the contrary, the kdbusfs implementation gives you better security
model support than before, it ties directly into the LSM hooks, see the
add-on patches from some other developers that bring full support of LSM
to the codebase.

> That said, it is still not clear to me that dbus must be inside the
> kernel to be able to perform multicast and zero copy using memfd.

It seems you have yet to read my introductory email for the patch
series.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-02 17:26             ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-02 17:26 UTC (permalink / raw)
  To: Richard Yao
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On Tue, Dec 02, 2014 at 07:22:11AM -0500, Richard Yao wrote:
> Assuming that this dance succeeds, the FUSE process could then make a
> readonly file in itself, open it read only, unlink it, put the data into
> the file and send the file descriptor via UNIX domain socket while
> refusing further writes. If it has its own user/group, the file should
> be safe from prying eyes.
> 
> This is not as good as a memfd and also suffers from the race that
> O_TMPFILE was meant to close, but it should be able to function as a
> decent fallback.

We can't knowingly create and advocate for broken code, sorry.

> This would preserve portability across not only
> different versions of Linux, but also other POSIX systems.

I honestly do not care about any other system than Linux, so I don't see
why this would ever be an issue.

> Keeping the code in userspace would allow us to apply SELinux policies
> to it, which is something that we would lose if it were go to into the
> kernel.

On the contrary, the kdbusfs implementation gives you better security
model support than before, it ties directly into the LSM hooks, see the
add-on patches from some other developers that bring full support of LSM
to the codebase.

> That said, it is still not clear to me that dbus must be inside the
> kernel to be able to perform multicast and zero copy using memfd.

It seems you have yet to read my introductory email for the patch
series.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-03  9:22               ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-03  9:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, linux-api

[-- Attachment #1: Type: text/plain, Size: 4958 bytes --]

On 12/02/2014 12:26 PM, Greg Kroah-Hartman wrote:
> On Tue, Dec 02, 2014 at 07:22:11AM -0500, Richard Yao wrote:
>> Assuming that this dance succeeds, the FUSE process could then make a
>> readonly file in itself, open it read only, unlink it, put the data into
>> the file and send the file descriptor via UNIX domain socket while
>> refusing further writes. If it has its own user/group, the file should
>> be safe from prying eyes.
>>
>> This is not as good as a memfd and also suffers from the race that
>> O_TMPFILE was meant to close, but it should be able to function as a
>> decent fallback.
> 
> We can't knowingly create and advocate for broken code, sorry.
>
>> This would preserve portability across not only
>> different versions of Linux, but also other POSIX systems.
> 
> I honestly do not care about any other system than Linux, so I don't see
> why this would ever be an issue.

If you tie your userland to the most recent kernel and then want to
bisect an old bug, you will have a problem. You could try to find
another userland that supports the older kernels, but it would be *much*
easier if you could just use your current userland with it because then
the bare minimum must change. Writing portable software is the way to do
that.

Why burn the bridges that allow us to look backward when we have such a
need?

>> Keeping the code in userspace would allow us to apply SELinux policies
>> to it, which is something that we would lose if it were go to into the
>> kernel.
> 
> On the contrary, the kdbusfs implementation gives you better security
> model support than before, it ties directly into the LSM hooks, see the
> add-on patches from some other developers that bring full support of LSM
> to the codebase.

If a bug in kdbusfs that allows arbitrary code execution is exploited in
the wild, would kdbus be more secure than a userland version?

>> That said, it is still not clear to me that dbus must be inside the
>> kernel to be able to perform multicast and zero copy using memfd.
> 
> It seems you have yet to read my introductory email for the patch
> series.

Allow me to be more specific:

> - performance: fewer process context switches, fewer copies, fewer
>   syscalls, larger memory chunks via memfd.  This is really important
>   for a whole class of userspace programs that are ported from other
>   operating systems that are run on tiny ARM systems that rely on
>   hundreds of thousands of messages passed at boot time, and at
>   "critical" times in their user interaction loops.

What are some examples of these programs? Are any of them examples of
good software design?

> - security: the peers which communicate do not have to trust each other,
>   as the only trustworthy compoenent in the game is the kernel which
>   adds metadata and ensures that all data passed as payload is either
>   copied or sealed, so that the receiver can parse the data without
>   having to protect against changing memory while parsing buffers.

What keeps userspace from passing around memfds?

> - more metadata can be attached to messages than in userspace

How much metadata can be attached in either case? is there some inherit
aspect of the existing syscall API that prevents userspace from
attaching more? Why do we want to attach more?

> - being in the kernle closes a lot of races which can't be fixed with
>   the current userspace solutions.  For example, with kdbus, there is a
>   way a client can disconnect from a bus, but do so only if no further
>   messages present in its queue, which is crucial for implementing
>   race-free "exit-on-idle" services

Is the current dbus daemon not supporting this this only thing
preventing us from doing it in userspace?

> Of course, some of the bits above could be implemented in userspace
> alone, for example with more sophisticated memory management APIs, but
> this is usually done by losing out on the other details.  For example,
> for many of the memory management APIs, it's hard to not require the
> communicating peers to fully trust each other.  And we _really_ don't
> want peers to have to trust each other.

Does being in the kernel solve this in a way that using memfds in
userspace does not?

> Another benefit of having this in the kernel, rather than as a userspace
> daemon, is that you can now easily use the bus from the initrd, or up to
> the very end when the system shuts down.  On current userspace D-Bus,
> this is not really possible, as this requires passing the bus instance
> around between initrd and the "real" system.  Such a transition of all
> fds also requires keeping full state of what has already been read from
> the connection fds.  kdbus makes this much simpler, as we can change the
> ownership of the bus, just by passing one fd over from one part to the
> other.

Why do we want to start D-Bus inside the initramfs?


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-03  9:22               ` Richard Yao
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Yao @ 2014-12-03  9:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 4958 bytes --]

On 12/02/2014 12:26 PM, Greg Kroah-Hartman wrote:
> On Tue, Dec 02, 2014 at 07:22:11AM -0500, Richard Yao wrote:
>> Assuming that this dance succeeds, the FUSE process could then make a
>> readonly file in itself, open it read only, unlink it, put the data into
>> the file and send the file descriptor via UNIX domain socket while
>> refusing further writes. If it has its own user/group, the file should
>> be safe from prying eyes.
>>
>> This is not as good as a memfd and also suffers from the race that
>> O_TMPFILE was meant to close, but it should be able to function as a
>> decent fallback.
> 
> We can't knowingly create and advocate for broken code, sorry.
>
>> This would preserve portability across not only
>> different versions of Linux, but also other POSIX systems.
> 
> I honestly do not care about any other system than Linux, so I don't see
> why this would ever be an issue.

If you tie your userland to the most recent kernel and then want to
bisect an old bug, you will have a problem. You could try to find
another userland that supports the older kernels, but it would be *much*
easier if you could just use your current userland with it because then
the bare minimum must change. Writing portable software is the way to do
that.

Why burn the bridges that allow us to look backward when we have such a
need?

>> Keeping the code in userspace would allow us to apply SELinux policies
>> to it, which is something that we would lose if it were go to into the
>> kernel.
> 
> On the contrary, the kdbusfs implementation gives you better security
> model support than before, it ties directly into the LSM hooks, see the
> add-on patches from some other developers that bring full support of LSM
> to the codebase.

If a bug in kdbusfs that allows arbitrary code execution is exploited in
the wild, would kdbus be more secure than a userland version?

>> That said, it is still not clear to me that dbus must be inside the
>> kernel to be able to perform multicast and zero copy using memfd.
> 
> It seems you have yet to read my introductory email for the patch
> series.

Allow me to be more specific:

> - performance: fewer process context switches, fewer copies, fewer
>   syscalls, larger memory chunks via memfd.  This is really important
>   for a whole class of userspace programs that are ported from other
>   operating systems that are run on tiny ARM systems that rely on
>   hundreds of thousands of messages passed at boot time, and at
>   "critical" times in their user interaction loops.

What are some examples of these programs? Are any of them examples of
good software design?

> - security: the peers which communicate do not have to trust each other,
>   as the only trustworthy compoenent in the game is the kernel which
>   adds metadata and ensures that all data passed as payload is either
>   copied or sealed, so that the receiver can parse the data without
>   having to protect against changing memory while parsing buffers.

What keeps userspace from passing around memfds?

> - more metadata can be attached to messages than in userspace

How much metadata can be attached in either case? is there some inherit
aspect of the existing syscall API that prevents userspace from
attaching more? Why do we want to attach more?

> - being in the kernle closes a lot of races which can't be fixed with
>   the current userspace solutions.  For example, with kdbus, there is a
>   way a client can disconnect from a bus, but do so only if no further
>   messages present in its queue, which is crucial for implementing
>   race-free "exit-on-idle" services

Is the current dbus daemon not supporting this this only thing
preventing us from doing it in userspace?

> Of course, some of the bits above could be implemented in userspace
> alone, for example with more sophisticated memory management APIs, but
> this is usually done by losing out on the other details.  For example,
> for many of the memory management APIs, it's hard to not require the
> communicating peers to fully trust each other.  And we _really_ don't
> want peers to have to trust each other.

Does being in the kernel solve this in a way that using memfds in
userspace does not?

> Another benefit of having this in the kernel, rather than as a userspace
> daemon, is that you can now easily use the bus from the initrd, or up to
> the very end when the system shuts down.  On current userspace D-Bus,
> this is not really possible, as this requires passing the bus instance
> around between initrd and the "real" system.  Such a transition of all
> fds also requires keeping full state of what has already been read from
> the connection fds.  kdbus makes this much simpler, as we can change the
> ownership of the bus, just by passing one fd over from one part to the
> other.

Why do we want to start D-Bus inside the initramfs?


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-03 21:15                 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-03 21:15 UTC (permalink / raw)
  To: Richard Yao; +Cc: linux-kernel, linux-api

On Wed, Dec 03, 2014 at 04:22:33AM -0500, Richard Yao wrote:
> On 12/02/2014 12:26 PM, Greg Kroah-Hartman wrote:
> > On Tue, Dec 02, 2014 at 07:22:11AM -0500, Richard Yao wrote:
> >> Assuming that this dance succeeds, the FUSE process could then make a
> >> readonly file in itself, open it read only, unlink it, put the data into
> >> the file and send the file descriptor via UNIX domain socket while
> >> refusing further writes. If it has its own user/group, the file should
> >> be safe from prying eyes.
> >>
> >> This is not as good as a memfd and also suffers from the race that
> >> O_TMPFILE was meant to close, but it should be able to function as a
> >> decent fallback.
> > 
> > We can't knowingly create and advocate for broken code, sorry.
> >
> >> This would preserve portability across not only
> >> different versions of Linux, but also other POSIX systems.
> > 
> > I honestly do not care about any other system than Linux, so I don't see
> > why this would ever be an issue.
> 
> If you tie your userland to the most recent kernel and then want to
> bisect an old bug, you will have a problem.

Like you do today with any kernel feature that any userspace code relies
on, there's nothing new here that we haven't delt with for the past 20
years.

> You could try to find another userland that supports the older
> kernels, but it would be *much* easier if you could just use your
> current userland with it because then the bare minimum must change.
> Writing portable software is the way to do that.

I don't think I understand what you mean by "portable", please define it
better.

> >> Keeping the code in userspace would allow us to apply SELinux policies
> >> to it, which is something that we would lose if it were go to into the
> >> kernel.
> > 
> > On the contrary, the kdbusfs implementation gives you better security
> > model support than before, it ties directly into the LSM hooks, see the
> > add-on patches from some other developers that bring full support of LSM
> > to the codebase.
> 
> If a bug in kdbusfs that allows arbitrary code execution is exploited in
> the wild, would kdbus be more secure than a userland version?

s/kdbusfs/fuse/ if you want to make the same argument.  Sure, any kernel
feature comes with additional code and the normal worries about security
issues.  Being afraid of ever adding new features or code just because
_maybe_ there could be a problem there is a sure way to kill an entire
project.  A non-changing operating system is a dead operating system.

> > - performance: fewer process context switches, fewer copies, fewer
> >   syscalls, larger memory chunks via memfd.  This is really important
> >   for a whole class of userspace programs that are ported from other
> >   operating systems that are run on tiny ARM systems that rely on
> >   hundreds of thousands of messages passed at boot time, and at
> >   "critical" times in their user interaction loops.
> 
> What are some examples of these programs?

Lots of programs that previously ran on QNX have been ported to Linux
for a huge range of products (automotive has a bunch of these if you are
curious to look at them.)

> Are any of them examples of good software design?

"good software design" is in the eye of the beholder.

> > - security: the peers which communicate do not have to trust each other,
> >   as the only trustworthy compoenent in the game is the kernel which
> >   adds metadata and ensures that all data passed as payload is either
> >   copied or sealed, so that the receiver can parse the data without
> >   having to protect against changing memory while parsing buffers.
> 
> What keeps userspace from passing around memfds?

Nothing, userspace programs are already doing that, but memfds are not
the be-all-end-all, and actually are much slower for a large majority of
"normal" message sizes.

> > - more metadata can be attached to messages than in userspace
> 
> How much metadata can be attached in either case? is there some inherit
> aspect of the existing syscall API that prevents userspace from
> attaching more? Why do we want to attach more?

See the documentation for what can be attached, and yes, userspace can
not get access to lots of these things in a way that can not be
"spoofed", making the metadata pointless.

> > - being in the kernle closes a lot of races which can't be fixed with
> >   the current userspace solutions.  For example, with kdbus, there is a
> >   way a client can disconnect from a bus, but do so only if no further
> >   messages present in its queue, which is crucial for implementing
> >   race-free "exit-on-idle" services
> 
> Is the current dbus daemon not supporting this this only thing
> preventing us from doing it in userspace?

It's one of many things, as this list shows.

> > Of course, some of the bits above could be implemented in userspace
> > alone, for example with more sophisticated memory management APIs, but
> > this is usually done by losing out on the other details.  For example,
> > for many of the memory management APIs, it's hard to not require the
> > communicating peers to fully trust each other.  And we _really_ don't
> > want peers to have to trust each other.
> 
> Does being in the kernel solve this in a way that using memfds in
> userspace does not?

See above for why you can't use a memfd for "everything", you will slow
things down from what you have today if you were to attempt it.

> > Another benefit of having this in the kernel, rather than as a userspace
> > daemon, is that you can now easily use the bus from the initrd, or up to
> > the very end when the system shuts down.  On current userspace D-Bus,
> > this is not really possible, as this requires passing the bus instance
> > around between initrd and the "real" system.  Such a transition of all
> > fds also requires keeping full state of what has already been read from
> > the connection fds.  kdbus makes this much simpler, as we can change the
> > ownership of the bus, just by passing one fd over from one part to the
> > other.
> 
> Why do we want to start D-Bus inside the initramfs?

To allow the existing applications in the initramfs from having to
roll-their-own form of IPC like they have had to do so today.  Also it
allows a much easier way to transition out of the initramfs than we have
today.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Why not make kdbus use CUSE?
@ 2014-12-03 21:15                 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-12-03 21:15 UTC (permalink / raw)
  To: Richard Yao
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-api-u79uwXL29TY76Z2rM5mHXA

On Wed, Dec 03, 2014 at 04:22:33AM -0500, Richard Yao wrote:
> On 12/02/2014 12:26 PM, Greg Kroah-Hartman wrote:
> > On Tue, Dec 02, 2014 at 07:22:11AM -0500, Richard Yao wrote:
> >> Assuming that this dance succeeds, the FUSE process could then make a
> >> readonly file in itself, open it read only, unlink it, put the data into
> >> the file and send the file descriptor via UNIX domain socket while
> >> refusing further writes. If it has its own user/group, the file should
> >> be safe from prying eyes.
> >>
> >> This is not as good as a memfd and also suffers from the race that
> >> O_TMPFILE was meant to close, but it should be able to function as a
> >> decent fallback.
> > 
> > We can't knowingly create and advocate for broken code, sorry.
> >
> >> This would preserve portability across not only
> >> different versions of Linux, but also other POSIX systems.
> > 
> > I honestly do not care about any other system than Linux, so I don't see
> > why this would ever be an issue.
> 
> If you tie your userland to the most recent kernel and then want to
> bisect an old bug, you will have a problem.

Like you do today with any kernel feature that any userspace code relies
on, there's nothing new here that we haven't delt with for the past 20
years.

> You could try to find another userland that supports the older
> kernels, but it would be *much* easier if you could just use your
> current userland with it because then the bare minimum must change.
> Writing portable software is the way to do that.

I don't think I understand what you mean by "portable", please define it
better.

> >> Keeping the code in userspace would allow us to apply SELinux policies
> >> to it, which is something that we would lose if it were go to into the
> >> kernel.
> > 
> > On the contrary, the kdbusfs implementation gives you better security
> > model support than before, it ties directly into the LSM hooks, see the
> > add-on patches from some other developers that bring full support of LSM
> > to the codebase.
> 
> If a bug in kdbusfs that allows arbitrary code execution is exploited in
> the wild, would kdbus be more secure than a userland version?

s/kdbusfs/fuse/ if you want to make the same argument.  Sure, any kernel
feature comes with additional code and the normal worries about security
issues.  Being afraid of ever adding new features or code just because
_maybe_ there could be a problem there is a sure way to kill an entire
project.  A non-changing operating system is a dead operating system.

> > - performance: fewer process context switches, fewer copies, fewer
> >   syscalls, larger memory chunks via memfd.  This is really important
> >   for a whole class of userspace programs that are ported from other
> >   operating systems that are run on tiny ARM systems that rely on
> >   hundreds of thousands of messages passed at boot time, and at
> >   "critical" times in their user interaction loops.
> 
> What are some examples of these programs?

Lots of programs that previously ran on QNX have been ported to Linux
for a huge range of products (automotive has a bunch of these if you are
curious to look at them.)

> Are any of them examples of good software design?

"good software design" is in the eye of the beholder.

> > - security: the peers which communicate do not have to trust each other,
> >   as the only trustworthy compoenent in the game is the kernel which
> >   adds metadata and ensures that all data passed as payload is either
> >   copied or sealed, so that the receiver can parse the data without
> >   having to protect against changing memory while parsing buffers.
> 
> What keeps userspace from passing around memfds?

Nothing, userspace programs are already doing that, but memfds are not
the be-all-end-all, and actually are much slower for a large majority of
"normal" message sizes.

> > - more metadata can be attached to messages than in userspace
> 
> How much metadata can be attached in either case? is there some inherit
> aspect of the existing syscall API that prevents userspace from
> attaching more? Why do we want to attach more?

See the documentation for what can be attached, and yes, userspace can
not get access to lots of these things in a way that can not be
"spoofed", making the metadata pointless.

> > - being in the kernle closes a lot of races which can't be fixed with
> >   the current userspace solutions.  For example, with kdbus, there is a
> >   way a client can disconnect from a bus, but do so only if no further
> >   messages present in its queue, which is crucial for implementing
> >   race-free "exit-on-idle" services
> 
> Is the current dbus daemon not supporting this this only thing
> preventing us from doing it in userspace?

It's one of many things, as this list shows.

> > Of course, some of the bits above could be implemented in userspace
> > alone, for example with more sophisticated memory management APIs, but
> > this is usually done by losing out on the other details.  For example,
> > for many of the memory management APIs, it's hard to not require the
> > communicating peers to fully trust each other.  And we _really_ don't
> > want peers to have to trust each other.
> 
> Does being in the kernel solve this in a way that using memfds in
> userspace does not?

See above for why you can't use a memfd for "everything", you will slow
things down from what you have today if you were to attempt it.

> > Another benefit of having this in the kernel, rather than as a userspace
> > daemon, is that you can now easily use the bus from the initrd, or up to
> > the very end when the system shuts down.  On current userspace D-Bus,
> > this is not really possible, as this requires passing the bus instance
> > around between initrd and the "real" system.  Such a transition of all
> > fds also requires keeping full state of what has already been read from
> > the connection fds.  kdbus makes this much simpler, as we can change the
> > ownership of the bus, just by passing one fd over from one part to the
> > other.
> 
> Why do we want to start D-Bus inside the initramfs?

To allow the existing applications in the initramfs from having to
roll-their-own form of IPC like they have had to do so today.  Also it
allows a much easier way to transition out of the initramfs than we have
today.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-12-03 21:15 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-29  6:34 Why not make kdbus use CUSE? Richard Yao
2014-11-29  6:34 ` Richard Yao
2014-11-29 17:59 ` Greg Kroah-Hartman
2014-11-29 17:59   ` Greg Kroah-Hartman
2014-12-02  5:40   ` Richard Yao
2014-12-02  5:40     ` Richard Yao
2014-12-02  5:48     ` Greg Kroah-Hartman
2014-12-02  5:48       ` Greg Kroah-Hartman
2014-12-02  7:59       ` Richard Yao
2014-12-02 12:22         ` Richard Yao
2014-12-02 17:26           ` Greg Kroah-Hartman
2014-12-02 17:26             ` Greg Kroah-Hartman
2014-12-03  9:22             ` Richard Yao
2014-12-03  9:22               ` Richard Yao
2014-12-03 21:15               ` Greg Kroah-Hartman
2014-12-03 21:15                 ` Greg Kroah-Hartman
2014-12-02 17:12         ` Greg Kroah-Hartman
2014-12-02 17:12           ` Greg Kroah-Hartman
2014-12-01 14:23 ` One Thousand Gnomes
2014-12-01 14:23   ` One Thousand Gnomes
2014-12-02  4:31   ` Richard Yao
2014-12-02  4:31     ` Richard Yao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.