All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/13] Add kdbus implementation
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz

kdbus is a kernel-level IPC implementation that aims for resemblance to
the the protocol layer with the existing userspace D-Bus daemon while
enabling some features that couldn't be implemented before in userspace.

The documentation in the first patch in this series explains the
protocol and the API details.

This version has changed a lot since the first submission, based on the
review comments received, many thanks to everyone who took the time to
review the code and make suggestions.  Full details are below:

Reasons why this should be done in the kernel, instead of userspace as
it is currently done today include the following:

- performance: fewer process context switches, fewer copies, fewer
  syscalls, larger memory chunks via memfd.  This is really important
  for a whole class of userspace programs that are ported from other
  operating systems that are run on tiny ARM systems that rely on
  hundreds of thousands of messages passed at boot time, and at
  "critical" times in their user interaction loops.
- security: the peers which communicate do not have to trust each other,
  as the only trustworthy compoenent in the game is the kernel which
  adds metadata and ensures that all data passed as payload is either
  copied or sealed, so that the receiver can parse the data without
  having to protect against changing memory while parsing buffers.  Also,
  all the data transfer is controlled by the kernel, so that LSMs can
  track and control what is going on, without involving userspace.
  Because of the LSM issue, security people are much happier with this
  model than the current scheme of having to hook into dbus to mediate
  things.
- more metadata can be attached to messages than in userspace
- semantics for apps with heavy data payloads (media apps, for instance)
  with optinal priority message dequeuing, and global message ordering.
  Some "crazy" people are playing with using kdbus for audio data in the
  system.  I'm not saying that this is the best model for this, but
  until now, there wasn't any other way to do this without having to
  create custom "busses", one for each application library.
- being in the kernle closes a lot of races which can't be fixed with
  the current userspace solutions.  For example, with kdbus, there is a
  way a client can disconnect from a bus, but do so only if no further
  messages present in its queue, which is crucial for implementing
  race-free "exit-on-idle" services
- eavesdropping on the kernel level, so privileged users can hook into
  the message stream without hacking support for that into their
  userspace processes
- a number of smaller benefits: for example kdbus learned a way to peek
  full messages without dequeing them, which is really useful for
  logging metadata when handling bus-activation requests.

Of course, some of the bits above could be implemented in userspace
alone, for example with more sophisticated memory management APIs, but
this is usually done by losing out on the other details.  For example,
for many of the memory management APIs, it's hard to not require the
communicating peers to fully trust each other.  And we _really_ don't
want peers to have to trust each other.

Another benefit of having this in the kernel, rather than as a userspace
daemon, is that you can now easily use the bus from the initrd, or up to
the very end when the system shuts down.  On current userspace D-Bus,
this is not really possible, as this requires passing the bus instance
around between initrd and the "real" system.  Such a transition of all
fds also requires keeping full state of what has already been read from
the connection fds.  kdbus makes this much simpler, as we can change the
ownership of the bus, just by passing one fd over from one part to the
other.

Regarding binder: binder and kdbus follow very different design
concepts.  Binder implies the use of thread-pools to dispatch incoming
method calls.  This is a very efficient scheme, and completely natural
in programming languages like Java.  On most Linux programs, however,
there's a much stronger focus on central poll() loops that dispatch all
sources a program cares about.  kdbus is much more usable in such
environments, as it doesn't enforce a threading model, and it is happy
with serialized dispatching.  In fact, this major difference had an
effect on much of the design decisions: binder does not guarantee global
message ordering due to the parallel dispatching in the thread-pools,
but  kdbus does.  Moreover, there's also a difference in the way message
handling.  In kdbus, every message is basically taken and dispatched as
one blob, while in binder, continious connections to other peers are
created, which are then used to send messages on.  Hence, the models are
quite different, and they serve different needs.  I believe that the
D-Bus/kdbus model is more compatible and friendly with how Linux
programs are usually implemented.

This can also be found in a git tree, the kdbus branch of char-misc.git at:
        https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/

Changes since RFC v1:

  * Most notably, kdbus exposes its control files, buses and endpoints
    via an own file system now, called kdbusfs.

     * Each time a file system of this type is mounted, a new kdbus
       domain is created.

     * By default, kdbus is expected to be mounted in /sys/fs/kdbus

     * The layout inside each mount point is the same as before, except
       that domains are not hierarchically nested anymore.

     * Domains are therefore also unnamed now.

     * Unmounting a kdbusfs will automatically also destroy the
       associated domain.

     * Hence, the action of creating a kdbus domain is now as
       privileged as mounting a file system.

     * This way, we can get around creating dev nodes for everything,
       which is last but not least something that is not limited by
       20-bit minor numbers.

  * Rework the metadata attachment logic to address concerns raised by
    Andy Lutomirsky and Alan Cox:

     * Split the attach_flags in kdbus_cmd_hello into two parts,
       attach_flags_send and attach_flags_recv. Also, split the
       existing KDBUS_ITEM_ATTACH_FLAGS into
       KDBUS_ITEM_ATTACH_FLAGS_SEND and KDBUS_ITEM_ATTACH_FLAGS_RECV,
       and allow updating both connection details through
       KDBUS_CMD_CONN_UPDATE.

     * Only attach metadata to the final message in the receiver's pool
       if both the sender's attach_flags_send and the receiver's
       attach_flags_recv bit are set.

     * Add an optional metadata mask to the bus during its creation, so
       bus owners can denote their minimal requirements of metadata to
       be attached by connections of the bus.

  * Namespaces are now pinned by a domain at its creation time, and
    metadata items are automatically translated into these namespaces.
    Unless that cannot be done (currently only capabilities), in which
    case the items are dropped. For hide_pid enabled domains, drop all
    items except for such not revealing anything about the task.

  * Capabilities are now only checked at open() time, and the
    information is cached for the lifetime of a file descriptor.
    Reported by Eric W. Biederman, Andy Lutomirski and Thomas Gleixner.

  * Make functions that create new objects return the newly allocated
    memory directly, rather than in a referenced function arguments.
    That implies using ERR_PTR/PTR_ERR logic in many areas. Requested by
    Al Viro.

  * Rename two details in kdbus.h to not overload the term 'name' too
    much:

     KDBUS_ITEM_CONN_NAME	→ KDBUS_ITEM_CONN_DESCRIPTION
     KDBUS_ATTACH_CONN_NAME	→ KDBUS_ATTACH_CONN_DESCRIPTION

  * Documentation fixes, by Peter Meerwald and others.

  * Some memory leaks plugged, and another match test added, by
    Rui Miguel Silva

  * Per-user message count quota logic fixed, and new test added.
    By John de la Garza.

  * More test code for CONN_INFO ioctl

  * Added a kdbus_node object embedded by domains, endpoints and buses
    to track children in a generic way. A kdbus_node is always exposed
    as inode in kdbusfs.

  * Add a new attach flags constant called _KDBUS_ATTACH_ANY (~0)
    which automatically degrades to _KDBUS_ATTACH_ALL in the kernel.
    That way, old clients can opt-in for whethever newer kernels might
    offer to send.

  * Use #defines rather than an enum for the ioctl signatures, so when
    new ones are added, usespace can use #ifdeffery to determine the
    function set at compile time. Suggested by Arnd Bergmann.

  * Moved the driver to ipc/kdbus, as suggested by Arnd Bergmann.

Daniel Mack (13):
  kdbus: add documentation
  kdbus: add header file
  kdbus: add driver skeleton, ioctl entry points and utility functions
  kdbus: add connection pool implementation
  kdbus: add connection, queue handling and message validation code
  kdbus: add node and filesystem implementation
  kdbus: add code to gather metadata
  kdbus: add code for notifications and matches
  kdbus: add code for buses, domains and endpoints
  kdbus: add name registry implementation
  kdbus: add policy database implementation
  kdbus: add Makefile, Kconfig and MAINTAINERS entry
  kdbus: add selftests

 Documentation/ioctl/ioctl-number.txt             |    1 +
 Documentation/kdbus.txt                          | 1837 +++++++++++++++++++++
 MAINTAINERS                                      |   12 +
 include/uapi/linux/Kbuild                        |    1 +
 include/uapi/linux/kdbus.h                       |  933 +++++++++++
 include/uapi/linux/magic.h                       |    1 +
 init/Kconfig                                     |   12 +
 ipc/Makefile                                     |    2 +-
 ipc/kdbus/Makefile                               |   21 +
 ipc/kdbus/bus.c                                  |  459 ++++++
 ipc/kdbus/bus.h                                  |   98 ++
 ipc/kdbus/connection.c                           | 1838 ++++++++++++++++++++++
 ipc/kdbus/connection.h                           |  188 +++
 ipc/kdbus/domain.c                               |  349 ++++
 ipc/kdbus/domain.h                               |   84 +
 ipc/kdbus/endpoint.c                             |  497 ++++++
 ipc/kdbus/endpoint.h                             |   91 ++
 ipc/kdbus/fs.c                                   |  417 +++++
 ipc/kdbus/fs.h                                   |   22 +
 ipc/kdbus/handle.c                               |  993 ++++++++++++
 ipc/kdbus/handle.h                               |   20 +
 ipc/kdbus/item.c                                 |  258 +++
 ipc/kdbus/item.h                                 |   41 +
 ipc/kdbus/limits.h                               |   77 +
 ipc/kdbus/main.c                                 |   59 +
 ipc/kdbus/match.c                                |  524 ++++++
 ipc/kdbus/match.h                                |   31 +
 ipc/kdbus/message.c                              |  444 ++++++
 ipc/kdbus/message.h                              |   75 +
 ipc/kdbus/metadata.c                             |  698 ++++++++
 ipc/kdbus/metadata.h                             |   38 +
 ipc/kdbus/names.c                                |  921 +++++++++++
 ipc/kdbus/names.h                                |   81 +
 ipc/kdbus/node.c                                 |  872 ++++++++++
 ipc/kdbus/node.h                                 |   86 +
 ipc/kdbus/notify.c                               |  235 +++
 ipc/kdbus/notify.h                               |   29 +
 ipc/kdbus/policy.c                               |  629 ++++++++
 ipc/kdbus/policy.h                               |   61 +
 ipc/kdbus/pool.c                                 |  722 +++++++++
 ipc/kdbus/pool.h                                 |   44 +
 ipc/kdbus/queue.c                                |  608 +++++++
 ipc/kdbus/queue.h                                |   93 ++
 ipc/kdbus/util.c                                 |  166 ++
 ipc/kdbus/util.h                                 |  103 ++
 tools/testing/selftests/Makefile                 |    1 +
 tools/testing/selftests/kdbus/.gitignore         |   11 +
 tools/testing/selftests/kdbus/Makefile           |   45 +
 tools/testing/selftests/kdbus/kdbus-enum.c       |   94 ++
 tools/testing/selftests/kdbus/kdbus-enum.h       |   14 +
 tools/testing/selftests/kdbus/kdbus-test.c       |  546 +++++++
 tools/testing/selftests/kdbus/kdbus-test.h       |   81 +
 tools/testing/selftests/kdbus/kdbus-util.c       | 1240 +++++++++++++++
 tools/testing/selftests/kdbus/kdbus-util.h       |  143 ++
 tools/testing/selftests/kdbus/test-activator.c   |  317 ++++
 tools/testing/selftests/kdbus/test-benchmark.c   |  409 +++++
 tools/testing/selftests/kdbus/test-bus.c         |  130 ++
 tools/testing/selftests/kdbus/test-chat.c        |  123 ++
 tools/testing/selftests/kdbus/test-connection.c  |  501 ++++++
 tools/testing/selftests/kdbus/test-daemon.c      |   66 +
 tools/testing/selftests/kdbus/test-endpoint.c    |  221 +++
 tools/testing/selftests/kdbus/test-fd.c          |  664 ++++++++
 tools/testing/selftests/kdbus/test-free.c        |   34 +
 tools/testing/selftests/kdbus/test-match.c       |  437 +++++
 tools/testing/selftests/kdbus/test-message.c     |  371 +++++
 tools/testing/selftests/kdbus/test-metadata-ns.c |  258 +++
 tools/testing/selftests/kdbus/test-monitor.c     |  156 ++
 tools/testing/selftests/kdbus/test-names.c       |  184 +++
 tools/testing/selftests/kdbus/test-policy-ns.c   |  622 ++++++++
 tools/testing/selftests/kdbus/test-policy-priv.c | 1168 ++++++++++++++
 tools/testing/selftests/kdbus/test-policy.c      |   81 +
 tools/testing/selftests/kdbus/test-race.c        |  313 ++++
 tools/testing/selftests/kdbus/test-sync.c        |  241 +++
 tools/testing/selftests/kdbus/test-timeout.c     |   97 ++
 74 files changed, 23338 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/kdbus.txt
 create mode 100644 include/uapi/linux/kdbus.h
 create mode 100644 ipc/kdbus/Makefile
 create mode 100644 ipc/kdbus/bus.c
 create mode 100644 ipc/kdbus/bus.h
 create mode 100644 ipc/kdbus/connection.c
 create mode 100644 ipc/kdbus/connection.h
 create mode 100644 ipc/kdbus/domain.c
 create mode 100644 ipc/kdbus/domain.h
 create mode 100644 ipc/kdbus/endpoint.c
 create mode 100644 ipc/kdbus/endpoint.h
 create mode 100644 ipc/kdbus/fs.c
 create mode 100644 ipc/kdbus/fs.h
 create mode 100644 ipc/kdbus/handle.c
 create mode 100644 ipc/kdbus/handle.h
 create mode 100644 ipc/kdbus/item.c
 create mode 100644 ipc/kdbus/item.h
 create mode 100644 ipc/kdbus/limits.h
 create mode 100644 ipc/kdbus/main.c
 create mode 100644 ipc/kdbus/match.c
 create mode 100644 ipc/kdbus/match.h
 create mode 100644 ipc/kdbus/message.c
 create mode 100644 ipc/kdbus/message.h
 create mode 100644 ipc/kdbus/metadata.c
 create mode 100644 ipc/kdbus/metadata.h
 create mode 100644 ipc/kdbus/names.c
 create mode 100644 ipc/kdbus/names.h
 create mode 100644 ipc/kdbus/node.c
 create mode 100644 ipc/kdbus/node.h
 create mode 100644 ipc/kdbus/notify.c
 create mode 100644 ipc/kdbus/notify.h
 create mode 100644 ipc/kdbus/policy.c
 create mode 100644 ipc/kdbus/policy.h
 create mode 100644 ipc/kdbus/pool.c
 create mode 100644 ipc/kdbus/pool.h
 create mode 100644 ipc/kdbus/queue.c
 create mode 100644 ipc/kdbus/queue.h
 create mode 100644 ipc/kdbus/util.c
 create mode 100644 ipc/kdbus/util.h
 create mode 100644 tools/testing/selftests/kdbus/.gitignore
 create mode 100644 tools/testing/selftests/kdbus/Makefile
 create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.h
 create mode 100644 tools/testing/selftests/kdbus/kdbus-test.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-test.h
 create mode 100644 tools/testing/selftests/kdbus/kdbus-util.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-util.h
 create mode 100644 tools/testing/selftests/kdbus/test-activator.c
 create mode 100644 tools/testing/selftests/kdbus/test-benchmark.c
 create mode 100644 tools/testing/selftests/kdbus/test-bus.c
 create mode 100644 tools/testing/selftests/kdbus/test-chat.c
 create mode 100644 tools/testing/selftests/kdbus/test-connection.c
 create mode 100644 tools/testing/selftests/kdbus/test-daemon.c
 create mode 100644 tools/testing/selftests/kdbus/test-endpoint.c
 create mode 100644 tools/testing/selftests/kdbus/test-fd.c
 create mode 100644 tools/testing/selftests/kdbus/test-free.c
 create mode 100644 tools/testing/selftests/kdbus/test-match.c
 create mode 100644 tools/testing/selftests/kdbus/test-message.c
 create mode 100644 tools/testing/selftests/kdbus/test-metadata-ns.c
 create mode 100644 tools/testing/selftests/kdbus/test-monitor.c
 create mode 100644 tools/testing/selftests/kdbus/test-names.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy-ns.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy-priv.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy.c
 create mode 100644 tools/testing/selftests/kdbus/test-race.c
 create mode 100644 tools/testing/selftests/kdbus/test-sync.c
 create mode 100644 tools/testing/selftests/kdbus/test-timeout.c


^ permalink raw reply	[flat|nested] 73+ messages in thread

* [PATCH v2 00/13] Add kdbus implementation
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd-r2nGTMty4D4, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A

kdbus is a kernel-level IPC implementation that aims for resemblance to
the the protocol layer with the existing userspace D-Bus daemon while
enabling some features that couldn't be implemented before in userspace.

The documentation in the first patch in this series explains the
protocol and the API details.

This version has changed a lot since the first submission, based on the
review comments received, many thanks to everyone who took the time to
review the code and make suggestions.  Full details are below:

Reasons why this should be done in the kernel, instead of userspace as
it is currently done today include the following:

- performance: fewer process context switches, fewer copies, fewer
  syscalls, larger memory chunks via memfd.  This is really important
  for a whole class of userspace programs that are ported from other
  operating systems that are run on tiny ARM systems that rely on
  hundreds of thousands of messages passed at boot time, and at
  "critical" times in their user interaction loops.
- security: the peers which communicate do not have to trust each other,
  as the only trustworthy compoenent in the game is the kernel which
  adds metadata and ensures that all data passed as payload is either
  copied or sealed, so that the receiver can parse the data without
  having to protect against changing memory while parsing buffers.  Also,
  all the data transfer is controlled by the kernel, so that LSMs can
  track and control what is going on, without involving userspace.
  Because of the LSM issue, security people are much happier with this
  model than the current scheme of having to hook into dbus to mediate
  things.
- more metadata can be attached to messages than in userspace
- semantics for apps with heavy data payloads (media apps, for instance)
  with optinal priority message dequeuing, and global message ordering.
  Some "crazy" people are playing with using kdbus for audio data in the
  system.  I'm not saying that this is the best model for this, but
  until now, there wasn't any other way to do this without having to
  create custom "busses", one for each application library.
- being in the kernle closes a lot of races which can't be fixed with
  the current userspace solutions.  For example, with kdbus, there is a
  way a client can disconnect from a bus, but do so only if no further
  messages present in its queue, which is crucial for implementing
  race-free "exit-on-idle" services
- eavesdropping on the kernel level, so privileged users can hook into
  the message stream without hacking support for that into their
  userspace processes
- a number of smaller benefits: for example kdbus learned a way to peek
  full messages without dequeing them, which is really useful for
  logging metadata when handling bus-activation requests.

Of course, some of the bits above could be implemented in userspace
alone, for example with more sophisticated memory management APIs, but
this is usually done by losing out on the other details.  For example,
for many of the memory management APIs, it's hard to not require the
communicating peers to fully trust each other.  And we _really_ don't
want peers to have to trust each other.

Another benefit of having this in the kernel, rather than as a userspace
daemon, is that you can now easily use the bus from the initrd, or up to
the very end when the system shuts down.  On current userspace D-Bus,
this is not really possible, as this requires passing the bus instance
around between initrd and the "real" system.  Such a transition of all
fds also requires keeping full state of what has already been read from
the connection fds.  kdbus makes this much simpler, as we can change the
ownership of the bus, just by passing one fd over from one part to the
other.

Regarding binder: binder and kdbus follow very different design
concepts.  Binder implies the use of thread-pools to dispatch incoming
method calls.  This is a very efficient scheme, and completely natural
in programming languages like Java.  On most Linux programs, however,
there's a much stronger focus on central poll() loops that dispatch all
sources a program cares about.  kdbus is much more usable in such
environments, as it doesn't enforce a threading model, and it is happy
with serialized dispatching.  In fact, this major difference had an
effect on much of the design decisions: binder does not guarantee global
message ordering due to the parallel dispatching in the thread-pools,
but  kdbus does.  Moreover, there's also a difference in the way message
handling.  In kdbus, every message is basically taken and dispatched as
one blob, while in binder, continious connections to other peers are
created, which are then used to send messages on.  Hence, the models are
quite different, and they serve different needs.  I believe that the
D-Bus/kdbus model is more compatible and friendly with how Linux
programs are usually implemented.

This can also be found in a git tree, the kdbus branch of char-misc.git at:
        https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/

Changes since RFC v1:

  * Most notably, kdbus exposes its control files, buses and endpoints
    via an own file system now, called kdbusfs.

     * Each time a file system of this type is mounted, a new kdbus
       domain is created.

     * By default, kdbus is expected to be mounted in /sys/fs/kdbus

     * The layout inside each mount point is the same as before, except
       that domains are not hierarchically nested anymore.

     * Domains are therefore also unnamed now.

     * Unmounting a kdbusfs will automatically also destroy the
       associated domain.

     * Hence, the action of creating a kdbus domain is now as
       privileged as mounting a file system.

     * This way, we can get around creating dev nodes for everything,
       which is last but not least something that is not limited by
       20-bit minor numbers.

  * Rework the metadata attachment logic to address concerns raised by
    Andy Lutomirsky and Alan Cox:

     * Split the attach_flags in kdbus_cmd_hello into two parts,
       attach_flags_send and attach_flags_recv. Also, split the
       existing KDBUS_ITEM_ATTACH_FLAGS into
       KDBUS_ITEM_ATTACH_FLAGS_SEND and KDBUS_ITEM_ATTACH_FLAGS_RECV,
       and allow updating both connection details through
       KDBUS_CMD_CONN_UPDATE.

     * Only attach metadata to the final message in the receiver's pool
       if both the sender's attach_flags_send and the receiver's
       attach_flags_recv bit are set.

     * Add an optional metadata mask to the bus during its creation, so
       bus owners can denote their minimal requirements of metadata to
       be attached by connections of the bus.

  * Namespaces are now pinned by a domain at its creation time, and
    metadata items are automatically translated into these namespaces.
    Unless that cannot be done (currently only capabilities), in which
    case the items are dropped. For hide_pid enabled domains, drop all
    items except for such not revealing anything about the task.

  * Capabilities are now only checked at open() time, and the
    information is cached for the lifetime of a file descriptor.
    Reported by Eric W. Biederman, Andy Lutomirski and Thomas Gleixner.

  * Make functions that create new objects return the newly allocated
    memory directly, rather than in a referenced function arguments.
    That implies using ERR_PTR/PTR_ERR logic in many areas. Requested by
    Al Viro.

  * Rename two details in kdbus.h to not overload the term 'name' too
    much:

     KDBUS_ITEM_CONN_NAME	→ KDBUS_ITEM_CONN_DESCRIPTION
     KDBUS_ATTACH_CONN_NAME	→ KDBUS_ATTACH_CONN_DESCRIPTION

  * Documentation fixes, by Peter Meerwald and others.

  * Some memory leaks plugged, and another match test added, by
    Rui Miguel Silva

  * Per-user message count quota logic fixed, and new test added.
    By John de la Garza.

  * More test code for CONN_INFO ioctl

  * Added a kdbus_node object embedded by domains, endpoints and buses
    to track children in a generic way. A kdbus_node is always exposed
    as inode in kdbusfs.

  * Add a new attach flags constant called _KDBUS_ATTACH_ANY (~0)
    which automatically degrades to _KDBUS_ATTACH_ALL in the kernel.
    That way, old clients can opt-in for whethever newer kernels might
    offer to send.

  * Use #defines rather than an enum for the ioctl signatures, so when
    new ones are added, usespace can use #ifdeffery to determine the
    function set at compile time. Suggested by Arnd Bergmann.

  * Moved the driver to ipc/kdbus, as suggested by Arnd Bergmann.

Daniel Mack (13):
  kdbus: add documentation
  kdbus: add header file
  kdbus: add driver skeleton, ioctl entry points and utility functions
  kdbus: add connection pool implementation
  kdbus: add connection, queue handling and message validation code
  kdbus: add node and filesystem implementation
  kdbus: add code to gather metadata
  kdbus: add code for notifications and matches
  kdbus: add code for buses, domains and endpoints
  kdbus: add name registry implementation
  kdbus: add policy database implementation
  kdbus: add Makefile, Kconfig and MAINTAINERS entry
  kdbus: add selftests

 Documentation/ioctl/ioctl-number.txt             |    1 +
 Documentation/kdbus.txt                          | 1837 +++++++++++++++++++++
 MAINTAINERS                                      |   12 +
 include/uapi/linux/Kbuild                        |    1 +
 include/uapi/linux/kdbus.h                       |  933 +++++++++++
 include/uapi/linux/magic.h                       |    1 +
 init/Kconfig                                     |   12 +
 ipc/Makefile                                     |    2 +-
 ipc/kdbus/Makefile                               |   21 +
 ipc/kdbus/bus.c                                  |  459 ++++++
 ipc/kdbus/bus.h                                  |   98 ++
 ipc/kdbus/connection.c                           | 1838 ++++++++++++++++++++++
 ipc/kdbus/connection.h                           |  188 +++
 ipc/kdbus/domain.c                               |  349 ++++
 ipc/kdbus/domain.h                               |   84 +
 ipc/kdbus/endpoint.c                             |  497 ++++++
 ipc/kdbus/endpoint.h                             |   91 ++
 ipc/kdbus/fs.c                                   |  417 +++++
 ipc/kdbus/fs.h                                   |   22 +
 ipc/kdbus/handle.c                               |  993 ++++++++++++
 ipc/kdbus/handle.h                               |   20 +
 ipc/kdbus/item.c                                 |  258 +++
 ipc/kdbus/item.h                                 |   41 +
 ipc/kdbus/limits.h                               |   77 +
 ipc/kdbus/main.c                                 |   59 +
 ipc/kdbus/match.c                                |  524 ++++++
 ipc/kdbus/match.h                                |   31 +
 ipc/kdbus/message.c                              |  444 ++++++
 ipc/kdbus/message.h                              |   75 +
 ipc/kdbus/metadata.c                             |  698 ++++++++
 ipc/kdbus/metadata.h                             |   38 +
 ipc/kdbus/names.c                                |  921 +++++++++++
 ipc/kdbus/names.h                                |   81 +
 ipc/kdbus/node.c                                 |  872 ++++++++++
 ipc/kdbus/node.h                                 |   86 +
 ipc/kdbus/notify.c                               |  235 +++
 ipc/kdbus/notify.h                               |   29 +
 ipc/kdbus/policy.c                               |  629 ++++++++
 ipc/kdbus/policy.h                               |   61 +
 ipc/kdbus/pool.c                                 |  722 +++++++++
 ipc/kdbus/pool.h                                 |   44 +
 ipc/kdbus/queue.c                                |  608 +++++++
 ipc/kdbus/queue.h                                |   93 ++
 ipc/kdbus/util.c                                 |  166 ++
 ipc/kdbus/util.h                                 |  103 ++
 tools/testing/selftests/Makefile                 |    1 +
 tools/testing/selftests/kdbus/.gitignore         |   11 +
 tools/testing/selftests/kdbus/Makefile           |   45 +
 tools/testing/selftests/kdbus/kdbus-enum.c       |   94 ++
 tools/testing/selftests/kdbus/kdbus-enum.h       |   14 +
 tools/testing/selftests/kdbus/kdbus-test.c       |  546 +++++++
 tools/testing/selftests/kdbus/kdbus-test.h       |   81 +
 tools/testing/selftests/kdbus/kdbus-util.c       | 1240 +++++++++++++++
 tools/testing/selftests/kdbus/kdbus-util.h       |  143 ++
 tools/testing/selftests/kdbus/test-activator.c   |  317 ++++
 tools/testing/selftests/kdbus/test-benchmark.c   |  409 +++++
 tools/testing/selftests/kdbus/test-bus.c         |  130 ++
 tools/testing/selftests/kdbus/test-chat.c        |  123 ++
 tools/testing/selftests/kdbus/test-connection.c  |  501 ++++++
 tools/testing/selftests/kdbus/test-daemon.c      |   66 +
 tools/testing/selftests/kdbus/test-endpoint.c    |  221 +++
 tools/testing/selftests/kdbus/test-fd.c          |  664 ++++++++
 tools/testing/selftests/kdbus/test-free.c        |   34 +
 tools/testing/selftests/kdbus/test-match.c       |  437 +++++
 tools/testing/selftests/kdbus/test-message.c     |  371 +++++
 tools/testing/selftests/kdbus/test-metadata-ns.c |  258 +++
 tools/testing/selftests/kdbus/test-monitor.c     |  156 ++
 tools/testing/selftests/kdbus/test-names.c       |  184 +++
 tools/testing/selftests/kdbus/test-policy-ns.c   |  622 ++++++++
 tools/testing/selftests/kdbus/test-policy-priv.c | 1168 ++++++++++++++
 tools/testing/selftests/kdbus/test-policy.c      |   81 +
 tools/testing/selftests/kdbus/test-race.c        |  313 ++++
 tools/testing/selftests/kdbus/test-sync.c        |  241 +++
 tools/testing/selftests/kdbus/test-timeout.c     |   97 ++
 74 files changed, 23338 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/kdbus.txt
 create mode 100644 include/uapi/linux/kdbus.h
 create mode 100644 ipc/kdbus/Makefile
 create mode 100644 ipc/kdbus/bus.c
 create mode 100644 ipc/kdbus/bus.h
 create mode 100644 ipc/kdbus/connection.c
 create mode 100644 ipc/kdbus/connection.h
 create mode 100644 ipc/kdbus/domain.c
 create mode 100644 ipc/kdbus/domain.h
 create mode 100644 ipc/kdbus/endpoint.c
 create mode 100644 ipc/kdbus/endpoint.h
 create mode 100644 ipc/kdbus/fs.c
 create mode 100644 ipc/kdbus/fs.h
 create mode 100644 ipc/kdbus/handle.c
 create mode 100644 ipc/kdbus/handle.h
 create mode 100644 ipc/kdbus/item.c
 create mode 100644 ipc/kdbus/item.h
 create mode 100644 ipc/kdbus/limits.h
 create mode 100644 ipc/kdbus/main.c
 create mode 100644 ipc/kdbus/match.c
 create mode 100644 ipc/kdbus/match.h
 create mode 100644 ipc/kdbus/message.c
 create mode 100644 ipc/kdbus/message.h
 create mode 100644 ipc/kdbus/metadata.c
 create mode 100644 ipc/kdbus/metadata.h
 create mode 100644 ipc/kdbus/names.c
 create mode 100644 ipc/kdbus/names.h
 create mode 100644 ipc/kdbus/node.c
 create mode 100644 ipc/kdbus/node.h
 create mode 100644 ipc/kdbus/notify.c
 create mode 100644 ipc/kdbus/notify.h
 create mode 100644 ipc/kdbus/policy.c
 create mode 100644 ipc/kdbus/policy.h
 create mode 100644 ipc/kdbus/pool.c
 create mode 100644 ipc/kdbus/pool.h
 create mode 100644 ipc/kdbus/queue.c
 create mode 100644 ipc/kdbus/queue.h
 create mode 100644 ipc/kdbus/util.c
 create mode 100644 ipc/kdbus/util.h
 create mode 100644 tools/testing/selftests/kdbus/.gitignore
 create mode 100644 tools/testing/selftests/kdbus/Makefile
 create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.h
 create mode 100644 tools/testing/selftests/kdbus/kdbus-test.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-test.h
 create mode 100644 tools/testing/selftests/kdbus/kdbus-util.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-util.h
 create mode 100644 tools/testing/selftests/kdbus/test-activator.c
 create mode 100644 tools/testing/selftests/kdbus/test-benchmark.c
 create mode 100644 tools/testing/selftests/kdbus/test-bus.c
 create mode 100644 tools/testing/selftests/kdbus/test-chat.c
 create mode 100644 tools/testing/selftests/kdbus/test-connection.c
 create mode 100644 tools/testing/selftests/kdbus/test-daemon.c
 create mode 100644 tools/testing/selftests/kdbus/test-endpoint.c
 create mode 100644 tools/testing/selftests/kdbus/test-fd.c
 create mode 100644 tools/testing/selftests/kdbus/test-free.c
 create mode 100644 tools/testing/selftests/kdbus/test-match.c
 create mode 100644 tools/testing/selftests/kdbus/test-message.c
 create mode 100644 tools/testing/selftests/kdbus/test-metadata-ns.c
 create mode 100644 tools/testing/selftests/kdbus/test-monitor.c
 create mode 100644 tools/testing/selftests/kdbus/test-names.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy-ns.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy-priv.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy.c
 create mode 100644 tools/testing/selftests/kdbus/test-race.c
 create mode 100644 tools/testing/selftests/kdbus/test-sync.c
 create mode 100644 tools/testing/selftests/kdbus/test-timeout.c

^ permalink raw reply	[flat|nested] 73+ messages in thread

* kdbus: add documentation
  2014-11-21  5:02 ` Greg Kroah-Hartman
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  2014-11-21  8:29   ` Harald Hoyer
                     ` (3 more replies)
  -1 siblings, 4 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

kdbus is a system for low-latency, low-overhead, easy to use
interprocess communication (IPC).

The interface to all functions in this driver is implemented through
ioctls on files exposed through the mount point of a kdbusfs.  This
patch adds detailed documentation about the kernel level API design.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/kdbus.txt | 1837 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1837 insertions(+)
 create mode 100644 Documentation/kdbus.txt

diff --git a/Documentation/kdbus.txt b/Documentation/kdbus.txt
new file mode 100644
index 000000000000..2bd7277ef179
--- /dev/null
+++ b/Documentation/kdbus.txt
@@ -0,0 +1,1837 @@
+D-Bus is a system for powerful, easy to use interprocess communication (IPC).
+
+The focus of this document is an overview of the low-level, native kernel D-Bus
+transport called kdbus. Kdbus exposes its functionality via files in a
+filesystem called 'kdbusfs'. All communication between processes takes place
+via ioctls on files exposed through the mount point of a kdbusfs. The default
+mount point of kdbusfs is /sys/fs/kdbus.
+
+For the general D-Bus protocol specification, the payload format, the
+marshaling, and the communication semantics, please refer to:
+  http://dbus.freedesktop.org/doc/dbus-specification.html
+
+For a kdbus specific userspace library implementation please refer to:
+  http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-bus.h
+
+Articles about D-Bus and kdbus:
+  http://lwn.net/Articles/580194/
+
+
+1. Terminology
+===============================================================================
+
+  Domain:
+    A domain is created each time a kdbusfs is mounted. Each process that is
+    capable to mount a new instance of a kdbusfs will have its own kdbus
+    hierarchy. Each domain (ie, each mount point) offers its own "control"
+    file to create new buses. Domains have no connection to each other and
+    cannot see nor talk to each other. See section 5 for more details.
+
+  Bus:
+    A bus is a named object inside a domain. Clients exchange messages
+    over a bus. Multiple buses themselves have no connection to each other;
+    messages can only be exchanged on the same bus. The default entry point to
+    a bus, where clients establish the connection to, is the "bus" file
+    /sys/fs/kdbus/<bus name>/bus.
+    Common operating system setups create one "system bus" per system, and one
+    "user bus" for every logged-in user. Applications or services may create
+    their own private named buses. See section 5 for more details.
+
+  Endpoint:
+    An endpoint provides the file to talk to a bus. Opening an endpoint
+    creates a new connection to the bus to which the endpoint belongs.
+    Inside the directory of the bus, every bus has a default endpoint
+    called "bus". A bus can optionally offer additional endpoints with
+    custom names to provide restricted access to the bus. Custom endpoints
+    carry additional policy which can be used to create sandboxes with a
+    locked-down, limited, filtered access to a bus.  See section 5 for
+    more details.
+
+  Connection:
+    A connection to a bus is created by opening an endpoint file of inside a
+    bus' folder and becoming an active client with the HELLO exchange. Every
+    connected client connection has a unique identifier on the bus and can
+    address messages to every other connection on the same bus by using
+    the peer's connection id as the destination.
+    See section 6 for more details.
+
+  Pool:
+    Each connection allocates a piece of shmem-backed memory that is used
+    to receive messages and answers to ioctl command from the kernel. It is
+    never used to send anything to the kernel. In order to access that memory,
+    userspace must mmap() it into its task.
+    See section 12 for more details.
+
+  Well-known Name:
+    A connection can, in addition to its implicit unique connection id, request
+    the ownership of a textual well-known name. Well-known names are noted in
+    reverse-domain notation, such as com.example.service1. Connections offering
+    a service on a bus are usually reached by its well-known name. The analogy
+    of connection id and well-known name is an IP address and a DNS name
+    associated with that address.
+
+  Message:
+    Connections can exchange messages with other connections by addressing
+    the peers with their connection id or well-known name. A message consists
+    of a message header with kernel-specific information on how to route the
+    message, and the message payload, which is a logical byte stream of
+    arbitrary size. Messages can carry additional file descriptors to be passed
+    from one connection to another. Every connection can specify which set of
+    metadata the kernel should attach to the message when it is delivered
+    to the receiving connection. Metadata contains information like: system
+    timestamps, uid, gid, tid, proc-starttime, well-known-names, process comm,
+    process exe, process argv, cgroup, capabilities, seclabel, audit session,
+    loginuid and the connection's human-readable name.
+    See section 7 and 13 for more details.
+
+  Item:
+    The API of kdbus implements a notion of items, submitted through and
+    returned by most ioctls, and stored inside data structures in the
+    connection's pool. See section 4 for more details.
+
+  Broadcast and Match:
+    Broadcast messages are potentially sent to all connections of a bus. By
+    default, the connections will not actually receive any of the sent
+    broadcast messages; only after installing a match for specific message
+    properties, a broadcast message passes this filter.
+    See section 10 for more details.
+
+  Policy:
+    A policy is a set of rules that define which connections can see, talk to,
+    or register a well-know name on the bus. A policy is attached to buses and
+    custom endpoints, and modified by policy holder connection or owners of
+    custom endpoints. See section 11 for more details.
+
+    Access rules to allow who can see a name on the bus are only checked on
+    custom endpoints. Policies may be defined with names that end with '.*'.
+    When matching a well-known name against such a wildcard entry, the last
+    part of the name is ignored and checked against the wildcard name without
+    the trailing '.*'. See section 11 for more details.
+
+  Privileged bus users:
+    A user connecting to the bus is considered privileged if it is either the
+    creator of the bus, or if it has the CAP_IPC_OWNER capability flag set.
+
+
+2. Control Files Layout
+===============================================================================
+
+The kdbus interface is exposed through files in its kdbusfs mount point
+(defaults to /sys/fs/kdbus):
+
+  /sys/fs/kdbus
+  |-- control
+  |-- 0-system
+  |   |-- bus
+  |   `-- ep.apache
+  |-- 1000-user
+  |   `-- bus
+  `-- 2702-user
+      |-- bus
+      `-- ep.app
+
+
+3. Data Structures and flags
+===============================================================================
+
+3.1 Data structures and interconnections
+----------------------------------------
+
+  +--------------------------------------------------------------------------+
+  | Domain (Mount Point)                                                     |
+  | /sys/fs/kdbus/control                                                    |
+  | +----------------------------------------------------------------------+ |
+  | | Bus (System Bus)                                                     | |
+  | | /sys/fs/kdbus/0-system/                                              | |
+  | | +-------------------------------+ +--------------------------------+ | |
+  | | | Endpoint                      | | Endpoint                       | | |
+  | | | /sys/fs/kdbus/0-system/bus    | | /sys/fs/kdbus/0-system/ep.app  | | |
+  | | +-------------------------------+ +--------------------------------+ | |
+  | | +--------------+ +--------------+ +--------------+ +---------------+ | |
+  | | | Connection   | | Connection   | | Connection   | | Connection    | | |
+  | | | :1.22        | | :1.25        | | :1.55        | | :1.81         | | |
+  | | +--------------+ +--------------+ +--------------+ +---------------+ | |
+  | +----------------------------------------------------------------------+ |
+  |                                                                          |
+  | +----------------------------------------------------------------------+ |
+  | | Bus (User Bus for UID 2702)                                          | |
+  | | /sys/fs/kdbus/2702-user/                                             | |
+  | | +-------------------------------+ +--------------------------------+ | |
+  | | | Endpoint                      | | Endpoint                       | | |
+  | | | /sys/fs/kdbus/2702-user/bus   | | /sys/fs/kdbus/2702-user/ep.app | | |
+  | | +-------------------------------+ +--------------------------------+ | |
+  | | +--------------+ +--------------+ +--------------+ +---------------+ | |
+  | | | Connection   | | Connection   | | Connection   | | Connection    | | |
+  | | | :1.22        | | :1.25        | | :1.55        | | :1.81         | | |
+  | | +--------------+ +--------------+ +--------------------------------+ | |
+  | +----------------------------------------------------------------------+ |
+  +--------------------------------------------------------------------------+
+
+The above description uses the D-Bus notation of unique connection names that
+adds a ":1." prefix to the connection's unique ID. kdbus itself doesn't
+use that notation, neither internally nor externally. However, libraries and
+other usespace code that aims for compatibility to D-Bus might.
+
+3.2 Flags
+---------
+
+All ioctls used in the communication with the driver contain two 64-bit fields,
+'flags' and 'kernel_flags'. In 'flags', the behavior of the command can be
+tweaked, whereas in 'kernel_flags', the kernel driver writes back the mask of
+supported bits upon each call, and sets the KDBUS_FLAGS_KERNEL bit. This is a
+way to probe possible kernel features and make code forward and backward
+compatible.
+
+All bits that are not recognized by the kernel in 'flags' are rejected, and the
+ioctl fails with -EINVAL.
+
+
+4. Items
+===============================================================================
+
+To flexibly augment transport structures used by kdbus, data blobs of type
+struct kdbus_item are used. An item has a fixed-sized header that only stores
+the type of the item and the overall size. The total size is variable and is
+in some cases defined by the item type, in other cases, they can be of
+arbitrary length (for instance, a string).
+
+In the external kernel API, items are used for many ioctls to transport
+optional information from userspace to kernelspace. They are also used for
+information stored in a connection's pool, such as messages, name lists or
+requested connection information.
+
+In all such occasions where items are used as part of the kdbus kernel API,
+they are embedded in structs that have an overall size of their own, so there
+can be many of them.
+
+The kernel expects all items to be aligned to 8-byte boundaries.
+
+A simple iterator in userspace would iterate over the items until the items
+have reached the embedding structure's overall size. An example implementation
+of such an iterator can be found in tools/testing/selftests/kdbus/kdbus-util.h.
+
+
+5. Creation of new domains, buses and endpoints
+===============================================================================
+
+The initial kdbus domain is unconditionally created by the kernel module. A
+domain contains a "control" file which allows to create a new bus. New domains
+(mount points) do not have any buses created by default.
+
+
+5.1 Buses
+---------
+
+Opening the control file returns a file descriptor which accepts the
+KDBUS_CMD_BUS_MAKE ioctl to create a new bus. The control file descriptor needs
+to be kept open for the entire life-time of the created bus, closing it will
+immediately cleanup the entire bus and all its associated resources and
+endpoints. Every control file descriptor can only be used once
+to create a new bus; from that point, it is not used for any further
+communication until the final close().
+
+Each bus will generate a random, 128-bit UUID upon creation. It will be
+returned to the creators of connections through kdbus_cmd_hello.id128 and can
+be used by userspace to uniquely identify buses, even across different machines
+or containers. The UUID will have its its variant bits set to 'DCE', and denote
+version 4 (random).
+
+Optionally, an item of type KDBUS_ITEM_ATTACH_FLAGS_RECV can be attached to
+KDBUS_CMD_BUS_MAKE. In that, a set of required attach flags can be passed,
+which is used as negotiation measure during connection creation.
+
+
+5.2 Endpoints
+-------------
+
+Endpoints are entry points to a bus. By default, each bus has a default
+endpoint called 'bus'. The bus owner has the ability to create custom
+endpoints with specific names, permissions, and policy databases (see below).
+
+To create a custom endpoint, use the KDBUS_CMD_ENDPOINT_MAKE ioctl with struct
+kdbus_cmd_make. Custom endpoints always have a policy database that, by
+default, does not allow anything. Everything that users of this new endpoint
+should be able to do has to be explicitly specified through KDBUS_ITEM_NAME and
+KDBUS_ITEM_POLICY_ACCESS items.
+
+
+5.3 Domains
+-----------
+
+Each time a kdbusfs is mounted, a new kdbus domain is created, with its own
+'control' file only. The lifetime of the domain ends once the user has
+unmounted the kdbusfs.
+
+
+5.4 Creating buses and endpoints
+--------------------------------
+
+KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a
+struct kdbus_cmd_make argument.
+
+struct kdbus_cmd_make {
+  __u64 size;
+    The overall size of the struct, including its items.
+
+  __u64 flags;
+    The flags for creation.
+
+    KDBUS_MAKE_ACCESS_GROUP
+      Make the device file group-accessible
+
+    KDBUS_MAKE_ACCESS_WORLD
+      Make the device file world-accessible
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  struct kdbus_item items[0];
+    A list of items, only used for creating custom endpoints. Has specific
+    meanings for KDBUS_CMD_BUS_MAKE and KDBUS_CMD_ENDPOINT_MAKE (see above).
+};
+
+
+6. Connections
+===============================================================================
+
+
+6.1 Connection IDs and well-known connection names
+--------------------------------------------------
+
+Connections are identified by their connection id, internally implemented as a
+uint64_t counter. The IDs of every newly created bus start at 1, and every new
+connection will increment the counter by 1. The ids are not reused.
+
+In higher level tools, the user visible representation of a connection is
+defined by the D-Bus protocol specification as ":1.<id>".
+
+Messages with a specific uint64_t destination id are directly delivered to
+the connection with the corresponding id. Messages with the special destination
+id KDBUS_DST_ID_BROADCAST are broadcast messages and are potentially delivered
+to all known connections on the bus; clients interested in broadcast messages
+need to subscribe to the specific messages they are interested, though before
+any broadcast message reaches them.
+
+Messages synthesized and sent directly by the kernel will carry the special
+source id KDBUS_SRC_ID_KERNEL (0).
+
+In addition to the unique uint64_t connection id, established connections can
+request the ownership of well-known names, under which they can be found and
+addressed by other bus clients. A well-known name is associated with one and
+only one connection at a time. See section 8 on name acquisition and the
+name registry, and the validity of names.
+
+Messages can specify the special destination id 0 and carry a well-known name
+in the message data. Such a message is delivered to the destination connection
+which owns that well-known name.
+
+  +-------------------------------------------------------------------------+
+  | +---------------+     +---------------------------+                     |
+  | | Connection    |     | Message                   | -----------------+  |
+  | | :1.22         | --> | src: 22                   |                  |  |
+  | |               |     | dst: 25                   |                  |  |
+  | |               |     |                           |                  |  |
+  | |               |     |                           |                  |  |
+  | |               |     +---------------------------+                  |  |
+  | |               |                                                    |  |
+  | |               | <--------------------------------------+           |  |
+  | +---------------+                                        |           |  |
+  |                                                          |           |  |
+  | +---------------+     +---------------------------+      |           |  |
+  | | Connection    |     | Message                   | -----+           |  |
+  | | :1.25         | --> | src: 25                   |                  |  |
+  | |               |     | dst: 0xffffffffffffffff   | -------------+   |  |
+  | |               |     |  (KDBUS_DST_ID_BROADCAST) |              |   |  |
+  | |               |     |                           | ---------+   |   |  |
+  | |               |     +---------------------------+          |   |   |  |
+  | |               |                                            |   |   |  |
+  | |               | <--------------------------------------------------+  |
+  | +---------------+                                            |   |      |
+  |                                                              |   |      |
+  | +---------------+     +---------------------------+          |   |      |
+  | | Connection    |     | Message                   | --+      |   |      |
+  | | :1.55         | --> | src: 55                   |   |      |   |      |
+  | |               |     | dst: 0 / org.foo.bar      |   |      |   |      |
+  | |               |     |                           |   |      |   |      |
+  | |               |     |                           |   |      |   |      |
+  | |               |     +---------------------------+   |      |   |      |
+  | |               |                                     |      |   |      |
+  | |               | <------------------------------------------+   |      |
+  | +---------------+                                     |          |      |
+  |                                                       |          |      |
+  | +---------------+                                     |          |      |
+  | | Connection    |                                     |          |      |
+  | | :1.81         |                                     |          |      |
+  | | org.foo.bar   |                                     |          |      |
+  | |               |                                     |          |      |
+  | |               |                                     |          |      |
+  | |               | <-----------------------------------+          |      |
+  | |               |                                                |      |
+  | |               | <----------------------------------------------+      |
+  | +---------------+                                                       |
+  +-------------------------------------------------------------------------+
+
+
+6.2 Creating connections
+------------------------
+
+A connection to a bus is created by opening an endpoint file of a bus and
+becoming an active client with the KDBUS_CMD_HELLO ioctl. Every connected client
+connection has a unique identifier on the bus and can address messages to every
+other connection on the same bus by using the peer's connection id as the
+destination.
+
+The KDBUS_CMD_HELLO ioctl takes the following struct as argument.
+
+struct kdbus_cmd_hello {
+  __u64 size;
+    The overall size of the struct, including all attached items.
+
+  __u64 conn_flags;
+    Flags to apply to this connection:
+
+    KDBUS_HELLO_ACCEPT_FD
+      When this flag is set, the connection can be sent file descriptors
+      as message payload. If it's not set, any attempt of doing so will
+      result in -ECOMM on the sender's side.
+
+    KDBUS_HELLO_ACTIVATOR
+      Make this connection an activator (see below). With this bit set,
+      an item of type KDBUS_ITEM_NAME has to be attached which describes
+      the well-known name this connection should be an activator for.
+
+    KDBUS_HELLO_POLICY_HOLDER
+      Make this connection a policy holder (see below). With this bit set,
+      an item of type KDBUS_ITEM_NAME has to be attached which describes
+      the well-known name this connection should hold a policy for.
+
+    KDBUS_HELLO_MONITOR
+      Make this connection an eaves-dropping connection that receives all
+      unicast messages sent on the bus. To also receive broadcast messages,
+      the connection has to upload appropriate matches as well.
+      This flag is only valid for privileged bus connections.
+
+  __u64 attach_flags_send;
+      Set the bits for metadata this connection permits to be sent to the
+      receiving peer. Only metadata items that are both allowed to be sent by
+      the sender and that are requested by the receiver will effectively be
+      attached to the message eventually. Note, however, that the bus may
+      optionally enforce some of those bits to be set. If the match fails,
+      -ECONNREFUSED will be returned. In either case, this field will be set
+      to the mask of metadata items that are enforced by the bus. The
+      KDBUS_FLAGS_KERNEL bit will as well be set.
+
+  __u64 attach_flags_recv;
+      Request the attachment of metadata for each message received by this
+      connection. The metadata actually attached may actually augment the list
+      of requested items. See section 13 for more details.
+
+  __u64 bus_flags;
+      Upon successful completion of the ioctl, this member will contain the
+      flags of the bus it connected to.
+
+  __u64 id;
+      Upon successful completion of the ioctl, this member will contain the
+      id of the new connection.
+
+  __u64 pool_size;
+      The size of the communication pool, in bytes. The pool can be accessed
+      by calling mmap() on the file descriptor that was used to issue the
+      KDBUS_CMD_HELLO ioctl.
+
+  struct kdbus_bloom_parameter bloom;
+      Bloom filter parameter (see below).
+
+  __u8 id128[16];
+      Upon successful completion of the ioctl, this member will contain the
+      128 bit wide UUID of the connected bus.
+
+  struct kdbus_item items[0];
+      Variable list of items to add optional additional information. The
+      following items are currently expected/valid:
+
+      KDBUS_ITEM_CONN_DESCRIPTION
+        Contains a string to describes this connection's name, so it can be
+        identified later.
+
+      KDBUS_ITEM_NAME
+      KDBUS_ITEM_POLICY_ACCESS
+        For activators and policy holders only, combinations of these two
+        items describe policy access entries (see section about policy).
+
+      KDBUS_ITEM_CREDS
+      KDBUS_ITEM_SECLABEL
+        Privileged bus users may submit these types in order to create
+        connections with faked credentials. The only real use case for this
+        is a proxy service which acts on behalf of some other tasks. For a
+        connection that runs in that mode, the message's metadata items will
+        be limited to what's specified here. See section 13 for more
+        information.
+
+      Items of other types are silently ignored.
+};
+
+
+6.3 Activator and policy holder connection
+------------------------------------------
+
+An activator connection is a placeholder for a well-known name. Messages sent
+to such a connection can be used by userspace to start an implementor
+connection, which will then get all the messages from the activator copied
+over. An activator connection cannot be used to send any message.
+
+A policy holder connection only installs a policy for one or more names.
+These policy entries are kept active as long as the connection is alive, and
+are removed once it terminates. Such a policy connection type can be used to
+deploy restrictions for names that are not yet active on the bus. A policy
+holder connection cannot be used to send any message.
+
+The creation of activator, policy holder or monitor connections is an operation
+restricted to privileged users on the bus (see section "Terminology").
+
+
+6.4 Retrieving information on a connection
+------------------------------------------
+
+The KDBUS_CMD_CONN_INFO ioctl can be used to retrieve credentials and
+properties of the initial creator of a connection. This ioctl uses the
+following struct:
+
+struct kdbus_cmd_info {
+  __u64 size;
+    The overall size of the struct, including the name with its 0-byte string
+    terminator.
+
+  __u64 flags;
+    Specify which metadata items should be attached to the answer.
+    See section 13 for more details.
+
+    After the ioctl returns, this field will contain the current metadata
+    attach flags of the connection.
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  __u64 id;
+    The connection's numerical ID to retrieve information for. If set to
+    non-zero value, the 'name' field is ignored.
+
+  __u64 offset;
+    When the ioctl returns, this value will yield the offset of the connection
+    information inside the caller's pool.
+
+  struct kdbus_item items[0];
+    The optional item list, containing the well-known name to look up as
+    a KDBUS_ITEM_OWNED_NAME. Only required if the 'id' field is set to 0.
+    All other items are currently ignored.
+};
+
+After the ioctl returns, the following struct will be stored in the caller's
+pool at 'offset'.
+
+struct kdbus_info {
+  __u64 size;
+    The overall size of the struct, including all its items.
+
+  __u64 id;
+    The connection's unique ID.
+
+  __u64 flags;
+    The connection's flags as specified when it was created.
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  struct kdbus_item items[0];
+    Depending on the 'flags' field in struct kdbus_cmd_info, items of
+    types KDBUS_ITEM_OWNED_NAME and KDBUS_ITEM_CONN_DESCRIPTION are followed
+    here.
+};
+
+Once the caller is finished with parsing the return buffer, it needs to call
+KDBUS_CMD_FREE for the offset.
+
+
+6.5 Getting information about a connection's bus creator
+--------------------------------------------------------
+
+The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
+KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
+the bus the connection is attached to. The metadata returned by this call is
+collected during the creation of the bus and is never altered afterwards, so
+it provides pristine information on the task that created the bus, at the
+moment when it did so.
+
+In response to this call, a slice in the connection's pool is allocated and
+filled with an object of type struct kdbus_info, pointed to by the ioctl's
+'offset' field.
+
+struct kdbus_info {
+  __u64 size;
+    The overall size of the struct, including all its items.
+
+  __u64 id;
+    The bus' ID
+
+  __u64 flags;
+    The bus' flags as specified when it was created.
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  struct kdbus_item items[0];
+    Metadata information is stored in items here.
+};
+
+Once the caller is finished with parsing the return buffer, it needs to call
+KDBUS_CMD_FREE for the offset.
+
+
+6.6 Updating connection details
+-------------------------------
+
+Some of a connection's details can be updated with the KDBUS_CMD_CONN_UPDATE
+ioctl, using the file descriptor that was used to create the connection.
+The update command uses the following struct.
+
+struct kdbus_cmd_update {
+  __u64 size;
+    The overall size of the struct, including all its items.
+
+  struct kdbus_item items[0];
+    Items to describe the connection details to be updated. The following item
+    types are supported:
+
+    KDBUS_ITEM_ATTACH_FLAGS_SEND
+      Supply a new set of items that this connection permits to be sent along
+      with messages.
+
+    KDBUS_ITEM_ATTACH_FLAGS_RECV
+      Supply a new set of items to be attached to each message.
+
+    KDBUS_ITEM_NAME
+    KDBUS_ITEM_POLICY_ACCESS
+      Policy holder connections may supply a new set of policy information
+      with these items. For other connection types, -EOPNOTSUPP is returned.
+};
+
+
+6.6 Termination
+---------------
+
+A connection can be terminated by simply closing the file descriptor that was
+used to start the connection. All pending incoming messages will be discarded,
+and the memory in the pool will be freed.
+
+An alternative way of closing down a connection is calling the KDBUS_CMD_BYEBYE
+ioctl on it, which will only succeed if the message queue of the connection is
+empty at the time of closing, otherwise, -EBUSY is returned.
+
+When this ioctl returns successfully, the connection has been terminated and
+won't accept any new messages from remote peers. This way, a connection can
+be terminated race-free, without losing any messages.
+
+
+7. Messages
+===============================================================================
+
+Messages consist of a fixed-size header followed directly by a list of
+variable-sized data 'items'. The overall message size is specified in the
+header of the message. The chain of data items can contain well-defined
+message metadata fields, raw data, references to data, or file descriptors.
+
+
+7.1 Sending messages
+--------------------
+
+Messages are passed to the kernel with the KDBUS_CMD_MSG_SEND ioctl. Depending
+on the destination address of the message, the kernel delivers the message to
+the specific destination connection or to all connections on the same bus.
+Sending messages across buses is not possible. Messages are always queued in
+the memory pool of the destination connection (see below).
+
+The KDBUS_CMD_MSG_SEND ioctl uses struct kdbus_msg to describe the message to
+be sent.
+
+struct kdbus_msg {
+  __u64 size;
+    The overall size of the struct, including the attached items.
+
+  __u64 flags;
+    Flags for message delivery:
+
+    KDBUS_MSG_FLAGS_EXPECT_REPLY
+      Expect a reply from the remote peer to this message. With this bit set,
+      the timeout_ns field must be set to a non-zero number of nanoseconds in
+      which the receiving peer is expected to reply. If such a reply is not
+      received in time, the sender will be notified with a timeout message
+      (see below). The value must be an absolute value, in nanoseconds and
+      based on CLOCK_MONOTONIC.
+
+      For a message to be accepted as reply, it must be a direct message to
+      the original sender (not a broadcast), and its kdbus_msg.reply_cookie
+      must match the previous message's kdbus_msg.cookie.
+
+      Expected replies also temporarily open the policy of the sending
+      connection, so the other peer is allowed to respond within the given
+      time window.
+
+    KDBUS_MSG_FLAGS_SYNC_REPLY
+      By default, all calls to kdbus are considered asynchronous,
+      non-blocking. However, as there are many use cases that need to wait
+      for a remote peer to answer a method call, there's a way to send a
+      message and wait for a reply in a synchronous fashion. This is what
+      the KDBUS_MSG_FLAGS_SYNC_REPLY controls. The KDBUS_CMD_MSG_SEND ioctl
+      will block until the reply has arrived, the timeout limit is reached,
+      in case the remote connection was shut down, or if interrupted by
+      a signal before any reply; see signal(7).
+
+      The offset of the reply message in the sender's pool is stored in
+      in 'offset_reply' when the ioctl has returned without error. Hence,
+      there is no need for another KDBUS_CMD_MSG_RECV ioctl or anything else
+      to receive the reply.
+
+    KDBUS_MSG_FLAGS_NO_AUTO_START
+      By default, when a message is sent to an activator connection, the
+      activator notified and will start an implementor. This flag inhibits
+      that behavior. With this bit set, and the remote being an activator,
+      -EADDRNOTAVAIL is returned from the ioctl.
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call of
+    KDBUS_MSG_SEND.
+
+  __s64 priority;
+    The priority of this message. Receiving messages (see below) may
+    optionally be constrained to messages of a minimal priority. This
+    allows for use cases where timing critical data is interleaved with
+    control data on the same connection. If unused, the priority should be
+    set to zero.
+
+  __u64 dst_id;
+    The numeric ID of the destination connection, or KDBUS_DST_ID_BROADCAST
+    (~0ULL) to address every peer on the bus, or KDBUS_DST_ID_NAME (0) to look
+    it up dynamically from the bus' name registry. In the latter case, an item
+    of type KDBUS_ITEM_DST_NAME is mandatory.
+
+  __u64 src_id;
+    Upon return of the ioctl, this member will contain the sending
+    connection's numerical ID. Should be 0 at send time.
+
+  __u64 payload_type;
+    Type of the payload in the actual data records. Currently, only
+    KDBUS_PAYLOAD_DBUS is accepted as input value of this field. When
+    receiving messages that are generated by the kernel (notifications),
+    this field will yield KDBUS_PAYLOAD_KERNEL.
+
+  __u64 cookie;
+    Cookie of this message, for later recognition. Also, when replying
+    to a message (see above), the cookie_reply field must match this value.
+
+  __u64 timeout_ns;
+    If the message sent requires a reply from the remote peer (see above),
+    this field contains the timeout in absolute nanoseconds based on
+    CLOCK_MONOTONIC.
+
+  __u64 cookie_reply;
+    If the message sent is a reply to another message, this field must
+    match the cookie of the formerly received message.
+
+  __u64 offset_reply;
+    If the message successfully got a synchronous reply (see above), this
+    field will yield the offset of the reply message in the sender's pool.
+    Is is what KDBUS_CMD_MSG_RECV usually does for asynchronous messages.
+
+  struct kdbus_item items[0];
+    A dynamically sized list of items to contain additional information.
+    The following items are expected/valid:
+
+    KDBUS_ITEM_PAYLOAD_VEC
+    KDBUS_ITEM_PAYLOAD_MEMFD
+    KDBUS_ITEM_FDS
+      Actual data records containing the payload. See section "Passing of
+      Payload Data".
+
+    KDBUS_ITEM_BLOOM_FILTER
+      Bloom filter for matches (see below).
+
+    KDBUS_ITEM_DST_NAME
+      Well-known name to send this message to. Required if dst_id is set
+      to KDBUS_DST_ID_NAME. If a connection holding the given name can't
+      be found, -ESRCH is returned.
+      For messages to a unique name (ID), this item is optional. If present,
+      the kernel will make sure the name owner matches the given unique name.
+      This allows userspace tie the message sending to the condition that a
+      name is currently owned by a certain unique name.
+};
+
+The message will be augmented by the requested metadata items when queued into
+the receiver's pool. See also section 13.1 ("Metadata and namespaces").
+
+
+7.2 Message layout
+------------------
+
+The layout of a message is shown below.
+
+  +-------------------------------------------------------------------------+
+  | Message                                                                 |
+  | +---------------------------------------------------------------------+ |
+  | | Header                                                              | |
+  | | size: overall message size, including the data records              | |
+  | | destination: connection id of the receiver                          | |
+  | | source: connection id of the sender (set by kernel)                 | |
+  | | payload_type: "DBusDBus" textual identifier stored as uint64_t      | |
+  | +---------------------------------------------------------------------+ |
+  | +---------------------------------------------------------------------+ |
+  | | Data Record                                                         | |
+  | | size: overall record size (without padding)                         | |
+  | | type: type of data                                                  | |
+  | | data: reference to data (address or file descriptor)                | |
+  | +---------------------------------------------------------------------+ |
+  | +---------------------------------------------------------------------+ |
+  | | padding bytes to the next 8 byte alignment                          | |
+  | +---------------------------------------------------------------------+ |
+  | +---------------------------------------------------------------------+ |
+  | | Data Record                                                         | |
+  | | size: overall record size (without padding)                         | |
+  | | ...                                                                 | |
+  | +---------------------------------------------------------------------+ |
+  | +---------------------------------------------------------------------+ |
+  | | padding bytes to the next 8 byte alignment                          | |
+  | +---------------------------------------------------------------------+ |
+  | +---------------------------------------------------------------------+ |
+  | | Data Record                                                         | |
+  | | size: overall record size                                           | |
+  | | ...                                                                 | |
+  | +---------------------------------------------------------------------+ |
+  | +---------------------------------------------------------------------+ |
+  | | padding bytes to the next 8 byte alignment                          | |
+  | +---------------------------------------------------------------------+ |
+  +-------------------------------------------------------------------------+
+
+
+7.3 Passing of Payload Data
+---------------------------
+
+When connecting to the bus, receivers request a memory pool of a given size,
+large enough to carry all backlog of data enqueued for the connection. The
+pool is internally backed by a shared memory file which can be mmap()ed by
+the receiver.
+
+KDBUS_MSG_PAYLOAD_VEC:
+  Messages are directly copied by the sending process into the receiver's pool,
+  that way two peers can exchange data by effectively doing a single-copy from
+  one process to another, the kernel will not buffer the data anywhere else.
+
+KDBUS_MSG_PAYLOAD_MEMFD:
+  Messages can reference memfd files which contain the data.
+  memfd files are tmpfs-backed files that allow sealing of the content of the
+  file, which prevents all writable access to the file content.
+  Only sealed memfd files are accepted as payload data, which enforces
+  reliable passing of data; the receiver can assume that neither the sender nor
+  anyone else can alter the content after the message is sent.
+
+Apart from the sender filling-in the content into memfd files, the data will
+be passed as zero-copy from one process to another, read-only, shared between
+the peers.
+
+
+7.4 Receiving messages
+----------------------
+
+Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The
+endpoint file of the bus supports poll() to wake up the receiving process when
+new messages are queued up to be received.
+
+With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used.
+
+struct kdbus_cmd_recv {
+  __u64 flags;
+    Flags to control the receive command.
+
+    KDBUS_RECV_PEEK
+      Just return the location of the next message. Do not install file
+      descriptors or anything else. This is usually used to determine the
+      sender of the next queued message.
+
+    KDBUS_RECV_DROP
+      Drop the next message without doing anything else with it, and free the
+      pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE.
+
+    KDBUS_RECV_USE_PRIORITY
+      Use the priority field (see below).
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  __s64 priority;
+      With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in
+      the queue with at least the given priority. If no such message is waiting
+      in the queue, -ENOMSG is returned.
+
+  __u64 offset;
+      Upon return of the ioctl, this field contains the offset in the
+      receiver's memory pool.
+};
+
+Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the
+offset field contains the location of the new message inside the receiver's
+pool. The message is stored as struct kdbus_msg at this offset, and can be
+interpreted with the semantics described above.
+
+Also, if the connection allowed for file descriptor to be passed
+(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
+installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
+returns. The receiving task is obliged to close all of them appropriately.
+
+The caller is obliged to call KDBUS_CMD_FREE with the returned offset when
+the memory is no longer needed.
+
+
+7.5 Canceling messages synchronously waiting for replies
+--------------------------------------------------------
+
+When a connection sends a message with KDBUS_MSG_FLAGS_SYNC_REPLY and
+blocks while waiting for the reply, the KDBUS_CMD_MSG_CANCEL ioctl can be
+used on the same file descriptor to cancel the message, based on its cookie.
+If there are multiple messages with the same cookie that are all synchronously
+waiting for a reply, all of them will be canceled. Obviously, this is only
+possible in multi-threaded applications.
+
+
+8. Name registry
+===============================================================================
+
+Each bus instantiates a name registry to resolve well-known names into unique
+connection IDs for message delivery. The registry will be queried when a
+message is sent with kdbus_msg.dst_id set to KDBUS_DST_ID_NAME, or when a
+registry dump is requested.
+
+All of the below is subject to policy rules for SEE and OWN permissions.
+
+
+8.1 Name validity
+-----------------
+
+A name has to comply to the following rules to be considered valid:
+
+ - The name has two or more elements separated by a period ('.') character
+ - All elements must contain at least one character
+ - Each element must only contain the ASCII characters "[A-Z][a-z][0-9]_"
+   and must not begin with a digit
+ - The name must contain at least one '.' (period) character
+   (and thus at least two elements)
+ - The name must not begin with a '.' (period) character
+ - The name must not exceed KDBUS_NAME_MAX_LEN (255)
+
+
+8.2 Acquiring a name
+--------------------
+
+To acquire a name, a client uses the KDBUS_CMD_NAME_ACQUIRE ioctl with the
+following data structure.
+
+struct kdbus_cmd_name {
+  __u64 size;
+    The overall size of this struct, including the name with its 0-byte string
+    terminator.
+
+  __u64 flags;
+    Flags to control details in the name acquisition.
+
+    KDBUS_NAME_REPLACE_EXISTING
+      Acquiring a name that is already present usually fails, unless this flag
+      is set in the call, and KDBUS_NAME_ALLOW_REPLACEMENT or (see below) was
+      set when the current owner of the name acquired it, or if the current
+      owner is an activator connection (see below).
+
+    KDBUS_NAME_ALLOW_REPLACEMENT
+      Allow other connections to take over this name. When this happens, the
+      former owner of the connection will be notified of the name loss.
+
+    KDBUS_NAME_QUEUE (acquire)
+      A name that is already acquired by a connection, and which wasn't
+      requested with the KDBUS_NAME_ALLOW_REPLACEMENT flag set can not be
+      acquired again. However, a connection can put itself in a queue of
+      connections waiting for the name to be released. Once that happens, the
+      first connection in that queue becomes the new owner and is notified
+      accordingly.
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  struct kdbus_item items[0];
+    Items to submit the name. Currently, one item of type KDBUS_ITEM_NAME is
+    expected and allowed, and the contained string must be a valid bus name.
+};
+
+
+8.3 Releasing a name
+--------------------
+
+A connection may release a name explicitly with the KDBUS_CMD_NAME_RELEASE
+ioctl. If the connection was an implementor of an activatable name, its
+pending messages are moved back to the activator. If there are any connections
+queued up as waiters for the name, the oldest one of them will become the new
+owner. The same happens implicitly for all names once a connection terminates.
+
+The KDBUS_CMD_NAME_RELEASE ioctl uses the same data structure as the
+acquisition call, but with slightly different field usage.
+
+struct kdbus_cmd_name {
+  __u64 size;
+    The overall size of this struct, including the name with its 0-byte string
+    terminator.
+
+  __u64 flags;
+
+  struct kdbus_item items[0];
+    Items to submit the name. Currently, one item of type KDBUS_ITEM_NAME is
+    expected and allowed, and the contained string must be a valid bus name.
+};
+
+
+8.4 Dumping the name registry
+-----------------------------
+
+A connection may request a complete or filtered dump of currently active bus
+names with the KDBUS_CMD_NAME_LIST ioctl, which takes a struct
+kdbus_cmd_name_list as argument.
+
+struct kdbus_cmd_name_list {
+  __u64 flags;
+    Any combination of flags to specify which names should be dumped.
+
+    KDBUS_NAME_LIST_UNIQUE
+      List the unique (numeric) IDs of the connection, whether it owns a name
+      or not.
+
+    KDBUS_NAME_LIST_NAMES
+      List well-known names stored in the database which are actively owned by
+      a real connection (not an activator).
+
+    KDBUS_NAME_LIST_ACTIVATORS
+      List names that are owned by an activator.
+
+    KDBUS_NAME_LIST_QUEUED
+      List connections that are not yet owning a name but are waiting for it
+      to become available.
+
+  __u64 offset;
+    When the ioctl returns successfully, the offset to the name registry dump
+    inside the connection's pool will be stored in this field.
+};
+
+The returned list of names is stored in a struct kdbus_name_list that in turn
+contains a dynamic number of struct kdbus_cmd_name that carry the actual
+information. The fields inside that struct kdbus_cmd_name is described next.
+
+struct kdbus_name_info {
+  __u64 size;
+    The overall size of this struct, including the name with its 0-byte string
+    terminator.
+
+  __u64 owner_id;
+    The owning connection's unique ID.
+
+  __u64 conn_flags;
+    The flags of the owning connection.
+
+  struct kdbus_item items[0];
+    Items containing the actual name. Currently, one item of type
+    KDBUS_ITEM_OWNED_NAME will be attached, including the name's flags. In that
+    item, the flags field of the name may carry the following bits:
+
+    KDBUS_NAME_ALLOW_REPLACEMENT
+      Other connections are allowed to take over this name from the
+      connection that owns it.
+
+    KDBUS_NAME_IN_QUEUE (list)
+      When retrieving a list of currently acquired name in the registry, this
+      flag indicates whether the connection actually owns the name or is
+      currently waiting for it to become available.
+
+    KDBUS_NAME_ACTIVATOR (list)
+      An activator connection owns a name as a placeholder for an implementor,
+      which is started on demand as soon as the first message arrives. There's
+      some more information on this topic below. In contrast to
+      KDBUS_NAME_REPLACE_EXISTING, when a name is taken over from an activator
+      connection, all the messages that have been queued in the activator
+      connection will be moved over to the new owner. The activator connection
+      will still be tracked for the name and will take control again if the
+      implementor connection terminates.
+      This flag can not be used when acquiring a name, but is implicitly set
+      through KDBUS_CMD_HELLO with KDBUS_HELLO_ACTIVATOR set in
+      kdbus_cmd_hello.conn_flags.
+};
+
+The returned buffer must be freed with the KDBUS_CMD_FREE ioctl when the user
+is finished with it.
+
+
+9. Notifications
+===============================================================================
+
+The kernel will notify its users of the following events.
+
+  * When connection A is terminated while connection B is waiting for a reply
+    from it, connection B is notified with a message with an item of type
+    KDBUS_ITEM_REPLY_DEAD.
+
+  * When connection A does not receive a reply from connection B within the
+    specified timeout window, connection A will receive a message with an item
+    of type KDBUS_ITEM_REPLY_TIMEOUT.
+
+  * When a connection is created on or removed from a bus, messages with an
+    item of type KDBUS_ITEM_ID_ADD or KDBUS_ITEM_ID_REMOVE, respectively, are
+    sent to all bus members that match these messages through their match
+    database.
+
+  * When a connection owns or loses a name, or a name is moved from one
+    connection to another, messages with an item of type KDBUS_ITEM_NAME_ADD,
+    KDBUS_ITEM_NAME_REMOVE or KDBUS_ITEM_NAME_CHANGE are sent to all bus
+    members that match these messages through their match database.
+
+A kernel notification is a regular kdbus message with the following details.
+
+  * kdbus_msg.src_id == KDBUS_SRC_ID_KERNEL
+  * kdbus_msg.dst_id == KDBUS_DST_ID_BROADCAST
+  * kdbus_msg.payload_type == KDBUS_PAYLOAD_KERNEL
+  * Has exactly one of the aforementioned items attached
+
+
+10. Message Matching, Bloom filters
+===============================================================================
+
+10.1 Matches for broadcast messages from other connections
+----------------------------------------------------------
+
+A message addressed at the connection ID KDBUS_DST_ID_BROADCAST (~0ULL) is a
+broadcast message, delivered to all connected peers which installed a rule to
+match certain properties of the message. Without any rules installed in the
+connection, no broadcast message or kernel-side notifications will be delivered
+to the connection. Broadcast messages are subject to policy rules and TALK
+access checks.
+
+See section 11 for details on policies, and section 11.5 for more
+details on implicit policies.
+
+Matches for messages from other connections (not kernel notifications) are
+implemented as bloom filters. The sender adds certain properties of the message
+as elements to a bloom filter bit field, and sends that along with the
+broadcast message.
+
+The connection adds the message properties it is interested as elements to a
+bloom mask bit field, and uploads the mask to the match rules of the
+connection.
+
+The kernel will match the broadcast message's bloom filter against the
+connections bloom mask (simply by &-ing it), and decide whether the message
+should be delivered to the connection.
+
+The kernel has no notion of any specific properties of the message, all it
+sees are the bit fields of the bloom filter and mask to match against. The
+use of bloom filters allows simple and efficient matching, without exposing
+any message properties or internals to the kernel side. Clients need to deal
+with the fact that they might receive broadcasts which they did not subscribe
+to, as the bloom filter might allow false-positives to pass the filter.
+
+To allow the future extension of the set of elements in the bloom filter, the
+filter specifies a "generation" number. A later generation must always contain
+all elements of the set of the previous generation, but can add new elements
+to the set. The match rules mask can carry an array with all previous
+generations of masks individually stored. When the filter and mask are matched
+by the kernel, the mask with the closest matching "generation" is selected
+as the index into the mask array.
+
+
+10.2 Matches for kernel notifications
+------------------------------------
+
+To receive kernel generated notifications (see section 9), a connection must
+install special match rules that are different from the bloom filter matches
+described in the section above. They can be filtered by a sender connection's
+ID, by one of the name the sender connection owns at the time of sending the
+message, or by type of the notification (id/name add/remove/change).
+
+10.3 Adding a match
+-------------------
+
+To add a match, the KDBUS_CMD_MATCH_ADD ioctl is used, which takes a struct
+of the struct described below.
+
+Note that each of the items attached to this command will internally create
+one match 'rule', and the collection of them, which is submitted as one block
+via the ioctl is called a 'match'. To allow a message to pass, all rules of a
+match have to be satisfied. Hence, adding more items to the command will only
+narrow the possibility of a match to effectively let the message pass, and will
+cause the connection's user space process to wake up less likely.
+
+Multiple matches can be installed per connection. As long as one of it has a
+set of rules which allows the message to pass, this one will be decisive.
+
+struct kdbus_cmd_match {
+  __u64 size;
+    The overall size of the struct, including its items.
+
+  __u64 cookie;
+    A cookie which identifies the match, so it can be referred to at removal
+    time.
+
+  __u64 flags;
+    Flags to control the behavior of the ioctl.
+
+    KDBUS_MATCH_REPLACE:
+      Remove all entries with the given cookie before installing the new one.
+      This allows for race-free replacement of matches.
+
+  struct kdbus_item items[0];
+    Items to define the actual rules of the matches. The following item types
+    are expected. Each item will cause one new match rule to be created.
+
+    KDBUS_ITEM_BLOOM_MASK
+      An item that carries the bloom filter mask to match against in its
+      data field. The payload size must match the bloom filter size that
+      was specified when the bus was created.
+      See section 10.4 for more information.
+
+    KDBUS_ITEM_NAME
+      Specify a name that a sending connection must own at a time of sending
+      a broadcast message in order to match this rule.
+
+    KDBUS_ITEM_ID
+      Specify a sender connection's ID that will match this rule.
+
+    KDBUS_ITEM_NAME_ADD
+    KDBUS_ITEM_NAME_REMOVE
+    KDBUS_ITEM_NAME_CHANGE
+      These items request delivery of broadcast messages that describe a name
+      acquisition, loss, or change. The details are stored in the item's
+      kdbus_notify_name_change member. All information specified must be
+      matched in order to make the message pass. Use KDBUS_MATCH_ID_ANY to
+      match against any unique connection ID.
+
+    KDBUS_ITEM_ID_ADD
+    KDBUS_ITEM_ID_REMOVE
+      These items request delivery of broadcast messages that are generated
+      when a connection is created or terminated. struct kdbus_notify_id_change
+      is used to store the actual match information. This item can be used to
+      monitor one particular connection ID, or, when the id field is set to
+      KDBUS_MATCH_ID_ANY, all of them.
+
+    Other item types are ignored.
+};
+
+
+10.4 Bloom filters
+------------------
+
+Bloom filters allow checking whether a given word is present in a dictionary.
+This allows connections to set up a mask for information it is interested in,
+and will be delivered broadcast messages that have a matching filter.
+
+For general information on bloom filters, see
+
+  https://en.wikipedia.org/wiki/Bloom_filter
+
+The size of the bloom filter is defined per bus when it is created, in
+kdbus_bloom_parameter.size. All bloom filters attached to broadcast messages
+on the bus must match this size, and all bloom filter matches uploaded by
+connections must also match the size, or a multiple thereof (see below).
+
+The calculation of the mask has to be done on the userspace side. The kernel
+just checks the bitmasks to decide whether or not to let the message pass. All
+bits in the mask must match the filter in and bit-wise AND logic, but the
+mask may have more bits set than the filter. Consequently, false positive
+matches are expected to happen, and userspace must deal with that fact.
+
+Masks are entities that are always passed to the kernel as part of a match
+(with an item of type KDBUS_ITEM_BLOOM_MASK), and filters can be attached to
+broadcast messages (with an item of type KDBUS_ITEM_BLOOM_FILTER).
+
+For a broadcast to match, all set bits in the filter have to be set in the
+installed match mask as well. For example, consider a bus has a bloom size
+of 8 bytes, and the following mask/filter combinations:
+
+    filter  0x0101010101010101
+    mask    0x0101010101010101
+            -> matches
+
+    filter  0x0303030303030303
+    mask    0x0101010101010101
+            -> doesn't match
+
+    filter  0x0101010101010101
+    mask    0x0303030303030303
+            -> matches
+
+Hence, in order to catch all messages, a mask filled with 0xff bytes can be
+installed as a wildcard match rule.
+
+Uploaded matches may contain multiple masks, each of which in the size of the
+bloom size defined by the bus. Each block of a mask is called a 'generation',
+starting at index 0.
+
+At match time, when a broadcast message is about to be delivered, a bloom
+mask generation is passed, which denotes which of the bloom masks the filter
+should be matched against. This allows userspace to provide backward compatible
+masks at upload time, while older clients can still match against older
+versions of filters.
+
+
+10.5 Removing a match
+--------------------
+
+Matches can be removed through the KDBUS_CMD_MATCH_REMOVE ioctl, which again
+takes struct kdbus_cmd_match as argument, but its fields are used slightly
+differently.
+
+struct kdbus_cmd_match {
+  __u64 size;
+    The overall size of the struct. As it has no items in this use case, the
+    value should yield 16.
+
+  __u64 cookie;
+    The cookie of the match, as it was passed when the match was added.
+    All matches that have this cookie will be removed.
+
+  __u64 flags;
+    Unused for this use case,
+
+  __u64 kernel_flags;
+    Valid flags for this command, returned by the kernel upon each call.
+
+  struct kdbus_item items[0];
+    Unused for this use case.
+};
+
+
+11. Policy
+===============================================================================
+
+A policy databases restrict the possibilities of connections to own, see and
+talk to well-known names. It can be associated with a bus (through a policy
+holder connection) or a custom endpoint.
+
+See section 8.1 for more details on the validity of well-known names.
+
+Default endpoints of buses always have a policy database. The default
+policy is to deny all operations except for operations that are covered by
+implicit policies. Custom endpoints always have a policy, and by default,
+a policy database is empty. Therefore, unless policy rules are added, all
+operations will also be denied by default.
+
+See section 11.5 for more details on implicit policies.
+
+A set of policy rules is described by a name and multiple access rules, defined
+by the following struct.
+
+struct kdbus_policy_access {
+  __u64 type;	/* USER, GROUP, WORLD */
+    One of the following.
+
+    KDBUS_POLICY_ACCESS_USER
+      Grant access to a user with the uid stored in the 'id' field.
+
+    KDBUS_POLICY_ACCESS_GROUP
+      Grant access to a user with the gid stored in the 'id' field.
+
+    KDBUS_POLICY_ACCESS_WORLD
+      Grant access to everyone. The 'id' field is ignored.
+
+  __u64 access;	/* OWN, TALK, SEE */
+    The access to grant.
+
+    KDBUS_POLICY_SEE
+      Allow the name to be seen.
+
+    KDBUS_POLICY_TALK
+      Allow the name to be talked to.
+
+    KDBUS_POLICY_OWN
+      Allow the name to be owned.
+
+  __u64 id;
+    For KDBUS_POLICY_ACCESS_USER, stores the uid.
+    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
+};
+
+Policies are set through KDBUS_CMD_HELLO (when creating a policy holder
+connection), KDBUS_CMD_CONN_UPDATE (when updating a policy holder connection),
+KDBUS_CMD_ENDPOINT_MAKE (creating a custom endpoint) or
+KDBUS_CMD_ENDPOINT_UPDATE (updating a custom endpoint). In all cases, the name
+and policy access information is stored in items of type KDBUS_ITEM_NAME and
+KDBUS_ITEM_POLICY_ACCESS. For this transport, the following rules apply.
+
+  * An item of type KDBUS_ITEM_NAME must be followed by at least one
+    KDBUS_ITEM_POLICY_ACCESS item
+  * An item of type KDBUS_ITEM_NAME can be followed by an arbitrary number of
+    KDBUS_ITEM_POLICY_ACCESS items
+  * An arbitrary number of groups of names and access levels can be passed
+
+uids and gids are internally always stored in the kernel's view of global ids,
+and are translated back and forth on the ioctl level accordingly.
+
+
+11.2 Wildcard names
+-------------------
+
+Policy holder connections may upload names that contain the wildcard suffix
+(".*"). That way, a policy can be uploaded that is effective for every
+well-kwown name that extends the provided name by exactly one more level.
+
+For example, if an item of a set up uploaded policy rules contains the name
+"foo.bar.*", both "foo.bar.baz" and "foo.bar.bazbaz" are valid, but
+"foo.bar.baz.baz" is not.
+
+This allows connections to take control over multiple names that the policy
+holder doesn't need to know about when uploading the policy.
+
+Such wildcard entries are not allowed for custom endpoints.
+
+
+11.3 Policy example
+-------------------
+
+For example, a set of policy rules may look like this:
+
+  KDBUS_ITEM_NAME: str='org.foo.bar'
+  KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, id=1000
+  KDBUS_ITEM_POLICY_ACCESS: type=USER, access=TALK, id=1001
+  KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=SEE
+  KDBUS_ITEM_NAME: str='org.blah.baz'
+  KDBUS_ITEM_POLICY_ACCESS: type=USER, access=OWN, id=0
+  KDBUS_ITEM_POLICY_ACCESS: type=WORLD, access=TALK
+
+That means that 'org.foo.bar' may only be owned by uid 1000, but every user on
+the bus is allowed to see the name. However, only uid 1001 may actually send
+a message to the connection and receive a reply from it.
+
+The second rule allows 'org.blah.baz' to be owned by uid 0 only, but every user
+may talk to it.
+
+
+11.4 TALK access and multiple well-known names per connection
+-------------------------------------------------------------
+
+Note that TALK access is checked against all names of a connection.
+For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
+the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
+permission is also granted to 'org.foo.bar'. That might sound illogical, but
+after all, we allow messages to be directed to either the name or a well-known
+name, and policy is applied to the connection, not the name. In other words,
+the effective TALK policy for a connection is the most permissive of all names
+the connection owns.
+
+If a policy database exists for a bus (because a policy holder created one on
+demand) or for a custom endpoint (which always has one), each one is consulted
+during name registry listing, name owning or message delivery. If either one
+fails, the operation is failed with -EPERM.
+
+For best practices, connections that own names with a restricted TALK
+access should not install matches. This avoids cases where the sent
+message may pass the bloom filter due to false-positives and may also
+satisfy the policy rules.
+
+
+11.5 Implicit policies
+----------------------
+
+Depending on the type of the endpoint, a set of implicit rules might be
+enforced. On default endpoints, the following set is enforced:
+
+  * Privileged connections always override any installed policy. Those
+    connections could easily install their own policies, so there is no
+    reason to enforce installed policies.
+  * Connections can always talk to connections of the same user. This
+    includes broadcast messages.
+  * Connections that own names might send broadcast messages to other
+    connections that belong to a different user, but only if that
+    destination connection does not own any name.
+
+Custom endpoints have stricter policies. The following rules apply:
+
+  * Policy rules are always enforced, even if the connection is a privileged
+    connection.
+  * Policy rules are always enforced for TALK access, even if both ends are
+    running under the same user. This includes broadcast messages.
+  * To restrict the set of names that can be seen, endpoint policies can
+    install "SEE" policies.
+
+
+12. Pool
+===============================================================================
+
+A pool for data received from the kernel is installed for every connection of
+the bus, and is sized according to kdbus_cmd_hello.pool_size. It is accessed
+when one of the following ioctls is issued:
+
+  * KDBUS_CMD_MSG_RECV, to receive a message
+  * KDBUS_CMD_NAME_LIST, to dump the name registry
+  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection
+
+Internally, the pool is organized in slices, stored in an rb-tree. The offsets
+returned by either one of the aforementioned ioctls describe offsets inside the
+pool. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE
+has to be called on the offset.
+
+To access the memory, the caller is expected to mmap() it to its task, like
+this:
+
+  /*
+   * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the
+   * value that was previously passed in the .pool_size field of struct
+   * kdbus_cmd_hello.
+   */
+
+  buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0);
+
+
+13. Metadata
+===============================================================================
+
+When a message is delivered to a receiver connection, it is augmented by
+metadata items in accordance to the destination's current attach flags. The
+information stored in those metadata items refer to the sender task at the
+time of sending the message, so even if any detail of the sender task has
+already changed upon message reception (or if the sender task does not exist
+anymore), the information is still preserved and won't be modfied until the
+message is freed.
+
+Note that there are two exceptions to the above rules:
+
+  a) Kernel generated messages don't have a source connection, so they won't be
+     augmented.
+
+  b) If a connection was created with faked credentials (see section 6.2),
+     the only attached metadata items are the ones provided by the connection
+     itself. Other bits in the destination's attach_flags_recv won't have any
+     effect in such cases.
+
+Also, there are two things to be considered by userspace programs regarding
+those metadata items:
+
+  a) Userspace must cope with the fact that it might get more metadata than
+     they requested. That happens, for example, when a broadcast message is
+     sent and receivers have different attach flags. Items that haven't been
+     requested should hence be silently ignored.
+
+  b) Userspace might not always get all requested metadata items that it
+     requested. That is because some of those items are only added if a
+     corresponding kernel feature has been enabled. Also, the two exceptions
+     described above will as well lead to less items be attached than
+     requested.
+
+
+13.1 Known item types
+---------------------
+
+The following attach flags are currently supported.
+
+  KDBUS_ATTACH_TIMESTAMP
+    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
+    monotonic and the realtime timestamp, taken when the message was
+    processed on the kernel side.
+
+  KDBUS_ATTACH_CREDS
+    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
+    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
+
+  KDBUS_ATTACH_AUXGROUPS
+    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
+    number of auxiliary groups the sending task was a member of.
+
+  KDBUS_ATTACH_NAMES
+    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
+    connection currently owns. The name and flags are stored in kdbus_item.name
+    for each of them.
+
+  KDBUS_ATTACH_TID_COMM
+    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
+    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
+
+  KDBUS_ATTACH_PID_COMM
+    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
+    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
+
+  KDBUS_ATTACH_EXE
+    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
+    executable of the sending task, stored in kdbus_item.str.
+
+  KDBUS_ATTACH_CMDLINE
+    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
+    arguments of the sending task, as an array of strings, stored in
+    kdbus_item.str.
+
+  KDBUS_ATTACH_CGROUP
+    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
+
+  KDBUS_ATTACH_CAPS
+    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
+    that should be accessed via kdbus_item.caps.caps. Also, userspace should
+    be written in a way that it takes kdbus_item.caps.last_cap into account,
+    and derive the number of sets and rows from the item size and the reported
+    number of valid capability bits.
+
+  KDBUS_ATTACH_SECLABEL
+    Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux
+    security label of the sending task. Access via kdbus_item->str.
+
+  KDBUS_ATTACH_AUDIT
+    Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label
+    of the sending taskj. Access via kdbus_item->str.
+
+  KDBUS_ATTACH_CONN_DESCRIPTION
+    Attaches an item of type KDBUS_ITEM_CONN_DESCRIPTION that contains the
+    sender connection's current name in kdbus_item.str.
+
+
+13.1 Metadata and namespaces
+----------------------------
+
+Metadata such as PIDs, UIDs or GIDs are automatically translated to the
+namespaces of the domain that is used to send a message over. The namespaces
+of a domain are pinned at creation time, which is when the filesystem has been
+mounted.
+
+Metadata items that cannot be translated are dropped.
+
+
+14. Error codes
+===============================================================================
+
+Below is a list of error codes that might be returned by the individual
+ioctl commands. The list focuses on the return values from kdbus code itself,
+and might not cover those of all kernel internal functions.
+
+For all ioctls:
+
+  -ENOMEM	The kernel memory is exhausted
+  -ENOTTY	Illegal ioctl command issued for the file descriptor
+  -ENOSYS	The requested functionality is not available
+
+For all ioctls that carry a struct as payload:
+
+  -EFAULT	The supplied data pointer was not 64-bit aligned, or was
+		inaccessible from the kernel side.
+  -EINVAL	The size inside the supplied struct was smaller than expected
+  -EMSGSIZE	The size inside the supplied struct was bigger than expected
+  -ENAMETOOLONG	A supplied name is larger than the allowed maximum size
+
+For KDBUS_CMD_BUS_MAKE:
+
+  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid or
+		the supplied name does not start with the current uid and a '-'
+  -EEXIST	A bus of that name already exists
+  -ESHUTDOWN	The domain for the bus is already shut down
+  -EMFILE	The maximum number of buses for the current user is exhausted
+
+For KDBUS_CMD_ENDPOINT_MAKE:
+
+  -EPERM	The calling user is not privileged (see Terminology)
+  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid
+  -EEXIST	An endpoint of that name already exists
+
+For KDBUS_CMD_HELLO:
+
+  -EFAULT	The supplied pool size was 0 or not a multiple of the page size
+  -EINVAL	The flags supplied in the kdbus_cmd_make struct are invalid, or
+		an illegal combination of KDBUS_HELLO_MONITOR,
+		KDBUS_HELLO_ACTIVATOR and KDBUS_HELLO_POLICY_HOLDER was passed
+		in the flags, or an invalid set of items was supplied
+  -ECONNREFUSED	The attach_flags_send field did not satisfy the requirements of
+		the bus
+  -EPERM	An KDBUS_ITEM_CREDS items was supplied, but the current user is
+		not privileged
+  -ESHUTDOWN	The bus has already been shut down
+  -EMFILE	The maximum number of connection on the bus has been reached
+
+For KDBUS_CMD_BYEBYE:
+
+  -EALREADY	The connection has already been shut down
+  -EBUSY	There are still messages queued up in the connection's pool
+
+For KDBUS_CMD_MSG_SEND:
+
+  -EOPNOTSUPP	The connection is not an ordinary connection, or the passed
+		file descriptors are either kdbus handles or unix domain
+		sockets. Both are currently unsupported
+  -EINVAL	The submitted payload type is KDBUS_PAYLOAD_KERNEL,
+		KDBUS_MSG_FLAGS_EXPECT_REPLY was set without a timeout value,
+		KDBUS_MSG_FLAGS_SYNC_REPLY was set without
+		KDBUS_MSG_FLAGS_EXPECT_REPLY, an invalid item was supplied,
+		src_id was != 0 and different from the current connection's ID,
+		a supplied memfd had a size of 0, a string was not properly
+		nul-terminated
+  -ENOTUNIQ	The supplied destination is KDBUS_DST_ID_BROADCAST, a file
+		descriptor was passed, KDBUS_MSG_FLAGS_EXPECT_REPLY was set,
+		or a timeout was given for a broadcast message
+  -E2BIG	Too many items
+  -EMSGSIZE	The size of the message header and items or the payload vector
+		is too big.
+  -EEXIST	Multiple KDBUS_ITEM_FDS, KDBUS_ITEM_BLOOM_FILTER or
+		KDBUS_ITEM_DST_NAME items were supplied
+  -EBADF	The supplied KDBUS_ITEM_FDS or KDBUS_MSG_PAYLOAD_MEMFD items
+		contained an illegal file descriptor
+  -EMEDIUMTYPE	The supplied memfd is not a sealed kdbus memfd
+  -EMFILE	Too many file descriptors inside a KDBUS_ITEM_FDS
+  -EBADMSG	An item had illegal size, both a dst_id and a
+		KDBUS_ITEM_DST_NAME was given, or both a name and a bloom
+		filter was given
+  -ETXTBSY	The supplied kdbus memfd file cannot be sealed or the seal
+		was removed, because it is shared with other processes or
+		still mmap()ed
+  -ECOMM	A peer does not accept the file descriptors addressed to it
+  -EFAULT	The supplied bloom filter size was not 64-bit aligned
+  -EDOM		The supplied bloom filter size did not match the bloom filter
+		size of the bus
+  -EDESTADDRREQ	dst_id was set to KDBUS_DST_ID_NAME, but no KDBUS_ITEM_DST_NAME
+		was attached
+  -ESRCH	The name to look up was not found in the name registry
+  -EADDRNOTAVAIL KDBUS_MSG_FLAGS_NO_AUTO_START was given but the destination
+		 connection is an activator.
+  -ENXIO	The passed numeric destination connection ID couldn't be found,
+		or is not connected
+  -ECONNRESET	The destination connection is no longer active
+  -ETIMEDOUT	Timeout while synchronously waiting for a reply
+  -EINTR	System call interrupted while synchronously waiting for a reply
+  -EPIPE	When sending a message, a synchronous reply from the receiving
+		connection was expected but the connection died before
+		answering
+  -ECANCELED	A synchronous message sending was cancelled
+  -ENOBUFS	Too many pending messages on the receiver side
+  -EREMCHG	Both a well-known name and a unique name (ID) was given, but
+		the name is not currently owned by that connection.
+  -EXFULL	The memory pool of the receiver is full
+  -EREMOTEIO	While synchronously waiting for a reply, the remote peer
+		failed with an I/O error.
+
+For KDBUS_CMD_MSG_RECV:
+
+  -EINVAL	Invalid flags or offset
+  -EAGAIN	No message found in the queue
+  -ENOMSG	No message of the requested priority found
+  -EOVERFLOW	Broadcast messages have been lost
+
+For KDBUS_CMD_MSG_CANCEL:
+
+  -EINVAL	Invalid flags
+  -ENOENT	Pending message with the supplied cookie not found
+
+For KDBUS_CMD_FREE:
+
+  -ENXIO	No pool slice found at given offset
+  -EINVAL	Invalid flags provided, the offset is valid, but the user is
+		not allowed to free the slice. This happens, for example, if
+		the offset was retrieved with KDBUS_RECV_PEEK.
+
+For KDBUS_CMD_NAME_ACQUIRE:
+
+  -EINVAL	Illegal command flags, illegal name provided, or an activator
+		tried to acquire a second name
+  -EPERM	Policy prohibited name ownership
+  -EALREADY	Connection already owns that name
+  -EEXIST	The name already exists and can not be taken over
+  -E2BIG	The maximum number of well-known names per connection
+		is exhausted
+  -ECONNRESET	The connection was reset during the call
+
+For KDBUS_CMD_NAME_RELEASE:
+
+  -EINVAL	Invalid command flags, or invalid name provided
+  -ESRCH	Name is not found found in the registry
+  -EADDRINUSE	Name is owned by a different connection and can't be released
+
+For KDBUS_CMD_NAME_LIST:
+
+  -EINVAL	Invalid flags
+  -ENOBUFS	No available memory in the connection's pool.
+
+For KDBUS_CMD_CONN_INFO:
+
+  -EINVAL	Invalid flags, or neither an ID nor a name was provided,
+		or the name is invalid.
+  -ESRCH	Connection lookup by name failed
+  -ENXIO	No connection with the provided number connection ID found
+
+For KDBUS_CMD_CONN_UPDATE:
+
+  -EINVAL	Illegal flags or items
+  -EOPNOTSUPP	Operation not supported by connection.
+  -E2BIG	Too many policy items attached
+  -EINVAL	Wildcards submitted in policy entries, or illegal sequence
+		of policy items
+
+For KDBUS_CMD_ENDPOINT_UPDATE:
+
+  -E2BIG	Too many policy items attached
+  -EINVAL	Invalid flags, or wildcards submitted in policy entries,
+		or illegal sequence of policy items
+
+For KDBUS_CMD_MATCH_ADD:
+
+  -EINVAL	Illegal flags or items
+  -EDOM		Illegal bloom filter size
+  -EMFILE	Too many matches for this connection
+
+For KDBUS_CMD_MATCH_REMOVE:
+
+  -EINVAL	Illegal flags
+  -ENOENT	A match entry with the given cookie could not be found.
+
+
+15. Internal object relations
+===============================================================================
+
+This is a simplified outline of the internal kdbus object relations, for
+those interested in the inner life of the driver implementation.
+
+From the a mount point's (domain's) perspective:
+
+struct kdbus_domain
+  |» struct kdbus_domain_user *user (many, owned)
+  '» struct kdbus_node node (embedded)
+      |» struct kdbus_node children (many, referenced)
+      |» struct kdbus_node *parent (pinned)
+      '» struct kdbus_bus (many, pinned)
+          |» struct kdbus_node node (embedded)
+          '» struct kdbus_ep (many, pinned)
+              |» struct kdbus_node node (embedded)
+              |» struct kdbus_bus *bus (pinned)
+              |» struct kdbus_conn conn_list (many, pinned)
+              |   |» struct kdbus_ep *ep (pinned)
+              |   |» struct kdbus_name_entry *activator_of (owned)
+              |   |» struct kdbus_match_db *match_db (owned)
+              |   |» struct kdbus_meta *meta (owned)
+              |   |» struct kdbus_match_db *match_db (owned)
+              |   |    '» struct kdbus_match_entry (many, owned)
+              |   |
+              |   |» struct kdbus_pool *pool (owned)
+              |   |    '» struct kdbus_pool_slice *slices (many, owned)
+              |   |       '» struct kdbus_pool *pool (pinned)
+              |   |
+              |   |» struct kdbus_domain_user *user (pinned)
+              |   `» struct kdbus_queue_entry entries (many, embedded)
+              |        |» struct kdbus_pool_slice *slice (pinned)
+              |        |» struct kdbus_conn_reply *reply (owned)
+              |        '» struct kdbus_domain_user *user (pinned)
+              |
+              '» struct kdbus_domain_user *user (pinned)
+                  '» struct kdbus_policy_db policy_db (embedded)
+                       |» struct kdbus_policy_db_entry (many, owned)
+                       |   |» struct kdbus_conn (pinned)
+                       |   '» struct kdbus_ep (pinned)
+                       |
+                       '» struct kdbus_policy_db_cache_entry (many, owned)
+                           '» struct kdbus_conn (pinned)
+
+
+For the life-time of a file descriptor derived from calling open() on a file
+inside the mount point:
+
+struct kdbus_handle
+  |» struct kdbus_meta *meta (owned)
+  |» struct kdbus_ep *ep (pinned)
+  |» struct kdbus_conn *conn (owned)
+  '» struct kdbus_ep *ep (owned)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add header file
  2014-11-21  5:02 ` Greg Kroah-Hartman
  (?)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  2014-11-21  8:34   ` Harald Hoyer
  -1 siblings, 1 reply; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch adds the header file which describes the low-level transport
protocol used by various ioctls. The header file is located in
include/uapi/linux/ as it is shared between kernel and userspace, and it
only contains data structure definitionsi, enums and #defines for
constants.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/uapi/linux/Kbuild  |   1 +
 include/uapi/linux/kdbus.h | 933 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 934 insertions(+)
 create mode 100644 include/uapi/linux/kdbus.h

diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 4c94f31a8c99..84239f6a3210 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -212,6 +212,7 @@ header-y += ixjuser.h
 header-y += jffs2.h
 header-y += joystick.h
 header-y += kd.h
+header-y += kdbus.h
 header-y += kdev_t.h
 header-y += kernel-page-flags.h
 header-y += kernel.h
diff --git a/include/uapi/linux/kdbus.h b/include/uapi/linux/kdbus.h
new file mode 100644
index 000000000000..91de6211fe2a
--- /dev/null
+++ b/include/uapi/linux/kdbus.h
@@ -0,0 +1,933 @@
+/*
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef _KDBUS_UAPI_H_
+#define _KDBUS_UAPI_H_
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+#define KDBUS_IOCTL_MAGIC		0x95
+#define KDBUS_SRC_ID_KERNEL		(0)
+#define KDBUS_DST_ID_NAME		(0)
+#define KDBUS_MATCH_ID_ANY		(~0ULL)
+#define KDBUS_DST_ID_BROADCAST		(~0ULL)
+#define KDBUS_FLAG_KERNEL		(1ULL << 63)
+
+/**
+ * struct kdbus_notify_id_change - name registry change message
+ * @id:			New or former owner of the name
+ * @flags:		flags field from KDBUS_HELLO_*
+ *
+ * Sent from kernel to userspace when the owner or activator of
+ * a well-known name changes.
+ *
+ * Attached to:
+ *   KDBUS_ITEM_ID_ADD
+ *   KDBUS_ITEM_ID_REMOVE
+ */
+struct kdbus_notify_id_change {
+	__u64 id;
+	__u64 flags;
+};
+
+/**
+ * struct kdbus_notify_name_change - name registry change message
+ * @old_id:		ID and flags of former owner of a name
+ * @new_id:		ID and flags of new owner of a name
+ * @name:		Well-known name
+ *
+ * Sent from kernel to userspace when the owner or activator of
+ * a well-known name changes.
+ *
+ * Attached to:
+ *   KDBUS_ITEM_NAME_ADD
+ *   KDBUS_ITEM_NAME_REMOVE
+ *   KDBUS_ITEM_NAME_CHANGE
+ */
+struct kdbus_notify_name_change {
+	struct kdbus_notify_id_change old_id;
+	struct kdbus_notify_id_change new_id;
+	char name[0];
+};
+
+/**
+ * struct kdbus_creds - process credentials
+ * @uid:		User ID
+ * @gid:		Group ID
+ * @pid:		Process ID
+ * @tid:		Thread ID
+ * @starttime:		Starttime of the process
+ *
+ * The starttime of the process PID. This is useful to detect PID overruns
+ * from the client side. i.e. if you use the PID to look something up in
+ * /proc/$PID/ you can afterwards check the starttime field of it, to ensure
+ * you didn't run into a PID overrun.
+ *
+ * Attached to:
+ *   KDBUS_ITEM_CREDS
+ */
+struct kdbus_creds {
+	__u64 uid;
+	__u64 gid;
+	__u64 pid;
+	__u64 tid;
+	__u64 starttime;
+};
+
+/**
+ * struct kdbus_caps - process capabilities
+ * @last_cap:	Highest currently known capability bit
+ * @caps:	Variable number of 32-bit capabilities flags
+ *
+ * Contains a variable number of 32-bit capabilities flags.
+ *
+ * Attached to:
+ *   KDBUS_ITEM_CAPS
+ */
+struct kdbus_caps {
+	__u32 last_cap;
+	__u32 caps[0];
+};
+
+/**
+ * struct kdbus_audit - audit information
+ * @sessionid:		The audit session ID
+ * @loginuid:		The audit login uid
+ *
+ * Attached to:
+ *   KDBUS_ITEM_AUDIT
+ */
+struct kdbus_audit {
+	__u64 sessionid;
+	__u64 loginuid;
+};
+
+/**
+ * struct kdbus_timestamp
+ * @seqnum:		Global per-domain message sequence number
+ * @monotonic_ns:	Monotonic timestamp, in nanoseconds
+ * @realtime_ns:	Realtime timestamp, in nanoseconds
+ *
+ * Attached to:
+ *   KDBUS_ITEM_TIMESTAMP
+ */
+struct kdbus_timestamp {
+	__u64 seqnum;
+	__u64 monotonic_ns;
+	__u64 realtime_ns;
+};
+
+/**
+ * struct kdbus_vec - I/O vector for kdbus payload items
+ * @size:		The size of the vector
+ * @address:		Memory address of data buffer
+ * @offset:		Offset in the in-message payload memory,
+ *			relative to the message head
+ *
+ * Attached to:
+ *   KDBUS_ITEM_PAYLOAD_VEC, KDBUS_ITEM_PAYLOAD_OFF
+ */
+struct kdbus_vec {
+	__u64 size;
+	union {
+		__u64 address;
+		__u64 offset;
+	};
+};
+
+/**
+ * struct kdbus_bloom_parameter - bus-wide bloom parameters
+ * @size:		Size of the bit field in bytes (m / 8)
+ * @n_hash:		Number of hash functions used (k)
+ */
+struct kdbus_bloom_parameter {
+	__u64 size;
+	__u64 n_hash;
+};
+
+/**
+ * struct kdbus_bloom_filter - bloom filter containing n elements
+ * @generation:		Generation of the element set in the filter
+ * @data:		Bit field, multiple of 8 bytes
+ */
+struct kdbus_bloom_filter {
+	__u64 generation;
+	__u64 data[0];
+};
+
+/**
+ * struct kdbus_memfd - a kdbus memfd
+ * @size:		The memfd's size
+ * @fd:			The file descriptor number
+ * @__pad:		Padding to ensure proper alignment and size
+ *
+ * Attached to:
+ *   KDBUS_ITEM_PAYLOAD_MEMFD
+ */
+struct kdbus_memfd {
+	__u64 size;
+	int fd;
+	__u32 __pad;
+};
+
+/**
+ * struct kdbus_name - a registered well-known name with its flags
+ * @flags:		Flags from KDBUS_NAME_*
+ * @name:		Well-known name
+ *
+ * Attached to:
+ *   KDBUS_ITEM_OWNED_NAME
+ */
+struct kdbus_name {
+	__u64 flags;
+	char name[0];
+};
+
+/**
+ * struct kdbus_policy_access - policy access item
+ * @type:		One of KDBUS_POLICY_ACCESS_* types
+ * @access:		Access to grant
+ * @id:			For KDBUS_POLICY_ACCESS_USER, the uid
+ *			For KDBUS_POLICY_ACCESS_GROUP, the gid
+ */
+struct kdbus_policy_access {
+	__u64 type;	/* USER, GROUP, WORLD */
+	__u64 access;	/* OWN, TALK, SEE */
+	__u64 id;	/* uid, gid, 0 */
+};
+
+/**
+ * enum kdbus_item_type - item types to chain data in a list
+ * @_KDBUS_ITEM_NULL:			Uninitialized/invalid
+ * @_KDBUS_ITEM_USER_BASE:		Start of user items
+ * @KDBUS_ITEM_PAYLOAD_VEC:		Vector to data
+ * @KDBUS_ITEM_PAYLOAD_OFF:		Data at returned offset to message head
+ * @KDBUS_ITEM_PAYLOAD_MEMFD:		Data as sealed memfd
+ * @KDBUS_ITEM_FDS:			Attached file descriptors
+ * @KDBUS_ITEM_BLOOM_PARAMETER:		Bus-wide bloom parameters, used with
+ *					KDBUS_CMD_BUS_MAKE, carries a
+ *					struct kdbus_bloom_parameter
+ * @KDBUS_ITEM_BLOOM_FILTER:		Bloom filter carried with a message,
+ *					used to match against a bloom mask of a
+ *					connection, carries a struct
+ *					kdbus_bloom_filter
+ * @KDBUS_ITEM_BLOOM_MASK:		Bloom mask used to match against a
+ *					message'sbloom filter
+ * @KDBUS_ITEM_DST_NAME:		Destination's well-known name
+ * @KDBUS_ITEM_MAKE_NAME:		Name of domain, bus, endpoint
+ * @KDBUS_ITEM_ATTACH_FLAGS_SEND:	Attach-flags, used for updating which
+ *					metadata a connection opts in to send
+ * @KDBUS_ITEM_ATTACH_FLAGS_RECV:	Attach-flags, used for updating which
+ *					metadata a connection requests to
+ *					receive for each reeceived message
+ * @KDBUS_ITEM_ID:			Connection ID
+ * @KDBUS_ITEM_NAME:			Well-know name with flags
+ * @_KDBUS_ITEM_ATTACH_BASE:		Start of metadata attach items
+ * @KDBUS_ITEM_TIMESTAMP:		Timestamp
+ * @KDBUS_ITEM_CREDS:			Process credential
+ * @KDBUS_ITEM_AUXGROUPS:		Auxiliary process groups
+ * @KDBUS_ITEM_OWNED_NAME:		A name owned by the associated
+ *					connection
+ * @KDBUS_ITEM_TID_COMM:		Thread ID "comm" identifier
+ * @KDBUS_ITEM_PID_COMM:		Process ID "comm" identifier
+ * @KDBUS_ITEM_EXE:			The path of the executable
+ * @KDBUS_ITEM_CMDLINE:			The process command line
+ * @KDBUS_ITEM_CGROUP:			The croup membership
+ * @KDBUS_ITEM_CAPS:			The process capabilities
+ * @KDBUS_ITEM_SECLABEL:		The security label
+ * @KDBUS_ITEM_AUDIT:			The audit IDs
+ * @KDBUS_ITEM_CONN_DESCRIPTION:	The connection's human-readable name
+ *					(debugging)
+ * @_KDBUS_ITEM_POLICY_BASE:		Start of policy items
+ * @KDBUS_ITEM_POLICY_ACCESS:		Policy access block
+ * @_KDBUS_ITEM_KERNEL_BASE:		Start of kernel-generated message items
+ * @KDBUS_ITEM_NAME_ADD:		Notification in kdbus_notify_name_change
+ * @KDBUS_ITEM_NAME_REMOVE:		Notification in kdbus_notify_name_change
+ * @KDBUS_ITEM_NAME_CHANGE:		Notification in kdbus_notify_name_change
+ * @KDBUS_ITEM_ID_ADD:			Notification in kdbus_notify_id_change
+ * @KDBUS_ITEM_ID_REMOVE:		Notification in kdbus_notify_id_change
+ * @KDBUS_ITEM_REPLY_TIMEOUT:		Timeout has been reached
+ * @KDBUS_ITEM_REPLY_DEAD:		Destination died
+ */
+enum kdbus_item_type {
+	_KDBUS_ITEM_NULL,
+	_KDBUS_ITEM_USER_BASE,
+	KDBUS_ITEM_PAYLOAD_VEC	= _KDBUS_ITEM_USER_BASE,
+	KDBUS_ITEM_PAYLOAD_OFF,
+	KDBUS_ITEM_PAYLOAD_MEMFD,
+	KDBUS_ITEM_FDS,
+	KDBUS_ITEM_BLOOM_PARAMETER,
+	KDBUS_ITEM_BLOOM_FILTER,
+	KDBUS_ITEM_BLOOM_MASK,
+	KDBUS_ITEM_DST_NAME,
+	KDBUS_ITEM_MAKE_NAME,
+	KDBUS_ITEM_ATTACH_FLAGS_SEND,
+	KDBUS_ITEM_ATTACH_FLAGS_RECV,
+	KDBUS_ITEM_ID,
+	KDBUS_ITEM_NAME,
+
+	/* keep these item types in sync with KDBUS_ATTACH_* flags */
+	_KDBUS_ITEM_ATTACH_BASE	= 0x1000,
+	KDBUS_ITEM_TIMESTAMP	= _KDBUS_ITEM_ATTACH_BASE,
+	KDBUS_ITEM_CREDS,
+	KDBUS_ITEM_AUXGROUPS,
+	KDBUS_ITEM_OWNED_NAME,
+	KDBUS_ITEM_TID_COMM,
+	KDBUS_ITEM_PID_COMM,
+	KDBUS_ITEM_EXE,
+	KDBUS_ITEM_CMDLINE,
+	KDBUS_ITEM_CGROUP,
+	KDBUS_ITEM_CAPS,
+	KDBUS_ITEM_SECLABEL,
+	KDBUS_ITEM_AUDIT,
+	KDBUS_ITEM_CONN_DESCRIPTION,
+
+	_KDBUS_ITEM_POLICY_BASE	= 0x2000,
+	KDBUS_ITEM_POLICY_ACCESS = _KDBUS_ITEM_POLICY_BASE,
+
+	_KDBUS_ITEM_KERNEL_BASE	= 0x8000,
+	KDBUS_ITEM_NAME_ADD	= _KDBUS_ITEM_KERNEL_BASE,
+	KDBUS_ITEM_NAME_REMOVE,
+	KDBUS_ITEM_NAME_CHANGE,
+	KDBUS_ITEM_ID_ADD,
+	KDBUS_ITEM_ID_REMOVE,
+	KDBUS_ITEM_REPLY_TIMEOUT,
+	KDBUS_ITEM_REPLY_DEAD,
+};
+
+/**
+ * struct kdbus_item - chain of data blocks
+ * @size:		Overall data record size
+ * @type:		Kdbus_item type of data
+ * @data:		Generic bytes
+ * @data32:		Generic 32 bit array
+ * @data64:		Generic 64 bit array
+ * @str:		Generic string
+ * @id:			Connection ID
+ * @vec:		KDBUS_ITEM_PAYLOAD_VEC
+ * @creds:		KDBUS_ITEM_CREDS
+ * @audit:		KDBUS_ITEM_AUDIT
+ * @timestamp:		KDBUS_ITEM_TIMESTAMP
+ * @name:		KDBUS_ITEM_NAME
+ * @bloom_parameter:	KDBUS_ITEM_BLOOM_PARAMETER
+ * @bloom_filter:	KDBUS_ITEM_BLOOM_FILTER
+ * @memfd:		KDBUS_ITEM_PAYLOAD_MEMFD
+ * @name_change:	KDBUS_ITEM_NAME_ADD
+ *			KDBUS_ITEM_NAME_REMOVE
+ *			KDBUS_ITEM_NAME_CHANGE
+ * @id_change:		KDBUS_ITEM_ID_ADD
+ *			KDBUS_ITEM_ID_REMOVE
+ * @policy:		KDBUS_ITEM_POLICY_ACCESS
+ */
+struct kdbus_item {
+	__u64 size;
+	__u64 type;
+	union {
+		__u8 data[0];
+		__u32 data32[0];
+		__u64 data64[0];
+		char str[0];
+
+		__u64 id;
+		struct kdbus_vec vec;
+		struct kdbus_creds creds;
+		struct kdbus_audit audit;
+		struct kdbus_caps caps;
+		struct kdbus_timestamp timestamp;
+		struct kdbus_name name;
+		struct kdbus_bloom_parameter bloom_parameter;
+		struct kdbus_bloom_filter bloom_filter;
+		struct kdbus_memfd memfd;
+		int fds[0];
+		struct kdbus_notify_name_change name_change;
+		struct kdbus_notify_id_change id_change;
+		struct kdbus_policy_access policy_access;
+	};
+};
+
+/**
+ * enum kdbus_msg_flags - type of message
+ * @KDBUS_MSG_FLAGS_EXPECT_REPLY:	Expect a reply message, used for
+ *					method calls. The userspace-supplied
+ *					cookie identifies the message and the
+ *					respective reply carries the cookie
+ *					in cookie_reply
+ * @KDBUS_MSG_FLAGS_SYNC_REPLY:		Wait for destination connection to
+ *					reply to this message. The
+ *					KDBUS_CMD_MSG_SEND ioctl() will block
+ *					until the reply is received, and
+ *					offset_reply in struct kdbus_msg will
+ *					yield the offset in the sender's pool
+ *					where the reply can be found.
+ *					This flag is only valid if
+ *					@KDBUS_MSG_FLAGS_EXPECT_REPLY is set as
+ *					well.
+ * @KDBUS_MSG_FLAGS_NO_AUTO_START:	Do not start a service, if the addressed
+ *					name is not currently active
+ */
+enum kdbus_msg_flags {
+	KDBUS_MSG_FLAGS_EXPECT_REPLY	= 1ULL << 0,
+	KDBUS_MSG_FLAGS_SYNC_REPLY	= 1ULL << 1,
+	KDBUS_MSG_FLAGS_NO_AUTO_START	= 1ULL << 2,
+};
+
+/**
+ * enum kdbus_payload_type - type of payload carried by message
+ * @KDBUS_PAYLOAD_KERNEL:	Kernel-generated simple message
+ * @KDBUS_PAYLOAD_DBUS:		D-Bus marshalling "DBusDBus"
+ */
+enum kdbus_payload_type {
+	KDBUS_PAYLOAD_KERNEL,
+	KDBUS_PAYLOAD_DBUS	= 0x4442757344427573ULL,
+};
+
+/**
+ * struct kdbus_msg - the representation of a kdbus message
+ * @size:		Total size of the message
+ * @flags:		Message flags (KDBUS_MSG_FLAGS_*), userspace → kernel
+ * @kernel_flags:	Supported message flags, kernel → userspace
+ * @priority:		Message queue priority value
+ * @dst_id:		64-bit ID of the destination connection
+ * @src_id:		64-bit ID of the source connection
+ * @payload_type:	Payload type (KDBUS_PAYLOAD_*)
+ * @cookie:		Userspace-supplied cookie, for the connection
+ *			to identify its messages
+ * @timeout_ns:		The time to wait for a message reply from the peer.
+ *			If there is no reply, a kernel-generated message
+ *			with an attached KDBUS_ITEM_REPLY_TIMEOUT item
+ *			is sent to @src_id. The timeout is expected in
+ *			nanoseconds and as absolute CLOCK_MONOTONIC value.
+ * @cookie_reply:	A reply to the requesting message with the same
+ *			cookie. The requesting connection can match its
+ *			request and the reply with this value
+ * @offset_reply:	If KDBUS_MSG_FLAGS_EXPECT_REPLY, this field will
+ *			contain the offset in the sender's pool where the
+ *			reply is stored.
+ * @items:		A list of kdbus_items containing the message payload
+ */
+struct kdbus_msg {
+	__u64 size;
+	__u64 flags;
+	__u64 kernel_flags;
+	__s64 priority;
+	__u64 dst_id;
+	__u64 src_id;
+	__u64 payload_type;
+	__u64 cookie;
+	union {
+		__u64 timeout_ns;
+		__u64 cookie_reply;
+		__u64 offset_reply;
+	};
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * enum kdbus_recv_flags - flags for de-queuing messages
+ * @KDBUS_RECV_PEEK:		Return the next queued message without
+ *				actually de-queuing it, and without installing
+ *				any file descriptors or other resources. It is
+ *				usually used to determine the activating
+ *				connection of a bus name.
+ * @KDBUS_RECV_DROP:		Drop and free the next queued message and all
+ *				its resources without actually receiving it.
+ * @KDBUS_RECV_USE_PRIORITY:	Only de-queue messages with the specified or
+ *				higher priority (lowest values); if not set,
+ *				the priority value is ignored.
+ */
+enum kdbus_recv_flags {
+	KDBUS_RECV_PEEK		= 1ULL <<  0,
+	KDBUS_RECV_DROP		= 1ULL <<  1,
+	KDBUS_RECV_USE_PRIORITY	= 1ULL <<  2,
+};
+
+/**
+ * struct kdbus_cmd_recv - struct to de-queue a buffered message
+ * @flags:		KDBUS_RECV_* flags, userspace → kernel
+ * @kernel_flags:	Supported KDBUS_RECV_* flags, kernel → userspace
+ * @priority:		Minimum priority of the messages to de-queue. Lowest
+ *			values have the highest priority.
+ * @offset:		Returned offset in the pool where the message is
+ *			stored. The user must use KDBUS_CMD_FREE to free
+ *			the allocated memory.
+ * @dropped_msgs:	In case the KDBUS_CMD_MSG_RECV ioctl returns
+ *			-EOVERFLOW, this field will contain the number of
+ *			broadcast messages that have been lost since the
+ *			last call.
+ *
+ * This struct is used with the KDBUS_CMD_MSG_RECV ioctl.
+ */
+struct kdbus_cmd_recv {
+	__u64 flags;
+	__u64 kernel_flags;
+	__s64 priority;
+	union {
+		__u64 offset;
+		__u64 dropped_msgs;
+	};
+} __attribute__((aligned(8)));
+
+/**
+ * struct kdbus_cmd_cancel - struct to cancel a synchronously pending message
+ * @cookie:		The cookie of the pending message
+ * @flags:		Flags for the free command. Currently unused.
+ *
+ * This struct is used with the KDBUS_CMD_CANCEL ioctl.
+ */
+struct kdbus_cmd_cancel {
+	__u64 cookie;
+	__u64 flags;
+} __attribute__((aligned(8)));
+
+/**
+ * struct kdbus_cmd_free - struct to free a slice of memory in the pool
+ * @offset:		The offset of the memory slice, as returned by other
+ *			ioctls
+ * @flags:		Flags for the free command, userspace → kernel
+ * @kernel_flags:	Supported flags of the free command, userspace → kernel
+ *
+ * This struct is used with the KDBUS_CMD_FREE ioctl.
+ */
+struct kdbus_cmd_free {
+	__u64 offset;
+	__u64 flags;
+	__u64 kernel_flags;
+} __attribute__((aligned(8)));
+
+/**
+ * enum kdbus_policy_access_type - permissions of a policy record
+ * @_KDBUS_POLICY_ACCESS_NULL:	Uninitialized/invalid
+ * @KDBUS_POLICY_ACCESS_USER:	Grant access to a uid
+ * @KDBUS_POLICY_ACCESS_GROUP:	Grant access to gid
+ * @KDBUS_POLICY_ACCESS_WORLD:	World-accessible
+ */
+enum kdbus_policy_access_type {
+	_KDBUS_POLICY_ACCESS_NULL,
+	KDBUS_POLICY_ACCESS_USER,
+	KDBUS_POLICY_ACCESS_GROUP,
+	KDBUS_POLICY_ACCESS_WORLD,
+};
+
+/**
+ * enum kdbus_policy_access_flags - mode flags
+ * @KDBUS_POLICY_OWN:		Allow to own a well-known name
+ *				Implies KDBUS_POLICY_TALK and KDBUS_POLICY_SEE
+ * @KDBUS_POLICY_TALK:		Allow communication to a well-known name
+ *				Implies KDBUS_POLICY_SEE
+ * @KDBUS_POLICY_SEE:		Allow to see a well-known name
+ */
+enum kdbus_policy_type {
+	KDBUS_POLICY_SEE	= 0,
+	KDBUS_POLICY_TALK,
+	KDBUS_POLICY_OWN,
+};
+
+/**
+ * enum kdbus_hello_flags - flags for struct kdbus_cmd_hello
+ * @KDBUS_HELLO_ACCEPT_FD:	The connection allows the reception of
+ *				any passed file descriptors
+ * @KDBUS_HELLO_ACTIVATOR:	Special-purpose connection which registers
+ *				a well-know name for a process to be started
+ *				when traffic arrives
+ * @KDBUS_HELLO_POLICY_HOLDER:	Special-purpose connection which registers
+ *				policy entries for a name. The provided name
+ *				is not activated and not registered with the
+ *				name database, it only allows unprivileged
+ *				connections to aquire a name, talk or discover
+ *				a service
+ * @KDBUS_HELLO_MONITOR:	Special-purpose connection to monitor
+ *				bus traffic
+ */
+enum kdbus_hello_flags {
+	KDBUS_HELLO_ACCEPT_FD		=  1ULL <<  0,
+	KDBUS_HELLO_ACTIVATOR		=  1ULL <<  1,
+	KDBUS_HELLO_POLICY_HOLDER	=  1ULL <<  2,
+	KDBUS_HELLO_MONITOR		=  1ULL <<  3,
+};
+
+/**
+ * enum kdbus_attach_flags - flags for metadata attachments
+ * @KDBUS_ATTACH_TIMESTAMP:		Timestamp
+ * @KDBUS_ATTACH_CREDS:			Credentials
+ * @KDBUS_ATTACH_AUXGROUPS:		Auxiliary groups
+ * @KDBUS_ATTACH_NAMES:			Well-known names
+ * @KDBUS_ATTACH_TID_COMM:		The "comm" process identifier of the TID
+ * @KDBUS_ATTACH_PID_COMM:		The "comm" process identifier of the PID
+ * @KDBUS_ATTACH_EXE:			The path of the executable
+ * @KDBUS_ATTACH_CMDLINE:		The process command line
+ * @KDBUS_ATTACH_CGROUP:		The croup membership
+ * @KDBUS_ATTACH_CAPS:			The process capabilities
+ * @KDBUS_ATTACH_SECLABEL:		The security label
+ * @KDBUS_ATTACH_AUDIT:			The audit IDs
+ * @KDBUS_ATTACH_CONN_DESCRIPTION:	The human-readable connection name
+ * @_KDBUS_ATTACH_ALL:			All of the above
+ * @_KDBUS_ATTACH_ANY:			Wildcard match to enable any kind of
+ *					metatdata.
+ */
+enum kdbus_attach_flags {
+	KDBUS_ATTACH_TIMESTAMP		=  1ULL <<  0,
+	KDBUS_ATTACH_CREDS		=  1ULL <<  1,
+	KDBUS_ATTACH_AUXGROUPS		=  1ULL <<  2,
+	KDBUS_ATTACH_NAMES		=  1ULL <<  3,
+	KDBUS_ATTACH_TID_COMM		=  1ULL <<  4,
+	KDBUS_ATTACH_PID_COMM		=  1ULL <<  5,
+	KDBUS_ATTACH_EXE		=  1ULL <<  6,
+	KDBUS_ATTACH_CMDLINE		=  1ULL <<  7,
+	KDBUS_ATTACH_CGROUP		=  1ULL <<  8,
+	KDBUS_ATTACH_CAPS		=  1ULL <<  9,
+	KDBUS_ATTACH_SECLABEL		=  1ULL << 10,
+	KDBUS_ATTACH_AUDIT		=  1ULL << 11,
+	KDBUS_ATTACH_CONN_DESCRIPTION	=  1ULL << 12,
+	_KDBUS_ATTACH_ALL		=  (1ULL << 13) - 1,
+	_KDBUS_ATTACH_ANY		=  ~0ULL
+};
+
+/**
+ * struct kdbus_cmd_hello - struct to say hello to kdbus
+ * @size:		The total size of the structure
+ * @flags:		Connection flags (KDBUS_HELLO_*), userspace → kernel
+ * @kernel_flags:	Supported connection flags, kernel → userspace
+ * @attach_flags_send:	Mask of metadata to attach to each message sent
+ *			off by this connection (KDBUS_ATTACH_*)
+ * @attach_flags_recv:	Mask of metadata to attach to each message receieved
+ *			by the new connection (KDBUS_ATTACH_*)
+ * @bus_flags:		The flags field copied verbatim from the original
+ *			KDBUS_CMD_BUS_MAKE ioctl. It's intended to be useful
+ *			to do negotiation of features of the payload that is
+ *			transferred (kernel → userspace)
+ * @id:			The ID of this connection (kernel → userspace)
+ * @pool_size:		Size of the connection's buffer where the received
+ *			messages are placed
+ * @bloom:		The bloom properties of the bus, specified
+ *			by the bus creator (kernel → userspace)
+ * @id128:		Unique 128-bit ID of the bus (kernel → userspace)
+ * @items:		A list of items
+ *
+ * This struct is used with the KDBUS_CMD_HELLO ioctl.
+ */
+struct kdbus_cmd_hello {
+	__u64 size;
+	__u64 flags;
+	__u64 kernel_flags;
+	__u64 attach_flags_send;
+	__u64 attach_flags_recv;
+	__u64 bus_flags;
+	__u64 id;
+	__u64 pool_size;
+	struct kdbus_bloom_parameter bloom;
+	__u8 id128[16];
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * enum kdbus_make_flags - Flags for KDBUS_CMD_{BUS,EP,NS}_MAKE
+ * @KDBUS_MAKE_ACCESS_GROUP:	Make the device node group-accessible
+ * @KDBUS_MAKE_ACCESS_WORLD:	Make the device node world-accessible
+ */
+enum kdbus_make_flags {
+	KDBUS_MAKE_ACCESS_GROUP		= 1ULL <<  0,
+	KDBUS_MAKE_ACCESS_WORLD		= 1ULL <<  1,
+};
+
+/**
+ * struct kdbus_cmd_make - struct to make a bus, an endpoint or a domain
+ * @size:		The total size of the struct
+ * @flags:		Properties for the bus/ep/domain to create,
+ *			userspace → kernel
+ * @kernel_flags:	Supported flags for the used command, kernel → userspace
+ * @items:		Items describing details
+ *
+ * This structure is used with the KDBUS_CMD_BUS_MAKE and
+ * KDBUS_CMD_ENDPOINT_MAKE ioctls.
+ */
+struct kdbus_cmd_make {
+	__u64 size;
+	__u64 flags;
+	__u64 kernel_flags;
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * enum kdbus_name_flags - properties of a well-known name
+ * @KDBUS_NAME_REPLACE_EXISTING:	Try to replace name of other connections
+ * @KDBUS_NAME_ALLOW_REPLACEMENT:	Allow the replacement of the name
+ * @KDBUS_NAME_QUEUE:			Name should be queued if busy
+ * @KDBUS_NAME_IN_QUEUE:		Name is queued
+ * @KDBUS_NAME_ACTIVATOR:		Name is owned by a activator connection
+ */
+enum kdbus_name_flags {
+	KDBUS_NAME_REPLACE_EXISTING	= 1ULL <<  0,
+	KDBUS_NAME_ALLOW_REPLACEMENT	= 1ULL <<  1,
+	KDBUS_NAME_QUEUE		= 1ULL <<  2,
+	KDBUS_NAME_IN_QUEUE		= 1ULL <<  3,
+	KDBUS_NAME_ACTIVATOR		= 1ULL <<  4,
+};
+
+/**
+ * struct kdbus_cmd_name - struct to describe a well-known name
+ * @size:		The total size of the struct
+ * @flags:		Flags for a name entry (KDBUS_NAME_*),
+ *			userspace → kernel, kernel → userspace
+ * @kernel_flags:	Supported flags for a name entry, kernel → userspace
+ * @items:		Item list, containing the well-known name as
+ *			KDBUS_ITEM_NAME
+ *
+ * This structure is used with the KDBUS_CMD_NAME_ACQUIRE ioctl.
+ */
+struct kdbus_cmd_name {
+	__u64 size;
+	__u64 flags;
+	__u64 kernel_flags;
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * struct kdbus_name_info - struct to describe a well-known name
+ * @size:		The total size of the struct
+ * @conn_flags:		The flags of the owning connection (KDBUS_HELLO_*)
+ * @owner_id:		The current owner of the name
+ * @items:		Item list, containing the well-known name as
+ *			KDBUS_ITEM_OWNED_NAME
+ *
+ * This structure is used as return struct for the KDBUS_CMD_NAME_LIST ioctl.
+ */
+struct kdbus_name_info {
+	__u64 size;
+	__u64 conn_flags;
+	__u64 owner_id;
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * enum kdbus_name_list_flags - what to include into the returned list
+ * @KDBUS_NAME_LIST_UNIQUE:	All active connections
+ * @KDBUS_NAME_LIST_NAMES:	All known well-known names
+ * @KDBUS_NAME_LIST_ACTIVATORS:	All activator connections
+ * @KDBUS_NAME_LIST_QUEUED:	All queued-up names
+ */
+enum kdbus_name_list_flags {
+	KDBUS_NAME_LIST_UNIQUE		= 1ULL <<  0,
+	KDBUS_NAME_LIST_NAMES		= 1ULL <<  1,
+	KDBUS_NAME_LIST_ACTIVATORS	= 1ULL <<  2,
+	KDBUS_NAME_LIST_QUEUED		= 1ULL <<  3,
+};
+
+/**
+ * struct kdbus_cmd_name_list - request a list of name entries
+ * @flags:		Flags for the query (KDBUS_NAME_LIST_*),
+ *			userspace → kernel
+ * @kernel_flags:	Supported flags for queries, kernel → userspace
+ * @offset:		The returned offset in the caller's pool buffer.
+ *			The user must use KDBUS_CMD_FREE to free the
+ *			allocated memory.
+ *
+ * This structure is used with the KDBUS_CMD_NAME_LIST ioctl.
+ */
+struct kdbus_cmd_name_list {
+	__u64 flags;
+	__u64 kernel_flags;
+	__u64 offset;
+} __attribute__((aligned(8)));
+
+/**
+ * struct kdbus_name_list - information returned by KDBUS_CMD_NAME_LIST
+ * @size:		The total size of the structure
+ * @names:		A list of names
+ *
+ * Note that the user is responsible for freeing the allocated memory with
+ * the KDBUS_CMD_FREE ioctl.
+ */
+struct kdbus_name_list {
+	__u64 size;
+	struct kdbus_name_info names[0];
+};
+
+/**
+ * struct kdbus_cmd_info - struct used for KDBUS_CMD_CONN_INFO ioctl
+ * @size:		The total size of the struct
+ * @flags:		KDBUS_ATTACH_* flags, userspace → kernel
+ * @kernel_flags:	Supported KDBUS_ATTACH_* flags, kernel → userspace
+ * @id:			The 64-bit ID of the connection. If set to zero, passing
+ *			@name is required. kdbus will look up the name to
+ *			determine the ID in this case.
+ * @offset:		Returned offset in the caller's pool buffer where the
+ *			kdbus_info struct result is stored. The user must
+ *			use KDBUS_CMD_FREE to free the allocated memory.
+ * @items:		The optional item list, containing the
+ *			well-known name to look up as a KDBUS_ITEM_NAME.
+ *			Only needed in case @id is zero.
+ *
+ * On success, the KDBUS_CMD_CONN_INFO ioctl will return 0 and @offset will
+ * tell the user the offset in the connection pool buffer at which to find the
+ * result in a struct kdbus_info.
+ */
+struct kdbus_cmd_info {
+	__u64 size;
+	__u64 flags;
+	__u64 kernel_flags;
+	__u64 id;
+	__u64 offset;
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * struct kdbus_info - information returned by KDBUS_CMD_*_INFO
+ * @size:		The total size of the struct
+ * @id:			The connection's or bus' 64-bit ID
+ * @flags:		The connection's or bus' flags
+ * @items:		A list of struct kdbus_item
+ *
+ * Note that the user is responsible for freeing the allocated memory with
+ * the KDBUS_CMD_FREE ioctl.
+ */
+struct kdbus_info {
+	__u64 size;
+	__u64 id;
+	__u64 flags;
+	struct kdbus_item items[0];
+};
+
+/**
+ * struct kdbus_cmd_update - update flags of a connection
+ * @size:		The total size of the struct
+ * @flags:		Flags for the update command, userspace → kernel
+ * @kernel_flags:	Supported flags for this command, kernel → userspace
+ * @items:		A list of struct kdbus_item
+ *
+ * This struct is used with the KDBUS_CMD_CONN_UPDATE ioctl.
+ */
+struct kdbus_cmd_update {
+	__u64 size;
+	__u64 flags;
+	__u64 kernel_flags;
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * enum kdbus_cmd_match_flags - flags to control the KDBUS_CMD_MATCH_ADD ioctl
+ * @KDBUS_MATCH_REPLACE:	If entries with the supplied cookie already
+ *				exists, remove them before installing the new
+ *				matches.
+ */
+enum kdbus_cmd_match_flags {
+	KDBUS_MATCH_REPLACE	= 1ULL <<  0,
+};
+
+/**
+ * struct kdbus_cmd_match - struct to add or remove matches
+ * @size:		The total size of the struct
+ * @cookie:		Userspace supplied cookie. When removing, the cookie
+ *			identifies the match to remove
+ * @flags:		Flags for match command (KDBUS_MATCH_*),
+ *			userspace → kernel
+ * @kernel_flags:	Supported flags of the used command, kernel → userspace
+ * @items:		A list of items for additional information
+ *
+ * This structure is used with the KDBUS_CMD_MATCH_ADD and
+ * KDBUS_CMD_MATCH_REMOVE ioctl.
+ */
+struct kdbus_cmd_match {
+	__u64 size;
+	__u64 cookie;
+	__u64 flags;
+	__u64 kernel_flags;
+	struct kdbus_item items[0];
+} __attribute__((aligned(8)));
+
+/**
+ * Ioctl API
+ * KDBUS_CMD_BUS_MAKE:		After opening the "control" device node, this
+ *				command creates a new bus with the specified
+ *				name. The bus is immediately shut down and
+ *				cleaned up when the opened "control" device node
+ *				is closed.
+ * KDBUS_CMD_ENDPOINT_MAKE:	Creates a new named special endpoint to talk to
+ *				the bus. Such endpoints usually carry a more
+ *				restrictive policy and grant restricted access
+ *				to specific applications.
+ * KDBUS_CMD_HELLO:		By opening the bus device node a connection is
+ *				created. After a HELLO the opened connection
+ *				becomes an active peer on the bus.
+ * KDBUS_CMD_BYEBYE:		Disconnect a connection. If there are no
+ *				messages queued up in the connection's pool,
+ *				the call succeeds, and the handle is rendered
+ *				unusable. Otherwise, -EBUSY is returned without
+ *				any further side-effects.
+ * KDBUS_CMD_MSG_SEND:		Send a message and pass data from userspace to
+ *				the kernel.
+ * KDBUS_CMD_MSG_RECV:		Receive a message from the kernel which is
+ *				placed in the receiver's pool.
+ * KDBUS_CMD_MSG_CANCEL:	Cancel a pending request of a message that
+ *				blocks while waiting for a reply. The parameter
+ *				denotes the cookie of the message in flight.
+ * KDBUS_CMD_FREE:		Release the allocated memory in the receiver's
+ *				pool.
+ * KDBUS_CMD_NAME_ACQUIRE:	Request a well-known bus name to associate with
+ *				the connection. Well-known names are used to
+ *				address a peer on the bus.
+ * KDBUS_CMD_NAME_RELEASE:	Release a well-known name the connection
+ *				currently owns.
+ * KDBUS_CMD_NAME_LIST:		Retrieve the list of all currently registered
+ *				well-known and unique names.
+ * KDBUS_CMD_CONN_INFO:		Retrieve credentials and properties of the
+ *				initial creator of the connection. The data was
+ *				stored at registration time and does not
+ *				necessarily represent the connected process or
+ *				the actual state of the process.
+ * KDBUS_CMD_CONN_UPDATE:	Update the properties of a connection. Used to
+ *				update the metadata subscription mask and
+ *				policy.
+ * KDBUS_CMD_BUS_CREATOR_INFO:	Retrieve information of the creator of the bus
+ *				a connection is attached to.
+ * KDBUS_CMD_ENDPOINT_UPDATE:	Update the properties of a custom enpoint. Used
+ *				to update the policy.
+ * KDBUS_CMD_MATCH_ADD:	Install a match which broadcast messages should
+ *				be delivered to the connection.
+ * KDBUS_CMD_MATCH_REMOVE:	Remove a current match for broadcast messages.
+ */
+#define KDBUS_CMD_BUS_MAKE		_IOW(KDBUS_IOCTL_MAGIC, 0x00,	\
+					     struct kdbus_cmd_make)
+#define KDBUS_CMD_ENDPOINT_MAKE		_IOW(KDBUS_IOCTL_MAGIC, 0x10,	\
+					     struct kdbus_cmd_make)
+
+#define KDBUS_CMD_HELLO			_IOWR(KDBUS_IOCTL_MAGIC, 0x20,	\
+					      struct kdbus_cmd_hello)
+#define KDBUS_CMD_BYEBYE		_IO(KDBUS_IOCTL_MAGIC, 0x21)	\
+
+#define KDBUS_CMD_MSG_SEND		_IOWR(KDBUS_IOCTL_MAGIC, 0x30,	\
+					      struct kdbus_msg)
+#define KDBUS_CMD_MSG_RECV		_IOWR(KDBUS_IOCTL_MAGIC, 0x31,	\
+					      struct kdbus_cmd_recv)
+#define KDBUS_CMD_MSG_CANCEL		_IOW(KDBUS_IOCTL_MAGIC, 0x32,	\
+					     struct kdbus_cmd_cancel)
+#define KDBUS_CMD_FREE			_IOW(KDBUS_IOCTL_MAGIC, 0x33,	\
+					     struct kdbus_cmd_free)
+
+#define KDBUS_CMD_NAME_ACQUIRE		_IOWR(KDBUS_IOCTL_MAGIC, 0x40,	\
+					      struct kdbus_cmd_name)
+#define KDBUS_CMD_NAME_RELEASE		_IOW(KDBUS_IOCTL_MAGIC, 0x41,	\
+					     struct kdbus_cmd_name)
+#define KDBUS_CMD_NAME_LIST		_IOWR(KDBUS_IOCTL_MAGIC, 0x42,	\
+					      struct kdbus_cmd_name_list)
+
+#define KDBUS_CMD_CONN_INFO		_IOWR(KDBUS_IOCTL_MAGIC, 0x50,	\
+					      struct kdbus_cmd_info)
+#define KDBUS_CMD_CONN_UPDATE		_IOW(KDBUS_IOCTL_MAGIC, 0x51,	\
+					     struct kdbus_cmd_update)
+#define KDBUS_CMD_BUS_CREATOR_INFO	_IOWR(KDBUS_IOCTL_MAGIC, 0x52,	\
+					      struct kdbus_cmd_info)
+
+#define KDBUS_CMD_ENDPOINT_UPDATE	_IOW(KDBUS_IOCTL_MAGIC, 0x61,	\
+					     struct kdbus_cmd_update)
+
+#define KDBUS_CMD_MATCH_ADD		_IOW(KDBUS_IOCTL_MAGIC, 0x70,	\
+					     struct kdbus_cmd_match)
+#define KDBUS_CMD_MATCH_REMOVE		_IOW(KDBUS_IOCTL_MAGIC, 0x71,	\
+					     struct kdbus_cmd_match)
+
+#endif /* _KDBUS_UAPI_H_ */
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add driver skeleton, ioctl entry points and utility functions
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

Add the basic driver structure.

handle.c is the main ioctl command dispatcher that calls into other
parts of the driver.

main.c contains the code that creates the initial domain at startup, and
util.c has utility functions such as item iterators that are shared with
other files.

limits.h describes limits on things like maximum data structure sizes,
number of messages per users and suchlike. Some of the numbers currently
picked are rough ideas of what what might be sufficient and are probably
rather conservative.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/ioctl/ioctl-number.txt |   1 +
 ipc/kdbus/handle.c                   | 993 +++++++++++++++++++++++++++++++++++
 ipc/kdbus/handle.h                   |  20 +
 ipc/kdbus/limits.h                   |  77 +++
 ipc/kdbus/main.c                     |  59 +++
 ipc/kdbus/util.c                     | 166 ++++++
 ipc/kdbus/util.h                     | 103 ++++
 7 files changed, 1419 insertions(+)
 create mode 100644 ipc/kdbus/handle.c
 create mode 100644 ipc/kdbus/handle.h
 create mode 100644 ipc/kdbus/limits.h
 create mode 100644 ipc/kdbus/main.c
 create mode 100644 ipc/kdbus/util.c
 create mode 100644 ipc/kdbus/util.h

diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 8136e1fd30fd..54e091ebb862 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -292,6 +292,7 @@ Code  Seq#(hex)	Include File		Comments
 0x92	00-0F	drivers/usb/mon/mon_bin.c
 0x93	60-7F	linux/auto_fs.h
 0x94	all	fs/btrfs/ioctl.h
+0x95	all	uapi/linux/kdbus.h	kdbus IPC driver
 0x97	00-7F	fs/ceph/ioctl.h		Ceph file system
 0x99	00-0F				537-Addinboard driver
 					<mailto:buk@buks.ipn.de>
diff --git a/ipc/kdbus/handle.c b/ipc/kdbus/handle.c
new file mode 100644
index 000000000000..b3b943b864d8
--- /dev/null
+++ b/ipc/kdbus/handle.c
@@ -0,0 +1,993 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/kdev_t.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/syscalls.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "endpoint.h"
+#include "handle.h"
+#include "item.h"
+#include "match.h"
+#include "message.h"
+#include "metadata.h"
+#include "names.h"
+#include "domain.h"
+#include "policy.h"
+
+/**
+ * enum kdbus_handle_ep_type - type an endpoint handle can be of
+ * @KDBUS_HANDLE_EP_NONE:	New file descriptor on an endpoint
+ * @KDBUS_HANDLE_EP_CONNECTED:	A bus connection after HELLO
+ * @KDBUS_HANDLE_EP_OWNER:	File descriptor to hold an endpoint
+ */
+enum kdbus_handle_ep_type {
+	KDBUS_HANDLE_EP_NONE,
+	KDBUS_HANDLE_EP_CONNECTED,
+	KDBUS_HANDLE_EP_OWNER,
+};
+
+/**
+ * struct handle_ep - an endpoint handle to the kdbus system
+ * @lock:		Handle lock
+ * @meta:		Cached connection creator's metadata/credentials
+ * @ep:			The endpoint for this handle
+ * @type:		Type of this handle (KDBUS_HANDLE_EP_*)
+ * @conn:		The connection this handle owns, in case @type
+ *			is KDBUS_HANDLE_EP_CONNECTED
+ * @ep_owner:		The endpoint this handle owns, in case @type
+ *			is KDBUS_HANDLE_EP_OWNER
+ * @privileged:		Flag to mark a handle as privileged
+ */
+struct kdbus_handle_ep {
+	struct mutex lock;
+	struct kdbus_meta *meta;
+	struct kdbus_ep *ep;
+
+	enum kdbus_handle_ep_type type;
+	union {
+		struct kdbus_conn *conn;
+		struct kdbus_ep *ep_owner;
+	};
+
+	bool privileged:1;
+};
+
+static int handle_ep_open(struct inode *inode, struct file *file)
+{
+	struct kdbus_handle_ep *handle;
+	struct kdbus_node *node;
+	int ret;
+
+	/* kdbusfs stores the kdbus_node in i_private */
+	node = inode->i_private;
+	if (!kdbus_node_acquire(node))
+		return -ESHUTDOWN;
+
+	handle = kzalloc(sizeof(*handle), GFP_KERNEL);
+	if (!handle) {
+		ret = -ENOMEM;
+		goto exit_node;
+	}
+
+	mutex_init(&handle->lock);
+	handle->ep = kdbus_ep_ref(kdbus_ep_from_node(node));
+	handle->type = KDBUS_HANDLE_EP_NONE;
+
+	if (ns_capable(&init_user_ns, CAP_IPC_OWNER) ||
+	    uid_eq(handle->ep->bus->node.uid, file->f_cred->fsuid))
+		handle->privileged = true;
+
+	/* cache the metadata/credentials of the creator */
+	handle->meta = kdbus_meta_new();
+	if (IS_ERR(handle->meta)) {
+		ret = PTR_ERR(handle->meta);
+		goto exit_free;
+	}
+
+	ret = kdbus_meta_append(handle->meta, handle->ep->bus->domain, NULL, 0,
+				KDBUS_ATTACH_CREDS	|
+				KDBUS_ATTACH_AUXGROUPS	|
+				KDBUS_ATTACH_TID_COMM	|
+				KDBUS_ATTACH_PID_COMM	|
+				KDBUS_ATTACH_EXE	|
+				KDBUS_ATTACH_CMDLINE	|
+				KDBUS_ATTACH_CGROUP	|
+				KDBUS_ATTACH_CAPS	|
+				KDBUS_ATTACH_SECLABEL	|
+				KDBUS_ATTACH_AUDIT);
+	if (ret < 0)
+		goto exit_free;
+
+	file->private_data = handle;
+	kdbus_node_release(node);
+
+	return 0;
+
+exit_free:
+	kdbus_meta_free(handle->meta);
+	kdbus_ep_unref(handle->ep);
+	kfree(handle);
+exit_node:
+	kdbus_node_release(node);
+	return ret;
+}
+
+static int handle_ep_release(struct inode *inode, struct file *file)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+
+	switch (handle->type) {
+	case KDBUS_HANDLE_EP_OWNER:
+		kdbus_ep_deactivate(handle->ep_owner);
+		kdbus_ep_unref(handle->ep_owner);
+		break;
+
+	case KDBUS_HANDLE_EP_CONNECTED:
+		kdbus_conn_disconnect(handle->conn, false);
+		kdbus_conn_unref(handle->conn);
+		break;
+
+	case KDBUS_HANDLE_EP_NONE:
+		/* nothing to clean up */
+		break;
+	}
+
+	kdbus_meta_free(handle->meta);
+	kdbus_ep_unref(handle->ep);
+	kfree(handle);
+
+	return 0;
+}
+
+static int handle_ep_ioctl_endpoint_make(struct kdbus_handle_ep *handle,
+					 void __user *buf)
+{
+	struct kdbus_domain_user *user;
+	struct kdbus_cmd_make *make;
+	struct kdbus_ep *ep;
+	unsigned int access;
+	const char *name;
+	int ret;
+
+	/* creating custom endpoints is a privileged operation */
+	if (!handle->privileged)
+		return -EPERM;
+
+	make = kdbus_memdup_user(buf, sizeof(*make), KDBUS_MAKE_MAX_SIZE);
+	if (IS_ERR(make))
+		return PTR_ERR(make);
+
+	ret = kdbus_negotiate_flags(make, buf, struct kdbus_cmd_make,
+				    KDBUS_MAKE_ACCESS_GROUP |
+				    KDBUS_MAKE_ACCESS_WORLD);
+	if (ret < 0)
+		goto exit;
+
+	ret = kdbus_items_validate(make->items, KDBUS_ITEMS_SIZE(make, items));
+	if (ret < 0)
+		goto exit;
+
+	name = kdbus_items_get_str(make->items, KDBUS_ITEMS_SIZE(make, items),
+				   KDBUS_ITEM_MAKE_NAME);
+	if (IS_ERR(name)) {
+		ret = PTR_ERR(name);
+		goto exit;
+	}
+
+	access = make->flags & (KDBUS_MAKE_ACCESS_WORLD |
+				KDBUS_MAKE_ACCESS_GROUP);
+
+	ep = kdbus_ep_new(handle->ep->bus, name, access, current_fsuid(),
+			  current_fsgid(), true);
+	if (IS_ERR(ep)) {
+		ret = PTR_ERR(ep);
+		goto exit;
+	}
+
+	/*
+	 * Get an anonymous user to account messages against; custom
+	 * endpoint users do not share the budget with the ordinary
+	 * users created for a UID.
+	 */
+	user = kdbus_domain_get_user(handle->ep->bus->domain, INVALID_UID);
+	if (IS_ERR(user)) {
+		ret = PTR_ERR(user);
+		goto exit_ep_unref;
+	}
+	ep->user = user;
+
+	ret = kdbus_ep_activate(ep);
+	if (ret < 0)
+		goto exit_ep_unref;
+
+	ret = kdbus_ep_policy_set(ep, make->items,
+				  KDBUS_ITEMS_SIZE(make, items));
+	if (ret < 0)
+		goto exit_ep_unref;
+
+	/* protect against parallel ioctls */
+	mutex_lock(&handle->lock);
+	if (handle->type != KDBUS_HANDLE_EP_NONE) {
+		ret = -EBADFD;
+	} else {
+		handle->type = KDBUS_HANDLE_EP_OWNER;
+		handle->ep_owner = ep;
+	}
+	mutex_unlock(&handle->lock);
+
+	if (ret < 0)
+		goto exit_ep_unref;
+
+	goto exit;
+
+exit_ep_unref:
+	kdbus_ep_deactivate(ep);
+	kdbus_ep_unref(ep);
+exit:
+	kfree(make);
+	return ret;
+}
+
+static int handle_ep_ioctl_hello(struct kdbus_handle_ep *handle,
+				 void __user *buf)
+{
+	struct kdbus_conn *conn;
+	struct kdbus_cmd_hello *hello;
+	int ret;
+
+	hello = kdbus_memdup_user(buf, sizeof(*hello), KDBUS_HELLO_MAX_SIZE);
+	if (IS_ERR(hello))
+		return PTR_ERR(hello);
+
+	ret = kdbus_negotiate_flags(hello, buf, typeof(*hello),
+				    KDBUS_HELLO_ACCEPT_FD |
+				    KDBUS_HELLO_ACTIVATOR |
+				    KDBUS_HELLO_POLICY_HOLDER |
+				    KDBUS_HELLO_MONITOR);
+	if (ret < 0)
+		goto exit;
+
+	ret = kdbus_items_validate(hello->items,
+				   KDBUS_ITEMS_SIZE(hello, items));
+	if (ret < 0)
+		goto exit;
+
+	if (!hello->pool_size || !IS_ALIGNED(hello->pool_size, PAGE_SIZE)) {
+		ret = -EFAULT;
+		goto exit;
+	}
+
+	conn = kdbus_conn_new(handle->ep, hello, handle->meta,
+			      handle->privileged);
+	if (IS_ERR(conn)) {
+		ret = PTR_ERR(conn);
+		goto exit;
+	}
+
+	if (copy_to_user(buf, hello, sizeof(*hello))) {
+		ret = -EFAULT;
+		goto exit_conn_live;
+	}
+
+	/* protect against parallel ioctls */
+	mutex_lock(&handle->lock);
+	if (handle->type != KDBUS_HANDLE_EP_NONE) {
+		ret = -EBADFD;
+	} else {
+		handle->type = KDBUS_HANDLE_EP_CONNECTED;
+		handle->conn = conn;
+	}
+	mutex_unlock(&handle->lock);
+
+	if (ret < 0)
+		goto exit_conn_live;
+
+	goto exit;
+
+exit_conn_live:
+	kdbus_conn_disconnect(conn, false);
+	kdbus_conn_unref(conn);
+exit:
+	kfree(hello);
+	return ret;
+}
+
+/* kdbus endpoint make commands */
+static long handle_ep_ioctl_none(struct file *file, unsigned int cmd,
+				 void __user *buf)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+	long ret;
+
+	switch (cmd) {
+	case KDBUS_CMD_ENDPOINT_MAKE:
+		ret = handle_ep_ioctl_endpoint_make(handle, buf);
+		break;
+
+	case KDBUS_CMD_HELLO: {
+		ret = handle_ep_ioctl_hello(handle, buf);
+		break;
+	}
+
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	return ret;
+}
+
+/* kdbus endpoint commands for connected peers */
+static long handle_ep_ioctl_connected(struct file *file, unsigned int cmd,
+				      void __user *buf)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+	struct kdbus_conn *conn = handle->conn;
+	void *free_ptr = NULL;
+	long ret = 0;
+
+	/*
+	 * BYEBYE is special; we must not acquire a connection when
+	 * calling into kdbus_conn_disconnect() or we will deadlock,
+	 * because kdbus_conn_disconnect() will wait for all acquired
+	 * references to be dropped.
+	 */
+	if (cmd == KDBUS_CMD_BYEBYE) {
+		if (!kdbus_conn_is_ordinary(conn))
+			return -EOPNOTSUPP;
+
+		return kdbus_conn_disconnect(conn, true);
+	}
+
+	ret = kdbus_conn_acquire(conn);
+	if (ret < 0)
+		return ret;
+
+	switch (cmd) {
+	case KDBUS_CMD_NAME_ACQUIRE: {
+		/* acquire a well-known name */
+		struct kdbus_cmd_name *cmd_name;
+
+		if (!kdbus_conn_is_ordinary(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		cmd_name = kdbus_memdup_user(buf, sizeof(*cmd_name),
+					     sizeof(*cmd_name) +
+						KDBUS_ITEM_HEADER_SIZE +
+						KDBUS_NAME_MAX_LEN + 1);
+		if (IS_ERR(cmd_name)) {
+			ret = PTR_ERR(cmd_name);
+			break;
+		}
+
+		free_ptr = cmd_name;
+
+		ret = kdbus_negotiate_flags(cmd_name, buf, typeof(*cmd_name),
+					    KDBUS_NAME_REPLACE_EXISTING |
+					    KDBUS_NAME_ALLOW_REPLACEMENT |
+					    KDBUS_NAME_QUEUE);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_name->items,
+					   KDBUS_ITEMS_SIZE(cmd_name, items));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_cmd_name_acquire(conn->ep->bus->name_registry,
+					     conn, cmd_name);
+		if (ret < 0)
+			break;
+
+		/* return flags to the caller */
+		if (copy_to_user(buf, cmd_name, cmd_name->size))
+			ret = -EFAULT;
+
+		break;
+	}
+
+	case KDBUS_CMD_NAME_RELEASE: {
+		/* release a well-known name */
+		struct kdbus_cmd_name *cmd_name;
+
+		if (!kdbus_conn_is_ordinary(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		cmd_name = kdbus_memdup_user(buf, sizeof(*cmd_name),
+					     sizeof(*cmd_name) +
+						KDBUS_ITEM_HEADER_SIZE +
+						KDBUS_NAME_MAX_LEN + 1);
+		if (IS_ERR(cmd_name)) {
+			ret = PTR_ERR(cmd_name);
+			break;
+		}
+
+		free_ptr = cmd_name;
+
+		ret = kdbus_negotiate_flags(cmd_name, buf, typeof(*cmd_name),
+					    0);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_name->items,
+					   KDBUS_ITEMS_SIZE(cmd_name, items));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_cmd_name_release(conn->ep->bus->name_registry,
+					     conn, cmd_name);
+		break;
+	}
+
+	case KDBUS_CMD_NAME_LIST: {
+		struct kdbus_cmd_name_list cmd_list;
+
+		/* query current IDs and names */
+		if (kdbus_copy_from_user(&cmd_list, buf, sizeof(cmd_list))) {
+			ret = -EFAULT;
+			break;
+		}
+
+		ret = kdbus_negotiate_flags(&cmd_list, buf, typeof(cmd_list),
+					    KDBUS_NAME_LIST_UNIQUE |
+					    KDBUS_NAME_LIST_NAMES |
+					    KDBUS_NAME_LIST_ACTIVATORS |
+					    KDBUS_NAME_LIST_QUEUED);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_cmd_name_list(conn->ep->bus->name_registry,
+					  conn, &cmd_list);
+		if (ret < 0)
+			break;
+
+		/* return allocated data */
+		if (kdbus_offset_set_user(&cmd_list.offset, buf,
+					  struct kdbus_cmd_name_list))
+			ret = -EFAULT;
+
+		break;
+	}
+
+	case KDBUS_CMD_CONN_INFO:
+	case KDBUS_CMD_BUS_CREATOR_INFO: {
+		struct kdbus_cmd_info *cmd_info;
+
+		/* return the properties of a connection */
+		cmd_info = kdbus_memdup_user(buf, sizeof(*cmd_info),
+					     sizeof(*cmd_info) +
+						KDBUS_NAME_MAX_LEN + 1);
+		if (IS_ERR(cmd_info)) {
+			ret = PTR_ERR(cmd_info);
+			break;
+		}
+
+		free_ptr = cmd_info;
+
+		ret = kdbus_negotiate_flags(cmd_info, buf, typeof(*cmd_info),
+					    _KDBUS_ATTACH_ALL);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_info->items,
+					   KDBUS_ITEMS_SIZE(cmd_info, items));
+		if (ret < 0)
+			break;
+
+		if (cmd == KDBUS_CMD_CONN_INFO)
+			ret = kdbus_cmd_info(conn, cmd_info);
+		else
+			ret = kdbus_cmd_bus_creator_info(conn, cmd_info);
+
+		if (ret < 0)
+			break;
+
+		if (kdbus_offset_set_user(&cmd_info->offset, buf,
+					  struct kdbus_cmd_info))
+			ret = -EFAULT;
+
+		break;
+	}
+
+	case KDBUS_CMD_CONN_UPDATE: {
+		/* update the properties of a connection */
+		struct kdbus_cmd_update *cmd_update;
+
+		if (!kdbus_conn_is_ordinary(conn) &&
+		    !kdbus_conn_is_policy_holder(conn) &&
+		    !kdbus_conn_is_monitor(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		cmd_update = kdbus_memdup_user(buf, sizeof(*cmd_update),
+					       KDBUS_UPDATE_MAX_SIZE);
+		if (IS_ERR(cmd_update)) {
+			ret = PTR_ERR(cmd_update);
+			break;
+		}
+
+		free_ptr = cmd_update;
+
+		ret = kdbus_negotiate_flags(cmd_update, buf,
+					    typeof(*cmd_update), 0);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_update->items,
+					   KDBUS_ITEMS_SIZE(cmd_update, items));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_cmd_conn_update(conn, cmd_update);
+		break;
+	}
+
+	case KDBUS_CMD_MATCH_ADD: {
+		/* subscribe to/filter for broadcast messages */
+		struct kdbus_cmd_match *cmd_match;
+
+		if (!kdbus_conn_is_ordinary(conn) &&
+		    !kdbus_conn_is_monitor(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		cmd_match = kdbus_memdup_user(buf, sizeof(*cmd_match),
+					      KDBUS_MATCH_MAX_SIZE);
+		if (IS_ERR(cmd_match)) {
+			ret = PTR_ERR(cmd_match);
+			break;
+		}
+
+		free_ptr = cmd_match;
+
+		ret = kdbus_negotiate_flags(cmd_match, buf, typeof(*cmd_match),
+					    KDBUS_MATCH_REPLACE);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_match->items,
+					   KDBUS_ITEMS_SIZE(cmd_match, items));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_match_db_add(conn, cmd_match);
+		break;
+	}
+
+	case KDBUS_CMD_MATCH_REMOVE: {
+		/* unsubscribe from broadcast messages */
+		struct kdbus_cmd_match *cmd_match;
+
+		if (!kdbus_conn_is_ordinary(conn) &&
+		    !kdbus_conn_is_monitor(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		cmd_match = kdbus_memdup_user(buf, sizeof(*cmd_match),
+					      sizeof(*cmd_match));
+		if (IS_ERR(cmd_match)) {
+			ret = PTR_ERR(cmd_match);
+			break;
+		}
+
+		free_ptr = cmd_match;
+
+		ret = kdbus_negotiate_flags(cmd_match, buf, typeof(*cmd_match),
+					    0);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_match->items,
+					   KDBUS_ITEMS_SIZE(cmd_match, items));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_match_db_remove(conn, cmd_match);
+		break;
+	}
+
+	case KDBUS_CMD_MSG_SEND: {
+		/* submit a message which will be queued in the receiver */
+		struct kdbus_kmsg *kmsg = NULL;
+
+		if (!kdbus_conn_is_ordinary(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		kmsg = kdbus_kmsg_new_from_user(conn, buf);
+		if (IS_ERR(kmsg)) {
+			ret = PTR_ERR(kmsg);
+			break;
+		}
+
+		ret = kdbus_conn_kmsg_send(conn->ep, conn, kmsg);
+		if (ret < 0) {
+			kdbus_kmsg_free(kmsg);
+			break;
+		}
+
+		/* store the offset of the reply back to userspace */
+		if (kmsg->msg.flags & KDBUS_MSG_FLAGS_SYNC_REPLY) {
+			struct kdbus_msg __user *msg = buf;
+
+			if (copy_to_user(&msg->offset_reply,
+					 &kmsg->msg.offset_reply,
+					 sizeof(msg->offset_reply)))
+				ret = -EFAULT;
+		}
+
+		kdbus_kmsg_free(kmsg);
+		break;
+	}
+
+	case KDBUS_CMD_MSG_RECV: {
+		struct kdbus_cmd_recv cmd_recv;
+
+		if (!kdbus_conn_is_ordinary(conn) &&
+		    !kdbus_conn_is_monitor(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		ret = kdbus_copy_from_user(&cmd_recv, buf, sizeof(cmd_recv));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_negotiate_flags(&cmd_recv, buf, typeof(cmd_recv),
+					    KDBUS_RECV_PEEK | KDBUS_RECV_DROP |
+					    KDBUS_RECV_USE_PRIORITY);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_cmd_msg_recv(conn, &cmd_recv);
+		if (ret < 0)
+			break;
+
+		/* return the address of the next message in the pool */
+		if (kdbus_offset_set_user(&cmd_recv.offset, buf,
+					  struct kdbus_cmd_recv))
+			ret = -EFAULT;
+
+		break;
+	}
+
+	case KDBUS_CMD_MSG_CANCEL: {
+		struct kdbus_cmd_cancel cmd_cancel;
+
+		if (!kdbus_conn_is_ordinary(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		/* cancel sync message send requests by cookie */
+		ret = kdbus_copy_from_user(&cmd_cancel, buf,
+					   sizeof(cmd_cancel));
+		if (ret < 0)
+			break;
+
+		if (cmd_cancel.flags != 0)
+			return -EOPNOTSUPP;
+
+		ret = kdbus_cmd_msg_cancel(conn, cmd_cancel.cookie);
+		break;
+	}
+
+	case KDBUS_CMD_FREE: {
+		struct kdbus_cmd_free cmd_free;
+
+		if (!kdbus_conn_is_ordinary(conn) &&
+		    !kdbus_conn_is_monitor(conn)) {
+			ret = -EOPNOTSUPP;
+			break;
+		}
+
+		/* free the memory used in the receiver's pool */
+		ret = copy_from_user(&cmd_free, buf, sizeof(cmd_free));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_negotiate_flags(&cmd_free, buf, typeof(cmd_free),
+					    0);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_pool_release_offset(conn->pool, cmd_free.offset);
+		break;
+	}
+
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	kdbus_conn_release(conn);
+	kfree(free_ptr);
+	return ret;
+}
+
+/* kdbus endpoint commands for endpoint owners */
+static long handle_ep_ioctl_owner(struct file *file, unsigned int cmd,
+				  void __user *buf)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+	struct kdbus_ep *ep = handle->ep_owner;
+	void *free_ptr = NULL;
+	long ret = 0;
+
+	switch (cmd) {
+	case KDBUS_CMD_ENDPOINT_UPDATE: {
+		struct kdbus_cmd_update *cmd_update;
+
+		/* update the properties of a custom endpoint */
+		cmd_update = kdbus_memdup_user(buf, sizeof(*cmd_update),
+					       KDBUS_UPDATE_MAX_SIZE);
+		if (IS_ERR(cmd_update)) {
+			ret = PTR_ERR(cmd_update);
+			break;
+		}
+
+		free_ptr = cmd_update;
+
+		ret = kdbus_negotiate_flags(cmd_update, buf,
+					    typeof(*cmd_update), 0);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_items_validate(cmd_update->items,
+					   KDBUS_ITEMS_SIZE(cmd_update, items));
+		if (ret < 0)
+			break;
+
+		ret = kdbus_ep_policy_set(ep, cmd_update->items,
+					  KDBUS_ITEMS_SIZE(cmd_update, items));
+		break;
+	}
+
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	kfree(free_ptr);
+	return ret;
+}
+
+static long handle_ep_ioctl(struct file *file, unsigned int cmd,
+			    unsigned long arg)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+	void __user *argp = (void __user *)arg;
+	enum kdbus_handle_ep_type type;
+
+	/* lock while accessing handle->type to enforce barriers */
+	mutex_lock(&handle->lock);
+	type = handle->type;
+	mutex_unlock(&handle->lock);
+
+	switch (type) {
+	case KDBUS_HANDLE_EP_NONE:
+		return handle_ep_ioctl_none(file, cmd, argp);
+
+	case KDBUS_HANDLE_EP_CONNECTED:
+		return handle_ep_ioctl_connected(file, cmd, argp);
+
+	case KDBUS_HANDLE_EP_OWNER:
+		return handle_ep_ioctl_owner(file, cmd, argp);
+
+	default:
+		return -EBADFD;
+	}
+}
+
+static unsigned int handle_ep_poll(struct file *file,
+				   struct poll_table_struct *wait)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+	unsigned int mask = POLLOUT | POLLWRNORM;
+	int ret;
+
+	/* Only a connected endpoint can read/write data */
+	mutex_lock(&handle->lock);
+	if (handle->type != KDBUS_HANDLE_EP_CONNECTED) {
+		mutex_unlock(&handle->lock);
+		return POLLERR | POLLHUP;
+	}
+	mutex_unlock(&handle->lock);
+
+	ret = kdbus_conn_acquire(handle->conn);
+	if (ret < 0)
+		return POLLERR | POLLHUP;
+
+	poll_wait(file, &handle->conn->wait, wait);
+
+	if (!list_empty(&handle->conn->queue.msg_list))
+		mask |= POLLIN | POLLRDNORM;
+
+	kdbus_conn_release(handle->conn);
+
+	return mask;
+}
+
+static int handle_ep_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct kdbus_handle_ep *handle = file->private_data;
+
+	mutex_lock(&handle->lock);
+	if (handle->type != KDBUS_HANDLE_EP_CONNECTED) {
+		mutex_unlock(&handle->lock);
+		return -EPERM;
+	}
+	mutex_unlock(&handle->lock);
+
+	return kdbus_pool_mmap(handle->conn->pool, vma);
+}
+
+const struct file_operations kdbus_handle_ep_ops = {
+	.owner =		THIS_MODULE,
+	.open =			handle_ep_open,
+	.release =		handle_ep_release,
+	.poll =			handle_ep_poll,
+	.llseek =		noop_llseek,
+	.unlocked_ioctl =	handle_ep_ioctl,
+	.mmap =			handle_ep_mmap,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl =		handle_ep_ioctl,
+#endif
+};
+
+static int handle_control_open(struct inode *inode, struct file *file)
+{
+	if (!kdbus_node_is_active(inode->i_private))
+		return -ESHUTDOWN;
+
+	/* private_data is used by BUS_MAKE to store the new bus */
+	file->private_data = NULL;
+
+	return 0;
+}
+
+static int handle_control_release(struct inode *inode, struct file *file)
+{
+	struct kdbus_bus *bus = file->private_data;
+
+	if (bus) {
+		kdbus_bus_deactivate(bus);
+		kdbus_bus_unref(bus);
+	}
+
+	return 0;
+}
+
+static int handle_control_ioctl_bus_make(struct file *file,
+					 struct kdbus_domain *domain,
+					 void __user *buf)
+{
+	struct kdbus_cmd_make *make;
+	struct kdbus_bus *bus;
+	int ret;
+
+	/* catch double BUS_MAKE early, locked test is below */
+	if (file->private_data)
+		return -EBADFD;
+
+	make = kdbus_memdup_user(buf, sizeof(*make), KDBUS_MAKE_MAX_SIZE);
+	if (IS_ERR(make))
+		return PTR_ERR(make);
+
+	ret = kdbus_negotiate_flags(make, buf, struct kdbus_cmd_make,
+				    KDBUS_MAKE_ACCESS_GROUP |
+				    KDBUS_MAKE_ACCESS_WORLD);
+	if (ret < 0)
+		goto exit;
+
+	ret = kdbus_items_validate(make->items, KDBUS_ITEMS_SIZE(make, items));
+	if (ret < 0)
+		goto exit;
+
+	bus = kdbus_bus_new(domain, make, current_fsuid(), current_fsgid());
+	if (IS_ERR(bus)) {
+		ret = PTR_ERR(bus);
+		goto exit;
+	}
+
+	ret = kdbus_bus_activate(bus);
+	if (ret < 0)
+		goto exit_bus_unref;
+
+	/* protect against parallel ioctls */
+	mutex_lock(&domain->lock);
+	if (file->private_data)
+		ret = -EBADFD;
+	else
+		file->private_data = bus;
+	mutex_unlock(&domain->lock);
+
+	if (ret < 0)
+		goto exit_bus_unref;
+
+	goto exit;
+
+exit_bus_unref:
+	kdbus_bus_deactivate(bus);
+	kdbus_bus_unref(bus);
+exit:
+	kfree(make);
+	return ret;
+}
+
+static long handle_control_ioctl(struct file *file, unsigned int cmd,
+				 unsigned long arg)
+{
+	struct kdbus_node *node = file_inode(file)->i_private;
+	struct kdbus_domain *domain;
+	int ret = 0;
+
+	/*
+	 * The parent of control-nodes is always a domain, make sure to pin it
+	 * so the parent is actually valid.
+	 */
+	if (!kdbus_node_acquire(node))
+		return -ESHUTDOWN;
+
+	domain = kdbus_domain_from_node(node->parent);
+	if (!kdbus_node_acquire(&domain->node)) {
+		kdbus_node_release(node);
+		return -ESHUTDOWN;
+	}
+
+	switch (cmd) {
+	case KDBUS_CMD_BUS_MAKE:
+		ret = handle_control_ioctl_bus_make(file, domain,
+						    (void __user *)arg);
+		break;
+
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	kdbus_node_release(&domain->node);
+	kdbus_node_release(node);
+	return ret;
+}
+
+const struct file_operations kdbus_handle_control_ops = {
+	.open =			handle_control_open,
+	.release =		handle_control_release,
+	.llseek =		noop_llseek,
+	.unlocked_ioctl =	handle_control_ioctl,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl =		handle_control_ioctl,
+#endif
+};
diff --git a/ipc/kdbus/handle.h b/ipc/kdbus/handle.h
new file mode 100644
index 000000000000..32809dad3720
--- /dev/null
+++ b/ipc/kdbus/handle.h
@@ -0,0 +1,20 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_HANDLE_H
+#define __KDBUS_HANDLE_H
+
+extern const struct file_operations kdbus_handle_ep_ops;
+extern const struct file_operations kdbus_handle_control_ops;
+
+#endif
diff --git a/ipc/kdbus/limits.h b/ipc/kdbus/limits.h
new file mode 100644
index 000000000000..29cf30fcce07
--- /dev/null
+++ b/ipc/kdbus/limits.h
@@ -0,0 +1,77 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_DEFAULTS_H
+#define __KDBUS_DEFAULTS_H
+
+/* maximum size of message header and items */
+#define KDBUS_MSG_MAX_SIZE		SZ_8K
+
+/* maximum number of message items */
+#define KDBUS_MSG_MAX_ITEMS		128
+
+/*
+ * Maximum number of passed file descriptors
+ * Number taken from AF_UNIX upper limits
+ */
+#define KDBUS_MSG_MAX_FDS		253
+
+/* maximum message payload size */
+#define KDBUS_MSG_MAX_PAYLOAD_VEC_SIZE		SZ_2M
+
+/* maximum size of bloom bit field in bytes */
+#define KDBUS_BUS_BLOOM_MAX_SIZE		SZ_4K
+
+/* maximum length of well-known bus name */
+#define KDBUS_NAME_MAX_LEN			255
+
+/* maximum length of bus, domain, ep name */
+#define KDBUS_SYSNAME_MAX_LEN			63
+
+/* maximum size of make data */
+#define KDBUS_MAKE_MAX_SIZE			SZ_32K
+
+/* maximum size of hello data */
+#define KDBUS_HELLO_MAX_SIZE			SZ_32K
+
+/* maximum size for update commands */
+#define KDBUS_UPDATE_MAX_SIZE			SZ_32K
+
+/* maximum number of matches per connection */
+#define KDBUS_MATCH_MAX				256
+
+/* maximum size of match data */
+#define KDBUS_MATCH_MAX_SIZE			SZ_32K
+
+/* maximum size of policy data */
+#define KDBUS_POLICY_MAX_SIZE			SZ_32K
+
+/* maximum number of queued messages in a connection */
+#define KDBUS_CONN_MAX_MSGS			256
+
+/* maximum number of queued messages from the same indvidual user */
+#define KDBUS_CONN_MAX_MSGS_PER_USER		16
+
+/* maximum number of well-known names per connection */
+#define KDBUS_CONN_MAX_NAMES			64
+
+/* maximum number of queued requests waiting for a reply */
+#define KDBUS_CONN_MAX_REQUESTS_PENDING		128
+
+/* maximum number of connections per user in one domain */
+#define KDBUS_USER_MAX_CONN			256
+
+/* maximum number of buses per user in one domain */
+#define KDBUS_USER_MAX_BUSES			16
+
+#endif
diff --git a/ipc/kdbus/main.c b/ipc/kdbus/main.c
new file mode 100644
index 000000000000..11ddabd30311
--- /dev/null
+++ b/ipc/kdbus/main.c
@@ -0,0 +1,59 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#define pr_fmt(fmt)    KBUILD_MODNAME ": " fmt
+#include <linux/fs.h>
+#include <linux/init.h>
+#include <linux/module.h>
+
+#include "util.h"
+#include "fs.h"
+#include "handle.h"
+#include "node.h"
+
+/* kdbus mount-point /sys/fs/kdbus */
+static struct kobject *kdbus_dir;
+
+static int __init kdbus_init(void)
+{
+	int ret;
+
+	kdbus_dir = kobject_create_and_add(KBUILD_MODNAME, fs_kobj);
+	if (!kdbus_dir)
+		return -ENOMEM;
+
+	ret = kdbus_fs_init();
+	if (ret < 0) {
+		pr_err("cannot register filesystem: %d\n", ret);
+		goto exit_dir;
+	}
+
+	pr_info("initialized\n");
+	return 0;
+
+exit_dir:
+	kobject_put(kdbus_dir);
+	return ret;
+}
+
+static void __exit kdbus_exit(void)
+{
+	kdbus_fs_exit();
+	kobject_put(kdbus_dir);
+}
+
+module_init(kdbus_init);
+module_exit(kdbus_exit);
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("D-Bus, powerful, easy to use interprocess communication");
+MODULE_ALIAS_FS(KBUILD_MODNAME "fs");
diff --git a/ipc/kdbus/util.c b/ipc/kdbus/util.c
new file mode 100644
index 000000000000..7d138be2ea2f
--- /dev/null
+++ b/ipc/kdbus/util.c
@@ -0,0 +1,166 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/ctype.h>
+#include <linux/file.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+
+#include "limits.h"
+#include "util.h"
+
+/**
+ * kdbus_sysname_valid() - validate names showing up in /proc, /sys and /dev
+ * @name:		Name of domain, bus, endpoint
+ *
+ * Return: 0 if the given name is valid, otherwise negative errno
+ */
+int kdbus_sysname_is_valid(const char *name)
+{
+	unsigned int i;
+	size_t len;
+
+	len = strlen(name);
+	if (len == 0)
+		return -EINVAL;
+
+	for (i = 0; i < len; i++) {
+		if (isalpha(name[i]))
+			continue;
+		if (isdigit(name[i]))
+			continue;
+		if (name[i] == '_')
+			continue;
+		if (i > 0 && i + 1 < len && strchr("-.", name[i]))
+			continue;
+
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/**
+ * kdbus_check_and_write_flags() - check flags provided by user, and write the
+ *				   valid mask back
+ * @flags:	The flags mask provided by userspace
+ * @buf:	The buffer provided by userspace
+ * @offset_out:	Offset of the kernel_flags field inside the user-provided struct
+ * @valid:	Mask of valid bits
+ *
+ * This function will check whether the flags provided by userspace are within
+ * the combination of allowed bits to the kernel, with the KDBUS_FLAGS_KERNEL
+ * bit set in the return buffer.
+ *
+ * Return: 0 on success, -EFAULT if copy_to_user() failed, or -EINVAL if
+ * userspace submitted invalid bits in its mask.
+ */
+int kdbus_check_and_write_flags(u64 flags, void __user *buf,
+			  off_t offset_out, u64 valid)
+{
+	u64 val = valid | KDBUS_FLAG_KERNEL;
+
+	/*
+	 * KDBUS_FLAG_KERNEL is reserved and will never be considered
+	 * valid by any user of this function.
+	 */
+	WARN_ON_ONCE(valid & KDBUS_FLAG_KERNEL);
+
+	if (copy_to_user(((u8 __user *) buf) + offset_out, &val, sizeof(val)))
+		return -EFAULT;
+
+	if (flags & ~valid)
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
+ * kdbus_fput_files() - fput() an array of struct files
+ * @files:	The array of files to put, may be NULL
+ * @count:	The number of elements in @files
+ *
+ * Call fput() on all non-NULL elements in @files, and set the entries to
+ * NULL afterwards.
+ */
+void kdbus_fput_files(struct file **files, unsigned int count)
+{
+	int i;
+
+	if (!files)
+		return;
+
+	for (i = count - 1; i >= 0; i--)
+		if (files[i]) {
+			fput(files[i]);
+			files[i] = NULL;
+		}
+}
+
+/**
+ * kdbus_copy_from_user() - copy aligned data from user-space
+ * @dest:	target buffer in kernel memory
+ * @user_ptr:	user-provided source buffer
+ * @size:	memory size to copy from user
+ *
+ * This copies @size bytes from @user_ptr into the kernel, just like
+ * copy_from_user() does. But we enforce an 8-byte alignment and reject any
+ * unaligned user-space pointers.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kdbus_copy_from_user(void *dest, void __user *user_ptr, size_t size)
+{
+	if (!KDBUS_IS_ALIGNED8((uintptr_t)user_ptr))
+		return -EFAULT;
+
+	if (copy_from_user(dest, user_ptr, size))
+		return -EFAULT;
+
+	return 0;
+}
+
+/**
+ * kdbus_memdup_user() - copy dynamically sized object from user-space
+ * @user_ptr:	user-provided source buffer
+ * @sz_min:	minimum object size
+ * @sz_max:	maximum object size
+ *
+ * This copies a dynamically sized object from user-space into kernel-space. We
+ * require the object to have a 64bit size field at offset 0. We read it out
+ * first, allocate a suitably sized buffer and then copy all data.
+ *
+ * The @sz_min and @sz_max parameters define possible min and max object sizes
+ * so user-space cannot trigger un-bound kernel-space allocations.
+ *
+ * The same alignment-restrictions as described in kdbus_copy_from_user() apply.
+ *
+ * Return: pointer to dynamically allocated copy, or ERR_PTR() on failure.
+ */
+void *kdbus_memdup_user(void __user *user_ptr, size_t sz_min, size_t sz_max)
+{
+	u64 size;
+	int ret;
+
+	ret = kdbus_copy_from_user(&size, user_ptr, sizeof(size));
+	if (ret < 0)
+		return ERR_PTR(ret);
+
+	if (size < sz_min)
+		return ERR_PTR(-EINVAL);
+
+	if (size > sz_max)
+		return ERR_PTR(-EMSGSIZE);
+
+	return memdup_user(user_ptr, size);
+}
diff --git a/ipc/kdbus/util.h b/ipc/kdbus/util.h
new file mode 100644
index 000000000000..e727a2134d0c
--- /dev/null
+++ b/ipc/kdbus/util.h
@@ -0,0 +1,103 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_UTIL_H
+#define __KDBUS_UTIL_H
+
+#include <linux/dcache.h>
+#include <linux/ioctl.h>
+
+#include "kdbus.h"
+
+/* all exported addresses are 64 bit */
+#define KDBUS_PTR(addr) ((void __user *)(uintptr_t)(addr))
+
+/* all exported sizes are 64 bit and data aligned to 64 bit */
+#define KDBUS_ALIGN8(s) ALIGN((s), 8)
+#define KDBUS_IS_ALIGNED8(s) (IS_ALIGNED(s, 8))
+
+/**
+ * kdbus_size_get_user - read the size variable from user memory
+ * @_s:			Size variable
+ * @_b:			Buffer to read from
+ * @_t:			Structure, "size" is a member of
+ *
+ * Return: the result of copy_from_user()
+ */
+#define kdbus_size_get_user(_s, _b, _t)					\
+({									\
+	u64 __user *_sz =						\
+		(void __user *)((u8 __user *)(_b) + offsetof(_t, size));\
+	copy_from_user(_s, _sz, sizeof(__u64));				\
+})
+
+/**
+ * kdbus_offset_set_user - write the offset variable to user memory
+ * @_s:			Offset variable
+ * @_b:			Buffer to write to
+ * @_t:			Structure, "offset" is a member of
+ *
+ * Return: the result of copy_to_user()
+ */
+#define kdbus_offset_set_user(_s, _b, _t)				\
+({									\
+	u64 __user *_sz =						\
+		(void __user *)((u8 __user *)(_b) + offsetof(_t, offset)); \
+	copy_to_user(_sz, _s, sizeof(__u64));				\
+})
+
+/**
+ * kdbus_str_hash - calculate a hash
+ * @str:		String
+ *
+ * Return: hash value
+ */
+static inline unsigned int kdbus_str_hash(const char *str)
+{
+	unsigned long hash = init_name_hash();
+
+	while (*str)
+		hash = partial_name_hash(*str++, hash);
+
+	return end_name_hash(hash);
+}
+
+/**
+ * kdbus_str_valid - verify a string
+ * @str:		String to verify
+ * @size:		Size of buffer of string (including 0-byte)
+ *
+ * This verifies the string at position @str with size @size is properly
+ * zero-terminated and does not contain a 0-byte but at the end.
+ *
+ * Return: true if string is valid, false if not.
+ */
+static inline bool kdbus_str_valid(const char *str, size_t size)
+{
+	return size > 0 && memchr(str, '\0', size) == str + size - 1;
+}
+
+int kdbus_sysname_is_valid(const char *name);
+void kdbus_fput_files(struct file **files, unsigned int count);
+
+int kdbus_copy_from_user(void *dest, void __user *user_ptr, size_t size);
+void *kdbus_memdup_user(void __user *user_ptr, size_t sz_min, size_t sz_max);
+
+int kdbus_check_and_write_flags(u64 flags, void __user *buf,
+				off_t offset_out, u64 valid);
+
+#define kdbus_negotiate_flags(_s, _b, _t, _v)				\
+	kdbus_check_and_write_flags((_s)->flags, _b,			\
+				    offsetof(_t, kernel_flags), _v)
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add connection pool implementation
@ 2014-11-21  5:02   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

A pool for data received from the kernel is installed for every
connection of the bus, and it is used to copy data from the kernel to
userspace clients, for messages and other information.

It is accessed when one of the following ioctls is issued:

  * KDBUS_CMD_MSG_RECV, to receive a message
  * KDBUS_CMD_NAME_LIST, to dump the name registry
  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection

The offsets returned by either one of the aforementioned ioctls
describe offsets inside the pool. Internally, the pool is organized in
slices, that are dynamically allocated on demand. The overall size of
the pool is chosen by the connection when it connects to the bus with
KDBUS_CMD_HELLO.

In order to make the slice available for subsequent calls,
KDBUS_CMD_FREE has to be called on the offset.

To access the memory, the caller is expected to mmap() it to its task.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/pool.c | 722 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/pool.h |  44 ++++
 2 files changed, 766 insertions(+)
 create mode 100644 ipc/kdbus/pool.c
 create mode 100644 ipc/kdbus/pool.h

diff --git a/ipc/kdbus/pool.c b/ipc/kdbus/pool.c
new file mode 100644
index 000000000000..3bf4ab426d3b
--- /dev/null
+++ b/ipc/kdbus/pool.c
@@ -0,0 +1,722 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/aio.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/pagemap.h>
+#include <linux/rbtree.h>
+#include <linux/sched.h>
+#include <linux/shmem_fs.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "pool.h"
+#include "util.h"
+
+/**
+ * struct kdbus_pool - the receiver's buffer
+ * @f:			The backing shmem file
+ * @size:		The size of the file
+ * @busy:		The currently used size
+ * @lock:		Pool data lock
+ * @slices:		All slices sorted by address
+ * @slices_busy:	Tree of allocated slices
+ * @slices_free:	Tree of free slices
+ *
+ * The receiver's buffer, managed as a pool of allocated and free
+ * slices containing the queued messages.
+ *
+ * Messages sent with KDBUS_CMD_MSG_SEND are copied direcly by the
+ * sending process into the receiver's pool.
+ *
+ * Messages received with KDBUS_CMD_MSG_RECV just return the offset
+ * to the data placed in the pool.
+ *
+ * The internally allocated memory needs to be returned by the receiver
+ * with KDBUS_CMD_MSG_FREE.
+ */
+struct kdbus_pool {
+	struct file *f;
+	size_t size;
+	size_t busy;
+	struct mutex lock;
+
+	struct list_head slices;
+	struct rb_root slices_busy;
+	struct rb_root slices_free;
+};
+
+/**
+ * struct kdbus_pool_slice - allocated element in kdbus_pool
+ * @pool:		Pool this slice belongs to
+ * @off:		Offset of slice in the shmem file
+ * @size:		Size of slice
+ * @entry:		Entry in "all slices" list
+ * @rb_node:		Entry in free or busy list
+ * @free:		Unused slice
+ * @public:		Slice was exposed to userspace and may be freed
+ *			with KDBUS_CMD_FREE.
+ *
+ * The pool has one or more slices, always spanning the entire size of the
+ * pool.
+ *
+ * Every slice is an element in a list sorted by the buffer address, to
+ * provide access to the next neighbor slice.
+ *
+ * Every slice is member in either the busy or the free tree. The free
+ * tree is organized by slice size, the busy tree organized by buffer
+ * offset.
+ */
+struct kdbus_pool_slice {
+	struct kdbus_pool *pool;
+	size_t off;
+	size_t size;
+
+	struct list_head entry;
+	struct rb_node rb_node;
+	bool free;
+	bool public;
+};
+
+static struct kdbus_pool_slice *kdbus_pool_slice_new(struct kdbus_pool *pool,
+						     size_t off, size_t size)
+{
+	struct kdbus_pool_slice *slice;
+
+	slice = kzalloc(sizeof(*slice), GFP_KERNEL);
+	if (!slice)
+		return NULL;
+
+	slice->pool = pool;
+	slice->off = off;
+	slice->size = size;
+	slice->free = true;
+	slice->public = false;
+	return slice;
+}
+
+/* insert a slice into the free tree */
+static void kdbus_pool_add_free_slice(struct kdbus_pool *pool,
+				      struct kdbus_pool_slice *slice)
+{
+	struct rb_node **n;
+	struct rb_node *pn = NULL;
+
+	n = &pool->slices_free.rb_node;
+	while (*n) {
+		struct kdbus_pool_slice *pslice;
+
+		pn = *n;
+		pslice = rb_entry(pn, struct kdbus_pool_slice, rb_node);
+		if (slice->size < pslice->size)
+			n = &pn->rb_left;
+		else
+			n = &pn->rb_right;
+	}
+
+	rb_link_node(&slice->rb_node, pn, n);
+	rb_insert_color(&slice->rb_node, &pool->slices_free);
+}
+
+/* insert a slice into the busy tree */
+static void kdbus_pool_add_busy_slice(struct kdbus_pool *pool,
+				      struct kdbus_pool_slice *slice)
+{
+	struct rb_node **n;
+	struct rb_node *pn = NULL;
+
+	n = &pool->slices_busy.rb_node;
+	while (*n) {
+		struct kdbus_pool_slice *pslice;
+
+		pn = *n;
+		pslice = rb_entry(pn, struct kdbus_pool_slice, rb_node);
+		if (slice->off < pslice->off)
+			n = &pn->rb_left;
+		else if (slice->off > pslice->off)
+			n = &pn->rb_right;
+	}
+
+	rb_link_node(&slice->rb_node, pn, n);
+	rb_insert_color(&slice->rb_node, &pool->slices_busy);
+}
+
+static struct kdbus_pool_slice *kdbus_pool_find_slice(struct kdbus_pool *pool,
+						      size_t off)
+{
+	struct rb_node *n;
+
+	n = pool->slices_busy.rb_node;
+	while (n) {
+		struct kdbus_pool_slice *s;
+
+		s = rb_entry(n, struct kdbus_pool_slice, rb_node);
+		if (off < s->off)
+			n = n->rb_left;
+		else if (off > s->off)
+			n = n->rb_right;
+		else
+			return s;
+	}
+
+	return NULL;
+}
+
+/**
+ * kdbus_pool_slice_alloc() - allocate memory from a pool
+ * @pool:		The receiver's pool
+ * @size:		The number of bytes to allocate
+ *
+ * The returned slice is used for kdbus_pool_slice_free() to
+ * free the allocated memory.
+ *
+ * Return: the allocated slice on success, ERR_PTR on failure.
+ */
+struct kdbus_pool_slice *kdbus_pool_slice_alloc(struct kdbus_pool *pool,
+						size_t size)
+{
+	size_t slice_size = KDBUS_ALIGN8(size);
+	struct rb_node *n, *found = NULL;
+	struct kdbus_pool_slice *s;
+	int ret = 0;
+
+	/* search a free slice with the closest matching size */
+	mutex_lock(&pool->lock);
+	n = pool->slices_free.rb_node;
+	while (n) {
+		s = rb_entry(n, struct kdbus_pool_slice, rb_node);
+		if (slice_size < s->size) {
+			found = n;
+			n = n->rb_left;
+		} else if (slice_size > s->size) {
+			n = n->rb_right;
+		} else {
+			found = n;
+			break;
+		}
+	}
+
+	/* no slice with the minimum size found in the pool */
+	if (!found) {
+		ret = -ENOBUFS;
+		goto exit_unlock;
+	}
+
+	/* no exact match, use the closest one */
+	if (!n)
+		s = rb_entry(found, struct kdbus_pool_slice, rb_node);
+
+	/* move slice from free to the busy tree */
+	rb_erase(found, &pool->slices_free);
+	kdbus_pool_add_busy_slice(pool, s);
+
+	/* we got a slice larger than what we asked for? */
+	if (s->size > slice_size) {
+		struct kdbus_pool_slice *s_new;
+
+		/* split-off the remainder of the size to its own slice */
+		s_new = kdbus_pool_slice_new(pool, s->off + slice_size,
+					     s->size - slice_size);
+		if (!s_new) {
+			ret = -ENOMEM;
+			goto exit_unlock;
+		}
+
+		list_add(&s_new->entry, &s->entry);
+		kdbus_pool_add_free_slice(pool, s_new);
+
+		/* adjust our size now that we split-off another slice */
+		s->size = slice_size;
+	}
+
+	s->free = false;
+	s->public = false;
+	pool->busy += s->size;
+	mutex_unlock(&pool->lock);
+
+	return s;
+
+exit_unlock:
+	mutex_unlock(&pool->lock);
+	return ERR_PTR(ret);
+}
+
+static void __kdbus_pool_slice_free(struct kdbus_pool_slice *slice)
+{
+	struct kdbus_pool *pool = slice->pool;
+
+	BUG_ON(slice->free);
+
+	rb_erase(&slice->rb_node, &pool->slices_busy);
+	pool->busy -= slice->size;
+
+	/* merge with the next free slice */
+	if (!list_is_last(&slice->entry, &pool->slices)) {
+		struct kdbus_pool_slice *s;
+
+		s = list_entry(slice->entry.next,
+			       struct kdbus_pool_slice, entry);
+		if (s->free) {
+			rb_erase(&s->rb_node, &pool->slices_free);
+			list_del(&s->entry);
+			slice->size += s->size;
+			kfree(s);
+		}
+	}
+
+	/* merge with previous free slice */
+	if (pool->slices.next != &slice->entry) {
+		struct kdbus_pool_slice *s;
+
+		s = list_entry(slice->entry.prev, struct kdbus_pool_slice,
+			       entry);
+		if (s->free) {
+			rb_erase(&s->rb_node, &pool->slices_free);
+			list_del(&slice->entry);
+			s->size += slice->size;
+			kfree(slice);
+			slice = s;
+		}
+	}
+
+	slice->free = true;
+	kdbus_pool_add_free_slice(pool, slice);
+}
+
+/**
+ * kdbus_pool_slice_free() - give allocated memory back to the pool
+ * @slice:		Slice allocated from the the pool
+ *
+ * The slice was returned by the call to kdbus_pool_alloc_slice(), the
+ * memory is returned to the pool.
+ */
+void kdbus_pool_slice_free(struct kdbus_pool_slice *slice)
+{
+	struct kdbus_pool *pool = slice->pool;
+
+	mutex_lock(&pool->lock);
+	__kdbus_pool_slice_free(slice);
+	mutex_unlock(&pool->lock);
+}
+
+/**
+ * kdbus_pool_release_offset() - release a public offset
+ * @pool:		pool to operate on
+ * @off:		offset to release
+ *
+ * This should be called whenever user-space frees a slice given to them. It
+ * verifies the slice is available and public, and then drops it. It ensures
+ * correct locking and barriers against queues.
+ *
+ * Return: 0 on success, ENXIO if the offset is invalid, EINVAL if the offset is
+ * valid but not public.
+ */
+int kdbus_pool_release_offset(struct kdbus_pool *pool, size_t off)
+{
+	struct kdbus_pool_slice *slice;
+	int ret = 0;
+
+	mutex_lock(&pool->lock);
+	slice = kdbus_pool_find_slice(pool, off);
+	if (slice) {
+		if (slice->public)
+			__kdbus_pool_slice_free(slice);
+		else
+			ret = -EINVAL;
+	} else {
+		ret = -ENXIO;
+	}
+	mutex_unlock(&pool->lock);
+
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_offset() - return the slice's offset inside the pool
+ * @slice:		The slice
+ *
+ * Return: the offset in bytes.
+ */
+size_t kdbus_pool_slice_offset(const struct kdbus_pool_slice *slice)
+{
+	return slice->off;
+}
+
+/**
+ * kdbus_pool_slice_make_public() - set a slice's public flag to true
+ * @slice:		The slice
+ */
+void kdbus_pool_slice_make_public(struct kdbus_pool_slice *slice)
+{
+	slice->public = true;
+}
+
+/**
+ * kdbus_pool_new() - create a new pool
+ * @name:		Name of the (deleted) file which shows up in
+ *			/proc, used for debugging
+ * @size:		Maximum size of the pool
+ *
+ * Return: a new kdbus_pool on success, ERR_PTR on failure.
+ */
+struct kdbus_pool *kdbus_pool_new(const char *name, size_t size)
+{
+	struct kdbus_pool_slice *s;
+	struct kdbus_pool *p;
+	struct file *f;
+	char *n = NULL;
+	int ret;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return ERR_PTR(-ENOMEM);
+
+	if (name) {
+		n = kasprintf(GFP_KERNEL, KBUILD_MODNAME "-conn:%s", name);
+		if (!n) {
+			ret = -ENOMEM;
+			goto exit_free;
+		}
+	}
+
+	f = shmem_file_setup(n ?: KBUILD_MODNAME "-conn", size, VM_NORESERVE);
+	kfree(n);
+
+	if (IS_ERR(f)) {
+		ret = PTR_ERR(f);
+		goto exit_free;
+	}
+
+	ret = get_write_access(file_inode(f));
+	if (ret < 0)
+		goto exit_put_shmem;
+
+	/* allocate first slice spanning the entire pool */
+	s = kdbus_pool_slice_new(p, 0, size);
+	if (!s) {
+		ret = -ENOMEM;
+		goto exit_put_write;
+	}
+
+	p->f = f;
+	p->size = size;
+	p->busy = 0;
+	p->slices_free = RB_ROOT;
+	p->slices_busy = RB_ROOT;
+	mutex_init(&p->lock);
+
+	INIT_LIST_HEAD(&p->slices);
+	list_add(&s->entry, &p->slices);
+
+	kdbus_pool_add_free_slice(p, s);
+	return p;
+
+exit_put_write:
+	put_write_access(file_inode(f));
+exit_put_shmem:
+	fput(f);
+exit_free:
+	kfree(p);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_pool_free() - destroy pool
+ * @pool:		The receiver's pool
+ */
+void kdbus_pool_free(struct kdbus_pool *pool)
+{
+	struct kdbus_pool_slice *s, *tmp;
+
+	if (!pool)
+		return;
+
+	list_for_each_entry_safe(s, tmp, &pool->slices, entry) {
+		list_del(&s->entry);
+		kfree(s);
+	}
+
+	put_write_access(file_inode(pool->f));
+	fput(pool->f);
+	kfree(pool);
+}
+
+/**
+ * kdbus_pool_remain() - the number of free bytes in the pool
+ * @pool:		The receiver's pool
+ *
+ * Return: the number of unallocated bytes in the pool
+ */
+size_t kdbus_pool_remain(struct kdbus_pool *pool)
+{
+	size_t size;
+
+	mutex_lock(&pool->lock);
+	size = pool->size - pool->busy;
+	mutex_unlock(&pool->lock);
+
+	return size;
+}
+
+/* copy data from a file to a page in the receiver's pool */
+static int kdbus_pool_copy_file(struct page *p, size_t start,
+				struct file *f, size_t off, size_t count)
+{
+	loff_t o = off;
+	char *kaddr;
+	ssize_t n;
+
+	kaddr = kmap(p);
+	n = f->f_op->read(f, (char __force __user *)kaddr + start, count, &o);
+	kunmap(p);
+	if (n < 0)
+		return n;
+	if (n != count)
+		return -EFAULT;
+
+	return 0;
+}
+
+/* copy data to a page in the receiver's pool */
+static int kdbus_pool_copy_data(struct page *p, size_t start,
+				const void __user *from, size_t count)
+{
+	unsigned long remain;
+	char *kaddr;
+
+	if (fault_in_pages_readable(from, count) < 0)
+		return -EFAULT;
+
+	kaddr = kmap_atomic(p);
+	pagefault_disable();
+	remain = __copy_from_user_inatomic(kaddr + start, from, count);
+	pagefault_enable();
+	kunmap_atomic(kaddr);
+	if (remain > 0)
+		return -EFAULT;
+
+	cond_resched();
+	return 0;
+}
+
+/* copy data to the receiver's pool */
+static size_t kdbus_pool_copy(const struct kdbus_pool_slice *slice, size_t off,
+			      const void __user *data, struct file *f_src,
+			      size_t off_src, size_t len)
+{
+	struct file *f_dst = slice->pool->f;
+	struct address_space *mapping = f_dst->f_mapping;
+	const struct address_space_operations *aops = mapping->a_ops;
+	unsigned long fpos = slice->off + off;
+	unsigned long rem = len;
+	size_t pos = 0;
+	int ret = 0;
+
+	BUG_ON(off + len > slice->size);
+	BUG_ON(slice->free);
+
+	while (rem > 0) {
+		struct page *p;
+		unsigned long o;
+		unsigned long n;
+		void *fsdata;
+		int status;
+
+		o = fpos & (PAGE_CACHE_SIZE - 1);
+		n = min_t(unsigned long, PAGE_CACHE_SIZE - o, rem);
+
+		status = aops->write_begin(f_dst, mapping, fpos, n, 0, &p,
+					   &fsdata);
+		if (status) {
+			ret = -EFAULT;
+			break;
+		}
+
+		if (data)
+			ret = kdbus_pool_copy_data(p, o, data + pos, n);
+		else
+			ret = kdbus_pool_copy_file(p, o, f_src,
+						   off_src + pos, n);
+		mark_page_accessed(p);
+
+		status = aops->write_end(f_dst, mapping, fpos, n, n, p, fsdata);
+
+		if (ret < 0)
+			break;
+		if (status != n) {
+			ret = -EFAULT;
+			break;
+		}
+
+		pos += n;
+		fpos += n;
+		rem -= n;
+	}
+
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_copy_user() - copy user memory to a slice
+ * @slice:		The slice to write to
+ * @off:		Offset in the slice to write to
+ * @data:		User memory to copy from
+ * @len:		Number of bytes to copy
+ *
+ * The offset was returned by the call to kdbus_pool_alloc_slice().
+ * The user memory at @data will be copied to the @off in the allocated
+ * slice in the pool.
+ *
+ * Return: the numbers of bytes copied, negative errno on failure.
+ */
+ssize_t
+kdbus_pool_slice_copy_user(const struct kdbus_pool_slice *slice, size_t off,
+			   const void __user *data, size_t len)
+{
+	return kdbus_pool_copy(slice, off, data, NULL, 0, len);
+}
+
+/**
+ * kdbus_pool_slice_copy() - copy kernel memory to a slice
+ * @slice:		The slice to write to
+ * @off:		Offset in the slice to write to
+ * @data:		Kernel memory to copy from
+ * @len:		Number of bytes to copy
+ *
+ * The slice was returned by the call to kdbus_pool_alloc_slice().
+ * The user memory at @data will be copied to the @off in the allocated
+ * slice in the pool.
+ *
+ * Return: the numbers of bytes copied, negative errno on failure.
+ */
+ssize_t kdbus_pool_slice_copy(const struct kdbus_pool_slice *slice, size_t off,
+			      const void *data, size_t len)
+{
+	mm_segment_t old_fs;
+	ssize_t ret;
+
+	old_fs = get_fs();
+	set_fs(get_ds());
+	ret = kdbus_pool_copy(slice, off,
+			      (const void __user *)data, NULL, 0, len);
+	set_fs(old_fs);
+
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_move() - move memory from one pool into another one
+ * @src_pool:		The receiver's pool to copy from
+ * @dst_pool:		The receiver's pool to copy to
+ * @slice:		Reference to the slice to copy from the source;
+ *			updated with the newly allocated slice in the
+ *			destination
+ *
+ * Move memory from one pool to another. Memory will be allocated in the
+ * destination pool, the memory copied over, and the free()d in source
+ * pool.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_pool_slice_move(struct kdbus_pool *src_pool,
+			  struct kdbus_pool *dst_pool,
+			  struct kdbus_pool_slice **slice)
+{
+	mm_segment_t old_fs;
+	struct kdbus_pool_slice *slice_new;
+	int ret;
+
+	slice_new = kdbus_pool_slice_alloc(dst_pool, (*slice)->size);
+	if (IS_ERR(slice_new))
+		return PTR_ERR(slice_new);
+
+	old_fs = get_fs();
+	set_fs(get_ds());
+	ret = kdbus_pool_copy(slice_new, 0, NULL,
+			      src_pool->f, (*slice)->off, (*slice)->size);
+	set_fs(old_fs);
+	if (ret < 0)
+		goto exit_free;
+
+	kdbus_pool_slice_free(*slice);
+
+	*slice = slice_new;
+	return 0;
+
+exit_free:
+	kdbus_pool_slice_free(slice_new);
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_flush() - flush dcache memory area of a slice
+ * @slice:		The allocated slice to flush
+ *
+ * Dcache flushes are delayed to happen only right before the receiver
+ * gets the new buffer area announced. The mapped buffer is always
+ * read-only for the receiver, and only the area of the announced message
+ * needs to be flushed.
+ */
+void kdbus_pool_slice_flush(const struct kdbus_pool_slice *slice)
+{
+#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE == 1
+	struct address_space *mapping = slice->pool->f->f_mapping;
+	pgoff_t first = slice->off >> PAGE_CACHE_SHIFT;
+	pgoff_t last = (slice->off + slice->size +
+			PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
+	pgoff_t i;
+
+	for (i = first; i < last; i++) {
+		struct page *page;
+
+		page = find_get_page(mapping, i);
+		if (!page)
+			continue;
+
+		flush_dcache_page(page);
+		put_page(page);
+	}
+#endif
+}
+
+/**
+ * kdbus_pool_mmap() -  map the pool into the process
+ * @pool:		The receiver's pool
+ * @vma:		passed by mmap() syscall
+ *
+ * Return: the result of the mmap() call, negative errno on failure.
+ */
+int kdbus_pool_mmap(const struct kdbus_pool *pool, struct vm_area_struct *vma)
+{
+	/* deny write access to the pool */
+	if (vma->vm_flags & VM_WRITE)
+		return -EPERM;
+	vma->vm_flags &= ~VM_MAYWRITE;
+
+	/* do not allow to map more than the size of the file */
+	if ((vma->vm_end - vma->vm_start) > pool->size)
+		return -EFAULT;
+
+	/* replace the connection file with our shmem file */
+	if (vma->vm_file)
+		fput(vma->vm_file);
+	vma->vm_file = get_file(pool->f);
+
+	return pool->f->f_op->mmap(pool->f, vma);
+}
diff --git a/ipc/kdbus/pool.h b/ipc/kdbus/pool.h
new file mode 100644
index 000000000000..b8f6930d5bb9
--- /dev/null
+++ b/ipc/kdbus/pool.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_POOL_H
+#define __KDBUS_POOL_H
+
+struct kdbus_pool;
+struct kdbus_pool_slice;
+
+struct kdbus_pool *kdbus_pool_new(const char *name, size_t size);
+void kdbus_pool_free(struct kdbus_pool *pool);
+size_t kdbus_pool_remain(struct kdbus_pool *pool);
+int kdbus_pool_mmap(const struct kdbus_pool *pool, struct vm_area_struct *vma);
+int kdbus_pool_release_offset(struct kdbus_pool *pool, size_t off);
+
+struct kdbus_pool_slice *kdbus_pool_slice_alloc(struct kdbus_pool *pool,
+						size_t size);
+void kdbus_pool_slice_free(struct kdbus_pool_slice *slice);
+struct kdbus_pool_slice *kdbus_pool_slice_find(struct kdbus_pool *pool,
+					       size_t off);
+int kdbus_pool_slice_move(struct kdbus_pool *src_pool,
+			  struct kdbus_pool *dst_pool,
+			  struct kdbus_pool_slice **slice);
+size_t kdbus_pool_slice_offset(const struct kdbus_pool_slice *slice);
+ssize_t kdbus_pool_slice_copy(const struct kdbus_pool_slice *slice, size_t off,
+			      const void *data, size_t len);
+ssize_t kdbus_pool_slice_copy_user(const struct kdbus_pool_slice *slice,
+				   size_t off, const void __user *data,
+				   size_t len);
+void kdbus_pool_slice_flush(const struct kdbus_pool_slice *slice);
+
+void kdbus_pool_slice_make_public(struct kdbus_pool_slice *slice);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add connection pool implementation
@ 2014-11-21  5:02   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd-r2nGTMty4D4, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A, Greg Kroah-Hartman

From: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>

A pool for data received from the kernel is installed for every
connection of the bus, and it is used to copy data from the kernel to
userspace clients, for messages and other information.

It is accessed when one of the following ioctls is issued:

  * KDBUS_CMD_MSG_RECV, to receive a message
  * KDBUS_CMD_NAME_LIST, to dump the name registry
  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection

The offsets returned by either one of the aforementioned ioctls
describe offsets inside the pool. Internally, the pool is organized in
slices, that are dynamically allocated on demand. The overall size of
the pool is chosen by the connection when it connects to the bus with
KDBUS_CMD_HELLO.

In order to make the slice available for subsequent calls,
KDBUS_CMD_FREE has to be called on the offset.

To access the memory, the caller is expected to mmap() it to its task.

Signed-off-by: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
Signed-off-by: David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Djalal Harouni <tixxdz-Umm1ozX2/EEdnm+yROfE0A@public.gmane.org>
Signed-off-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
---
 ipc/kdbus/pool.c | 722 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/pool.h |  44 ++++
 2 files changed, 766 insertions(+)
 create mode 100644 ipc/kdbus/pool.c
 create mode 100644 ipc/kdbus/pool.h

diff --git a/ipc/kdbus/pool.c b/ipc/kdbus/pool.c
new file mode 100644
index 000000000000..3bf4ab426d3b
--- /dev/null
+++ b/ipc/kdbus/pool.c
@@ -0,0 +1,722 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/aio.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/pagemap.h>
+#include <linux/rbtree.h>
+#include <linux/sched.h>
+#include <linux/shmem_fs.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "pool.h"
+#include "util.h"
+
+/**
+ * struct kdbus_pool - the receiver's buffer
+ * @f:			The backing shmem file
+ * @size:		The size of the file
+ * @busy:		The currently used size
+ * @lock:		Pool data lock
+ * @slices:		All slices sorted by address
+ * @slices_busy:	Tree of allocated slices
+ * @slices_free:	Tree of free slices
+ *
+ * The receiver's buffer, managed as a pool of allocated and free
+ * slices containing the queued messages.
+ *
+ * Messages sent with KDBUS_CMD_MSG_SEND are copied direcly by the
+ * sending process into the receiver's pool.
+ *
+ * Messages received with KDBUS_CMD_MSG_RECV just return the offset
+ * to the data placed in the pool.
+ *
+ * The internally allocated memory needs to be returned by the receiver
+ * with KDBUS_CMD_MSG_FREE.
+ */
+struct kdbus_pool {
+	struct file *f;
+	size_t size;
+	size_t busy;
+	struct mutex lock;
+
+	struct list_head slices;
+	struct rb_root slices_busy;
+	struct rb_root slices_free;
+};
+
+/**
+ * struct kdbus_pool_slice - allocated element in kdbus_pool
+ * @pool:		Pool this slice belongs to
+ * @off:		Offset of slice in the shmem file
+ * @size:		Size of slice
+ * @entry:		Entry in "all slices" list
+ * @rb_node:		Entry in free or busy list
+ * @free:		Unused slice
+ * @public:		Slice was exposed to userspace and may be freed
+ *			with KDBUS_CMD_FREE.
+ *
+ * The pool has one or more slices, always spanning the entire size of the
+ * pool.
+ *
+ * Every slice is an element in a list sorted by the buffer address, to
+ * provide access to the next neighbor slice.
+ *
+ * Every slice is member in either the busy or the free tree. The free
+ * tree is organized by slice size, the busy tree organized by buffer
+ * offset.
+ */
+struct kdbus_pool_slice {
+	struct kdbus_pool *pool;
+	size_t off;
+	size_t size;
+
+	struct list_head entry;
+	struct rb_node rb_node;
+	bool free;
+	bool public;
+};
+
+static struct kdbus_pool_slice *kdbus_pool_slice_new(struct kdbus_pool *pool,
+						     size_t off, size_t size)
+{
+	struct kdbus_pool_slice *slice;
+
+	slice = kzalloc(sizeof(*slice), GFP_KERNEL);
+	if (!slice)
+		return NULL;
+
+	slice->pool = pool;
+	slice->off = off;
+	slice->size = size;
+	slice->free = true;
+	slice->public = false;
+	return slice;
+}
+
+/* insert a slice into the free tree */
+static void kdbus_pool_add_free_slice(struct kdbus_pool *pool,
+				      struct kdbus_pool_slice *slice)
+{
+	struct rb_node **n;
+	struct rb_node *pn = NULL;
+
+	n = &pool->slices_free.rb_node;
+	while (*n) {
+		struct kdbus_pool_slice *pslice;
+
+		pn = *n;
+		pslice = rb_entry(pn, struct kdbus_pool_slice, rb_node);
+		if (slice->size < pslice->size)
+			n = &pn->rb_left;
+		else
+			n = &pn->rb_right;
+	}
+
+	rb_link_node(&slice->rb_node, pn, n);
+	rb_insert_color(&slice->rb_node, &pool->slices_free);
+}
+
+/* insert a slice into the busy tree */
+static void kdbus_pool_add_busy_slice(struct kdbus_pool *pool,
+				      struct kdbus_pool_slice *slice)
+{
+	struct rb_node **n;
+	struct rb_node *pn = NULL;
+
+	n = &pool->slices_busy.rb_node;
+	while (*n) {
+		struct kdbus_pool_slice *pslice;
+
+		pn = *n;
+		pslice = rb_entry(pn, struct kdbus_pool_slice, rb_node);
+		if (slice->off < pslice->off)
+			n = &pn->rb_left;
+		else if (slice->off > pslice->off)
+			n = &pn->rb_right;
+	}
+
+	rb_link_node(&slice->rb_node, pn, n);
+	rb_insert_color(&slice->rb_node, &pool->slices_busy);
+}
+
+static struct kdbus_pool_slice *kdbus_pool_find_slice(struct kdbus_pool *pool,
+						      size_t off)
+{
+	struct rb_node *n;
+
+	n = pool->slices_busy.rb_node;
+	while (n) {
+		struct kdbus_pool_slice *s;
+
+		s = rb_entry(n, struct kdbus_pool_slice, rb_node);
+		if (off < s->off)
+			n = n->rb_left;
+		else if (off > s->off)
+			n = n->rb_right;
+		else
+			return s;
+	}
+
+	return NULL;
+}
+
+/**
+ * kdbus_pool_slice_alloc() - allocate memory from a pool
+ * @pool:		The receiver's pool
+ * @size:		The number of bytes to allocate
+ *
+ * The returned slice is used for kdbus_pool_slice_free() to
+ * free the allocated memory.
+ *
+ * Return: the allocated slice on success, ERR_PTR on failure.
+ */
+struct kdbus_pool_slice *kdbus_pool_slice_alloc(struct kdbus_pool *pool,
+						size_t size)
+{
+	size_t slice_size = KDBUS_ALIGN8(size);
+	struct rb_node *n, *found = NULL;
+	struct kdbus_pool_slice *s;
+	int ret = 0;
+
+	/* search a free slice with the closest matching size */
+	mutex_lock(&pool->lock);
+	n = pool->slices_free.rb_node;
+	while (n) {
+		s = rb_entry(n, struct kdbus_pool_slice, rb_node);
+		if (slice_size < s->size) {
+			found = n;
+			n = n->rb_left;
+		} else if (slice_size > s->size) {
+			n = n->rb_right;
+		} else {
+			found = n;
+			break;
+		}
+	}
+
+	/* no slice with the minimum size found in the pool */
+	if (!found) {
+		ret = -ENOBUFS;
+		goto exit_unlock;
+	}
+
+	/* no exact match, use the closest one */
+	if (!n)
+		s = rb_entry(found, struct kdbus_pool_slice, rb_node);
+
+	/* move slice from free to the busy tree */
+	rb_erase(found, &pool->slices_free);
+	kdbus_pool_add_busy_slice(pool, s);
+
+	/* we got a slice larger than what we asked for? */
+	if (s->size > slice_size) {
+		struct kdbus_pool_slice *s_new;
+
+		/* split-off the remainder of the size to its own slice */
+		s_new = kdbus_pool_slice_new(pool, s->off + slice_size,
+					     s->size - slice_size);
+		if (!s_new) {
+			ret = -ENOMEM;
+			goto exit_unlock;
+		}
+
+		list_add(&s_new->entry, &s->entry);
+		kdbus_pool_add_free_slice(pool, s_new);
+
+		/* adjust our size now that we split-off another slice */
+		s->size = slice_size;
+	}
+
+	s->free = false;
+	s->public = false;
+	pool->busy += s->size;
+	mutex_unlock(&pool->lock);
+
+	return s;
+
+exit_unlock:
+	mutex_unlock(&pool->lock);
+	return ERR_PTR(ret);
+}
+
+static void __kdbus_pool_slice_free(struct kdbus_pool_slice *slice)
+{
+	struct kdbus_pool *pool = slice->pool;
+
+	BUG_ON(slice->free);
+
+	rb_erase(&slice->rb_node, &pool->slices_busy);
+	pool->busy -= slice->size;
+
+	/* merge with the next free slice */
+	if (!list_is_last(&slice->entry, &pool->slices)) {
+		struct kdbus_pool_slice *s;
+
+		s = list_entry(slice->entry.next,
+			       struct kdbus_pool_slice, entry);
+		if (s->free) {
+			rb_erase(&s->rb_node, &pool->slices_free);
+			list_del(&s->entry);
+			slice->size += s->size;
+			kfree(s);
+		}
+	}
+
+	/* merge with previous free slice */
+	if (pool->slices.next != &slice->entry) {
+		struct kdbus_pool_slice *s;
+
+		s = list_entry(slice->entry.prev, struct kdbus_pool_slice,
+			       entry);
+		if (s->free) {
+			rb_erase(&s->rb_node, &pool->slices_free);
+			list_del(&slice->entry);
+			s->size += slice->size;
+			kfree(slice);
+			slice = s;
+		}
+	}
+
+	slice->free = true;
+	kdbus_pool_add_free_slice(pool, slice);
+}
+
+/**
+ * kdbus_pool_slice_free() - give allocated memory back to the pool
+ * @slice:		Slice allocated from the the pool
+ *
+ * The slice was returned by the call to kdbus_pool_alloc_slice(), the
+ * memory is returned to the pool.
+ */
+void kdbus_pool_slice_free(struct kdbus_pool_slice *slice)
+{
+	struct kdbus_pool *pool = slice->pool;
+
+	mutex_lock(&pool->lock);
+	__kdbus_pool_slice_free(slice);
+	mutex_unlock(&pool->lock);
+}
+
+/**
+ * kdbus_pool_release_offset() - release a public offset
+ * @pool:		pool to operate on
+ * @off:		offset to release
+ *
+ * This should be called whenever user-space frees a slice given to them. It
+ * verifies the slice is available and public, and then drops it. It ensures
+ * correct locking and barriers against queues.
+ *
+ * Return: 0 on success, ENXIO if the offset is invalid, EINVAL if the offset is
+ * valid but not public.
+ */
+int kdbus_pool_release_offset(struct kdbus_pool *pool, size_t off)
+{
+	struct kdbus_pool_slice *slice;
+	int ret = 0;
+
+	mutex_lock(&pool->lock);
+	slice = kdbus_pool_find_slice(pool, off);
+	if (slice) {
+		if (slice->public)
+			__kdbus_pool_slice_free(slice);
+		else
+			ret = -EINVAL;
+	} else {
+		ret = -ENXIO;
+	}
+	mutex_unlock(&pool->lock);
+
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_offset() - return the slice's offset inside the pool
+ * @slice:		The slice
+ *
+ * Return: the offset in bytes.
+ */
+size_t kdbus_pool_slice_offset(const struct kdbus_pool_slice *slice)
+{
+	return slice->off;
+}
+
+/**
+ * kdbus_pool_slice_make_public() - set a slice's public flag to true
+ * @slice:		The slice
+ */
+void kdbus_pool_slice_make_public(struct kdbus_pool_slice *slice)
+{
+	slice->public = true;
+}
+
+/**
+ * kdbus_pool_new() - create a new pool
+ * @name:		Name of the (deleted) file which shows up in
+ *			/proc, used for debugging
+ * @size:		Maximum size of the pool
+ *
+ * Return: a new kdbus_pool on success, ERR_PTR on failure.
+ */
+struct kdbus_pool *kdbus_pool_new(const char *name, size_t size)
+{
+	struct kdbus_pool_slice *s;
+	struct kdbus_pool *p;
+	struct file *f;
+	char *n = NULL;
+	int ret;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return ERR_PTR(-ENOMEM);
+
+	if (name) {
+		n = kasprintf(GFP_KERNEL, KBUILD_MODNAME "-conn:%s", name);
+		if (!n) {
+			ret = -ENOMEM;
+			goto exit_free;
+		}
+	}
+
+	f = shmem_file_setup(n ?: KBUILD_MODNAME "-conn", size, VM_NORESERVE);
+	kfree(n);
+
+	if (IS_ERR(f)) {
+		ret = PTR_ERR(f);
+		goto exit_free;
+	}
+
+	ret = get_write_access(file_inode(f));
+	if (ret < 0)
+		goto exit_put_shmem;
+
+	/* allocate first slice spanning the entire pool */
+	s = kdbus_pool_slice_new(p, 0, size);
+	if (!s) {
+		ret = -ENOMEM;
+		goto exit_put_write;
+	}
+
+	p->f = f;
+	p->size = size;
+	p->busy = 0;
+	p->slices_free = RB_ROOT;
+	p->slices_busy = RB_ROOT;
+	mutex_init(&p->lock);
+
+	INIT_LIST_HEAD(&p->slices);
+	list_add(&s->entry, &p->slices);
+
+	kdbus_pool_add_free_slice(p, s);
+	return p;
+
+exit_put_write:
+	put_write_access(file_inode(f));
+exit_put_shmem:
+	fput(f);
+exit_free:
+	kfree(p);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_pool_free() - destroy pool
+ * @pool:		The receiver's pool
+ */
+void kdbus_pool_free(struct kdbus_pool *pool)
+{
+	struct kdbus_pool_slice *s, *tmp;
+
+	if (!pool)
+		return;
+
+	list_for_each_entry_safe(s, tmp, &pool->slices, entry) {
+		list_del(&s->entry);
+		kfree(s);
+	}
+
+	put_write_access(file_inode(pool->f));
+	fput(pool->f);
+	kfree(pool);
+}
+
+/**
+ * kdbus_pool_remain() - the number of free bytes in the pool
+ * @pool:		The receiver's pool
+ *
+ * Return: the number of unallocated bytes in the pool
+ */
+size_t kdbus_pool_remain(struct kdbus_pool *pool)
+{
+	size_t size;
+
+	mutex_lock(&pool->lock);
+	size = pool->size - pool->busy;
+	mutex_unlock(&pool->lock);
+
+	return size;
+}
+
+/* copy data from a file to a page in the receiver's pool */
+static int kdbus_pool_copy_file(struct page *p, size_t start,
+				struct file *f, size_t off, size_t count)
+{
+	loff_t o = off;
+	char *kaddr;
+	ssize_t n;
+
+	kaddr = kmap(p);
+	n = f->f_op->read(f, (char __force __user *)kaddr + start, count, &o);
+	kunmap(p);
+	if (n < 0)
+		return n;
+	if (n != count)
+		return -EFAULT;
+
+	return 0;
+}
+
+/* copy data to a page in the receiver's pool */
+static int kdbus_pool_copy_data(struct page *p, size_t start,
+				const void __user *from, size_t count)
+{
+	unsigned long remain;
+	char *kaddr;
+
+	if (fault_in_pages_readable(from, count) < 0)
+		return -EFAULT;
+
+	kaddr = kmap_atomic(p);
+	pagefault_disable();
+	remain = __copy_from_user_inatomic(kaddr + start, from, count);
+	pagefault_enable();
+	kunmap_atomic(kaddr);
+	if (remain > 0)
+		return -EFAULT;
+
+	cond_resched();
+	return 0;
+}
+
+/* copy data to the receiver's pool */
+static size_t kdbus_pool_copy(const struct kdbus_pool_slice *slice, size_t off,
+			      const void __user *data, struct file *f_src,
+			      size_t off_src, size_t len)
+{
+	struct file *f_dst = slice->pool->f;
+	struct address_space *mapping = f_dst->f_mapping;
+	const struct address_space_operations *aops = mapping->a_ops;
+	unsigned long fpos = slice->off + off;
+	unsigned long rem = len;
+	size_t pos = 0;
+	int ret = 0;
+
+	BUG_ON(off + len > slice->size);
+	BUG_ON(slice->free);
+
+	while (rem > 0) {
+		struct page *p;
+		unsigned long o;
+		unsigned long n;
+		void *fsdata;
+		int status;
+
+		o = fpos & (PAGE_CACHE_SIZE - 1);
+		n = min_t(unsigned long, PAGE_CACHE_SIZE - o, rem);
+
+		status = aops->write_begin(f_dst, mapping, fpos, n, 0, &p,
+					   &fsdata);
+		if (status) {
+			ret = -EFAULT;
+			break;
+		}
+
+		if (data)
+			ret = kdbus_pool_copy_data(p, o, data + pos, n);
+		else
+			ret = kdbus_pool_copy_file(p, o, f_src,
+						   off_src + pos, n);
+		mark_page_accessed(p);
+
+		status = aops->write_end(f_dst, mapping, fpos, n, n, p, fsdata);
+
+		if (ret < 0)
+			break;
+		if (status != n) {
+			ret = -EFAULT;
+			break;
+		}
+
+		pos += n;
+		fpos += n;
+		rem -= n;
+	}
+
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_copy_user() - copy user memory to a slice
+ * @slice:		The slice to write to
+ * @off:		Offset in the slice to write to
+ * @data:		User memory to copy from
+ * @len:		Number of bytes to copy
+ *
+ * The offset was returned by the call to kdbus_pool_alloc_slice().
+ * The user memory at @data will be copied to the @off in the allocated
+ * slice in the pool.
+ *
+ * Return: the numbers of bytes copied, negative errno on failure.
+ */
+ssize_t
+kdbus_pool_slice_copy_user(const struct kdbus_pool_slice *slice, size_t off,
+			   const void __user *data, size_t len)
+{
+	return kdbus_pool_copy(slice, off, data, NULL, 0, len);
+}
+
+/**
+ * kdbus_pool_slice_copy() - copy kernel memory to a slice
+ * @slice:		The slice to write to
+ * @off:		Offset in the slice to write to
+ * @data:		Kernel memory to copy from
+ * @len:		Number of bytes to copy
+ *
+ * The slice was returned by the call to kdbus_pool_alloc_slice().
+ * The user memory at @data will be copied to the @off in the allocated
+ * slice in the pool.
+ *
+ * Return: the numbers of bytes copied, negative errno on failure.
+ */
+ssize_t kdbus_pool_slice_copy(const struct kdbus_pool_slice *slice, size_t off,
+			      const void *data, size_t len)
+{
+	mm_segment_t old_fs;
+	ssize_t ret;
+
+	old_fs = get_fs();
+	set_fs(get_ds());
+	ret = kdbus_pool_copy(slice, off,
+			      (const void __user *)data, NULL, 0, len);
+	set_fs(old_fs);
+
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_move() - move memory from one pool into another one
+ * @src_pool:		The receiver's pool to copy from
+ * @dst_pool:		The receiver's pool to copy to
+ * @slice:		Reference to the slice to copy from the source;
+ *			updated with the newly allocated slice in the
+ *			destination
+ *
+ * Move memory from one pool to another. Memory will be allocated in the
+ * destination pool, the memory copied over, and the free()d in source
+ * pool.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_pool_slice_move(struct kdbus_pool *src_pool,
+			  struct kdbus_pool *dst_pool,
+			  struct kdbus_pool_slice **slice)
+{
+	mm_segment_t old_fs;
+	struct kdbus_pool_slice *slice_new;
+	int ret;
+
+	slice_new = kdbus_pool_slice_alloc(dst_pool, (*slice)->size);
+	if (IS_ERR(slice_new))
+		return PTR_ERR(slice_new);
+
+	old_fs = get_fs();
+	set_fs(get_ds());
+	ret = kdbus_pool_copy(slice_new, 0, NULL,
+			      src_pool->f, (*slice)->off, (*slice)->size);
+	set_fs(old_fs);
+	if (ret < 0)
+		goto exit_free;
+
+	kdbus_pool_slice_free(*slice);
+
+	*slice = slice_new;
+	return 0;
+
+exit_free:
+	kdbus_pool_slice_free(slice_new);
+	return ret;
+}
+
+/**
+ * kdbus_pool_slice_flush() - flush dcache memory area of a slice
+ * @slice:		The allocated slice to flush
+ *
+ * Dcache flushes are delayed to happen only right before the receiver
+ * gets the new buffer area announced. The mapped buffer is always
+ * read-only for the receiver, and only the area of the announced message
+ * needs to be flushed.
+ */
+void kdbus_pool_slice_flush(const struct kdbus_pool_slice *slice)
+{
+#if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE == 1
+	struct address_space *mapping = slice->pool->f->f_mapping;
+	pgoff_t first = slice->off >> PAGE_CACHE_SHIFT;
+	pgoff_t last = (slice->off + slice->size +
+			PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
+	pgoff_t i;
+
+	for (i = first; i < last; i++) {
+		struct page *page;
+
+		page = find_get_page(mapping, i);
+		if (!page)
+			continue;
+
+		flush_dcache_page(page);
+		put_page(page);
+	}
+#endif
+}
+
+/**
+ * kdbus_pool_mmap() -  map the pool into the process
+ * @pool:		The receiver's pool
+ * @vma:		passed by mmap() syscall
+ *
+ * Return: the result of the mmap() call, negative errno on failure.
+ */
+int kdbus_pool_mmap(const struct kdbus_pool *pool, struct vm_area_struct *vma)
+{
+	/* deny write access to the pool */
+	if (vma->vm_flags & VM_WRITE)
+		return -EPERM;
+	vma->vm_flags &= ~VM_MAYWRITE;
+
+	/* do not allow to map more than the size of the file */
+	if ((vma->vm_end - vma->vm_start) > pool->size)
+		return -EFAULT;
+
+	/* replace the connection file with our shmem file */
+	if (vma->vm_file)
+		fput(vma->vm_file);
+	vma->vm_file = get_file(pool->f);
+
+	return pool->f->f_op->mmap(pool->f, vma);
+}
diff --git a/ipc/kdbus/pool.h b/ipc/kdbus/pool.h
new file mode 100644
index 000000000000..b8f6930d5bb9
--- /dev/null
+++ b/ipc/kdbus/pool.h
@@ -0,0 +1,44 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_POOL_H
+#define __KDBUS_POOL_H
+
+struct kdbus_pool;
+struct kdbus_pool_slice;
+
+struct kdbus_pool *kdbus_pool_new(const char *name, size_t size);
+void kdbus_pool_free(struct kdbus_pool *pool);
+size_t kdbus_pool_remain(struct kdbus_pool *pool);
+int kdbus_pool_mmap(const struct kdbus_pool *pool, struct vm_area_struct *vma);
+int kdbus_pool_release_offset(struct kdbus_pool *pool, size_t off);
+
+struct kdbus_pool_slice *kdbus_pool_slice_alloc(struct kdbus_pool *pool,
+						size_t size);
+void kdbus_pool_slice_free(struct kdbus_pool_slice *slice);
+struct kdbus_pool_slice *kdbus_pool_slice_find(struct kdbus_pool *pool,
+					       size_t off);
+int kdbus_pool_slice_move(struct kdbus_pool *src_pool,
+			  struct kdbus_pool *dst_pool,
+			  struct kdbus_pool_slice **slice);
+size_t kdbus_pool_slice_offset(const struct kdbus_pool_slice *slice);
+ssize_t kdbus_pool_slice_copy(const struct kdbus_pool_slice *slice, size_t off,
+			      const void *data, size_t len);
+ssize_t kdbus_pool_slice_copy_user(const struct kdbus_pool_slice *slice,
+				   size_t off, const void __user *data,
+				   size_t len);
+void kdbus_pool_slice_flush(const struct kdbus_pool_slice *slice);
+
+void kdbus_pool_slice_make_public(struct kdbus_pool_slice *slice);
+
+#endif
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add connection, queue handling and message validation code
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch adds code to create and destroy connections, to validate
incoming messages and to maintain the queue of messages that are
associated with a connection.

Note that connection and queue have a 1:1 relation, the code is only
split in two parts for cleaner separation and better readability.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/connection.c | 1838 ++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/connection.h |  188 +++++
 ipc/kdbus/item.c       |  258 +++++++
 ipc/kdbus/item.h       |   41 ++
 ipc/kdbus/message.c    |  444 ++++++++++++
 ipc/kdbus/message.h    |   75 ++
 ipc/kdbus/queue.c      |  608 ++++++++++++++++
 ipc/kdbus/queue.h      |   93 +++
 ipc/kdbus/util.h       |    2 +-
 9 files changed, 3546 insertions(+), 1 deletion(-)
 create mode 100644 ipc/kdbus/connection.c
 create mode 100644 ipc/kdbus/connection.h
 create mode 100644 ipc/kdbus/item.c
 create mode 100644 ipc/kdbus/item.h
 create mode 100644 ipc/kdbus/message.c
 create mode 100644 ipc/kdbus/message.h
 create mode 100644 ipc/kdbus/queue.c
 create mode 100644 ipc/kdbus/queue.h

diff --git a/ipc/kdbus/connection.c b/ipc/kdbus/connection.c
new file mode 100644
index 000000000000..73d149eecc25
--- /dev/null
+++ b/ipc/kdbus/connection.c
@@ -0,0 +1,1838 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ * Copyright (C) 2014 Djalal Harouni
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/audit.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/hashtable.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/math64.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/shmem_fs.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/syscalls.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "endpoint.h"
+#include "match.h"
+#include "message.h"
+#include "metadata.h"
+#include "names.h"
+#include "domain.h"
+#include "item.h"
+#include "notify.h"
+#include "policy.h"
+#include "util.h"
+#include "queue.h"
+
+#define KDBUS_CONN_ACTIVE_BIAS (INT_MIN + 1)
+
+/**
+ * struct kdbus_conn_reply - an entry of kdbus_conn's list of replies
+ * @kref:		Ref-count of this object
+ * @entry:		The entry of the connection's reply_list
+ * @reply_dst:		The connection the reply will be sent to (method origin)
+ * @queue_entry:	The queue enty item that is prepared by the replying
+ *			connection
+ * @deadline_ns:	The deadline of the reply, in nanoseconds
+ * @cookie:		The cookie of the requesting message
+ * @name_id:		ID of the well-known name the original msg was sent to
+ * @sync:		The reply block is waiting for synchronous I/O
+ * @waiting:		The condition to synchronously wait for
+ * @interrupted:	The sync reply was left in an interrupted state
+ * @err:		The error code for the synchronous reply
+ */
+struct kdbus_conn_reply {
+	struct kref kref;
+	struct list_head entry;
+	struct kdbus_conn *reply_dst;
+	struct kdbus_queue_entry *queue_entry;
+	u64 deadline_ns;
+	u64 cookie;
+	u64 name_id;
+	bool sync:1;
+	bool waiting:1;
+	bool interrupted:1;
+	int err;
+};
+
+static struct kdbus_conn_reply *
+kdbus_conn_reply_new(struct kdbus_conn *reply_dst,
+		     const struct kdbus_msg *msg,
+		     struct kdbus_name_entry *name_entry)
+{
+	bool sync = msg->flags & KDBUS_MSG_FLAGS_SYNC_REPLY;
+	struct kdbus_conn_reply *r;
+	int ret = 0;
+
+	if (atomic_inc_return(&reply_dst->reply_count) >
+	    KDBUS_CONN_MAX_REQUESTS_PENDING) {
+		ret = -EMLINK;
+		goto exit_dec_reply_count;
+	}
+
+	r = kzalloc(sizeof(*r), GFP_KERNEL);
+	if (!r) {
+		ret = -ENOMEM;
+		goto exit_dec_reply_count;
+	}
+
+	kref_init(&r->kref);
+	r->reply_dst = kdbus_conn_ref(reply_dst);
+	r->cookie = msg->cookie;
+	r->name_id = name_entry ? name_entry->name_id : 0;
+	r->deadline_ns = msg->timeout_ns;
+
+	if (sync) {
+		r->sync = true;
+		r->waiting = true;
+	}
+
+exit_dec_reply_count:
+	if (ret < 0) {
+		atomic_dec(&reply_dst->reply_count);
+		return ERR_PTR(ret);
+	}
+
+	return r;
+}
+
+static void __kdbus_conn_reply_free(struct kref *kref)
+{
+	struct kdbus_conn_reply *reply =
+		container_of(kref, struct kdbus_conn_reply, kref);
+
+	atomic_dec(&reply->reply_dst->reply_count);
+	kdbus_conn_unref(reply->reply_dst);
+	kfree(reply);
+}
+
+static struct kdbus_conn_reply*
+kdbus_conn_reply_ref(struct kdbus_conn_reply *r)
+{
+	if (r)
+		kref_get(&r->kref);
+	return r;
+}
+
+static struct kdbus_conn_reply*
+kdbus_conn_reply_unref(struct kdbus_conn_reply *r)
+{
+	if (r)
+		kref_put(&r->kref, __kdbus_conn_reply_free);
+	return NULL;
+}
+
+static void kdbus_conn_reply_sync(struct kdbus_conn_reply *reply, int err)
+{
+	BUG_ON(!reply->sync);
+
+	list_del_init(&reply->entry);
+	reply->waiting = false;
+	reply->err = err;
+	wake_up_interruptible(&reply->reply_dst->wait);
+}
+
+/*
+ * Check for maximum number of messages per individual user. This
+ * should prevent a single user from being able to fill the receiver's
+ * queue.
+ */
+static int kdbus_conn_queue_user_quota(const struct kdbus_conn *conn_src,
+				       struct kdbus_conn *conn_dst,
+				       struct kdbus_queue_entry *entry)
+{
+	struct kdbus_domain_user *user;
+
+	if (!conn_src)
+		return 0;
+
+	/*
+	 * Per-user accounting can be expensive if we have many different
+	 * users on the bus. Allow one set of messages to pass through
+	 * un-accounted. Only once we hit that limit, we start accounting.
+	 */
+	if (conn_dst->queue.msg_count < KDBUS_CONN_MAX_MSGS_PER_USER)
+		return 0;
+
+	user = conn_src->user;
+
+	/* extend array to store the user message counters */
+	if (user->idr >= conn_dst->msg_users_max) {
+		unsigned int *users;
+		unsigned int i;
+
+		i = 8 + KDBUS_ALIGN8(user->idr);
+		users = krealloc(conn_dst->msg_users, i * sizeof(unsigned int),
+				 GFP_KERNEL | __GFP_ZERO);
+		if (!users)
+			return -ENOMEM;
+
+		conn_dst->msg_users = users;
+		conn_dst->msg_users_max = i;
+	}
+
+	if (conn_dst->msg_users[user->idr] >= KDBUS_CONN_MAX_MSGS_PER_USER)
+		return -ENOBUFS;
+
+	conn_dst->msg_users[user->idr]++;
+	entry->user = kdbus_domain_user_ref(user);
+	return 0;
+}
+
+static void kdbus_conn_work(struct work_struct *work)
+{
+	struct kdbus_conn *conn;
+	struct kdbus_conn_reply *reply, *reply_tmp;
+	u64 deadline = ~0ULL;
+	struct timespec64 ts;
+	u64 now;
+
+	conn = container_of(work, struct kdbus_conn, work.work);
+	ktime_get_ts64(&ts);
+	now = timespec64_to_ns(&ts);
+
+	mutex_lock(&conn->lock);
+	if (!kdbus_conn_active(conn)) {
+		mutex_unlock(&conn->lock);
+		return;
+	}
+
+	list_for_each_entry_safe(reply, reply_tmp, &conn->reply_list, entry) {
+		/*
+		 * If the reply block is waiting for synchronous I/O,
+		 * the timeout is handled by wait_event_*_timeout(),
+		 * so we don't have to care for it here.
+		 */
+		if (reply->sync && !reply->interrupted)
+			continue;
+
+		if (reply->deadline_ns > now) {
+			/* remember next timeout */
+			if (deadline > reply->deadline_ns)
+				deadline = reply->deadline_ns;
+
+			continue;
+		}
+
+		/*
+		 * A zero deadline means the connection died, was
+		 * cleaned up already and the notification was sent.
+		 * Don't send notifications for reply trackers that were
+		 * left in an interrupted syscall state.
+		 */
+		if (reply->deadline_ns != 0 && !reply->interrupted)
+			kdbus_notify_reply_timeout(conn->ep->bus,
+						   reply->reply_dst->id,
+						   reply->cookie);
+
+		list_del_init(&reply->entry);
+		kdbus_conn_reply_unref(reply);
+	}
+
+	/* rearm delayed work with next timeout */
+	if (deadline != ~0ULL)
+		schedule_delayed_work(&conn->work,
+				      nsecs_to_jiffies(deadline - now));
+
+	mutex_unlock(&conn->lock);
+
+	kdbus_notify_flush(conn->ep->bus);
+}
+
+/**
+ * kdbus_cmd_msg_recv() - receive a message from the queue
+ * @conn:		Connection to work on
+ * @recv:		The command as passed in by the ioctl
+ *
+ * Return: 0 on success, negative errno on failure
+ */
+int kdbus_cmd_msg_recv(struct kdbus_conn *conn,
+		       struct kdbus_cmd_recv *recv)
+{
+	struct kdbus_queue_entry *entry = NULL;
+	unsigned int lost_count;
+	int ret = 0;
+
+	if (recv->offset > 0)
+		return -EINVAL;
+
+	mutex_lock(&conn->lock);
+	entry = kdbus_queue_entry_peek(&conn->queue, recv->priority,
+				       recv->flags & KDBUS_RECV_USE_PRIORITY);
+	if (IS_ERR(entry)) {
+		ret = PTR_ERR(entry);
+		goto exit_unlock;
+	}
+
+	/*
+	 * Make sure to never install fds into a connection that has
+	 * refused to receive any.
+	 */
+	if (WARN_ON(!(conn->flags & KDBUS_HELLO_ACCEPT_FD) &&
+		    entry->fds_count > 0)) {
+		ret = -EINVAL;
+		goto exit_unlock;
+	}
+
+	/* just drop the message */
+	if (recv->flags & KDBUS_RECV_DROP) {
+		bool reply_found = false;
+
+		if (entry->reply) {
+			struct kdbus_conn_reply *r;
+
+			/*
+			 * Walk the list of pending replies and see if the
+			 * one attached to this entry item is stil there.
+			 * It might have been removed by an incoming reply,
+			 * and we currently don't track reply entries in that
+			 * direction in order to prevent potentially dangling
+			 * pointers.
+			 */
+			list_for_each_entry(r, &conn->reply_list, entry) {
+				if (r == entry->reply) {
+					reply_found = true;
+					break;
+				}
+			}
+		}
+
+		if (reply_found) {
+			if (entry->reply->sync) {
+				kdbus_conn_reply_sync(entry->reply, -EPIPE);
+			} else {
+				list_del_init(&entry->reply->entry);
+				kdbus_conn_reply_unref(entry->reply);
+				kdbus_notify_reply_dead(conn->ep->bus,
+							entry->src_id,
+							entry->cookie);
+			}
+		}
+
+		kdbus_queue_entry_remove(conn, entry);
+		kdbus_pool_slice_free(entry->slice);
+
+		/* Free the resources of this entry */
+		kdbus_queue_entry_free(entry);
+
+		goto exit_unlock;
+	}
+
+	/*
+	 * If there have been lost broadcast messages, report the number
+	 * in the overloaded recv->dropped_msgs field and return -EOVERFLOW.
+	 */
+	lost_count = atomic_read(&conn->lost_count);
+	if (lost_count) {
+		recv->dropped_msgs = lost_count;
+		atomic_sub(lost_count, &conn->lost_count);
+		ret = -EOVERFLOW;
+		goto exit_unlock;
+	}
+
+	/* Give the offset back to the caller. */
+	recv->offset = kdbus_pool_slice_offset(entry->slice);
+
+	/*
+	 * Just return the location of the next message. Do not install
+	 * file descriptors or anything else. This is usually used to
+	 * determine the sender of the next queued message.
+	 *
+	 * File descriptor numbers referenced in the message items
+	 * are undefined, they are only valid with the full receive
+	 * not with peek.
+	 */
+	if (recv->flags & KDBUS_RECV_PEEK) {
+		kdbus_pool_slice_flush(entry->slice);
+		goto exit_unlock;
+	}
+
+	ret = kdbus_queue_entry_install(entry);
+	kdbus_pool_slice_make_public(entry->slice);
+	kdbus_queue_entry_remove(conn, entry);
+	kdbus_queue_entry_free(entry);
+
+exit_unlock:
+	mutex_unlock(&conn->lock);
+	kdbus_notify_flush(conn->ep->bus);
+	return ret;
+}
+
+/**
+ * kdbus_conn_reply_find() - Find the corresponding reply object
+ * @conn_replying:	The replying connection
+ * @conn_reply_dst:	The connection the reply will be sent to
+ *			(method origin)
+ * @cookie:		The cookie of the requesting message
+ *
+ * Lookup a reply object that should be sent as a reply by
+ * @conn_replying to @conn_reply_dst with the given cookie.
+ *
+ * For optimizations, callers should first check 'reply_count' of
+ * @conn_reply_dst to see if the connection has issued any requests
+ * that are waiting for replies, before calling this function.
+ *
+ * Return: the corresponding reply object or NULL if not found
+ */
+static struct kdbus_conn_reply *
+kdbus_conn_reply_find(struct kdbus_conn *conn_replying,
+		      struct kdbus_conn *conn_reply_dst,
+		      uint64_t cookie)
+{
+	struct kdbus_conn_reply *r;
+	struct kdbus_conn_reply *reply = NULL;
+
+	list_for_each_entry(r, &conn_replying->reply_list, entry) {
+		if (r->reply_dst == conn_reply_dst &&
+		    r->cookie == cookie) {
+			reply = r;
+			break;
+		}
+	}
+
+	return reply;
+}
+
+/**
+ * kdbus_cmd_msg_cancel() - cancel all pending sync requests
+ *			    with the given cookie
+ * @conn:		The connection
+ * @cookie:		The cookie
+ *
+ * Return: 0 on success, or -ENOENT if no pending request with that
+ * cookie was found.
+ */
+int kdbus_cmd_msg_cancel(struct kdbus_conn *conn,
+			 u64 cookie)
+{
+	struct kdbus_conn_reply *reply;
+	struct kdbus_conn *c;
+	int ret = -ENOENT;
+	int i;
+
+	if (atomic_read(&conn->reply_count) == 0)
+		return -ENOENT;
+
+	/* lock order: domain -> bus -> ep -> names -> conn */
+	down_read(&conn->ep->bus->conn_rwlock);
+	hash_for_each(conn->ep->bus->conn_hash, i, c, hentry) {
+		if (c == conn)
+			continue;
+
+		mutex_lock(&c->lock);
+		reply = kdbus_conn_reply_find(c, conn, cookie);
+		if (reply && reply->sync) {
+			kdbus_conn_reply_sync(reply, -ECANCELED);
+			ret = 0;
+		}
+		mutex_unlock(&c->lock);
+	}
+	up_read(&conn->ep->bus->conn_rwlock);
+
+	return ret;
+}
+
+static int kdbus_conn_check_access(struct kdbus_ep *ep,
+				   const struct kdbus_msg *msg,
+				   struct kdbus_conn *conn_src,
+				   struct kdbus_conn *conn_dst,
+				   struct kdbus_conn_reply **reply_wake)
+{
+	bool allowed = false;
+
+	/*
+	 * Walk the conn_src's list of expected replies. If there's any
+	 * matching entry, allow the message to be sent, and remove it.
+	 *
+	 * If conn_dst did not issue any previous request or if the
+	 * request was canceled then nothing to do, and fallback to
+	 * to a normal permission check
+	 */
+	if (reply_wake && msg->cookie_reply > 0 &&
+	    atomic_read(&conn_dst->reply_count) > 0) {
+		struct kdbus_conn_reply *r;
+
+		mutex_lock(&conn_src->lock);
+		r = kdbus_conn_reply_find(conn_src, conn_dst,
+					  msg->cookie_reply);
+		if (r) {
+			list_del_init(&r->entry);
+			if (r->sync)
+				*reply_wake = kdbus_conn_reply_ref(r);
+			else
+				kdbus_conn_reply_unref(r);
+
+			allowed = true;
+		}
+		mutex_unlock(&conn_src->lock);
+	}
+
+	if (allowed)
+		return 0;
+
+	/* ... otherwise, ask the policy DBs for permission */
+	return kdbus_ep_policy_check_talk_access(ep, conn_src, conn_dst);
+}
+
+/* Callers should take the conn_dst lock */
+static struct kdbus_queue_entry *
+kdbus_conn_entry_make(struct kdbus_conn *conn_src,
+		      struct kdbus_conn *conn_dst,
+		      const struct kdbus_kmsg *kmsg)
+{
+	struct kdbus_queue_entry *entry;
+
+	/* The remote connection was disconnected */
+	if (!kdbus_conn_active(conn_dst))
+		return ERR_PTR(-ECONNRESET);
+
+	/* The connection does not accept file descriptors */
+	if (!(conn_dst->flags & KDBUS_HELLO_ACCEPT_FD) && kmsg->fds_count > 0)
+		return ERR_PTR(-ECOMM);
+
+	entry = kdbus_queue_entry_alloc(conn_src, conn_dst, kmsg);
+	if (IS_ERR(entry))
+		return entry;
+
+	return entry;
+}
+
+/*
+ * Synchronously responding to a message, allocate a queue entry
+ * and attach it to the reply tracking object.
+ * The connection's queue will never get to see it.
+ */
+static int kdbus_conn_entry_sync_attach(struct kdbus_conn *conn_src,
+					struct kdbus_conn *conn_dst,
+					const struct kdbus_kmsg *kmsg,
+					struct kdbus_conn_reply *reply_wake)
+{
+	struct kdbus_queue_entry *entry;
+	int remote_ret;
+	int ret = 0;
+
+	mutex_lock(&conn_dst->lock);
+
+	/*
+	 * If we are still waiting then proceed, allocate a queue
+	 * entry and attach it to the reply object
+	 */
+	if (reply_wake->waiting) {
+		entry = kdbus_conn_entry_make(conn_src, conn_dst, kmsg);
+		if (IS_ERR(entry))
+			ret = PTR_ERR(entry);
+		else
+			/* Attach the entry to the reply object */
+			reply_wake->queue_entry = entry;
+	} else {
+		ret = -ECONNRESET;
+	}
+
+	/*
+	 * Update the reply object and wake up remote peer only
+	 * on appropriate return codes
+	 *
+	 * * -ECOMM: if the replying connection failed with -ECOMM
+	 *           then wakeup remote peer with -EREMOTEIO
+	 *
+	 *           We do this to differenciate between -ECOMM errors
+	 *           from the original sender perspective:
+	 *           -ECOMM error during the sync send and
+	 *           -ECOMM error during the sync reply, this last
+	 *           one is rewritten to -EREMOTEIO
+	 *
+	 * * Wake up on all other return codes.
+	 */
+	remote_ret = ret;
+
+	if (ret == -ECOMM)
+		remote_ret = -EREMOTEIO;
+
+	kdbus_conn_reply_sync(reply_wake, remote_ret);
+	kdbus_conn_reply_unref(reply_wake);
+
+	mutex_unlock(&conn_dst->lock);
+
+	return ret;
+}
+
+/**
+ * kdbus_conn_entry_insert - enqueue a message into the receiver's pool
+ * @conn_src:		The sending connection
+ * @conn_dst:		The connection to queue into
+ * @kmsg:		The kmag to queue
+ * @reply:		The reply tracker to attach to the queue entry
+ *
+ * Return: 0 on success. negative error otherwise.
+ */
+int kdbus_conn_entry_insert(struct kdbus_conn *conn_src,
+			    struct kdbus_conn *conn_dst,
+			    const struct kdbus_kmsg *kmsg,
+			    struct kdbus_conn_reply *reply)
+{
+	struct kdbus_queue_entry *entry;
+	int ret;
+
+	mutex_lock(&conn_dst->lock);
+
+	/* limit the maximum number of queued messages */
+	if (conn_dst->queue.msg_count > KDBUS_CONN_MAX_MSGS) {
+		ret = -ENOBUFS;
+		goto exit_unlock;
+	}
+
+	/* Get a queue entry for src and dst pairs */
+	entry = kdbus_conn_entry_make(conn_src, conn_dst, kmsg);
+	if (IS_ERR(entry)) {
+		ret = PTR_ERR(entry);
+		goto exit_unlock;
+	}
+
+	/* limit the number of queued messages from the same individual user */
+	ret = kdbus_conn_queue_user_quota(conn_src, conn_dst, entry);
+	if (ret < 0)
+		goto exit_queue_free;
+
+	/*
+	 * Remember the the reply associated with this queue entry, so we can
+	 * move the reply entry's connection when a connection moves from an
+	 * activator to an implementor.
+	 */
+	entry->reply = reply;
+
+	if (reply) {
+		list_add(&reply->entry, &conn_dst->reply_list);
+		if (!reply->sync)
+			schedule_delayed_work(&conn_dst->work, 0);
+	}
+
+	/* link the message into the receiver's entry */
+	kdbus_queue_entry_add(&conn_dst->queue, entry);
+	mutex_unlock(&conn_dst->lock);
+
+	/* wake up poll() */
+	wake_up_interruptible(&conn_dst->wait);
+	return 0;
+
+exit_queue_free:
+	kdbus_queue_entry_free(entry);
+exit_unlock:
+	mutex_unlock(&conn_dst->lock);
+	return ret;
+}
+
+static void kdbus_conn_eavesdrop(struct kdbus_bus *bus,
+				 struct kdbus_conn *conn,
+				 struct kdbus_kmsg *kmsg)
+{
+	struct kdbus_conn *c;
+	int ret;
+
+	/*
+	 * Monitor connections get all messages; ignore possible errors
+	 * when sending messages to monitor connections.
+	 */
+
+	down_read(&bus->conn_rwlock);
+	list_for_each_entry(c, &bus->monitors_list, monitor_entry) {
+		/*
+		 * The first monitor which requests additional
+		 * metadata causes the message to carry it; all
+		 * monitors after that will see all of the added
+		 * data, even when they did not ask for it.
+		 */
+		if (conn) {
+			ret = kdbus_kmsg_attach_metadata(kmsg, conn, c);
+			if (ret < 0)
+				break;
+		}
+
+		kdbus_conn_entry_insert(NULL, c, kmsg, NULL);
+	}
+	up_read(&bus->conn_rwlock);
+}
+
+static int kdbus_conn_wait_reply(struct kdbus_conn *conn_src,
+				 struct kdbus_conn *conn_dst,
+				 struct kdbus_msg *msg,
+				 struct kdbus_conn_reply *reply_wait,
+				 u64 timeout_ns)
+{
+	struct kdbus_queue_entry *entry;
+	int r, ret;
+
+	/*
+	 * Block until the reply arrives. reply_wait is left untouched
+	 * by the timeout scans that might be conducted for other,
+	 * asynchronous replies of conn_src.
+	 */
+	r = wait_event_interruptible_timeout(reply_wait->reply_dst->wait,
+		!reply_wait->waiting || !kdbus_conn_active(conn_src),
+		nsecs_to_jiffies(timeout_ns));
+	if (r < 0) {
+		/*
+		 * Interrupted system call. Unref the reply object, and
+		 * pass the return value down the chain. Mark the reply as
+		 * interrupted, so the cleanup work can remove it, but do
+		 * not unlink it from the list. Once the syscall restarts,
+		 * we'll pick it up and wait on it again.
+		 */
+		mutex_lock(&conn_dst->lock);
+		reply_wait->interrupted = true;
+		schedule_delayed_work(&conn_dst->work, 0);
+		mutex_unlock(&conn_dst->lock);
+
+		return r;
+	}
+
+	if (r == 0)
+		ret = -ETIMEDOUT;
+	else if (!kdbus_conn_active(conn_src))
+		ret = -ECONNRESET;
+	else
+		ret = reply_wait->err;
+
+	mutex_lock(&conn_dst->lock);
+	list_del_init(&reply_wait->entry);
+	mutex_unlock(&conn_dst->lock);
+
+	mutex_lock(&conn_src->lock);
+	reply_wait->waiting = false;
+	entry = reply_wait->queue_entry;
+	if (entry) {
+		if (ret == 0)
+			ret = kdbus_queue_entry_install(entry);
+
+		msg->offset_reply = kdbus_pool_slice_offset(entry->slice);
+		kdbus_pool_slice_make_public(entry->slice);
+		kdbus_queue_entry_free(entry);
+	}
+	mutex_unlock(&conn_src->lock);
+
+	kdbus_conn_reply_unref(reply_wait);
+
+	return ret;
+}
+
+/**
+ * kdbus_conn_kmsg_send() - send a message
+ * @ep:			Endpoint to send from
+ * @conn_src:		Connection, kernel-generated messages do not have one
+ * @kmsg:		Message to send
+ *
+ * Return: 0 on success, negative errno on failure
+ */
+int kdbus_conn_kmsg_send(struct kdbus_ep *ep,
+			 struct kdbus_conn *conn_src,
+			 struct kdbus_kmsg *kmsg)
+{
+	struct kdbus_conn_reply *reply_wait = NULL;
+	struct kdbus_conn_reply *reply_wake = NULL;
+	struct kdbus_name_entry *name_entry = NULL;
+	struct kdbus_msg *msg = &kmsg->msg;
+	struct kdbus_conn *conn_dst = NULL;
+	struct kdbus_bus *bus = ep->bus;
+	bool sync = msg->flags & KDBUS_MSG_FLAGS_SYNC_REPLY;
+	int ret = 0;
+
+	/* assign domain-global message sequence number */
+	BUG_ON(kmsg->seq > 0);
+	kmsg->seq = atomic64_inc_return(&bus->domain->msg_seq_last);
+
+	/* non-kernel senders append credentials/metadata */
+	if (conn_src) {
+		/*
+		 * If a connection has installed faked credentials when it was
+		 * created, make sure only those are sent out as attachments
+		 * of messages, and nothing that is gathered at retrieved from
+		 * 'current' at the time of sending.
+		 *
+		 * Hence, in such cases, duplicate the connection's owner_meta,
+		 * and take care not to augment it by attaching any new items.
+		 */
+		if (conn_src->owner_meta)
+			kmsg->meta = kdbus_meta_dup(conn_src->owner_meta);
+		else
+			kmsg->meta = kdbus_meta_new();
+
+		if (IS_ERR(kmsg->meta)) {
+			ret = PTR_ERR(kmsg->meta);
+			kmsg->meta = NULL;
+			return ret;
+		}
+	}
+
+	if (msg->dst_id == KDBUS_DST_ID_BROADCAST) {
+		kdbus_bus_broadcast(bus, conn_src, kmsg);
+		return 0;
+	}
+
+	if (kmsg->dst_name) {
+		name_entry = kdbus_name_lock(bus->name_registry,
+					     kmsg->dst_name);
+		if (!name_entry)
+			return -ESRCH;
+
+		/*
+		 * If both a name and a connection ID are given as destination
+		 * of a message, check that the currently owning connection of
+		 * the name matches the specified ID.
+		 * This way, we allow userspace to send the message to a
+		 * specific connection by ID only if the connection currently
+		 * owns the given name.
+		 */
+		if (msg->dst_id != KDBUS_DST_ID_NAME &&
+		    msg->dst_id != name_entry->conn->id) {
+			ret = -EREMCHG;
+			goto exit_name_unlock;
+		}
+
+		if (!name_entry->conn && name_entry->activator)
+			conn_dst = kdbus_conn_ref(name_entry->activator);
+		else
+			conn_dst = kdbus_conn_ref(name_entry->conn);
+
+		if ((msg->flags & KDBUS_MSG_FLAGS_NO_AUTO_START) &&
+		     kdbus_conn_is_activator(conn_dst)) {
+			ret = -EADDRNOTAVAIL;
+			goto exit_unref;
+		}
+	} else {
+		/* unicast message to unique name */
+		conn_dst = kdbus_bus_find_conn_by_id(bus, msg->dst_id);
+		if (!conn_dst)
+			return -ENXIO;
+
+		/*
+		 * Special-purpose connections are not allowed to be addressed
+		 * via their unique IDs.
+		 */
+		if (!kdbus_conn_is_ordinary(conn_dst)) {
+			ret = -ENXIO;
+			goto exit_unref;
+		}
+	}
+
+	/*
+	 * Record the sequence number of the registered name;
+	 * it will be passed on to the queue, in case messages
+	 * addressed to a name need to be moved from or to
+	 * activator connections of the same name.
+	 */
+	if (name_entry)
+		kmsg->dst_name_id = name_entry->name_id;
+
+	if (conn_src) {
+		/*
+		 * If we got here due to an interrupted system call, our reply
+		 * wait object is still queued on conn_dst, with the former
+		 * cookie. Look it up, and in case it exists, go dormant right
+		 * away again, and don't queue the message again.
+		 *
+		 * We also need to make sure that conn_src did really
+		 * issue a request or if the request did not get
+		 * canceled on the way before looking up any reply
+		 * object.
+		 */
+		if (sync && atomic_read(&conn_src->reply_count) > 0) {
+			mutex_lock(&conn_dst->lock);
+			reply_wait = kdbus_conn_reply_find(conn_dst,
+							   conn_src,
+							   kmsg->msg.cookie);
+			if (reply_wait) {
+				/* It was interrupted */
+				if (reply_wait->interrupted)
+					reply_wait->interrupted = false;
+				else
+					reply_wait = NULL;
+			}
+			mutex_unlock(&conn_dst->lock);
+
+			if (reply_wait)
+				goto wait_sync;
+		}
+
+		ret = kdbus_kmsg_attach_metadata(kmsg, conn_src, conn_dst);
+		if (ret < 0)
+			goto exit_unref;
+
+		if (msg->flags & KDBUS_MSG_FLAGS_EXPECT_REPLY) {
+			ret = kdbus_conn_check_access(ep, msg, conn_src,
+						      conn_dst, NULL);
+			if (ret < 0)
+				goto exit_unref;
+
+			reply_wait = kdbus_conn_reply_new(conn_src, msg,
+							  name_entry);
+			if (IS_ERR(reply_wait)) {
+				ret = PTR_ERR(reply_wait);
+				goto exit_unref;
+			}
+		} else {
+			ret = kdbus_conn_check_access(ep, msg, conn_src,
+						      conn_dst, &reply_wake);
+			if (ret < 0)
+				goto exit_unref;
+		}
+	}
+
+	if (reply_wake) {
+		/*
+		 * If we're synchronously responding to a message, allocate a
+		 * queue item and attach it to the reply tracking object.
+		 * The connection's queue will never get to see it.
+		 */
+		ret = kdbus_conn_entry_sync_attach(conn_src, conn_dst,
+						   kmsg, reply_wake);
+		if (ret < 0)
+			goto exit_unref;
+	} else {
+		/*
+		 * Otherwise, put it in the queue and wait for the connection
+		 * to dequeue and receive the message.
+		 */
+		ret = kdbus_conn_entry_insert(conn_src, conn_dst,
+					      kmsg, reply_wait);
+		if (ret < 0) {
+			if (reply_wait)
+				kdbus_conn_reply_unref(reply_wait);
+			goto exit_unref;
+		}
+	}
+
+	/* forward to monitors */
+	kdbus_conn_eavesdrop(bus, conn_src, kmsg);
+
+wait_sync:
+	/* no reason to keep names locked for replies */
+	name_entry = kdbus_name_unlock(bus->name_registry, name_entry);
+
+	if (sync) {
+		struct timespec64 ts;
+		u64 now, timeout;
+
+		BUG_ON(!reply_wait);
+
+		ktime_get_ts64(&ts);
+		now = timespec64_to_ns(&ts);
+
+		if (unlikely(msg->timeout_ns <= now))
+			timeout = 0;
+		else
+			timeout = msg->timeout_ns - now;
+
+		ret = kdbus_conn_wait_reply(conn_src, conn_dst, msg,
+					    reply_wait, timeout);
+	}
+
+exit_unref:
+	kdbus_conn_unref(conn_dst);
+exit_name_unlock:
+	kdbus_name_unlock(bus->name_registry, name_entry);
+
+	return ret;
+}
+
+/**
+ * kdbus_conn_disconnect() - disconnect a connection
+ * @conn:		The connection to disconnect
+ * @ensure_queue_empty:	Flag to indicate if the call should fail in
+ *			case the connection's message list is not
+ *			empty
+ *
+ * If @ensure_msg_list_empty is true, and the connection has pending messages,
+ * -EBUSY is returned.
+ *
+ * Return: 0 on success, negative errno on failure
+ */
+int kdbus_conn_disconnect(struct kdbus_conn *conn, bool ensure_queue_empty)
+{
+	struct kdbus_conn_reply *reply, *reply_tmp;
+	struct kdbus_queue_entry *entry, *tmp;
+	LIST_HEAD(reply_list);
+
+	mutex_lock(&conn->lock);
+	if (!kdbus_conn_active(conn)) {
+		mutex_unlock(&conn->lock);
+		return -EALREADY;
+	}
+
+	if (ensure_queue_empty && !list_empty(&conn->queue.msg_list)) {
+		mutex_unlock(&conn->lock);
+		return -EBUSY;
+	}
+
+	atomic_add(KDBUS_CONN_ACTIVE_BIAS, &conn->active);
+	mutex_unlock(&conn->lock);
+
+	wake_up_interruptible(&conn->wait);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	rwsem_acquire(&conn->dep_map, 0, 0, _RET_IP_);
+	if (atomic_read(&conn->active) != KDBUS_CONN_ACTIVE_BIAS)
+		lock_contended(&conn->dep_map, _RET_IP_);
+#endif
+
+	wait_event(conn->wait,
+		   atomic_read(&conn->active) == KDBUS_CONN_ACTIVE_BIAS);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	lock_acquired(&conn->dep_map, _RET_IP_);
+	rwsem_release(&conn->dep_map, 1, _RET_IP_);
+#endif
+
+	cancel_delayed_work_sync(&conn->work);
+
+	/* lock order: domain -> bus -> ep -> names -> conn */
+	mutex_lock(&conn->ep->lock);
+	down_write(&conn->ep->bus->conn_rwlock);
+
+	/* remove from bus and endpoint */
+	hash_del(&conn->hentry);
+	list_del(&conn->monitor_entry);
+	list_del(&conn->ep_entry);
+
+	up_write(&conn->ep->bus->conn_rwlock);
+	mutex_unlock(&conn->ep->lock);
+
+	/*
+	 * Remove all names associated with this connection; this possibly
+	 * moves queued messages back to the activator connection.
+	 */
+	kdbus_name_remove_by_conn(conn->ep->bus->name_registry, conn);
+
+	/* if we die while other connections wait for our reply, notify them */
+	mutex_lock(&conn->lock);
+	list_for_each_entry_safe(entry, tmp, &conn->queue.msg_list, entry) {
+		if (entry->reply)
+			kdbus_notify_reply_dead(conn->ep->bus, entry->src_id,
+						entry->cookie);
+
+		kdbus_queue_entry_remove(conn, entry);
+		kdbus_pool_slice_free(entry->slice);
+		kdbus_queue_entry_free(entry);
+	}
+	list_splice_init(&conn->reply_list, &reply_list);
+	mutex_unlock(&conn->lock);
+
+	list_for_each_entry_safe(reply, reply_tmp, &reply_list, entry) {
+		if (reply->sync) {
+			kdbus_conn_reply_sync(reply, -EPIPE);
+			continue;
+		}
+
+		/* send a 'connection dead' notification */
+		kdbus_notify_reply_dead(conn->ep->bus, reply->reply_dst->id,
+					reply->cookie);
+
+		list_del(&reply->entry);
+		kdbus_conn_reply_unref(reply);
+	}
+
+	kdbus_notify_id_change(conn->ep->bus, KDBUS_ITEM_ID_REMOVE,
+			       conn->id, conn->flags);
+
+	kdbus_notify_flush(conn->ep->bus);
+
+	return 0;
+}
+
+/**
+ * kdbus_conn_active() - connection is not disconnected
+ * @conn:		Connection to check
+ *
+ * Return true if the connection was not disconnected, yet. Note that a
+ * connection might be disconnected asynchronously, unless you hold the
+ * connection lock. If that's not suitable for you, see kdbus_conn_acquire() to
+ * suppress connection shutdown for a short period.
+ *
+ * Return: true if the connection is still active
+ */
+bool kdbus_conn_active(const struct kdbus_conn *conn)
+{
+	return atomic_read(&conn->active) >= 0;
+}
+
+/**
+ * kdbus_conn_flush_policy() - flush all cached policy entries that
+ *			       refer to a connecion
+ * @conn:	Connection to check
+ */
+void kdbus_conn_purge_policy_cache(struct kdbus_conn *conn)
+{
+	kdbus_policy_purge_cache(&conn->ep->policy_db, conn);
+	kdbus_policy_purge_cache(&conn->ep->bus->policy_db, conn);
+}
+
+static void __kdbus_conn_free(struct kref *kref)
+{
+	struct kdbus_conn *conn = container_of(kref, struct kdbus_conn, kref);
+
+	BUG_ON(kdbus_conn_active(conn));
+	BUG_ON(delayed_work_pending(&conn->work));
+	BUG_ON(!list_empty(&conn->queue.msg_list));
+	BUG_ON(!list_empty(&conn->names_list));
+	BUG_ON(!list_empty(&conn->names_queue_list));
+	BUG_ON(!list_empty(&conn->reply_list));
+
+	atomic_dec(&conn->user->connections);
+	kdbus_domain_user_unref(conn->user);
+
+	kdbus_conn_purge_policy_cache(conn);
+	kdbus_policy_remove_owner(&conn->ep->bus->policy_db, conn);
+
+	kdbus_meta_free(conn->owner_meta);
+	kdbus_match_db_free(conn->match_db);
+	kdbus_pool_free(conn->pool);
+	kdbus_ep_unref(conn->ep);
+	put_cred(conn->cred);
+	kfree(conn->name);
+	kfree(conn);
+}
+
+/**
+ * kdbus_conn_ref() - take a connection reference
+ * @conn:		Connection
+ *
+ * Return: the connection itself
+ */
+struct kdbus_conn *kdbus_conn_ref(struct kdbus_conn *conn)
+{
+	kref_get(&conn->kref);
+	return conn;
+}
+
+/**
+ * kdbus_conn_unref() - drop a connection reference
+ * @conn:		Connection (may be NULL)
+ *
+ * When the last reference is dropped, the connection's internal structure
+ * is freed.
+ *
+ * Return: NULL
+ */
+struct kdbus_conn *kdbus_conn_unref(struct kdbus_conn *conn)
+{
+	if (conn)
+		kref_put(&conn->kref, __kdbus_conn_free);
+	return NULL;
+}
+
+/**
+ * kdbus_conn_acquire() - acquire an active connection reference
+ * @conn:		Connection
+ *
+ * Users can close a connection via KDBUS_BYEBYE (or by destroying the
+ * endpoint/bus/...) at any time. Whenever this happens, we should deny any
+ * user-visible action on this connection and signal ECONNRESET instead.
+ * To avoid testing for connection availability everytime you take the
+ * connection-lock, you can acquire a connection for short periods.
+ *
+ * By calling kdbus_conn_acquire(), you gain an "active reference" to the
+ * connection. You must also hold a regular reference at any time! As long as
+ * you hold the active-ref, the connection will not be shut down. However, if
+ * the connection was shut down, you can never acquire an active-ref again.
+ *
+ * kdbus_conn_disconnect() disables the connection and then waits for all active
+ * references to be dropped. It will also wake up any pending operation.
+ * However, you must not sleep for an indefinite period while holding an
+ * active-reference. Otherwise, kdbus_conn_disconnect() might stall. If you need
+ * to sleep for an indefinite period, either release the reference and try to
+ * acquire it again after waking up, or make kdbus_conn_disconnect() wake up
+ * your wait-queue.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kdbus_conn_acquire(struct kdbus_conn *conn)
+{
+	if (!atomic_inc_unless_negative(&conn->active))
+		return -ECONNRESET;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	rwsem_acquire_read(&conn->dep_map, 0, 1, _RET_IP_);
+#endif
+
+	return 0;
+}
+
+/**
+ * kdbus_conn_release() - release an active connection reference
+ * @conn:		Connection
+ *
+ * This releases an active reference that has been acquired via
+ * kdbus_conn_acquire(). If the connection was already disabled and this is the
+ * last active-ref that is dropped, the disconnect-waiter will be woken up and
+ * properly close the connection.
+ */
+void kdbus_conn_release(struct kdbus_conn *conn)
+{
+	int v;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	rwsem_release(&conn->dep_map, 1, _RET_IP_);
+#endif
+
+	v = atomic_dec_return(&conn->active);
+	if (v != KDBUS_CONN_ACTIVE_BIAS)
+		return;
+
+	wake_up_all(&conn->wait);
+}
+
+/**
+ * kdbus_conn_move_messages() - move messages from one connection to another
+ * @conn_dst:		Connection to copy to
+ * @conn_src:		Connection to copy from
+ * @name_id:		Filter for the sequence number of the registered
+ *			name, 0 means no filtering.
+ *
+ * Move all messages from one connection to another. This is used when
+ * an implementor connection is taking over/giving back a well-known name
+ * from/to an activator connection.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_conn_move_messages(struct kdbus_conn *conn_dst,
+			     struct kdbus_conn *conn_src,
+			     u64 name_id)
+{
+	struct kdbus_queue_entry *q, *q_tmp;
+	struct kdbus_conn_reply *r, *r_tmp;
+	LIST_HEAD(reply_list);
+	LIST_HEAD(msg_list);
+	int ret = 0;
+
+	BUG_ON(!mutex_is_locked(&conn_dst->ep->bus->lock));
+	BUG_ON(conn_src == conn_dst);
+
+	/* remove all messages from the source */
+	mutex_lock(&conn_src->lock);
+	list_for_each_entry_safe(r, r_tmp, &conn_src->reply_list, entry) {
+		/* filter messages for a specific name */
+		if (name_id > 0 && r->name_id != name_id)
+			continue;
+
+		list_move_tail(&r->entry, &reply_list);
+	}
+	list_for_each_entry_safe(q, q_tmp, &conn_src->queue.msg_list, entry) {
+		/* filter messages for a specific name */
+		if (name_id > 0 && q->dst_name_id != name_id)
+			continue;
+
+		kdbus_queue_entry_remove(conn_src, q);
+
+		if (!(conn_dst->flags & KDBUS_HELLO_ACCEPT_FD) &&
+		    q->fds_count > 0) {
+			atomic_inc(&conn_dst->lost_count);
+			continue;
+		}
+
+		list_add_tail(&q->entry, &msg_list);
+	}
+	mutex_unlock(&conn_src->lock);
+
+	/* insert messages into destination */
+	mutex_lock(&conn_dst->lock);
+	if (!kdbus_conn_active(conn_dst)) {
+		struct kdbus_conn_reply *r, *r_tmp;
+
+		/* our destination connection died, just drop all messages */
+		mutex_unlock(&conn_dst->lock);
+		list_for_each_entry_safe(q, q_tmp, &msg_list, entry)
+			kdbus_queue_entry_free(q);
+		list_for_each_entry_safe(r, r_tmp, &reply_list, entry)
+			kdbus_conn_reply_unref(r);
+		return -ECONNRESET;
+	}
+
+	list_for_each_entry_safe(q, q_tmp, &msg_list, entry) {
+		ret = kdbus_pool_slice_move(conn_src->pool, conn_dst->pool,
+					    &q->slice);
+		if (ret < 0)
+			kdbus_queue_entry_free(q);
+		else
+			kdbus_queue_entry_add(&conn_dst->queue, q);
+	}
+	list_splice(&reply_list, &conn_dst->reply_list);
+	mutex_unlock(&conn_dst->lock);
+
+	/* wake up poll() */
+	wake_up_interruptible(&conn_dst->wait);
+
+	return ret;
+}
+
+/**
+ * kdbus_cmd_info() - retrieve info about a connection
+ * @conn:		Connection
+ * @cmd_info:		The command as passed in by the ioctl
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_cmd_info(struct kdbus_conn *conn,
+		   struct kdbus_cmd_info *cmd_info)
+{
+	struct kdbus_name_entry *entry = NULL;
+	struct kdbus_conn *owner_conn = NULL;
+	struct kdbus_info info = {};
+	struct kdbus_meta *meta = NULL;
+	struct kdbus_pool_slice *slice;
+	u64 extra_flags, attach_flags;
+	size_t pos, meta_size;
+	int ret = 0;
+
+	if (cmd_info->id == 0) {
+		const char *name;
+
+		name = kdbus_items_get_str(cmd_info->items,
+					   KDBUS_ITEMS_SIZE(cmd_info, items),
+					   KDBUS_ITEM_NAME);
+		if (IS_ERR(name))
+			return -EINVAL;
+
+		if (!kdbus_name_is_valid(name, false))
+			return -EINVAL;
+
+		/* check if 'conn' is allowed to see 'name' */
+		ret = kdbus_ep_policy_check_see_access(conn->ep, conn, name);
+		if (ret < 0)
+			return ret;
+
+		entry = kdbus_name_lock(conn->ep->bus->name_registry, name);
+		if (!entry)
+			return -ESRCH;
+		else if (entry->conn)
+			owner_conn = kdbus_conn_ref(entry->conn);
+	} else {
+		owner_conn = kdbus_bus_find_conn_by_id(conn->ep->bus,
+						       cmd_info->id);
+		if (!owner_conn) {
+			ret = -ENXIO;
+			goto exit;
+		}
+
+		/* check if 'conn' is allowed to see any of owner_conn's names*/
+		ret = kdbus_ep_policy_check_src_names(conn->ep, owner_conn,
+						      conn);
+		if (ret < 0)
+			goto exit;
+	}
+
+	info.size = sizeof(info);
+	info.id = owner_conn->id;
+	info.flags = owner_conn->flags;
+
+	/* mask out what information the connection wants to pass us */
+	attach_flags = cmd_info->flags &
+		       atomic64_read(&owner_conn->attach_flags_send);
+
+	meta_size = kdbus_meta_size(owner_conn->meta, conn, &attach_flags);
+	info.size += meta_size;
+
+	/*
+	 * Unlike the rest of the values which are cached at connection
+	 * creation time, some values need to be appended here because
+	 * at creation time a connection does not have names and other
+	 * properties.
+	 */
+	extra_flags = attach_flags & (KDBUS_ATTACH_NAMES |
+				      KDBUS_ATTACH_CONN_DESCRIPTION);
+	if (extra_flags) {
+		meta = kdbus_meta_new();
+		if (IS_ERR(meta)) {
+			ret = PTR_ERR(meta);
+			meta = NULL;
+			goto exit;
+		}
+
+		ret = kdbus_meta_append(meta, conn->ep->bus->domain,
+					owner_conn, 0, extra_flags);
+		if (ret < 0)
+			goto exit;
+
+		info.size += kdbus_meta_size(meta, conn, &extra_flags);
+	}
+
+	slice = kdbus_pool_slice_alloc(conn->pool, info.size);
+	if (IS_ERR(slice)) {
+		ret = PTR_ERR(slice);
+		slice = NULL;
+		goto exit;
+	}
+
+	ret = kdbus_pool_slice_copy(slice, 0, &info, sizeof(info));
+	if (ret < 0)
+		goto exit_free;
+
+	pos = sizeof(info);
+
+	if (meta_size) {
+		ret = kdbus_meta_write(owner_conn->meta, conn,
+				       attach_flags, slice, pos);
+		if (ret < 0)
+			goto exit_free;
+
+		pos += meta_size;
+	}
+
+	if (extra_flags) {
+		ret = kdbus_meta_write(meta, conn, extra_flags, slice, pos);
+		if (ret < 0)
+			goto exit_free;
+	}
+
+	/* write back the offset */
+	cmd_info->offset = kdbus_pool_slice_offset(slice);
+	kdbus_pool_slice_flush(slice);
+	kdbus_pool_slice_make_public(slice);
+
+exit_free:
+	if (ret < 0)
+		kdbus_pool_slice_free(slice);
+
+exit:
+	kdbus_meta_free(meta);
+	kdbus_conn_unref(owner_conn);
+	kdbus_name_unlock(conn->ep->bus->name_registry, entry);
+
+	return ret;
+}
+
+/**
+ * kdbus_cmd_conn_update() - update the attach-flags of a connection or
+ *			     the policy entries of a policy holding one
+ * @conn:		Connection
+ * @cmd:		The command as passed in by the ioctl
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_cmd_conn_update(struct kdbus_conn *conn,
+			  const struct kdbus_cmd_update *cmd)
+{
+	const struct kdbus_item *item;
+	bool policy_provided = false;
+	bool send_flags_provided = false;
+	bool recv_flags_provided = false;
+	u64 attach_flags_send;
+	u64 attach_flags_recv;
+	int ret;
+
+	KDBUS_ITEMS_FOREACH(item, cmd->items, KDBUS_ITEMS_SIZE(cmd, items)) {
+		switch (item->type) {
+		case KDBUS_ITEM_ATTACH_FLAGS_SEND:
+		case KDBUS_ITEM_ATTACH_FLAGS_RECV:
+			/*
+			 * Only ordinary or monitor connections
+			 * may update their attach-flags.
+			 */
+			if (!kdbus_conn_is_ordinary(conn) &&
+			    !kdbus_conn_is_monitor(conn))
+				return -EOPNOTSUPP;
+
+			if (item->type == KDBUS_ITEM_ATTACH_FLAGS_SEND) {
+				send_flags_provided = true;
+				attach_flags_send = item->data64[0];
+			} else {
+				recv_flags_provided = true;
+				attach_flags_recv = item->data64[0];
+			}
+			break;
+
+		case KDBUS_ITEM_NAME:
+		case KDBUS_ITEM_POLICY_ACCESS:
+			/*
+			 * Only policy holders may update their policy entries.
+			 */
+			if (!kdbus_conn_is_policy_holder(conn))
+				return -EOPNOTSUPP;
+
+			policy_provided = true;
+			break;
+		}
+	}
+
+	if (policy_provided) {
+		ret = kdbus_policy_set(&conn->ep->bus->policy_db, cmd->items,
+				       KDBUS_ITEMS_SIZE(cmd, items),
+				       1, true, conn);
+		if (ret < 0)
+			return ret;
+	}
+
+	if (send_flags_provided)
+		atomic64_set(&conn->attach_flags_send, attach_flags_send);
+
+	if (recv_flags_provided)
+		atomic64_set(&conn->attach_flags_recv, attach_flags_recv);
+
+	return 0;
+}
+
+/**
+ * kdbus_conn_new() - create a new connection
+ * @ep:			The endpoint the connection is connected to
+ * @hello:		The kdbus_cmd_hello as passed in by the user
+ * @meta:		The metadata gathered at open() time of the handle
+ * @privileged:		Whether to create a privileged connection
+ *
+ * Return: a new kdbus_conn on success, ERR_PTR on failure
+ */
+struct kdbus_conn *kdbus_conn_new(struct kdbus_ep *ep,
+				  struct kdbus_cmd_hello *hello,
+				  struct kdbus_meta *meta,
+				  bool privileged)
+{
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	static struct lock_class_key __key;
+#endif
+	const struct kdbus_creds *creds = NULL;
+	struct kdbus_bus *bus = ep->bus;
+	const struct kdbus_item *item;
+	const char *conn_name = NULL;
+	const char *seclabel = NULL;
+	const char *name = NULL;
+	struct kdbus_conn *conn;
+	size_t seclabel_len = 0;
+	u64 attach_flags_send;
+	u64 attach_flags_recv;
+	bool is_policy_holder;
+	bool is_activator;
+	bool is_monitor;
+	int ret;
+
+	is_monitor = hello->flags & KDBUS_HELLO_MONITOR;
+	is_activator = hello->flags & KDBUS_HELLO_ACTIVATOR;
+	is_policy_holder = hello->flags & KDBUS_HELLO_POLICY_HOLDER;
+
+	/* can't be activator or policy holder and monitor at the same time */
+	if (is_monitor && (is_activator || is_policy_holder))
+		return ERR_PTR(-EINVAL);
+
+	/* can't be policy holder and activator at the same time */
+	if (is_activator && is_policy_holder)
+		return ERR_PTR(-EINVAL);
+
+	/* only privileged connections can activate and monitor */
+	if (!privileged && (is_activator || is_policy_holder || is_monitor))
+		return ERR_PTR(-EPERM);
+
+	KDBUS_ITEMS_FOREACH(item, hello->items,
+			    KDBUS_ITEMS_SIZE(hello, items)) {
+		switch (item->type) {
+		case KDBUS_ITEM_NAME:
+			if (!is_activator && !is_policy_holder)
+				return ERR_PTR(-EINVAL);
+
+			if (name)
+				return ERR_PTR(-EINVAL);
+
+			if (!kdbus_name_is_valid(item->str, true))
+				return ERR_PTR(-EINVAL);
+
+			name = item->str;
+			break;
+
+		case KDBUS_ITEM_CREDS:
+			/* privileged processes can impersonate somebody else */
+			if (!privileged)
+				return ERR_PTR(-EPERM);
+
+			if (item->size != KDBUS_ITEM_SIZE(sizeof(*creds)))
+				return ERR_PTR(-EINVAL);
+
+			creds = &item->creds;
+			break;
+
+		case KDBUS_ITEM_SECLABEL:
+			/* privileged processes can impersonate somebody else */
+			if (!privileged)
+				return ERR_PTR(-EPERM);
+
+			seclabel = item->str;
+			seclabel_len = item->size - KDBUS_ITEM_HEADER_SIZE;
+			break;
+
+		case KDBUS_ITEM_CONN_DESCRIPTION:
+			/* human-readable connection name (debugging) */
+			if (conn_name)
+				return ERR_PTR(-EINVAL);
+
+			conn_name = item->str;
+			break;
+		}
+	}
+
+	if ((is_activator || is_policy_holder) && !name)
+		return ERR_PTR(-EINVAL);
+
+	attach_flags_send = hello->attach_flags_send;
+	attach_flags_recv = hello->attach_flags_recv;
+
+	/* 'any' degrades to 'all' for compatibility */
+	if (attach_flags_send == _KDBUS_ATTACH_ANY)
+		attach_flags_send = _KDBUS_ATTACH_ALL;
+
+	if (attach_flags_recv == _KDBUS_ATTACH_ANY)
+		attach_flags_recv = _KDBUS_ATTACH_ALL;
+
+	/* reject unknown attach flags */
+	if (attach_flags_send & ~_KDBUS_ATTACH_ALL)
+		return ERR_PTR(-EINVAL);
+
+	if (attach_flags_recv & ~_KDBUS_ATTACH_ALL)
+		return ERR_PTR(-EINVAL);
+
+	/* Let userspace know which flags are enforced by the bus */
+	hello->attach_flags_send = bus->attach_flags_req | KDBUS_FLAG_KERNEL;
+
+	if (bus->attach_flags_req & ~attach_flags_send)
+		return ERR_PTR(-ECONNREFUSED);
+
+	conn = kzalloc(sizeof(*conn), GFP_KERNEL);
+	if (!conn)
+		return ERR_PTR(-ENOMEM);
+
+	if (is_activator || is_policy_holder) {
+		/*
+		 * Policy holders may install one name, and are
+		 * allowed to use wildcards.
+		 */
+		ret = kdbus_policy_set(&bus->policy_db, hello->items,
+				       KDBUS_ITEMS_SIZE(hello, items),
+				       1, is_policy_holder, conn);
+		if (ret < 0)
+			goto exit_free_conn;
+	}
+
+	if (conn_name) {
+		conn->name = kstrdup(conn_name, GFP_KERNEL);
+		if (!conn->name) {
+			ret = -ENOMEM;
+			goto exit_free_conn;
+		}
+	}
+
+	kref_init(&conn->kref);
+	atomic_set(&conn->active, 0);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	lockdep_init_map(&conn->dep_map, "s_active", &__key, 0);
+#endif
+	mutex_init(&conn->lock);
+	INIT_LIST_HEAD(&conn->names_list);
+	INIT_LIST_HEAD(&conn->names_queue_list);
+	INIT_LIST_HEAD(&conn->reply_list);
+	atomic_set(&conn->name_count, 0);
+	atomic_set(&conn->reply_count, 0);
+	atomic_set(&conn->lost_count, 0);
+	INIT_DELAYED_WORK(&conn->work, kdbus_conn_work);
+	conn->cred = get_current_cred();
+	init_waitqueue_head(&conn->wait);
+	kdbus_queue_init(&conn->queue);
+	conn->privileged = privileged;
+
+	/* init entry, so we can unconditionally remove it */
+	INIT_LIST_HEAD(&conn->monitor_entry);
+
+	conn->pool = kdbus_pool_new(conn->name, hello->pool_size);
+	if (IS_ERR(conn->pool)) {
+		ret = PTR_ERR(conn->pool);
+		conn->pool = NULL;
+		goto exit_unref_cred;
+	}
+
+	conn->match_db = kdbus_match_db_new();
+	if (IS_ERR(conn->match_db)) {
+		ret = PTR_ERR(conn->match_db);
+		conn->match_db = NULL;
+		goto exit_free_pool;
+	}
+
+	conn->ep = kdbus_ep_ref(ep);
+
+	/* get new id for this connection */
+	conn->id = atomic64_inc_return(&bus->conn_seq_last);
+
+	/* return properties of this connection to the caller */
+	hello->bus_flags = bus->bus_flags;
+	hello->bloom = bus->bloom;
+	hello->id = conn->id;
+
+	BUILD_BUG_ON(sizeof(bus->id128) != sizeof(hello->id128));
+	memcpy(hello->id128, bus->id128, sizeof(hello->id128));
+
+	conn->flags = hello->flags;
+	atomic64_set(&conn->attach_flags_send, attach_flags_send);
+	atomic64_set(&conn->attach_flags_recv, attach_flags_recv);
+
+	if (is_activator) {
+		u64 flags = KDBUS_NAME_ACTIVATOR;
+
+		ret = kdbus_name_acquire(bus->name_registry, conn,
+					 name, &flags);
+		if (ret < 0)
+			goto exit_unref_ep;
+	}
+
+	if (is_monitor) {
+		down_write(&bus->conn_rwlock);
+		list_add_tail(&conn->monitor_entry, &bus->monitors_list);
+		up_write(&bus->conn_rwlock);
+	}
+
+	/* privileged processes can impersonate somebody else */
+	if (creds || seclabel) {
+		conn->owner_meta = kdbus_meta_new();
+		if (IS_ERR(conn->owner_meta)) {
+			ret = PTR_ERR(conn->owner_meta);
+			conn->owner_meta = NULL;
+			goto exit_release_names;
+		}
+
+		if (creds) {
+			ret = kdbus_meta_append_data(conn->owner_meta,
+						     KDBUS_ITEM_CREDS,
+						     creds, sizeof(*creds));
+			if (ret < 0)
+				goto exit_free_meta;
+		}
+
+		if (seclabel) {
+			ret = kdbus_meta_append_data(conn->owner_meta,
+						     KDBUS_ITEM_SECLABEL,
+						     seclabel, seclabel_len);
+			if (ret < 0)
+				goto exit_free_meta;
+		}
+
+		/* use the information provided with the HELLO call */
+		conn->meta = conn->owner_meta;
+	} else {
+		/* use the connection's metadata gathered at open() */
+		conn->meta = meta;
+	}
+
+	/*
+	 * Account the connection against the current user (UID), or for
+	 * custom endpoints use the anonymous user assigned to the endpoint.
+	 */
+	if (ep->user) {
+		conn->user = kdbus_domain_user_ref(ep->user);
+	} else {
+		conn->user = kdbus_domain_get_user(ep->bus->domain,
+						   current_fsuid());
+		if (IS_ERR(conn->user)) {
+			ret = PTR_ERR(conn->user);
+			conn->user = NULL;
+			goto exit_free_meta;
+		}
+	}
+
+	/* lock order: domain -> bus -> ep -> names -> conn */
+	mutex_lock(&bus->lock);
+	mutex_lock(&ep->lock);
+	down_write(&bus->conn_rwlock);
+
+	if (atomic_inc_return(&conn->user->connections) > KDBUS_USER_MAX_CONN) {
+		atomic_dec(&conn->user->connections);
+		ret = -EMFILE;
+		goto exit_unref_user_unlock;
+	}
+
+	/* make sure the ep-node is active while we add our connection */
+	if (!kdbus_node_acquire(&ep->node)) {
+		atomic_dec(&conn->user->connections);
+		ret = -ESHUTDOWN;
+		goto exit_unref_user_unlock;
+	}
+
+	/* link into bus and endpoint */
+	list_add_tail(&conn->ep_entry, &ep->conn_list);
+	hash_add(bus->conn_hash, &conn->hentry, conn->id);
+
+	kdbus_node_release(&ep->node);
+	up_write(&bus->conn_rwlock);
+	mutex_unlock(&ep->lock);
+	mutex_unlock(&bus->lock);
+
+	/* notify subscribers about the new active connection */
+	ret = kdbus_notify_id_change(conn->ep->bus, KDBUS_ITEM_ID_ADD,
+				     conn->id, conn->flags);
+	if (ret < 0) {
+		atomic_dec(&conn->user->connections);
+		goto exit_domain_user_unref;
+	}
+
+	kdbus_notify_flush(conn->ep->bus);
+
+	return conn;
+
+exit_unref_user_unlock:
+	up_write(&bus->conn_rwlock);
+	mutex_unlock(&ep->lock);
+	mutex_unlock(&bus->lock);
+exit_domain_user_unref:
+	kdbus_domain_user_unref(conn->user);
+exit_free_meta:
+	kdbus_meta_free(conn->owner_meta);
+exit_release_names:
+	kdbus_name_remove_by_conn(bus->name_registry, conn);
+exit_unref_ep:
+	kdbus_ep_unref(conn->ep);
+	kdbus_match_db_free(conn->match_db);
+exit_free_pool:
+	kdbus_pool_free(conn->pool);
+exit_unref_cred:
+	put_cred(conn->cred);
+exit_free_conn:
+	kfree(conn->name);
+	kfree(conn);
+
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_conn_has_name() - check if a connection owns a name
+ * @conn:		Connection
+ * @name:		Well-know name to check for
+ *
+ * Return: true if the name is currently owned by the connection
+ */
+bool kdbus_conn_has_name(struct kdbus_conn *conn, const char *name)
+{
+	struct kdbus_name_entry *e;
+	bool match = false;
+
+	/* No need to go further if we do not own names */
+	if (atomic_read(&conn->name_count) == 0)
+		return false;
+
+	mutex_lock(&conn->lock);
+	list_for_each_entry(e, &conn->names_list, conn_entry) {
+		if (strcmp(e->name, name) == 0) {
+			match = true;
+			break;
+		}
+	}
+	mutex_unlock(&conn->lock);
+
+	return match;
+}
diff --git a/ipc/kdbus/connection.h b/ipc/kdbus/connection.h
new file mode 100644
index 000000000000..cd4cb241cae3
--- /dev/null
+++ b/ipc/kdbus/connection.h
@@ -0,0 +1,188 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ * Copyright (C) 2014 Djalal Harouni
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_CONNECTION_H
+#define __KDBUS_CONNECTION_H
+
+#include <linux/atomic.h>
+#include <linux/kref.h>
+#include <linux/lockdep.h>
+#include "limits.h"
+#include "metadata.h"
+#include "pool.h"
+#include "queue.h"
+#include "util.h"
+
+#define KDBUS_HELLO_SPECIAL_CONN	(KDBUS_HELLO_ACTIVATOR | \
+					 KDBUS_HELLO_POLICY_HOLDER | \
+					 KDBUS_HELLO_MONITOR)
+
+/**
+ * struct kdbus_conn - connection to a bus
+ * @kref:		Reference count
+ * @active:		Active references to the connection
+ * @id:			Connection ID
+ * @flags:		KDBUS_HELLO_* flags
+ * @attach_flags_send:	KDBUS_ATTACH_* flags for sending
+ * @attach_flags_recv:	KDBUS_ATTACH_* flags for receiving
+ * @name:		Human-readable connection name, used for debugging
+ * @ep:			The endpoint this connection belongs to
+ * @lock:		Connection data lock
+ * @msg_users:		Array to account the number of queued messages per
+ *			individual user
+ * @msg_users_max:	Size of the users array
+ * @hentry:		Entry in ID <-> connection map
+ * @ep_entry:		Entry in endpoint
+ * @monitor_entry:	Entry in monitor, if the connection is a monitor
+ * @names_list:		List of well-known names
+ * @names_queue_list:	Well-known names this connection waits for
+ * @reply_list:		List of connections this connection should
+ *			reply to
+ * @work:		Delayed work to handle timeouts
+ * @activator_of:	Well-known name entry this connection acts as an
+ *			activator for
+ * @match_db:		Subscription filter to broadcast messages
+ * @meta:		Active connection creator's metadata/credentials,
+ *			either from the handle or from HELLO
+ * @owner_meta:		The connection's metadata/credentials supplied by
+ *			HELLO
+ * @pool:		The user's buffer to receive messages
+ * @user:		Owner of the connection
+ * @cred:		The credentials of the connection at creation time
+ * @name_count:		Number of owned well-known names
+ * @reply_count:	Number of requests this connection has issued, and
+ *			waits for replies from other peers
+ * @lost_count:		Number of lost broadcast messages
+ * @wait:		Wake up this endpoint
+ * @queue:		The message queue associated with this connection
+ * @privileged:		Whether this connection is privileged on the bus
+ */
+struct kdbus_conn {
+	struct kref kref;
+	atomic_t active;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	struct lockdep_map dep_map;
+#endif
+	u64 id;
+	u64 flags;
+	atomic64_t attach_flags_send;
+	atomic64_t attach_flags_recv;
+	const char *name;
+	struct kdbus_ep *ep;
+	struct mutex lock;
+	unsigned int *msg_users;
+	unsigned int msg_users_max;
+	struct hlist_node hentry;
+	struct list_head ep_entry;
+	struct list_head monitor_entry;
+	struct list_head names_list;
+	struct list_head names_queue_list;
+	struct list_head reply_list;
+	struct delayed_work work;
+	struct kdbus_name_entry *activator_of;
+	struct kdbus_match_db *match_db;
+	struct kdbus_meta *meta;
+	struct kdbus_meta *owner_meta;
+	struct kdbus_pool *pool;
+	struct kdbus_domain_user *user;
+	const struct cred *cred;
+	atomic_t name_count;
+	atomic_t reply_count;
+	atomic_t lost_count;
+	wait_queue_head_t wait;
+	struct kdbus_queue queue;
+	bool privileged : 1;
+};
+
+struct kdbus_kmsg;
+struct kdbus_name_registry;
+
+struct kdbus_conn *kdbus_conn_new(struct kdbus_ep *ep,
+				  struct kdbus_cmd_hello *hello,
+				  struct kdbus_meta *meta,
+				  bool privileged);
+struct kdbus_conn *kdbus_conn_ref(struct kdbus_conn *conn);
+struct kdbus_conn *kdbus_conn_unref(struct kdbus_conn *conn);
+int kdbus_conn_acquire(struct kdbus_conn *conn);
+void kdbus_conn_release(struct kdbus_conn *conn);
+int kdbus_conn_disconnect(struct kdbus_conn *conn, bool ensure_queue_empty);
+bool kdbus_conn_active(const struct kdbus_conn *conn);
+int kdbus_conn_entry_insert(struct kdbus_conn *conn_src,
+			    struct kdbus_conn *conn_dst,
+			    const struct kdbus_kmsg *kmsg,
+			    struct kdbus_conn_reply *reply);
+void kdbus_conn_purge_policy_cache(struct kdbus_conn *conn);
+int kdbus_conn_move_messages(struct kdbus_conn *conn_dst,
+			     struct kdbus_conn *conn_src,
+			     u64 name_id);
+bool kdbus_conn_has_name(struct kdbus_conn *conn, const char *name);
+
+/* command dispatcher */
+int kdbus_cmd_msg_recv(struct kdbus_conn *conn,
+		       struct kdbus_cmd_recv *recv);
+int kdbus_cmd_msg_cancel(struct kdbus_conn *conn,
+			 u64 cookie);
+int kdbus_cmd_info(struct kdbus_conn *conn,
+			struct kdbus_cmd_info *cmd_info);
+int kdbus_cmd_conn_update(struct kdbus_conn *conn,
+			  const struct kdbus_cmd_update *cmd_update);
+int kdbus_conn_kmsg_send(struct kdbus_ep *ep,
+			 struct kdbus_conn *conn_src,
+			 struct kdbus_kmsg *kmsg);
+
+/**
+ * kdbus_conn_is_ordinary() - Check if connection is ordinary
+ * @conn:		The connection to check
+ *
+ * Return: Non-zero if the connection is an ordinary connection
+ */
+static inline int kdbus_conn_is_ordinary(const struct kdbus_conn *conn)
+{
+	return !(conn->flags & KDBUS_HELLO_SPECIAL_CONN);
+}
+
+/**
+ * kdbus_conn_is_activator() - Check if connection is an activator
+ * @conn:		The connection to check
+ *
+ * Return: Non-zero if the connection is an activator
+ */
+static inline int kdbus_conn_is_activator(const struct kdbus_conn *conn)
+{
+	return conn->flags & KDBUS_HELLO_ACTIVATOR;
+}
+
+/**
+ * kdbus_conn_is_policy_holder() - Check if connection is a policy holder
+ * @conn:		The connection to check
+ *
+ * Return: Non-zero if the connection is a policy holder
+ */
+static inline int kdbus_conn_is_policy_holder(const struct kdbus_conn *conn)
+{
+	return conn->flags & KDBUS_HELLO_POLICY_HOLDER;
+}
+
+/**
+ * kdbus_conn_is_monitor() - Check if connection is a monitor
+ * @conn:		The connection to check
+ *
+ * Return: Non-zero if the connection is a monitor
+ */
+static inline int kdbus_conn_is_monitor(const struct kdbus_conn *conn)
+{
+	return conn->flags & KDBUS_HELLO_MONITOR;
+}
+
+#endif
diff --git a/ipc/kdbus/item.c b/ipc/kdbus/item.c
new file mode 100644
index 000000000000..06369fefeb69
--- /dev/null
+++ b/ipc/kdbus/item.c
@@ -0,0 +1,258 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/ctype.h>
+#include <linux/fs.h>
+#include <linux/string.h>
+
+#include "item.h"
+#include "limits.h"
+#include "util.h"
+
+#define KDBUS_ITEM_VALID(_i, _is, _s)					\
+	((_i)->size > KDBUS_ITEM_HEADER_SIZE &&				\
+	 (u8 *)(_i) + (_i)->size <= (u8 *)(_is) + (_s) &&		\
+	 (u8 *)(_i) >= (u8 *)(_is))
+
+#define KDBUS_ITEMS_END(_i, _is, _s)					\
+	((u8 *)_i == ((u8 *)(_is) + KDBUS_ALIGN8(_s)))
+
+/**
+ * kdbus_item_validate_name() - validate an item containing a name
+ * @item:		Item to validate
+ *
+ * Return: zero on success or an negative error code on failure
+ */
+int kdbus_item_validate_name(const struct kdbus_item *item)
+{
+	if (item->size < KDBUS_ITEM_HEADER_SIZE + 2)
+		return -EINVAL;
+
+	if (item->size > KDBUS_ITEM_HEADER_SIZE +
+			 KDBUS_SYSNAME_MAX_LEN + 1)
+		return -ENAMETOOLONG;
+
+	if (!kdbus_str_valid(item->str, KDBUS_ITEM_PAYLOAD_SIZE(item)))
+		return -EINVAL;
+
+	return kdbus_sysname_is_valid(item->str);
+}
+
+static int kdbus_item_validate(const struct kdbus_item *item)
+{
+	size_t payload_size = KDBUS_ITEM_PAYLOAD_SIZE(item);
+	size_t l;
+	int ret;
+
+	if (item->size < KDBUS_ITEM_HEADER_SIZE)
+		return -EINVAL;
+
+	switch (item->type) {
+	case KDBUS_ITEM_PAYLOAD_VEC:
+		if (payload_size != sizeof(struct kdbus_vec))
+			return -EINVAL;
+		if (item->vec.size == 0 || item->vec.size > SIZE_MAX)
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_PAYLOAD_OFF:
+		if (payload_size != sizeof(struct kdbus_vec))
+			return -EINVAL;
+		if (item->vec.size == 0 || item->vec.size > SIZE_MAX)
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_PAYLOAD_MEMFD:
+		if (payload_size != sizeof(struct kdbus_memfd))
+			return -EINVAL;
+		if (item->memfd.size == 0 || item->memfd.size > SIZE_MAX)
+			return -EINVAL;
+		if (item->memfd.fd < 0)
+			return -EBADF;
+		break;
+
+	case KDBUS_ITEM_FDS:
+		if (payload_size % sizeof(int) != 0)
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_BLOOM_PARAMETER:
+		if (payload_size != sizeof(struct kdbus_bloom_parameter))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_BLOOM_FILTER:
+		/* followed by the bloom-mask, depends on the bloom-size */
+		if (payload_size < sizeof(struct kdbus_bloom_filter))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_BLOOM_MASK:
+		/* size depends on bloom-size of bus */
+		break;
+
+	case KDBUS_ITEM_CONN_DESCRIPTION:
+	case KDBUS_ITEM_MAKE_NAME:
+		ret = kdbus_item_validate_name(item);
+		if (ret < 0)
+			return ret;
+		break;
+
+	case KDBUS_ITEM_ATTACH_FLAGS_SEND:
+	case KDBUS_ITEM_ATTACH_FLAGS_RECV:
+	case KDBUS_ITEM_ID:
+		if (payload_size != sizeof(u64))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_TIMESTAMP:
+		if (payload_size != sizeof(struct kdbus_timestamp))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_CREDS:
+		if (payload_size != sizeof(struct kdbus_creds))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_AUXGROUPS:
+		if (payload_size % sizeof(u64) != 0)
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_NAME:
+	case KDBUS_ITEM_DST_NAME:
+	case KDBUS_ITEM_PID_COMM:
+	case KDBUS_ITEM_TID_COMM:
+	case KDBUS_ITEM_EXE:
+	case KDBUS_ITEM_CMDLINE:
+	case KDBUS_ITEM_CGROUP:
+	case KDBUS_ITEM_SECLABEL:
+		if (!kdbus_str_valid(item->str, payload_size))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_CAPS:
+		/* TODO */
+		break;
+
+	case KDBUS_ITEM_AUDIT:
+		if (payload_size != sizeof(struct kdbus_audit))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_POLICY_ACCESS:
+		if (payload_size != sizeof(struct kdbus_policy_access))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_NAME_ADD:
+	case KDBUS_ITEM_NAME_REMOVE:
+	case KDBUS_ITEM_NAME_CHANGE:
+		if (payload_size < sizeof(struct kdbus_notify_name_change))
+			return -EINVAL;
+		l = payload_size - offsetof(struct kdbus_notify_name_change,
+					    name);
+		if (l > 0 && !kdbus_str_valid(item->name_change.name, l))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_ID_ADD:
+	case KDBUS_ITEM_ID_REMOVE:
+		if (payload_size != sizeof(struct kdbus_notify_id_change))
+			return -EINVAL;
+		break;
+
+	case KDBUS_ITEM_REPLY_TIMEOUT:
+	case KDBUS_ITEM_REPLY_DEAD:
+		if (payload_size != 0)
+			return -EINVAL;
+		break;
+
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+/**
+ * kdbus_items_validate() - validate items passed by user-space
+ * @items:		items to validate
+ * @items_size:		number of items
+ *
+ * This verifies that the passed items pointer is consistent and valid.
+ * Furthermore, each item is checked for:
+ *  - valid "size" value
+ *  - payload is of expected type
+ *  - payload is fully included in the item
+ *  - string payloads are zero-terminated
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int kdbus_items_validate(const struct kdbus_item *items, size_t items_size)
+{
+	const struct kdbus_item *item;
+	int ret;
+
+	KDBUS_ITEMS_FOREACH(item, items, items_size) {
+		if (!KDBUS_ITEM_VALID(item, items, items_size))
+			return -EINVAL;
+
+		ret = kdbus_item_validate(item);
+		if (ret < 0)
+			return ret;
+	}
+
+	if (!KDBUS_ITEMS_END(item, items, items_size))
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
+ * kdbus_items_get_str() - get string from a list of items
+ * @items:		The items to walk
+ * @items_size:		The size of all items
+ * @item_type:		The item type to look for
+ *
+ * This function walks a list of items and searches for items of type
+ * @item_type. If it finds exactly one such item, @str_ret will be set to
+ * the .str member of the item.
+ *
+ * Return: the string, if the item was found exactly once, ERR_PTR(-EEXIST)
+ * if the item was found more than once, and ERR_PTR(-EBADMSG) if there was
+ * no item of the given type.
+ */
+const char *kdbus_items_get_str(const struct kdbus_item *items,
+				size_t items_size,
+				unsigned int item_type)
+{
+	const struct kdbus_item *item;
+	const char *n = NULL;
+
+	KDBUS_ITEMS_FOREACH(item, items, items_size) {
+		if (item->type == item_type) {
+			if (n)
+				return ERR_PTR(-EEXIST);
+
+			n = item->str;
+			continue;
+		}
+	}
+
+	if (!n)
+		return ERR_PTR(-EBADMSG);
+
+	return n;
+}
diff --git a/ipc/kdbus/item.h b/ipc/kdbus/item.h
new file mode 100644
index 000000000000..30d399ab0b65
--- /dev/null
+++ b/ipc/kdbus/item.h
@@ -0,0 +1,41 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_ITEM_H
+#define __KDBUS_ITEM_H
+
+#include <linux/kernel.h>
+#include <uapi/linux/kdbus.h>
+
+#include "util.h"
+
+/* generic access and iterators over a stream of items */
+#define KDBUS_ITEM_HEADER_SIZE offsetof(struct kdbus_item, data)
+#define KDBUS_ITEM_PAYLOAD_SIZE(_i) ((_i)->size - KDBUS_ITEM_HEADER_SIZE)
+#define KDBUS_ITEM_SIZE(_s) KDBUS_ALIGN8(KDBUS_ITEM_HEADER_SIZE + (_s))
+#define KDBUS_ITEM_NEXT(_i) (typeof(_i))(((u8 *)_i) + KDBUS_ALIGN8((_i)->size))
+#define KDBUS_ITEMS_SIZE(_h, _is) ((_h)->size - offsetof(typeof(*_h), _is))
+
+#define KDBUS_ITEMS_FOREACH(_i, _is, _s)				\
+	for (_i = _is;							\
+	     ((u8 *)(_i) < (u8 *)(_is) + (_s)) &&			\
+	       ((u8 *)(_i) >= (u8 *)(_is));				\
+	     _i = KDBUS_ITEM_NEXT(_i))
+
+int kdbus_item_validate_name(const struct kdbus_item *item);
+int kdbus_items_validate(const struct kdbus_item *items, size_t items_size);
+const char *kdbus_items_get_str(const struct kdbus_item *items,
+				size_t items_size,
+				unsigned int item_type);
+
+#endif
diff --git a/ipc/kdbus/message.c b/ipc/kdbus/message.c
new file mode 100644
index 000000000000..5b9c3fe3cab1
--- /dev/null
+++ b/ipc/kdbus/message.c
@@ -0,0 +1,444 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/capability.h>
+#include <linux/cgroup.h>
+#include <linux/cred.h>
+#include <linux/file.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/shmem_fs.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <net/sock.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "handle.h"
+#include "item.h"
+#include "match.h"
+#include "message.h"
+#include "names.h"
+#include "policy.h"
+
+#define KDBUS_KMSG_HEADER_SIZE offsetof(struct kdbus_kmsg, msg)
+
+/**
+ * kdbus_kmsg_free() - free allocated message
+ * @kmsg:		Message
+ */
+void kdbus_kmsg_free(struct kdbus_kmsg *kmsg)
+{
+	kdbus_fput_files(kmsg->memfds, kmsg->memfds_count);
+	kdbus_fput_files(kmsg->fds, kmsg->fds_count);
+	kdbus_meta_free(kmsg->meta);
+	kfree(kmsg->memfds);
+	kfree(kmsg->fds);
+	kfree(kmsg);
+}
+
+/**
+ * kdbus_kmsg_new() - allocate message
+ * @extra_size:		additional size to reserve for data
+ *
+ * Return: new kdbus_kmsg on success, ERR_PTR on failure.
+ */
+struct kdbus_kmsg *kdbus_kmsg_new(size_t extra_size)
+{
+	struct kdbus_kmsg *m;
+	size_t size;
+
+	size = sizeof(struct kdbus_kmsg) + KDBUS_ITEM_SIZE(extra_size);
+	m = kzalloc(size, GFP_KERNEL);
+	if (!m)
+		return ERR_PTR(-ENOMEM);
+
+	m->msg.size = size - KDBUS_KMSG_HEADER_SIZE;
+	m->msg.items[0].size = KDBUS_ITEM_SIZE(extra_size);
+
+	return m;
+}
+
+static int kdbus_handle_check_file(struct file *file)
+{
+	struct inode *inode = file_inode(file);
+	struct socket *sock;
+
+	/*
+	 * Don't allow file descriptors in the transport that themselves allow
+	 * file descriptor queueing. This will eventually be allowed once both
+	 * unix domain sockets and kdbus share a generic garbage collector.
+	 */
+
+	if (file->f_op == &kdbus_handle_ep_ops)
+		return -EOPNOTSUPP;
+
+	if (!S_ISSOCK(inode->i_mode))
+		return 0;
+
+	/* Almost nothing can be done with O_PATHed files */
+	if (file->f_mode & FMODE_PATH)
+		return 0;
+
+	sock = SOCKET_I(inode);
+	if (sock->sk && sock->ops && sock->ops->family == PF_UNIX)
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+/*
+ * kdbus_msg_scan_items() - validate incoming data and prepare parsing
+ * @conn:		Connection
+ * @kmsg:		Message
+ *
+ * Return: 0 on success, negative errno on failure.
+ *
+ * On errors, the caller should drop any taken reference with
+ * kdbus_kmsg_free()
+ */
+static int kdbus_msg_scan_items(struct kdbus_conn *conn,
+				struct kdbus_kmsg *kmsg)
+{
+	const struct kdbus_msg *msg = &kmsg->msg;
+	const struct kdbus_item *item;
+	unsigned int items_count = 0;
+	size_t vecs_size = 0;
+	bool has_bloom = false;
+	bool has_name = false;
+	bool has_fds = false;
+	struct file *f;
+
+	KDBUS_ITEMS_FOREACH(item, msg->items, KDBUS_ITEMS_SIZE(msg, items))
+		if (item->type == KDBUS_ITEM_PAYLOAD_MEMFD)
+			kmsg->memfds_count++;
+
+	if (kmsg->memfds_count > 0) {
+		kmsg->memfds = kcalloc(kmsg->memfds_count,
+				       sizeof(struct file *), GFP_KERNEL);
+		if (!kmsg->memfds)
+			return -ENOMEM;
+
+		/* reset counter so we can reuse it */
+		kmsg->memfds_count = 0;
+	}
+
+	KDBUS_ITEMS_FOREACH(item, msg->items, KDBUS_ITEMS_SIZE(msg, items)) {
+		size_t payload_size;
+
+		if (++items_count > KDBUS_MSG_MAX_ITEMS)
+			return -E2BIG;
+
+		payload_size = KDBUS_ITEM_PAYLOAD_SIZE(item);
+
+		switch (item->type) {
+		case KDBUS_ITEM_PAYLOAD_VEC:
+			if (vecs_size + item->vec.size <= vecs_size)
+				return -EMSGSIZE;
+
+			vecs_size += item->vec.size;
+			if (vecs_size > KDBUS_MSG_MAX_PAYLOAD_VEC_SIZE)
+				return -EMSGSIZE;
+
+			/* \0-bytes records store only the alignment bytes */
+			if (KDBUS_PTR(item->vec.address))
+				kmsg->vecs_size += item->vec.size;
+			else
+				kmsg->vecs_size += item->vec.size % 8;
+			kmsg->vecs_count++;
+			break;
+
+		case KDBUS_ITEM_PAYLOAD_MEMFD: {
+			int seals, mask;
+			int fd = item->memfd.fd;
+
+			/* Verify the fd and increment the usage count */
+			if (fd < 0)
+				return -EBADF;
+
+			f = fget(fd);
+			if (!f)
+				return -EBADF;
+
+			kmsg->memfds[kmsg->memfds_count] = f;
+			kmsg->memfds_count++;
+
+			/*
+			 * We only accept a sealed memfd file whose content
+			 * cannot be altered by the sender or anybody else
+			 * while it is shared or in-flight. Other files need
+			 * to be passed with KDBUS_MSG_FDS.
+			 */
+			seals = shmem_get_seals(f);
+			if (seals < 0)
+				return -EMEDIUMTYPE;
+
+			mask = F_SEAL_SHRINK |
+			       F_SEAL_GROW |
+			       F_SEAL_WRITE |
+			       F_SEAL_SEAL;
+			if ((seals & mask) != mask)
+				return -ETXTBSY;
+
+			/*
+			 * The specified size in the item cannot be larger
+			 * than the backing file.
+			 */
+			if (item->memfd.size > i_size_read(file_inode(f)))
+				return -EBADF;
+
+			break;
+		}
+
+		case KDBUS_ITEM_FDS: {
+			unsigned int n, i;
+
+			/* do not allow multiple fd arrays */
+			if (has_fds)
+				return -EEXIST;
+			has_fds = true;
+
+			/* do not allow to broadcast file descriptors */
+			if (msg->dst_id == KDBUS_DST_ID_BROADCAST)
+				return -ENOTUNIQ;
+
+			n = KDBUS_ITEM_PAYLOAD_SIZE(item) / sizeof(int);
+			if (n > KDBUS_MSG_MAX_FDS)
+				return -EMFILE;
+
+			kmsg->fds = kcalloc(n, sizeof(*kmsg->fds), GFP_KERNEL);
+			if (!kmsg->fds)
+				return -ENOMEM;
+
+			for (i = 0; i < n; i++) {
+				int ret;
+				int fd = item->fds[i];
+
+				/*
+				 * Verify the fd and increment the usage count.
+				 * Use fget_raw() to allow passing O_PATH fds.
+				 */
+				if (fd < 0)
+					return -EBADF;
+
+				f = fget_raw(fd);
+				if (!f)
+					return -EBADF;
+
+				kmsg->fds[i] = f;
+				kmsg->fds_count++;
+
+				ret = kdbus_handle_check_file(f);
+				if (ret < 0)
+					return ret;
+			}
+
+			break;
+		}
+
+		case KDBUS_ITEM_BLOOM_FILTER: {
+			u64 bloom_size;
+
+			/* do not allow multiple bloom filters */
+			if (has_bloom)
+				return -EEXIST;
+			has_bloom = true;
+
+			/* bloom filters are only for broadcast messages */
+			if (msg->dst_id != KDBUS_DST_ID_BROADCAST)
+				return -EBADMSG;
+
+			bloom_size = payload_size -
+				     offsetof(struct kdbus_bloom_filter, data);
+
+			/*
+			* Allow only bloom filter sizes of a multiple of 64bit.
+			*/
+			if (!KDBUS_IS_ALIGNED8(bloom_size))
+				return -EFAULT;
+
+			/* do not allow mismatching bloom filter sizes */
+			if (bloom_size != conn->ep->bus->bloom.size)
+				return -EDOM;
+
+			kmsg->bloom_filter = &item->bloom_filter;
+			break;
+		}
+
+		case KDBUS_ITEM_DST_NAME:
+			/* do not allow multiple names */
+			if (has_name)
+				return -EEXIST;
+			has_name = true;
+
+			if (!kdbus_name_is_valid(item->str, false))
+				return -EINVAL;
+
+			kmsg->dst_name = item->str;
+			break;
+		}
+	}
+
+	/* name is needed if no ID is given */
+	if (msg->dst_id == KDBUS_DST_ID_NAME && !has_name)
+		return -EDESTADDRREQ;
+
+	if (msg->dst_id == KDBUS_DST_ID_BROADCAST) {
+		/* broadcasts can't take names */
+		if (has_name)
+			return -EBADMSG;
+
+		/* broadcast messages require a bloom filter */
+		if (!has_bloom)
+			return -EBADMSG;
+
+		/* timeouts are not allowed for broadcasts */
+		if (msg->timeout_ns > 0)
+			return -ENOTUNIQ;
+	}
+
+	/* bloom filters are for undirected messages only */
+	if (has_name && has_bloom)
+		return -EBADMSG;
+
+	return 0;
+}
+
+/**
+ * kdbus_kmsg_new_from_user() - copy message from user memory
+ * @conn:		Connection
+ * @msg:		User-provided message
+ *
+ * Return: a new kdbus_kmsg on success, ERR_PTR on failure.
+ */
+struct kdbus_kmsg *kdbus_kmsg_new_from_user(struct kdbus_conn *conn,
+					    struct kdbus_msg __user *msg)
+{
+	struct kdbus_kmsg *m;
+	u64 size, alloc_size;
+	int ret;
+
+	if (!KDBUS_IS_ALIGNED8((unsigned long)msg))
+		return ERR_PTR(-EFAULT);
+
+	if (kdbus_size_get_user(&size, msg, struct kdbus_msg))
+		return ERR_PTR(-EFAULT);
+
+	if (size < sizeof(struct kdbus_msg) || size > KDBUS_MSG_MAX_SIZE)
+		return ERR_PTR(-EMSGSIZE);
+
+	alloc_size = size + KDBUS_KMSG_HEADER_SIZE;
+
+	m = kmalloc(alloc_size, GFP_KERNEL);
+	if (!m)
+		return ERR_PTR(-ENOMEM);
+	memset(m, 0, KDBUS_KMSG_HEADER_SIZE);
+
+	if (copy_from_user(&m->msg, msg, size)) {
+		ret = -EFAULT;
+		goto exit_free;
+	}
+
+	ret = kdbus_items_validate(m->msg.items,
+				   KDBUS_ITEMS_SIZE(&m->msg, items));
+	if (ret < 0)
+		goto exit_free;
+
+	/* do not accept kernel-generated messages */
+	if (m->msg.payload_type == KDBUS_PAYLOAD_KERNEL) {
+		ret = -EINVAL;
+		goto exit_free;
+	}
+
+	ret = kdbus_negotiate_flags(&m->msg, msg, struct kdbus_msg,
+				    KDBUS_MSG_FLAGS_EXPECT_REPLY |
+				    KDBUS_MSG_FLAGS_SYNC_REPLY |
+				    KDBUS_MSG_FLAGS_NO_AUTO_START);
+	if (ret < 0)
+		goto exit_free;
+
+	if (m->msg.flags & KDBUS_MSG_FLAGS_EXPECT_REPLY) {
+		/* requests for replies need a timeout */
+		if (m->msg.timeout_ns == 0) {
+			ret = -EINVAL;
+			goto exit_free;
+		}
+
+		/* replies may not be expected for broadcasts */
+		if (m->msg.dst_id == KDBUS_DST_ID_BROADCAST) {
+			ret = -ENOTUNIQ;
+			goto exit_free;
+		}
+	} else {
+		/*
+		 * KDBUS_MSG_FLAGS_SYNC_REPLY is only valid together with
+		 * KDBUS_MSG_FLAGS_EXPECT_REPLY
+		 */
+		if (m->msg.flags & KDBUS_MSG_FLAGS_SYNC_REPLY) {
+			ret = -EINVAL;
+			goto exit_free;
+		}
+	}
+
+	ret = kdbus_msg_scan_items(conn, m);
+	if (ret < 0)
+		goto exit_free;
+
+	/* patch-in the source of this message */
+	if (m->msg.src_id > 0 && m->msg.src_id != conn->id) {
+		ret = -EINVAL;
+		goto exit_free;
+	}
+	m->msg.src_id = conn->id;
+
+	return m;
+
+exit_free:
+	kdbus_kmsg_free(m);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_kmsg_attach_metadata() - Attach metadata to a kmsg object
+ * @kmsg:	The message to attach the metadata to
+ * @conn_src:	The source connection that sends the message
+ * @conn_dst:	The destination connection that is about to receive the message
+ *
+ * Append metadata items according to the destination connection's
+ * attach flags. If the source connection has faked credentials, the
+ * metadata object associated with the kmsg has been pre-filled with
+ * conn_src->owner_meta, and we only attach the connection's name and
+ * currently owned names on top of that.
+ *
+ * Return: 0 on success, negative error otherwise.
+ */
+int kdbus_kmsg_attach_metadata(struct kdbus_kmsg *kmsg,
+			       struct kdbus_conn *conn_src,
+			       struct kdbus_conn *conn_dst)
+{
+	u64 attach_flags;
+
+	attach_flags = atomic64_read(&conn_dst->attach_flags_recv);
+
+	if (conn_src->owner_meta)
+		attach_flags &= KDBUS_ATTACH_NAMES |
+				KDBUS_ATTACH_CONN_DESCRIPTION;
+
+	return kdbus_meta_append(kmsg->meta, conn_dst->ep->bus->domain,
+				 conn_src, kmsg->seq, attach_flags);
+}
diff --git a/ipc/kdbus/message.h b/ipc/kdbus/message.h
new file mode 100644
index 000000000000..b610e2c4f67b
--- /dev/null
+++ b/ipc/kdbus/message.h
@@ -0,0 +1,75 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_MESSAGE_H
+#define __KDBUS_MESSAGE_H
+
+#include "util.h"
+#include "metadata.h"
+
+/**
+ * struct kdbus_kmsg - internal message handling data
+ * @seq:		Domain-global message sequence number
+ * @notify_type:	Short-cut for faster lookup
+ * @notify_old_id:	Short-cut for faster lookup
+ * @notify_new_id:	Short-cut for faster lookup
+ * @notify_name:	Short-cut for faster lookup
+ * @dst_name:		Short-cut to msg for faster lookup
+ * @dst_name_id:	Short-cut to msg for faster lookup
+ * @bloom_filter:	Bloom filter to match message properties
+ * @bloom_generation:	Generation of bloom element set
+ * @fds:		Array of file descriptors to pass
+ * @fds_count:		Number of file descriptors to pass
+ * @meta:		Appended SCM-like metadata of the sending process
+ * @vecs_size:		Size of PAYLOAD data
+ * @vecs_count:		Number of PAYLOAD vectors
+ * @memfds_count:	Number of memfds to pass
+ * @notify_entry:	List of kernel-generated notifications
+ * @msg:		Message from or to userspace
+ */
+struct kdbus_kmsg {
+	u64 seq;
+	u64 notify_type;
+	u64 notify_old_id;
+	u64 notify_new_id;
+	const char *notify_name;
+
+	const char *dst_name;
+	u64 dst_name_id;
+	const struct kdbus_bloom_filter *bloom_filter;
+	u64 bloom_generation;
+	struct file **fds;
+	unsigned int fds_count;
+	struct kdbus_meta *meta;
+	size_t vecs_size;
+	unsigned int vecs_count;
+	struct file **memfds;
+	unsigned int memfds_count;
+	struct list_head notify_entry;
+
+	/* variable size, must be the last member */
+	struct kdbus_msg msg;
+};
+
+struct kdbus_ep;
+struct kdbus_conn;
+
+struct kdbus_kmsg *kdbus_kmsg_new(size_t extra_size);
+struct kdbus_kmsg *kdbus_kmsg_new_from_user(struct kdbus_conn *conn,
+					    struct kdbus_msg __user *msg);
+void kdbus_kmsg_free(struct kdbus_kmsg *kmsg);
+
+int kdbus_kmsg_attach_metadata(struct kdbus_kmsg *kmsg,
+			       struct kdbus_conn *conn_src,
+			       struct kdbus_conn *conn_dst);
+#endif
diff --git a/ipc/kdbus/queue.c b/ipc/kdbus/queue.c
new file mode 100644
index 000000000000..668831d66d0f
--- /dev/null
+++ b/ipc/kdbus/queue.c
@@ -0,0 +1,608 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/audit.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/hashtable.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/math64.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/syscalls.h>
+
+#include "domain.h"
+#include "connection.h"
+#include "item.h"
+#include "message.h"
+#include "metadata.h"
+#include "util.h"
+#include "queue.h"
+
+static int kdbus_queue_entry_fds_install(struct kdbus_queue_entry *entry)
+{
+	unsigned int i;
+	int ret, *fds;
+	size_t count;
+
+	/* get array of file descriptors */
+	count = entry->fds_count + entry->memfds_count;
+	if (!count)
+		return 0;
+
+	fds = kcalloc(count, sizeof(int), GFP_KERNEL);
+	if (!fds)
+		return -ENOMEM;
+
+	/* allocate new file descriptors in the receiver's process */
+	for (i = 0; i < count; i++) {
+		fds[i] = get_unused_fd_flags(O_CLOEXEC);
+		if (fds[i] < 0) {
+			ret = fds[i];
+			goto exit_remove_unused;
+		}
+	}
+
+	if (entry->fds_count) {
+		/* copy the array into the message item */
+		ret = kdbus_pool_slice_copy(entry->slice, entry->fds, fds,
+					    entry->fds_count * sizeof(int));
+		if (ret < 0)
+			goto exit_remove_unused;
+
+		/* install files in the receiver's process */
+		for (i = 0; i < entry->fds_count; i++)
+			fd_install(fds[i], get_file(entry->fds_fp[i]));
+	}
+
+	if (entry->memfds_count) {
+		off_t o = entry->fds_count;
+
+		/*
+		 * Update the file descriptor number in the items.
+		 * We remembered the locations of the values in the buffer.
+		 */
+		for (i = 0; i < entry->memfds_count; i++) {
+			ret = kdbus_pool_slice_copy(entry->slice,
+						    entry->memfds[i],
+						    &fds[o + i], sizeof(int));
+			if (ret < 0)
+				goto exit_rewind_fds;
+		}
+
+		/* install files in the receiver's process */
+		for (i = 0; i < entry->memfds_count; i++)
+			fd_install(fds[o + i], get_file(entry->memfds_fp[i]));
+	}
+
+	kfree(fds);
+	return 0;
+
+exit_rewind_fds:
+	for (i = 0; i < entry->fds_count; i++)
+		sys_close(fds[i]);
+
+exit_remove_unused:
+	for (i = 0; i < count; i++) {
+		if (fds[i] < 0)
+			break;
+
+		put_unused_fd(fds[i]);
+	}
+
+	kfree(fds);
+	return ret;
+}
+
+/**
+ * kdbus_queue_entry_install() - install message components into the
+ *				 receiver's process
+ * @entry:	The queue entry to install
+ *
+ * This function will install file descriptors transported in a queue enrty
+ * into 'current'.
+ *
+ * Return: 0 on success.
+ */
+int kdbus_queue_entry_install(struct kdbus_queue_entry *entry)
+{
+	int ret;
+
+	ret = kdbus_queue_entry_fds_install(entry);
+	if (ret < 0)
+		return ret;
+
+	kdbus_pool_slice_flush(entry->slice);
+	return 0;
+}
+
+static int kdbus_queue_entry_payload_add(struct kdbus_queue_entry *entry,
+					 const struct kdbus_kmsg *kmsg,
+					 size_t items, size_t vec_data)
+{
+	const struct kdbus_item *item;
+	int ret;
+
+	if (kmsg->memfds_count > 0) {
+		entry->memfds = kcalloc(kmsg->memfds_count,
+					sizeof(off_t), GFP_KERNEL);
+		if (!entry->memfds)
+			return -ENOMEM;
+
+		entry->memfds_fp = kcalloc(kmsg->memfds_count,
+					   sizeof(struct file *), GFP_KERNEL);
+		if (!entry->memfds_fp)
+			return -ENOMEM;
+	}
+
+	KDBUS_ITEMS_FOREACH(item, kmsg->msg.items,
+			    KDBUS_ITEMS_SIZE(&kmsg->msg, items)) {
+		switch (item->type) {
+		case KDBUS_ITEM_PAYLOAD_VEC: {
+			char tmp[KDBUS_ITEM_HEADER_SIZE +
+				 sizeof(struct kdbus_vec)];
+			struct kdbus_item *it = (struct kdbus_item *)tmp;
+
+			/* add item */
+			it->type = KDBUS_ITEM_PAYLOAD_OFF;
+			it->size = sizeof(tmp);
+
+			/* a NULL address specifies a \0-bytes record */
+			if (KDBUS_PTR(item->vec.address))
+				it->vec.offset = vec_data;
+			else
+				it->vec.offset = ~0ULL;
+			it->vec.size = item->vec.size;
+			ret = kdbus_pool_slice_copy(entry->slice, items,
+						    it, it->size);
+			if (ret < 0)
+				return ret;
+			items += KDBUS_ALIGN8(it->size);
+
+			/* \0-bytes record */
+			if (!KDBUS_PTR(item->vec.address)) {
+				size_t l = item->vec.size % 8;
+				const char *n = "\0\0\0\0\0\0\0";
+
+				if (l == 0)
+					break;
+
+				/*
+				 * Preserve the alignment for the next payload
+				 * record in the output buffer; write as many
+				 * null-bytes to the buffer which the \0-bytes
+				 * record would have shifted the alignment.
+				 */
+				ret = kdbus_pool_slice_copy(entry->slice,
+							    vec_data, n, l);
+				if (ret < 0)
+					return ret;
+
+				vec_data += l;
+				break;
+			}
+
+			/* copy kdbus_vec data from sender to receiver */
+			ret = kdbus_pool_slice_copy_user(entry->slice, vec_data,
+				KDBUS_PTR(item->vec.address), item->vec.size);
+			if (ret < 0)
+				return ret;
+
+			vec_data += item->vec.size;
+			break;
+		}
+
+		case KDBUS_ITEM_PAYLOAD_MEMFD: {
+			char tmp[KDBUS_ITEM_HEADER_SIZE +
+				 sizeof(struct kdbus_memfd)];
+			struct kdbus_item *it = (struct kdbus_item *)tmp;
+
+			/* add item */
+			it->type = KDBUS_ITEM_PAYLOAD_MEMFD;
+			it->size = sizeof(tmp);
+			it->memfd.size = item->memfd.size;
+			it->memfd.fd = -1;
+			ret = kdbus_pool_slice_copy(entry->slice, items,
+						    it, it->size);
+			if (ret < 0)
+				return ret;
+
+			/*
+			 * Remember the file and the location of the fd number
+			 * which will be updated at RECV time.
+			 */
+			entry->memfds[entry->memfds_count] =
+				items + offsetof(struct kdbus_item, memfd.fd);
+			entry->memfds_fp[entry->memfds_count] =
+				get_file(kmsg->memfds[entry->memfds_count]);
+			entry->memfds_count++;
+
+			items += KDBUS_ALIGN8(it->size);
+			break;
+		}
+
+		default:
+			break;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * kdbus_queue_entry_add() - Add an queue entry to a queue
+ * @queue:	The queue to attach the item to
+ * @entry:	The entry to attach
+ *
+ * Adds a previously allocated queue item to a queue, and maintains the
+ * priority r/b tree.
+ */
+/* add queue entry to connection, maintain priority queue */
+void kdbus_queue_entry_add(struct kdbus_queue *queue,
+			   struct kdbus_queue_entry *entry)
+{
+	struct rb_node **n, *pn = NULL;
+	bool highest = true;
+
+	/* sort into priority entry tree */
+	n = &queue->msg_prio_queue.rb_node;
+	while (*n) {
+		struct kdbus_queue_entry *e;
+
+		pn = *n;
+		e = rb_entry(pn, struct kdbus_queue_entry, prio_node);
+
+		/* existing node for this priority, add to its list */
+		if (likely(entry->priority == e->priority)) {
+			list_add_tail(&entry->prio_entry, &e->prio_entry);
+			goto prio_done;
+		}
+
+		if (entry->priority < e->priority) {
+			n = &pn->rb_left;
+		} else {
+			n = &pn->rb_right;
+			highest = false;
+		}
+	}
+
+	/* cache highest-priority entry */
+	if (highest)
+		queue->msg_prio_highest = &entry->prio_node;
+
+	/* new node for this priority */
+	rb_link_node(&entry->prio_node, pn, n);
+	rb_insert_color(&entry->prio_node, &queue->msg_prio_queue);
+	INIT_LIST_HEAD(&entry->prio_entry);
+
+prio_done:
+	/* add to unsorted fifo list */
+	list_add_tail(&entry->entry, &queue->msg_list);
+	queue->msg_count++;
+}
+
+/**
+ * kdbus_queue_entry_peek() - Retrieves an entry from a queue
+ *
+ * @queue:		The queue
+ * @priority:		The minimum priority of the entry to peek
+ * @use_priority:	Boolean flag whether or not to peek by priority
+ *
+ * Look for a entry in a queue, either by priority, or the oldest one (FIFO).
+ * The entry is not freed, put off the queue's lists or anything else.
+ *
+ * Return: the peeked queue entry on success, ERR_PTR(-ENOMSG) if there is no
+ * entry with the requested priority, or ERR_PTR(-EAGAIN) if there are no
+ * entries at all.
+ */
+struct kdbus_queue_entry *kdbus_queue_entry_peek(struct kdbus_queue *queue,
+						 s64 priority,
+						 bool use_priority)
+{
+	struct kdbus_queue_entry *e;
+
+	if (queue->msg_count == 0)
+		return ERR_PTR(-EAGAIN);
+
+	if (use_priority) {
+		/* get next entry with highest priority */
+		e = rb_entry(queue->msg_prio_highest,
+			     struct kdbus_queue_entry, prio_node);
+
+		/* no entry with the requested priority */
+		if (e->priority > priority)
+			return ERR_PTR(-ENOMSG);
+	} else {
+		/* ignore the priority, return the next entry in the entry */
+		e = list_first_entry(&queue->msg_list,
+				     struct kdbus_queue_entry, entry);
+	}
+
+	return e;
+}
+
+/**
+ * kdbus_queue_entry_remove() - Remove an entry from a queue
+ * @conn:	The connection containing the queue
+ * @entry:	The entry to remove
+ *
+ * Remove an entry from both the queue's list and the priority r/b tree.
+ */
+void kdbus_queue_entry_remove(struct kdbus_conn *conn,
+			      struct kdbus_queue_entry *entry)
+{
+	struct kdbus_queue *queue = &conn->queue;
+
+	list_del(&entry->entry);
+	queue->msg_count--;
+
+	/* user quota */
+	if (entry->user) {
+		BUG_ON(conn->msg_users[entry->user->idr] == 0);
+		conn->msg_users[entry->user->idr]--;
+		entry->user = kdbus_domain_user_unref(entry->user);
+	}
+
+	/* the queue is empty, remove the user quota accounting */
+	if (queue->msg_count == 0 && conn->msg_users_max > 0) {
+		kfree(conn->msg_users);
+		conn->msg_users = NULL;
+		conn->msg_users_max = 0;
+	}
+
+	if (list_empty(&entry->prio_entry)) {
+		/*
+		 * Single entry for this priority, update cached
+		 * highest-priority entry, remove the tree node.
+		 */
+		if (queue->msg_prio_highest == &entry->prio_node)
+			queue->msg_prio_highest = rb_next(&entry->prio_node);
+
+		rb_erase(&entry->prio_node, &queue->msg_prio_queue);
+	} else {
+		struct kdbus_queue_entry *q;
+
+		/*
+		 * Multiple entries for this priority entry, get next one in
+		 * the list. Update cached highest-priority entry, store the
+		 * new one as the tree node.
+		 */
+		q = list_first_entry(&entry->prio_entry,
+				     struct kdbus_queue_entry, prio_entry);
+		list_del(&entry->prio_entry);
+
+		if (queue->msg_prio_highest == &entry->prio_node)
+			queue->msg_prio_highest = &q->prio_node;
+
+		rb_replace_node(&entry->prio_node, &q->prio_node,
+				&queue->msg_prio_queue);
+	}
+}
+
+/**
+ * kdbus_queue_entry_alloc() - allocate a queue entry
+ * @conn_src:	The connection used to create the message
+ * @conn_dst:	The connection that holds the queue
+ * @kmsg:	The kmsg object the queue entry should track
+ *
+ * Allocates a queue entry based on a given kmsg and allocate space for
+ * the message payload and the requested metadata in the connection's pool.
+ * The entry is not actually added to the queue's lists at this point.
+ *
+ * Return: the allocated entry on success, or an ERR_PTR on failures.
+ */
+struct kdbus_queue_entry *kdbus_queue_entry_alloc(struct kdbus_conn *conn_src,
+						  struct kdbus_conn *conn_dst,
+						  const struct kdbus_kmsg *kmsg)
+{
+	struct kdbus_queue_entry *entry;
+	struct kdbus_item *it;
+	u64 attach_flags = 0;
+	size_t msg_size;
+	size_t size;
+	size_t dst_name_len = 0;
+	size_t payloads = 0;
+	size_t fds = 0;
+	size_t meta_off = 0;
+	size_t vec_data;
+	size_t want, have;
+	int ret = 0;
+
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return ERR_PTR(-ENOMEM);
+
+	/* copy message properties we need for the entry management */
+	entry->src_id = kmsg->msg.src_id;
+	entry->cookie = kmsg->msg.cookie;
+
+	/* space for the header */
+	if (kmsg->msg.src_id == KDBUS_SRC_ID_KERNEL)
+		size = kmsg->msg.size;
+	else
+		size = offsetof(struct kdbus_msg, items);
+	msg_size = size;
+
+	/* let the receiver know where the message was addressed to */
+	if (kmsg->dst_name) {
+		dst_name_len = strlen(kmsg->dst_name) + 1;
+		msg_size += KDBUS_ITEM_SIZE(dst_name_len);
+		entry->dst_name_id = kmsg->dst_name_id;
+	}
+
+	/* space for PAYLOAD items */
+	if ((kmsg->vecs_count + kmsg->memfds_count) > 0) {
+		payloads = msg_size;
+		msg_size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec)) *
+			    kmsg->vecs_count;
+		msg_size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd)) *
+			    kmsg->memfds_count;
+	}
+
+	/* space for FDS item */
+	if (kmsg->fds_count > 0) {
+		entry->fds_fp = kcalloc(kmsg->fds_count, sizeof(struct file *),
+					GFP_KERNEL);
+		if (!entry->fds_fp) {
+			ret = -ENOMEM;
+			goto exit_free_entry;
+		}
+
+		fds = msg_size;
+		msg_size += KDBUS_ITEM_SIZE(kmsg->fds_count * sizeof(int));
+	}
+
+	if (conn_src)
+		attach_flags = atomic64_read(&conn_src->attach_flags_send) &
+			       atomic64_read(&conn_dst->attach_flags_recv);
+
+	/* space for metadata/credential items */
+	if (kmsg->meta && attach_flags) {
+		size_t meta_size;
+
+		meta_size = kdbus_meta_size(kmsg->meta, conn_dst,
+					    &attach_flags);
+		if (meta_size > 0) {
+			meta_off = msg_size;
+			msg_size += meta_size;
+		}
+	}
+
+	/* data starts after the message */
+	vec_data = KDBUS_ALIGN8(msg_size);
+
+	/* do not give out more than half of the remaining space */
+	want = vec_data + kmsg->vecs_size;
+	have = kdbus_pool_remain(conn_dst->pool);
+	if (want < have && want > have / 2) {
+		ret = -EXFULL;
+		goto exit_free_entry;
+	}
+
+	/* allocate the needed space in the pool of the receiver */
+	entry->slice = kdbus_pool_slice_alloc(conn_dst->pool, want);
+	if (IS_ERR(entry->slice)) {
+		ret = PTR_ERR(entry->slice);
+		entry->slice = NULL;
+		goto exit_free_entry;
+	}
+
+	/* copy the message header */
+	ret = kdbus_pool_slice_copy(entry->slice, 0, &kmsg->msg, size);
+	if (ret < 0)
+		goto exit_free_slice;
+
+	/* update the size */
+	ret = kdbus_pool_slice_copy(entry->slice, 0, &msg_size,
+				    sizeof(kmsg->msg.size));
+	if (ret < 0)
+		goto exit_free_slice;
+
+	if (dst_name_len  > 0) {
+		char tmp[KDBUS_ITEM_HEADER_SIZE + dst_name_len];
+
+		it = (struct kdbus_item *)tmp;
+		it->size = KDBUS_ITEM_HEADER_SIZE + dst_name_len;
+		it->type = KDBUS_ITEM_DST_NAME;
+		memcpy(it->str, kmsg->dst_name, dst_name_len);
+
+		ret = kdbus_pool_slice_copy(entry->slice, size, it, it->size);
+		if (ret < 0)
+			goto exit_free_slice;
+	}
+
+	/* add PAYLOAD items */
+	if (payloads > 0) {
+		ret = kdbus_queue_entry_payload_add(entry, kmsg,
+						    payloads, vec_data);
+		if (ret < 0)
+			goto exit_free_slice;
+	}
+
+	/* add a FDS item; the array content will be updated at RECV time */
+	if (kmsg->fds_count > 0) {
+		char tmp[KDBUS_ITEM_HEADER_SIZE];
+		unsigned int i;
+
+		it = (struct kdbus_item *)tmp;
+		it->type = KDBUS_ITEM_FDS;
+		it->size = KDBUS_ITEM_HEADER_SIZE +
+			   (kmsg->fds_count * sizeof(int));
+		ret = kdbus_pool_slice_copy(entry->slice, fds,
+					    it, KDBUS_ITEM_HEADER_SIZE);
+		if (ret < 0)
+			goto exit_free_slice;
+
+		for (i = 0; i < kmsg->fds_count; i++) {
+			entry->fds_fp[i] = get_file(kmsg->fds[i]);
+			if (!entry->fds_fp[i]) {
+				ret = -EBADF;
+				goto exit_free_slice;
+			}
+		}
+
+		/* remember the array to update at RECV */
+		entry->fds = fds + offsetof(struct kdbus_item, fds);
+		entry->fds_count = kmsg->fds_count;
+	}
+
+	/* append message metadata/credential items */
+	if (meta_off > 0) {
+		ret = kdbus_meta_write(kmsg->meta, conn_dst, attach_flags,
+				       entry->slice, meta_off);
+		if (ret < 0)
+			goto exit_free_slice;
+	}
+
+	entry->priority = kmsg->msg.priority;
+	return entry;
+
+exit_free_slice:
+	kdbus_pool_slice_free(entry->slice);
+exit_free_entry:
+	kdbus_queue_entry_free(entry);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_queue_entry_free() - free resources of an entry
+ * @entry:	The entry to free
+ *
+ * Removes resources allocated by a queue entry, along with the entry itself.
+ * Note that the entry's slice is not freed at this point.
+ */
+void kdbus_queue_entry_free(struct kdbus_queue_entry *entry)
+{
+	kdbus_fput_files(entry->memfds_fp, entry->memfds_count);
+	kdbus_fput_files(entry->fds_fp, entry->fds_count);
+	kfree(entry->memfds_fp);
+	kfree(entry->fds_fp);
+	kfree(entry->memfds);
+	kfree(entry);
+}
+
+/**
+ * kdbus_queue_init() - initialize data structure related to a queue
+ * @queue:	The queue to initialize
+ */
+void kdbus_queue_init(struct kdbus_queue *queue)
+{
+	INIT_LIST_HEAD(&queue->msg_list);
+	queue->msg_prio_queue = RB_ROOT;
+}
diff --git a/ipc/kdbus/queue.h b/ipc/kdbus/queue.h
new file mode 100644
index 000000000000..2ddde8b81cfe
--- /dev/null
+++ b/ipc/kdbus/queue.h
@@ -0,0 +1,93 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_QUEUE_H
+#define __KDBUS_QUEUE_H
+
+struct kdbus_domain_user;
+
+/**
+ * struct kdbus_queue - a connection's message queue
+ * @msg_count		Number of messages in the queue
+ * @msg_list:		List head for kdbus_queue_entry objects
+ * @msg_prio_queue:	RB tree root for messages, sorted by priority
+ * @msg_prio_highest:	Link to the RB node referencing the message with the
+ *			highest priority in the tree.
+ */
+struct kdbus_queue {
+	size_t msg_count;
+	struct list_head msg_list;
+	struct rb_root msg_prio_queue;
+	struct rb_node *msg_prio_highest;
+};
+
+/**
+ * struct kdbus_queue_entry - messages waiting to be read
+ * @entry:		Entry in the connection's list
+ * @prio_node:		Entry in the priority queue tree
+ * @prio_entry:		Queue tree node entry in the list of one priority
+ * @priority:		Queueing priority of the message
+ * @slice:		Allocated slice in the receiver's pool
+ * @memfds:		Arrays of offsets where to update the installed
+ *			fd number
+ * @memfds_fp:		Array memfd files queued up for this message
+ * @memfds_count:	Number of memfds
+ * @fds:		Offset to array where to update the installed fd number
+ * @fds_fp:		Array of passed files queued up for this message
+ * @fds_count:		Number of files
+ * @src_id:		The ID of the sender
+ * @cookie:		Message cookie, used for replies
+ * @dst_name_id:	The sequence number of the name this message is
+ *			addressed to, 0 for messages sent to an ID
+ * @reply:		The reply block if a reply to this message is expected.
+ * @user:		Index in per-user message counter, -1 for unused
+ */
+struct kdbus_queue_entry {
+	struct list_head entry;
+	struct rb_node prio_node;
+	struct list_head prio_entry;
+	s64 priority;
+	struct kdbus_pool_slice *slice;
+	size_t *memfds;
+	struct file **memfds_fp;
+	unsigned int memfds_count;
+	size_t fds;
+	struct file **fds_fp;
+	unsigned int fds_count;
+	u64 src_id;
+	u64 cookie;
+	u64 dst_name_id;
+	struct kdbus_conn_reply *reply;
+	struct kdbus_domain_user *user;
+};
+
+struct kdbus_kmsg;
+
+void kdbus_queue_init(struct kdbus_queue *queue);
+
+struct kdbus_queue_entry *
+kdbus_queue_entry_alloc(struct kdbus_conn *conn_src,
+			struct kdbus_conn *conn_dst,
+			const struct kdbus_kmsg *kmsg);
+void kdbus_queue_entry_free(struct kdbus_queue_entry *entry);
+
+void kdbus_queue_entry_add(struct kdbus_queue *queue,
+			   struct kdbus_queue_entry *entry);
+void kdbus_queue_entry_remove(struct kdbus_conn *conn,
+			      struct kdbus_queue_entry *entry);
+struct kdbus_queue_entry *kdbus_queue_entry_peek(struct kdbus_queue *queue,
+						 s64 priority,
+						 bool use_priority);
+int kdbus_queue_entry_install(struct kdbus_queue_entry *entry);
+
+#endif /* __KDBUS_QUEUE_H */
diff --git a/ipc/kdbus/util.h b/ipc/kdbus/util.h
index e727a2134d0c..39347a394bc8 100644
--- a/ipc/kdbus/util.h
+++ b/ipc/kdbus/util.h
@@ -17,7 +17,7 @@
 #include <linux/dcache.h>
 #include <linux/ioctl.h>
 
-#include "kdbus.h"
+#include <uapi/linux/kdbus.h>
 
 /* all exported addresses are 64 bit */
 #define KDBUS_PTR(addr) ((void __user *)(uintptr_t)(addr))
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add node and filesystem implementation
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  2014-11-21 15:55     ` Sasha Levin
  2014-11-21 16:35   ` Andy Lutomirski
  -1 siblings, 2 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

kdbusfs is a filesystem that will expose a fresh kdbus domain context
each time it is mounted. Per mount point, there will be a 'control'
node, which can be used to create buses. fs.c contains the
implementation of that pseudo-fs. Exported inodes of 'file' type have
their i_fop set to either kdbus_handle_control_ops or
kdbus_handle_ep_ops, depending on their type. The actual dispatching
of file operations is done from handle.c

node.c is an implementation of a kdbus object that has an id and
children, organized in an R/B tree. The tree is used by the filesystem
code for lookup and iterator functions, and to deactivate children
once the parent is deactivated. Every inode exported by kdbusfs is
backed by a kdbus_node, hence it is embedded in struct kdbus_ep,
struct kdbus_bus and struct kdbus_domain.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/uapi/linux/magic.h |   1 +
 ipc/kdbus/fs.c             | 417 ++++++++++++++++++++++
 ipc/kdbus/fs.h             |  22 ++
 ipc/kdbus/node.c           | 872 +++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/node.h           |  86 +++++
 5 files changed, 1398 insertions(+)
 create mode 100644 ipc/kdbus/fs.c
 create mode 100644 ipc/kdbus/fs.h
 create mode 100644 ipc/kdbus/node.c
 create mode 100644 ipc/kdbus/node.h

diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 77c60311a6c6..eecf6b9cd185 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -72,5 +72,6 @@
 #define MTD_INODE_FS_MAGIC      0x11307854
 #define ANON_INODE_FS_MAGIC	0x09041934
 #define BTRFS_TEST_MAGIC	0x73727279
+#define KDBUS_SUPER_MAGIC	0x44427573
 
 #endif /* __LINUX_MAGIC_H__ */
diff --git a/ipc/kdbus/fs.c b/ipc/kdbus/fs.c
new file mode 100644
index 000000000000..0e8727f2f9ce
--- /dev/null
+++ b/ipc/kdbus/fs.c
@@ -0,0 +1,417 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/backing-dev.h>
+#include <linux/fs.h>
+#include <linux/fsnotify.h>
+#include <linux/init.h>
+#include <linux/ipc_namespace.h>
+#include <linux/magic.h>
+#include <linux/module.h>
+#include <linux/mount.h>
+#include <linux/mutex.h>
+#include <linux/namei.h>
+#include <linux/pagemap.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+#include "bus.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "fs.h"
+#include "handle.h"
+#include "node.h"
+
+#define kdbus_node_from_dentry(_dentry) \
+	((struct kdbus_node *)(_dentry)->d_fsdata)
+#define kdbus_node_from_inode(_inode) \
+	((struct kdbus_node *)(_inode)->i_private)
+
+static struct inode *fs_inode_get(struct super_block *sb,
+				  struct kdbus_node *node);
+
+/*
+ * Directory Management
+ */
+
+static inline unsigned char kdbus_dt_type(struct kdbus_node *node)
+{
+	switch (node->type) {
+	case KDBUS_NODE_DOMAIN:
+	case KDBUS_NODE_BUS:
+		return DT_DIR;
+	case KDBUS_NODE_CONTROL:
+	case KDBUS_NODE_ENDPOINT:
+		return DT_REG;
+	}
+
+	return DT_UNKNOWN;
+}
+
+static int fs_dir_fop_iterate(struct file *file, struct dir_context *ctx)
+{
+	struct dentry *dentry = file->f_path.dentry;
+	struct kdbus_node *parent = kdbus_node_from_dentry(dentry);
+	struct kdbus_node *old, *next = file->private_data;
+
+	/*
+	 * kdbusfs directory iterator (modelled after sysfs/kernfs)
+	 * When iterating kdbusfs directories, we iterate all children of the
+	 * parent kdbus_node object. We use ctx->pos to store the hash of the
+	 * child and file->private_data to store a reference to the next node
+	 * object. If ctx->pos is not modified via llseek while you iterate a
+	 * directory, then we use the file->private_data node pointer to
+	 * directly access the next node in the tree.
+	 * However, if you directly seek on the directory, we have to find the
+	 * closest node to that position and cannot use our node pointer. This
+	 * means iterating the rb-tree to find the closest match and start over
+	 * from there.
+	 * Note that hash values are not neccessarily unique. Therefore, llseek
+	 * is not guaranteed to seek to the same node that you got when you
+	 * retrieved the position. Seeking to 0, 1, 2 and >=INT_MAX is safe,
+	 * though. We could use the inode-number as position, but this would
+	 * require another rb-tree for fast access. Kernfs and others already
+	 * ignore those conflicts, so we should be fine, too.
+	 */
+
+	if (!dir_emit_dots(file, ctx))
+		return 0;
+
+	/* acquire @next; if deactivated, or seek detected, find next node */
+	old = next;
+	if (next && ctx->pos == next->hash) {
+		if (kdbus_node_acquire(next))
+			kdbus_node_ref(next);
+		else
+			next = kdbus_node_next_child(parent, next);
+	} else {
+		next = kdbus_node_find_closest(parent, ctx->pos);
+	}
+	kdbus_node_unref(old);
+
+	while (next) {
+		/* emit @next */
+		file->private_data = next;
+		ctx->pos = next->hash;
+
+		kdbus_node_release(next);
+
+		if (!dir_emit(ctx, next->name, strlen(next->name), next->id,
+			      kdbus_dt_type(next)))
+			return 0;
+
+		/* find next node after @next */
+		old = next;
+		next = kdbus_node_next_child(parent, next);
+		kdbus_node_unref(old);
+	}
+
+	file->private_data = NULL;
+	ctx->pos = INT_MAX;
+
+	return 0;
+}
+
+static loff_t fs_dir_fop_llseek(struct file *file, loff_t offset, int whence)
+{
+	struct inode *inode = file_inode(file);
+	loff_t ret;
+
+	/* protect f_off against fop_iterate */
+	mutex_lock(&inode->i_mutex);
+	ret = generic_file_llseek(file, offset, whence);
+	mutex_unlock(&inode->i_mutex);
+
+	return ret;
+}
+
+static int fs_dir_fop_release(struct inode *inode, struct file *file)
+{
+	kdbus_node_unref(file->private_data);
+	return 0;
+}
+
+static const struct file_operations fs_dir_fops = {
+	.read		= generic_read_dir,
+	.iterate	= fs_dir_fop_iterate,
+	.llseek		= fs_dir_fop_llseek,
+	.release	= fs_dir_fop_release,
+};
+
+static struct dentry *fs_dir_iop_lookup(struct inode *dir,
+					struct dentry *dentry,
+					unsigned int flags)
+{
+	struct dentry *dnew = NULL;
+	struct kdbus_node *parent;
+	struct kdbus_node *node;
+	struct inode *inode;
+
+	parent = kdbus_node_from_dentry(dentry->d_parent);
+	if (!kdbus_node_acquire(parent))
+		return NULL;
+
+	/* returns reference to _acquired_ child node */
+	node = kdbus_node_find_child(parent, dentry->d_name.name);
+	if (node) {
+		dentry->d_fsdata = node;
+		inode = fs_inode_get(dir->i_sb, node);
+		if (IS_ERR(inode))
+			dnew = ERR_CAST(inode);
+		else
+			dnew = d_materialise_unique(dentry, inode);
+
+		kdbus_node_release(node);
+	}
+
+	kdbus_node_release(parent);
+	return dnew;
+}
+
+static const struct inode_operations fs_dir_iops = {
+	.permission	= generic_permission,
+	.lookup		= fs_dir_iop_lookup,
+};
+
+/*
+ * Inode Management
+ */
+
+static const struct inode_operations fs_inode_iops = {
+	.permission	= generic_permission,
+};
+
+static struct inode *fs_inode_get(struct super_block *sb,
+				  struct kdbus_node *node)
+{
+	struct inode *inode;
+
+	inode = iget_locked(sb, node->id);
+	if (!inode)
+		return ERR_PTR(-ENOMEM);
+	if (!(inode->i_state & I_NEW))
+		return inode;
+
+	inode->i_private = kdbus_node_ref(node);
+	inode->i_mapping->a_ops = &empty_aops;
+	inode->i_mapping->backing_dev_info = &noop_backing_dev_info;
+	inode->i_mode = node->mode & S_IALLUGO;
+	inode->i_atime = inode->i_ctime = inode->i_mtime = CURRENT_TIME;
+	inode->i_uid = node->uid;
+	inode->i_gid = node->gid;
+
+	switch (node->type) {
+	case KDBUS_NODE_DOMAIN:
+	case KDBUS_NODE_BUS:
+		inode->i_mode |= S_IFDIR;
+		inode->i_op = &fs_dir_iops;
+		inode->i_fop = &fs_dir_fops;
+		set_nlink(inode, 2);
+		break;
+	case KDBUS_NODE_CONTROL:
+		inode->i_mode |= S_IFREG;
+		inode->i_op = &fs_inode_iops;
+		inode->i_fop = &kdbus_handle_control_ops;
+		break;
+	case KDBUS_NODE_ENDPOINT:
+		inode->i_mode |= S_IFREG;
+		inode->i_op = &fs_inode_iops;
+		inode->i_fop = &kdbus_handle_ep_ops;
+		break;
+	}
+
+	unlock_new_inode(inode);
+
+	return inode;
+}
+
+/*
+ * Superblock Management
+ */
+
+static int fs_super_dop_revalidate(struct dentry *dentry, unsigned int flags)
+{
+	struct kdbus_node *node;
+
+	/* Force lookup on negatives */
+	if (!dentry->d_inode)
+		return 0;
+
+	node = kdbus_node_from_dentry(dentry);
+
+	/* see whether the node has been removed */
+	if (!kdbus_node_is_active(node))
+		return 0;
+
+	return 1;
+}
+
+static void fs_super_dop_release(struct dentry *dentry)
+{
+	kdbus_node_unref(dentry->d_fsdata);
+}
+
+static const struct dentry_operations fs_super_dops = {
+	.d_revalidate	= fs_super_dop_revalidate,
+	.d_release	= fs_super_dop_release,
+};
+
+static void fs_super_sop_evict_inode(struct inode *inode)
+{
+	struct kdbus_node *node = kdbus_node_from_inode(inode);
+
+	truncate_inode_pages_final(&inode->i_data);
+	clear_inode(inode);
+	kdbus_node_unref(node);
+}
+
+static const struct super_operations fs_super_sops = {
+	.statfs		= simple_statfs,
+	.drop_inode	= generic_delete_inode,
+	.evict_inode	= fs_super_sop_evict_inode,
+};
+
+static int fs_super_fill(struct super_block *sb)
+{
+	struct kdbus_domain *domain = sb->s_fs_info;
+	struct inode *inode;
+
+	sb->s_blocksize = PAGE_CACHE_SIZE;
+	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
+	sb->s_magic = KDBUS_SUPER_MAGIC;
+	sb->s_maxbytes = MAX_LFS_FILESIZE;
+	sb->s_op = &fs_super_sops;
+	sb->s_time_gran = 1;
+
+	inode = fs_inode_get(sb, &domain->node);
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
+
+	sb->s_root = d_make_root(inode);
+	if (!sb->s_root) {
+		/* d_make_root iput()s the inode on failure */
+		return -ENOMEM;
+	}
+
+	/* sb holds domain reference */
+	sb->s_root->d_fsdata = &domain->node;
+	sb->s_d_op = &fs_super_dops;
+	sb->s_flags |= MS_ACTIVE;
+
+	return 0;
+}
+
+static void fs_super_kill(struct super_block *sb)
+{
+	struct kdbus_domain *domain = sb->s_fs_info;
+
+	kill_anon_super(sb);
+
+	if (domain) {
+		kdbus_domain_deactivate(domain);
+		kdbus_domain_unref(domain);
+	}
+}
+
+static int fs_super_set(struct super_block *sb, void *data)
+{
+	int ret;
+
+	ret = set_anon_super(sb, data);
+	if (!ret)
+		sb->s_fs_info = data;
+
+	return ret;
+}
+
+static struct dentry *fs_super_mount(struct file_system_type *fs_type,
+				     int flags, const char *dev_name,
+				     void *data)
+{
+	struct kdbus_domain *domain;
+	struct super_block *sb;
+	int ret;
+
+	domain = kdbus_domain_new(KDBUS_MAKE_ACCESS_WORLD);
+	if (IS_ERR(domain))
+		return ERR_CAST(domain);
+
+	ret = kdbus_domain_activate(domain);
+	if (ret < 0)
+		goto exit_domain;
+
+	sb = sget(fs_type, NULL, fs_super_set, flags, domain);
+	if (IS_ERR(sb)) {
+		ret = PTR_ERR(sb);
+		goto exit_domain;
+	}
+
+	WARN_ON(sb->s_fs_info != domain);
+	WARN_ON(sb->s_root);
+
+	ret = fs_super_fill(sb);
+	if (ret < 0) {
+		/* calls into ->kill_sb() when done */
+		deactivate_locked_super(sb);
+		return ERR_PTR(ret);
+	}
+
+	return dget(sb->s_root);
+
+exit_domain:
+	kdbus_domain_deactivate(domain);
+	kdbus_domain_unref(domain);
+	return ERR_PTR(ret);
+}
+
+static struct file_system_type fs_type = {
+	.name		= KBUILD_MODNAME "fs",
+	.owner		= THIS_MODULE,
+	.mount		= fs_super_mount,
+	.kill_sb	= fs_super_kill,
+};
+
+/**
+ * kdbus_fs_init() - register kdbus filesystem
+ *
+ * This registers a filesystem with the VFS layer. The filesystem is called
+ * `KBUILD_MODNAME "fs"', which usually resolves to `kdbusfs'. The nameing
+ * scheme allows to set KBUILD_MODNAME to "kdbus2" and you will get an
+ * independent filesystem for developers.
+ *
+ * Each mount of the kdbusfs filesystem has an kdbus_domain attached.
+ * Operations on this mount will only affect the attached domain. On each mount
+ * a new domain is automatically created and used for this mount exclusively.
+ * If you want to share a domain across multiple mounts, you need to bind-mount
+ * it.
+ *
+ * Mounts of kdbusfs (with a different domain each) are unrelated to each other
+ * and will never have any effect on any domain but their own.
+ *
+ * Return: 0 on success, negative error otherwise.
+ */
+int kdbus_fs_init(void)
+{
+	return register_filesystem(&fs_type);
+}
+
+/**
+ * kdbus_fs_exit() - unregister kdbus filesystem
+ *
+ * This does the reverse to kdbus_fs_init(). It unregisters the kdbusfs
+ * filesystem from VFS and cleans up any allocated resources.
+ */
+void kdbus_fs_exit(void)
+{
+	unregister_filesystem(&fs_type);
+}
diff --git a/ipc/kdbus/fs.h b/ipc/kdbus/fs.h
new file mode 100644
index 000000000000..29bc9ea07329
--- /dev/null
+++ b/ipc/kdbus/fs.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUSFS_H
+#define __KDBUSFS_H
+
+#include <linux/kernel.h>
+
+int kdbus_fs_init(void);
+void kdbus_fs_exit(void);
+
+#endif
diff --git a/ipc/kdbus/node.c b/ipc/kdbus/node.c
new file mode 100644
index 000000000000..1b586361cf4f
--- /dev/null
+++ b/ipc/kdbus/node.c
@@ -0,0 +1,872 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/atomic.h>
+#include <linux/fs.h>
+#include <linux/idr.h>
+#include <linux/kdev_t.h>
+#include <linux/rbtree.h>
+#include <linux/rwsem.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/wait.h>
+
+#include "bus.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "handle.h"
+#include "node.h"
+#include "util.h"
+
+/**
+ * DOC: kdbus nodes
+ *
+ * Nodes unify lifetime management across exposed kdbus objects and provide a
+ * hierarchy. Each kdbus object, that might be exposed to user-space, has a
+ * kdbus_node object embedded and is linked into the hierarchy. Each node can
+ * have any number (0-n) of child nodes linked. Each child retains a reference
+ * to its parent node. For root-nodes, the parent is NULL.
+ *
+ * Each node object goes through a bunch of states during it's lifetime:
+ *     * NEW
+ *       * LINKED    (can be skipped by NEW->FREED transition)
+ *         * ACTIVE  (can be skipped by LINKED->INACTIVE transition)
+ *       * INACTIVE
+ *       * DRAINED
+ *     * FREED
+ *
+ * Each node is allocated by the caller and initialized via kdbus_node_init().
+ * This never fails and sets the object into state NEW. From now on, ref-counts
+ * on the node manage its lifetime. During init, the ref-count is set to 1. Once
+ * it drops to 0, the node goes to state FREED and the node->free_cb() callback
+ * is called to deallocate any memory.
+ *
+ * After initializing a node, you usually link it into the hierarchy. You need
+ * to provide a parent node and a name. The node will be linked as child to the
+ * parent and a globally unique ID is assigned to the child. The name of the
+ * child must be unique for all children of this parent. Otherwise, linking the
+ * child will fail with -EEXIST.
+ * Note that the child is not marked active, yet. Admittedly, it prevents any
+ * other node from being linked with the same name (thus, it reserves that
+ * name), but any child-lookup (via name or unique ID) will never return this
+ * child unless it has been marked active.
+ *
+ * Once successfully linked, you can use kdbus_node_activate() to activate a
+ * child. This will mark the child active. This state can be skipped by directly
+ * deactivating the child via kdbus_node_deactivate() (see below).
+ * By activating a child, you enable any lookups on this child to succeed from
+ * now on. Furthermore, any code that got its hands on a reference to the node,
+ * can from now on "acquire" the node.
+ *
+ *     Active References (or: 'acquiring' and 'releasing' a node)
+ *     Additionally to normal object references, nodes support something we call
+ *     "active references". An active reference can be acquired via
+ *     kdbus_node_acquire() and released via kdbus_node_release(). A caller
+ *     _must_ own a normal object reference whenever calling those functions.
+ *     Unlike object references, acquiring an active reference can fail (by
+ *     returning 'false' from kdbus_node_acquire()). An active reference can
+ *     only be acquired if the node is marked active. If it is not marked
+ *     active, yet, or if it was already deactivated, no more active references
+ *     can be acquired, ever!
+ *     Active references are used to track tasks working on a node. Whenever a
+ *     task enters kernel-space to perform an action on a node, it acquires an
+ *     active reference, performs the action and releases the reference again.
+ *     While holding an active reference, the node is guaranteed to stay active.
+ *     If the node is deactivated in parallel, the node is marked as
+ *     deactivated, then we wait for all active references to be dropped, before
+ *     we finally proceed with any cleanups. That is, if you hold an active
+ *     reference to a node, any resources that are bound to the "active" state
+ *     are guaranteed to stay accessible until you release your reference.
+ *
+ *     Active-references are very similar to rw-locks, where acquiring a node is
+ *     equal to try-read-lock and releasing to read-unlock. Deactivating a node
+ *     means write-lock and never releasing it again.
+ *     Unlike rw-locks, the 'active reference' concept is more versatile and
+ *     avoids unusual rw-lock usage (never releasing a write-lock..).
+ *
+ *     It is safe to acquire multiple active-references recursively. But you
+ *     need to check the return value of kdbus_node_acquire() on _each_ call. It
+ *     may stop granting references at _any_ time.
+ *
+ *     You're free to perform any operations you want while holding an active
+ *     reference, except sleeping for an indefinite period. Sleeping for a fixed
+ *     amount of time is fine, but you usually should not wait on wait-queues
+ *     without a timeout.
+ *     For example, if you wait for I/O to happen, you should gather all data
+ *     and schedule the I/O operation, then release your active reference and
+ *     wait for it to complete. Then try to acquire a new reference. If it
+ *     fails, perform any cleanup (the node is now dead). Otherwise, you can
+ *     finish your operation.
+ *
+ * All nodes can be deactivated via kdbus_node_deactivate() at any time. You can
+ * call this multiple times, even in parallel or on nodes that were never
+ * linked, and it will just work. The only restriction is, you must not hold an
+ * active reference when calling kdbus_node_deactivate().
+ * By deactivating a node, it is immediately marked inactive. Then, we wait for
+ * all active references to be released (called 'draining' the node). This
+ * shouldn't take very long as we don't perform long-lasting operations while
+ * holding an active reference. Note that once the node is marked inactive, no
+ * new active references can be acquired.
+ * Once all active references are dropped, the node is considered 'drained'. Now
+ * kdbus_node_deactivate() is called on each child of the node before we
+ * continue deactvating our node. That is, once all children are entirely
+ * deactivated, we call ->release_cb() of our node. ->release_cb() can release
+ * any resources on that node which are bound to the "active" state of a node.
+ * When done, we unlink the node from its parent rb-tree, mark it as
+ * 'released' and return.
+ * If kdbus_node_deactivate() is called multiple times (even in parallel), all
+ * but one caller will just wait until the node is fully deactivated. That is,
+ * one random caller of kdbus_node_deactivate() is selected to call
+ * ->release_cb() and cleanup the node. Only once all this is done, all other
+ * callers will return from kdbus_node_deactivate(). That is, it doesn't matter
+ * whether you're the selected caller or not, it will only return after
+ * everything is fully done.
+ *
+ * When a node is activated, we acquire a normal object reference to the node.
+ * This reference is dropped after deactivation is fully done (and only iff the
+ * node really was activated). This allows callers to link+activate a child node
+ * and then drop all refs. The node will be deactivated together with the
+ * parent, and then be freed when this reference is dropped.
+ *
+ * Currently, nodes provide a bunch of resources that external code can use
+ * directly. This includes:
+ *
+ *     * node->waitq: Each node has its own wait-queue that is used to manage
+ *                    the 'active' state. When a node is deactivated, we wait on
+ *                    this queue until all active refs are dropped. Analogously,
+ *                    when you release an active reference on a deactivated
+ *                    node, and the active ref-count drops to 0, we wake up a
+ *                    single thread on this queue. Furthermore, once the
+ *                    ->release_cb() callback finished, we wake up all waiters.
+ *                    The node-owner is free to re-use this wait-queue for other
+ *                    purposes. As node-management uses this queue only during
+ *                    deactivation, it is usually totally fine to re-use the
+ *                    queue for other, preferably low-overhead, use-cases.
+ *
+ *     * node->type: This field defines the type of the owner of this node. It
+ *                   must be set during node initialization and must remain
+ *                   constant. The node management never looks at this value,
+ *                   but external users might use to gain access to the owner
+ *                   object of a node.
+ *                   It is totally up to the owner of the node to define what
+ *                   their type means. Usually it means you can access the
+ *                   parent structure via container_of(), as long as you hold an
+ *                   active reference to the node.
+ *
+ *     * node->free_cb:    callback after all references are dropped
+ *       node->release_cb: callback during node deactivation
+ *                         These fields must be set by the node owner during
+ *                         node initialization. They must remain constant. If
+ *                         NULL, they're skipped.
+ *
+ *     * node->mode: filesystem access modes
+ *       node->uid:  filesystem owner uid
+ *       node->gid:  filesystem owner gid
+ *                   These fields must be set by the node owner during node
+ *                   initialization. They must remain constant and may be
+ *                   accessed by other callers to properly initialize
+ *                   filesystem nodes.
+ *
+ *     * node->id: This is an unsigned 32bit integer allocated by an IDR. It is
+ *                 always kept as small as possible during allocation and is
+ *                 globally unique across all nodes allocated by this module. 0
+ *                 is reserved as "not assigned" and is the default.
+ *                 The ID is assigned during kdbus_node_link() and is kept until
+ *                 the object is freed. Thus, the ID surpasses the active
+ *                 lifetime of a node. As long as you hold an object reference
+ *                 to a node (and the node was linked once), the ID is valid and
+ *                 unique.
+ *
+ *     * node->name: name of this node
+ *       node->hash: 31bit hash-value of @name (range [2..INT_MAX-1])
+ *                   These values follow the same lifetime rules as node->id.
+ *                   They're initialized when the node is linked and then remain
+ *                   constant until the last object reference is dropped.
+ *                   Unlike the id, the name is only unique across all siblings
+ *                   and only until the node is deactivated. Currently, the name
+ *                   is even unique if linked but not activated, yet. This might
+ *                   change in the future, though. Code should not rely on this.
+ *
+ *     * node->lock:     lock to protect node->children, node->rb, node->parent
+ *     * node->parent: Reference to parent node. This is set during LINK time
+ *                     and is dropped during deactivation. You must not access
+ *                     it unless you hold an active reference to the node.
+ *     * node->children: rb-tree of all linked children of this node. You must
+ *                       not access this directly, but use one of the iterator
+ *                       or lookup helpers.
+ */
+
+/*
+ * Bias values track states of "active references". They're all negative. If a
+ * node is active, its active-ref-counter is >=0 and tracks all active
+ * references. Once a node is deactivaed, we subtract NODE_BIAS. This means, the
+ * counter is now negative but still counts the active references. Once it drops
+ * to exactly NODE_BIAS, we know all active references were dropped. Exactly one
+ * thread will change it to NODE_RELEASE now, perform cleanup and then put it
+ * into NODE_DRAINED. Once drained, all other threads that tried deactivating
+ * the node will now be woken up (thus, they wait until the node is fully done).
+ * The initial state during node-setup is NODE_NEW. If a node is directly
+ * deactivated without having ever been active, it is put into
+ * NODE_RELEASE_DIRECT instead of NODE_BIAS. This tracks this one-bit state
+ * across node-deactivation. The task putting it into NODE_RELEASE now knows
+ * whether the node was active before or not.
+ */
+#define KDBUS_NODE_BIAS			(INT_MIN + 4)
+#define KDBUS_NODE_RELEASE_DIRECT	(KDBUS_NODE_BIAS - 1)
+#define KDBUS_NODE_RELEASE		(KDBUS_NODE_BIAS - 2)
+#define KDBUS_NODE_DRAINED		(KDBUS_NODE_BIAS - 3)
+#define KDBUS_NODE_NEW			(KDBUS_NODE_BIAS - 4)
+
+/* global unique ID mapping for kdbus nodes */
+static DEFINE_IDR(kdbus_node_idr);
+static DECLARE_RWSEM(kdbus_node_idr_lock);
+
+/**
+ * kdbus_node_name_hash() - hash a name
+ * @name:	The string to hash
+ *
+ * This computes the hash of @name. It is guaranteed to be in the range
+ * [2..INT_MAX-1]. The values 1, 2 and INT_MAX are unused as they are reserved
+ * for the filesystem code.
+ *
+ * Return: hash value of the passed string
+ */
+static unsigned int kdbus_node_name_hash(const char *name)
+{
+	unsigned int hash;
+
+	/* reserve hash numbers 0, 1 and >=INT_MAX for magic directories */
+	hash = kdbus_str_hash(name) & INT_MAX;
+	if (hash < 2)
+		hash += 2;
+	if (hash >= INT_MAX)
+		hash = INT_MAX - 1;
+
+	return hash;
+}
+
+/**
+ * kdbus_node_name_compare() - compare a name with a node's name
+ * @hash:	hash of the string to compare the node with
+ * @name:	name to compare the node with
+ * @node:	node to compare the name with
+ *
+ * Return: 0 if @name and @hash exactly match the information in @node, or
+ * an integer less than or greater than zero if @name is found, respectively,
+ * to be less than or be greater than the string stored in @node.
+ */
+static int kdbus_node_name_compare(unsigned int hash, const char *name,
+				   const struct kdbus_node *node)
+{
+	if (hash != node->hash)
+		return hash - node->hash;
+
+	return strcmp(name, node->name);
+}
+
+/**
+ * kdbus_node_init() - initialize a kdbus_node
+ * @node:	Pointer to the node to initialize
+ * @type:	The type the node will have (KDBUS_NODE_*)
+ *
+ * The caller is responsible of allocating @node and initializating it to zero.
+ * Once this call returns, you must use the node_ref() and node_unref()
+ * functions to manage this node.
+ */
+void kdbus_node_init(struct kdbus_node *node, unsigned int type)
+{
+	atomic_set(&node->refcnt, 1);
+	mutex_init(&node->lock);
+	node->id = 0;
+	node->type = type;
+	RB_CLEAR_NODE(&node->rb);
+	node->children = RB_ROOT;
+	init_waitqueue_head(&node->waitq);
+	atomic_set(&node->active, KDBUS_NODE_NEW);
+}
+
+/**
+ * kdbus_node_link() - link a node into the nodes system
+ * @node:	Pointer to the node to initialize
+ * @parent:	Pointer to a parent node, may be %NULL
+ * @name:	The name of the node (or NULL if root node)
+ *
+ * This links a node into the hierarchy. This must not be called multiple times.
+ * If @parent is NULL, the node becomes a new root node.
+ *
+ * This call will fail if @name is not unique across all its siblings or if no
+ * ID could be allocated. You must not activate a node if linking failed! It is
+ * safe to deactivate it, though.
+ *
+ * Once you linked a node, you must call kdbus_node_deactivate() before you drop
+ * the last reference (even if you never activate the node).
+ *
+ * Return: 0 on success. negative error otherwise.
+ */
+int kdbus_node_link(struct kdbus_node *node, struct kdbus_node *parent,
+		    const char *name)
+{
+	int ret;
+
+	BUG_ON(node->type != KDBUS_NODE_DOMAIN && !parent);
+	BUG_ON(parent && !name);
+
+	if (name) {
+		node->name = kstrdup(name, GFP_KERNEL);
+		if (!node->name)
+			return -ENOMEM;
+
+		node->hash = kdbus_node_name_hash(name);
+	}
+
+	down_write(&kdbus_node_idr_lock);
+	ret = idr_alloc(&kdbus_node_idr, node, 1, 0, GFP_KERNEL);
+	if (ret >= 0)
+		node->id = ret;
+	up_write(&kdbus_node_idr_lock);
+
+	if (ret < 0)
+		return ret;
+
+	ret = 0;
+
+	if (parent) {
+		struct rb_node **n, *prev;
+
+		if (!kdbus_node_acquire(parent))
+			return -ESHUTDOWN;
+
+		mutex_lock(&parent->lock);
+
+		n = &parent->children.rb_node;
+		prev = NULL;
+
+		while (*n) {
+			struct kdbus_node *pos;
+			int result;
+
+			pos = kdbus_node_from_rb(*n);
+			prev = *n;
+			result = kdbus_node_name_compare(node->hash,
+							 node->name,
+							 pos);
+			if (result == 0) {
+				ret = -EEXIST;
+				goto exit_unlock;
+			}
+
+			if (result < 0)
+				n = &pos->rb.rb_left;
+			else
+				n = &pos->rb.rb_right;
+		}
+
+		/* add new node and rebalance the tree */
+		rb_link_node(&node->rb, prev, n);
+		rb_insert_color(&node->rb, &parent->children);
+		node->parent = kdbus_node_ref(parent);
+
+exit_unlock:
+		mutex_unlock(&parent->lock);
+		kdbus_node_release(parent);
+	}
+
+	return ret;
+}
+
+/**
+ * kdbus_node_ref() - Acquire object reference
+ * @node:	node to acquire reference to (or NULL)
+ *
+ * This acquires a new reference to @node. You must already own a reference when
+ * calling this!
+ * If @node is NULL, this is a no-op.
+ *
+ * Return: @node is returned
+ */
+struct kdbus_node *kdbus_node_ref(struct kdbus_node *node)
+{
+	if (node)
+		atomic_inc(&node->refcnt);
+	return node;
+}
+
+/**
+ * kdbus_node_unref() - Drop object reference
+ * @node:	node to drop reference to (or NULL)
+ *
+ * This drops an object reference to @node. You must not access the node if you
+ * no longer own a reference.
+ * If the ref-count drops to 0, the object will be destroyed (->free_cb will be
+ * called).
+ *
+ * If you linked or activated the node, you must deactivate the node before you
+ * drop your last reference! If you didn't link or activate the node, you can
+ * drop any reference you want.
+ *
+ * Note that this calls into ->free_cb() and thus _might_ sleep. The ->free_cb()
+ * callbacks must not acquire any outer locks, though. So you can safely drop
+ * references while holding locks.
+ *
+ * If @node is NULL, this is a no-op.
+ *
+ * Return: This always returns NULL
+ */
+struct kdbus_node *kdbus_node_unref(struct kdbus_node *node)
+{
+	if (node && atomic_dec_and_test(&node->refcnt)) {
+		struct kdbus_node safe = *node;
+
+		WARN_ON(atomic_read(&node->active) != KDBUS_NODE_DRAINED);
+		WARN_ON(!RB_EMPTY_NODE(&node->rb));
+
+		if (node->free_cb)
+			node->free_cb(node);
+
+		down_write(&kdbus_node_idr_lock);
+		if (safe.id > 0)
+			idr_remove(&kdbus_node_idr, safe.id);
+		/* drop caches after last node to not leak memory on unload */
+		if (idr_is_empty(&kdbus_node_idr)) {
+			idr_destroy(&kdbus_node_idr);
+			idr_init(&kdbus_node_idr);
+		}
+		up_write(&kdbus_node_idr_lock);
+
+		kfree(safe.name);
+		kdbus_node_unref(safe.parent);
+	}
+
+	return NULL;
+}
+
+/**
+ * kdbus_node_is_active() - test whether a node is active
+ * @node:	node to test
+ *
+ * This checks whether @node is active. That means, @node was linked and
+ * activated by the node owner and hasn't been deactivated, yet. If, and only
+ * if, a node is active, kdbus_node_acquire() will be able to acquire active
+ * references.
+ *
+ * Note that this function does not give any lifetime guarantees. After this
+ * call returns, the node might be deactivated immediately. Normally, what you
+ * want is to acquire a real active reference via kdbus_node_acquire().
+ *
+ * Return: true if @node is active, false otherwise
+ */
+bool kdbus_node_is_active(struct kdbus_node *node)
+{
+	return atomic_read(&node->active) >= 0;
+}
+
+/**
+ * kdbus_node_activate() - activate a node
+ * @node:	node to activate
+ *
+ * This marks @node as active if, and only if, the node wasn't activated nor
+ * deactivated, yet, and the parent is still active. Any but the first call to
+ * kdbus_node_activate() is a no-op.
+ * If you called kdbus_node_deactivate() before, then even the first call to
+ * kdbus_node_activate() will be a no-op.
+ *
+ * This call doesn't give any lifetime guarantees. The node might get
+ * deactivated immediately after this call returns. Or the parent might already
+ * be deactivated, which will make this call a no-op.
+ *
+ * If this call successfully activated a node, it will take an object reference
+ * to it. This reference is dropped after the node is deactivated. Therefore,
+ * the object owner can safely drop their reference to @node iff they know that
+ * its parent node will get deactivated at some point. Once the parent node is
+ * deactivated, it will deactivate all its child and thus drop this reference
+ * again.
+ *
+ * Return: True if this call successfully activated the node, otherwise false.
+ *         Note that this might return false, even if the node is still active
+ *         (eg., if you called this a second time).
+ */
+bool kdbus_node_activate(struct kdbus_node *node)
+{
+	bool res = false;
+
+	mutex_lock(&node->lock);
+	if (atomic_read(&node->active) == KDBUS_NODE_NEW) {
+		atomic_sub(KDBUS_NODE_NEW, &node->active);
+		/* activated nodes have ref +1 */
+		kdbus_node_ref(node);
+		res = true;
+	}
+	mutex_unlock(&node->lock);
+
+	return res;
+}
+
+/**
+ * kdbus_node_deactivate() - deactivate a node
+ * @node:	The node to deactivate.
+ *
+ * This function recursively deactivates this node and all its children. It
+ * returns only once all children and the node itself were recursively disabled
+ * (even if you call this function multiple times in parallel).
+ *
+ * It is safe to call this function on _any_ node that was initialized _any_
+ * number of times.
+ *
+ * This call may sleep, as it waits for all active references to be dropped.
+ */
+void kdbus_node_deactivate(struct kdbus_node *node)
+{
+	struct kdbus_node *pos, *child;
+	struct rb_node *rb;
+	int v_pre, v_post;
+
+	pos = node;
+
+	/*
+	 * To avoid recursion, we perform back-tracking while deactivating
+	 * nodes. For each node we enter, we first mark the active-counter as
+	 * deactivated by adding BIAS. If the node as children, we set the first
+	 * child as current position and start over. If the node has no
+	 * children, we drain the node by waiting for all active refs to be
+	 * dropped and then releasing the node.
+	 *
+	 * After the node is released, we set its parent as current position
+	 * and start over. If the current position was the initial node, we're
+	 * done.
+	 *
+	 * Note that this function can be called in parallel by multiple
+	 * callers. We make sure that each node is only released once, and any
+	 * racing caller will wait until the other thread fully released that
+	 * node.
+	 */
+
+	for (;;) {
+		/*
+		 * Add BIAS to node->active to mark it as inactive. If it was
+		 * never active before, immediately mark it as RELEASE_INACTIVE
+		 * so we remember this state.
+		 * We cannot remember v_pre as we might iterate into the
+		 * children, overwriting v_pre, before we can release our node.
+		 */
+		mutex_lock(&pos->lock);
+		v_pre = atomic_read(&pos->active);
+		if (v_pre >= 0)
+			atomic_add_return(KDBUS_NODE_BIAS, &pos->active);
+		else if (v_pre == KDBUS_NODE_NEW)
+			atomic_set(&pos->active, KDBUS_NODE_RELEASE_DIRECT);
+		mutex_unlock(&pos->lock);
+
+		/* wait until all active references were dropped */
+		wait_event(pos->waitq,
+			   atomic_read(&pos->active) <= KDBUS_NODE_BIAS);
+
+		mutex_lock(&pos->lock);
+		/* recurse into first child if any */
+		rb = rb_first(&pos->children);
+		if (rb) {
+			child = kdbus_node_ref(kdbus_node_from_rb(rb));
+			mutex_unlock(&pos->lock);
+			pos = child;
+			continue;
+		}
+
+		/* mark object as RELEASE */
+		v_post = atomic_read(&pos->active);
+		if (v_post == KDBUS_NODE_BIAS ||
+		    v_post == KDBUS_NODE_RELEASE_DIRECT)
+			atomic_set(&pos->active, KDBUS_NODE_RELEASE);
+		mutex_unlock(&pos->lock);
+
+		/*
+		 * If this is the thread that marked the object as RELEASE, we
+		 * perform the actual release. Otherwise, we wait until the
+		 * release is done and the node is marked as DRAINED.
+		 */
+		if (v_post == KDBUS_NODE_BIAS ||
+		    v_post == KDBUS_NODE_RELEASE_DIRECT) {
+			if (pos->release_cb)
+				pos->release_cb(pos, v_post == KDBUS_NODE_BIAS);
+
+			if (pos->parent) {
+				mutex_lock(&pos->parent->lock);
+				if (!RB_EMPTY_NODE(&pos->rb)) {
+					rb_erase(&pos->rb,
+						 &pos->parent->children);
+					RB_CLEAR_NODE(&pos->rb);
+				}
+				mutex_unlock(&pos->parent->lock);
+			}
+
+			/* mark as DRAINED */
+			atomic_set(&pos->active, KDBUS_NODE_DRAINED);
+			wake_up_all(&pos->waitq);
+
+			/*
+			 * If the node was activated and somone subtracted BIAS
+			 * from it to deactivate it, we, and only us, are
+			 * responsible to release the extra ref-count that was
+			 * taken once in kdbus_node_activate().
+			 * If the node was never activated, no-one ever
+			 * subtracted BIAS, but instead skipped that state and
+			 * immediately went to NODE_RELEASE_DIRECT. In that case
+			 * we must not drop the reference.
+			 */
+			if (v_post == KDBUS_NODE_BIAS)
+				kdbus_node_unref(pos);
+		} else {
+			/* wait until object is DRAINED */
+			wait_event(pos->waitq,
+			    atomic_read(&pos->active) == KDBUS_NODE_DRAINED);
+		}
+
+		/*
+		 * We're done with the current node. Continue on its parent
+		 * again, which will try deactivating its next child, or itself
+		 * if no child is left.
+		 * If we've reached our initial node again, we are done and
+		 * can safely return.
+		 */
+		if (pos == node)
+			break;
+
+		child = pos;
+		pos = pos->parent;
+		kdbus_node_unref(child);
+	}
+}
+
+/**
+ * kdbus_node_acquire() - Acquire an active ref on a node
+ * @node:	The node
+ *
+ * This acquires an active-reference to @node. This will only succeed if the
+ * node is active. You must release this active reference via
+ * kdbus_node_release() again.
+ *
+ * See the introduction to "active references" for more details.
+ *
+ * Return: %true if @node was non-NULL and active
+ */
+bool kdbus_node_acquire(struct kdbus_node *node)
+{
+	return node && atomic_inc_unless_negative(&node->active);
+}
+
+/**
+ * kdbus_node_release() - Release an active ref on a node
+ * @node:	The node
+ *
+ * This releases an active reference that was previously acquired via
+ * kdbus_node_acquire(). See kdbus_node_acquire() for details.
+ */
+void kdbus_node_release(struct kdbus_node *node)
+{
+	if (node && atomic_dec_return(&node->active) == KDBUS_NODE_BIAS)
+		wake_up(&node->waitq);
+}
+
+/**
+ * kdbus_node_find_child() - Find child by name
+ * @node:	parent node to search through
+ * @name:	name of child node
+ *
+ * This searches through all children of @node for a child-node with name @name.
+ * If not found, or if the child is deactivated, NULL is returned. Otherwise,
+ * the child is acquired and a new reference is returned.
+ *
+ * If you're done with the child, you need to release it and drop your
+ * reference.
+ *
+ * This function does not acquire the parent node. However, if the parent was
+ * already deactivated, then kdbus_node_deactivate() will, at some point, also
+ * deactivate the child. Therefore, we can rely on the explicit ordering during
+ * deactivation.
+ *
+ * Return: Reference to acquired child node, or NULL if not found / not active.
+ */
+struct kdbus_node *kdbus_node_find_child(struct kdbus_node *node,
+					 const char *name)
+{
+	struct kdbus_node *child;
+	struct rb_node *rb;
+	unsigned int hash;
+	int ret;
+
+	hash = kdbus_node_name_hash(name);
+
+	mutex_lock(&node->lock);
+	rb = node->children.rb_node;
+	while (rb) {
+		child = kdbus_node_from_rb(rb);
+		ret = kdbus_node_name_compare(hash, name, child);
+		if (ret < 0)
+			rb = rb->rb_left;
+		else if (ret > 0)
+			rb = rb->rb_right;
+		else
+			break;
+	}
+	if (rb && kdbus_node_acquire(child))
+		kdbus_node_ref(child);
+	else
+		child = NULL;
+	mutex_unlock(&node->lock);
+
+	return child;
+}
+
+static struct kdbus_node *node_find_closest_unlocked(struct kdbus_node *node,
+						     unsigned int hash,
+						     const char *name)
+{
+	struct kdbus_node *n, *pos = NULL;
+	struct rb_node *rb;
+	int res;
+
+	/*
+	 * Find the closest child with ``node->hash >= hash'', or, if @name is
+	 * valid, ``node->name >= name'' (where '>=' is the lex. order).
+	 */
+
+	rb = node->children.rb_node;
+	while (rb) {
+		n = kdbus_node_from_rb(rb);
+
+		if (name)
+			res = kdbus_node_name_compare(hash, name, n);
+		else
+			res = hash - n->hash;
+
+		if (res <= 0) {
+			rb = rb->rb_left;
+			pos = n;
+		} else { /* ``hash > n->hash'', ``name > n->name'' */
+			rb = rb->rb_right;
+		}
+	}
+
+	return pos;
+}
+
+/**
+ * kdbus_node_find_closest() - Find closest child-match
+ * @node:	parent node to search through
+ * @hash:	hash value to find closest match for
+ *
+ * Find the closest child of @node with a hash greater than or equal to @hash.
+ * The closest match is the left-most child of @node with this property. Which
+ * means, it is the first child with that hash returned by
+ * kdbus_node_next_child(), if you'd iterate the whole parent node.
+ *
+ * Return: Reference to acquired child, or NULL if none found.
+ */
+struct kdbus_node *kdbus_node_find_closest(struct kdbus_node *node,
+					   unsigned int hash)
+{
+	struct kdbus_node *child;
+	struct rb_node *rb;
+
+	mutex_lock(&node->lock);
+
+	child = node_find_closest_unlocked(node, hash, NULL);
+	while (child && !kdbus_node_acquire(child)) {
+		rb = rb_next(&child->rb);
+		if (rb)
+			child = kdbus_node_from_rb(rb);
+		else
+			child = NULL;
+	}
+	kdbus_node_ref(child);
+
+	mutex_unlock(&node->lock);
+
+	return child;
+}
+
+/**
+ * kdbus_node_next_child() - Acquire next child
+ * @node:	parent node
+ * @prev:	previous child-node position or NULL
+ *
+ * This function returns a reference to the next active child of @node, after
+ * the passed position @prev. If @prev is NULL, a reference to the first active
+ * child is returned. If no more active children are found, NULL is returned.
+ *
+ * This function acquires the next child it returns. If you're done with the
+ * returned pointer, you need to release _and_ unref it.
+ *
+ * The passed in pointer @prev is not modified by this function, and it does
+ * *not* have to be active. If @prev was acquired via different means, or if it
+ * was unlinked from its parent before you pass it in, then this iterator will
+ * still return the next active child (it will have to search through the
+ * rb-tree based on the node-name, though).
+ * However, @prev must not be linked to a different parent than @node!
+ *
+ * Return: Reference to next acquired child, or NULL if at the end.
+ */
+struct kdbus_node *kdbus_node_next_child(struct kdbus_node *node,
+					 struct kdbus_node *prev)
+{
+	struct kdbus_node *pos = NULL;
+	struct rb_node *rb;
+
+	mutex_lock(&node->lock);
+
+	if (!prev) {
+		/*
+		 * New iteration; find first node in rb-tree and try to acquire
+		 * it. If we got it, directly return it as first element.
+		 * Otherwise, the loop below will find the next active node.
+		 */
+		rb = rb_first(&node->children);
+		if (!rb)
+			goto exit;
+		pos = kdbus_node_from_rb(rb);
+		if (kdbus_node_acquire(pos))
+			goto exit;
+	} else if (RB_EMPTY_NODE(&prev->rb)) {
+		/*
+		 * The current iterator is no longer linked to the rb-tree. Use
+		 * its hash value and name to find the next _higher_ node and
+		 * acquire it. If we got it, return it as next element.
+		 * Otherwise, the loop below will find the next active node.
+		 */
+		pos = node_find_closest_unlocked(node, prev->hash, prev->name);
+		if (!pos)
+			goto exit;
+		if (kdbus_node_acquire(pos))
+			goto exit;
+	} else {
+		/*
+		 * The current iterator is still linked to the parent. Set it
+		 * as current position and use the loop below to find the next
+		 * active element.
+		 */
+		pos = prev;
+	}
+
+	/* @pos was already returned or is inactive; find next active node */
+	do {
+		rb = rb_next(&pos->rb);
+		if (rb)
+			pos = kdbus_node_from_rb(rb);
+		else
+			pos = NULL;
+	} while (pos && !kdbus_node_acquire(pos));
+
+exit:
+	/* @pos is NULL or acquired. Take ref if non-NULL and return it */
+	kdbus_node_ref(pos);
+	mutex_unlock(&node->lock);
+	return pos;
+}
diff --git a/ipc/kdbus/node.h b/ipc/kdbus/node.h
new file mode 100644
index 000000000000..d907dcc9a36c
--- /dev/null
+++ b/ipc/kdbus/node.h
@@ -0,0 +1,86 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_NODE_H
+#define __KDBUS_NODE_H
+
+#include <linux/atomic.h>
+#include <linux/kernel.h>
+#include <linux/mutex.h>
+#include <linux/wait.h>
+
+struct kdbus_bus;
+struct kdbus_domain;
+struct kdbus_ep;
+struct kdbus_node;
+
+enum kdbus_node_type {
+	KDBUS_NODE_DOMAIN,
+	KDBUS_NODE_CONTROL,
+	KDBUS_NODE_BUS,
+	KDBUS_NODE_ENDPOINT,
+};
+
+typedef void (*kdbus_node_free_t) (struct kdbus_node *node);
+typedef void (*kdbus_node_release_t) (struct kdbus_node *node, bool was_active);
+
+struct kdbus_node {
+	atomic_t refcnt;
+	atomic_t active;
+	wait_queue_head_t waitq;
+
+	/* static members */
+	unsigned int type;
+	kdbus_node_free_t free_cb;
+	kdbus_node_release_t release_cb;
+	umode_t mode;
+	kuid_t uid;
+	kgid_t gid;
+
+	/* valid once linked */
+	char *name;
+	unsigned int hash;
+	unsigned int id;
+
+	/* valid iff active */
+	struct mutex lock;
+	struct rb_node rb;
+	struct rb_root children;
+	struct kdbus_node *parent;
+};
+
+#define kdbus_node_from_rb(_node) rb_entry((_node), struct kdbus_node, rb)
+
+void kdbus_node_init(struct kdbus_node *node, unsigned int type);
+
+int kdbus_node_link(struct kdbus_node *node, struct kdbus_node *parent,
+		    const char *name);
+
+struct kdbus_node *kdbus_node_ref(struct kdbus_node *node);
+struct kdbus_node *kdbus_node_unref(struct kdbus_node *node);
+
+bool kdbus_node_is_active(struct kdbus_node *node);
+bool kdbus_node_activate(struct kdbus_node *node);
+void kdbus_node_deactivate(struct kdbus_node *node);
+
+bool kdbus_node_acquire(struct kdbus_node *node);
+void kdbus_node_release(struct kdbus_node *node);
+
+struct kdbus_node *kdbus_node_find_child(struct kdbus_node *node,
+					 const char *name);
+struct kdbus_node *kdbus_node_find_closest(struct kdbus_node *node,
+					   unsigned int hash);
+struct kdbus_node *kdbus_node_next_child(struct kdbus_node *node,
+					 struct kdbus_node *prev);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add code to gather metadata
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  2014-11-21 19:50     ` Andy Lutomirski
  -1 siblings, 1 reply; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

A connection chooses which metadata it wants to have attached to each
message it receives with kdbus_cmd_hello.attach_flags. The metadata
will be attached as items to the messages. All metadata refers to
information about the sending task at sending time, unless otherwise
stated. Also, the metadata is copied, not referenced, so even if the
sending task doesn't exist anymore at the time the message is received,
the information is still preserved.

See kdbus.txt for more details on which metadata can currently be
attached to messages.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/metadata.c | 698 +++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/metadata.h |  38 +++
 2 files changed, 736 insertions(+)
 create mode 100644 ipc/kdbus/metadata.c
 create mode 100644 ipc/kdbus/metadata.h

diff --git a/ipc/kdbus/metadata.c b/ipc/kdbus/metadata.c
new file mode 100644
index 000000000000..be70e6c8dcb6
--- /dev/null
+++ b/ipc/kdbus/metadata.c
@@ -0,0 +1,698 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/audit.h>
+#include <linux/capability.h>
+#include <linux/cgroup.h>
+#include <linux/cred.h>
+#include <linux/file.h>
+#include <linux/init.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/security.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/version.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "item.h"
+#include "message.h"
+#include "metadata.h"
+#include "names.h"
+#include "pool.h"
+
+/**
+ * struct kdbus_meta - metadata buffer
+ * @attached:		Flags for already attached data
+ * @data:		Allocated buffer
+ * @size:		Number of bytes used
+ * @allocated_size:	Size of buffer
+ *
+ * Used to collect and store connection metadata in a pre-compiled
+ * buffer containing struct kdbus_item.
+ */
+struct kdbus_meta {
+	u64 attached;
+	struct kdbus_item *data;
+	size_t size;
+	size_t allocated_size;
+};
+
+/**
+ * kdbus_meta_new() - create new metadata object
+ *
+ * Return: a new kdbus_meta object on success, ERR_PTR on failure.
+ */
+struct kdbus_meta *kdbus_meta_new(void)
+{
+	struct kdbus_meta *m;
+
+	m = kzalloc(sizeof(*m), GFP_KERNEL);
+	if (!m)
+		return ERR_PTR(-ENOMEM);
+
+	return m;
+}
+
+/**
+ * kdbus_meta_dup() - Duplicate a meta object
+ *
+ * @orig:	The meta object to duplicate
+ *
+ * Return: a new kdbus_meta object on success, ERR_PTR on failure.
+ */
+struct kdbus_meta *kdbus_meta_dup(const struct kdbus_meta *orig)
+{
+	struct kdbus_meta *m;
+
+	BUG_ON(!orig);
+
+	m = kmalloc(sizeof(*m), GFP_KERNEL);
+	if (!m)
+		return ERR_PTR(-ENOMEM);
+
+	m->data = kmemdup(orig->data, orig->allocated_size, GFP_KERNEL);
+	if (!m->data) {
+		kfree(m);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	m->attached = orig->attached;
+	m->allocated_size = orig->allocated_size;
+	m->size = orig->size;
+
+	return m;
+}
+
+/**
+ * kdbus_meta_free() - release metadata
+ * @meta:		Metadata object
+ */
+void kdbus_meta_free(struct kdbus_meta *meta)
+{
+	if (!meta)
+		return;
+
+	kfree(meta->data);
+	kfree(meta);
+}
+
+static struct kdbus_item *
+kdbus_meta_append_item(struct kdbus_meta *meta, u64 type, size_t payload_size)
+{
+	size_t extra_size = KDBUS_ITEM_SIZE(payload_size);
+	struct kdbus_item *item;
+	size_t size;
+
+	/* get new metadata buffer, pre-allocate at least 512 bytes */
+	if (!meta->data) {
+		size = roundup_pow_of_two(256 + extra_size);
+		meta->data = kzalloc(size, GFP_KERNEL);
+		if (!meta->data)
+			return ERR_PTR(-ENOMEM);
+
+		meta->allocated_size = size;
+	}
+
+	/* double the pre-allocated buffer size if needed */
+	size = meta->size + extra_size;
+	if (size > meta->allocated_size) {
+		size_t size_diff;
+		struct kdbus_item *data;
+
+		size = roundup_pow_of_two(size);
+		size_diff = size - meta->allocated_size;
+		data = kmalloc(size, GFP_KERNEL);
+		if (!data)
+			return ERR_PTR(-ENOMEM);
+
+		memcpy(data, meta->data, meta->size);
+		memset((u8 *)data + meta->allocated_size, 0, size_diff);
+
+		kfree(meta->data);
+		meta->data = data;
+		meta->allocated_size = size;
+	}
+
+	/* insert new record */
+	item = (struct kdbus_item *)((u8 *)meta->data + meta->size);
+	item->type = type;
+	item->size = KDBUS_ITEM_HEADER_SIZE + payload_size;
+
+	meta->size += extra_size;
+
+	return item;
+}
+
+/**
+ * kdbus_meta_append_data() - append given raw data to metadata object
+ * @meta:		Metadata object
+ * @type:		KDBUS_ITEM_* type
+ * @data:		pointer to data to copy from. If it is NULL
+ *			then just make space in the metadata buffer.
+ * @len:		number of bytes to copy
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_meta_append_data(struct kdbus_meta *meta, u64 type,
+			   const void *data, size_t len)
+{
+	struct kdbus_item *item;
+
+	if (len == 0)
+		return 0;
+
+	item = kdbus_meta_append_item(meta, type, len);
+	if (IS_ERR(item))
+		return PTR_ERR(item);
+
+	if (data)
+		memcpy(item->data, data, len);
+
+	return 0;
+}
+
+static int kdbus_meta_append_str(struct kdbus_meta *meta, u64 type,
+				 const char *str)
+{
+	return kdbus_meta_append_data(meta, type, str, strlen(str) + 1);
+}
+
+static int kdbus_meta_append_timestamp(struct kdbus_meta *meta,
+				       u64 seq)
+{
+	struct kdbus_item *item;
+	struct timespec ts;
+
+	item = kdbus_meta_append_item(meta, KDBUS_ITEM_TIMESTAMP,
+				      sizeof(struct kdbus_timestamp));
+	if (IS_ERR(item))
+		return PTR_ERR(item);
+
+	if (seq > 0)
+		item->timestamp.seqnum = seq;
+
+	ktime_get_ts(&ts);
+	item->timestamp.monotonic_ns = timespec_to_ns(&ts);
+
+	ktime_get_real_ts(&ts);
+	item->timestamp.realtime_ns = timespec_to_ns(&ts);
+
+	return 0;
+}
+
+static int kdbus_meta_append_cred(struct kdbus_meta *meta,
+				  const struct kdbus_domain *domain)
+{
+	struct kdbus_creds creds = {
+		.uid = from_kuid_munged(domain->user_namespace, current_uid()),
+		.gid = from_kgid_munged(domain->user_namespace, current_gid()),
+		.pid = task_pid_nr_ns(current, domain->pid_namespace),
+		.tid = task_tgid_nr_ns(current, domain->pid_namespace),
+		.starttime = current->start_time,
+	};
+
+	return kdbus_meta_append_data(meta, KDBUS_ITEM_CREDS,
+				      &creds, sizeof(creds));
+}
+
+static int kdbus_meta_append_auxgroups(struct kdbus_meta *meta,
+				       const struct kdbus_domain *domain)
+{
+	struct group_info *info;
+	struct kdbus_item *item;
+	int i, ret = 0;
+	u64 *gid;
+
+	info = get_current_groups();
+	item = kdbus_meta_append_item(meta, KDBUS_ITEM_AUXGROUPS,
+				      info->ngroups * sizeof(*gid));
+	if (IS_ERR(item)) {
+		ret = PTR_ERR(item);
+		goto exit_put_groups;
+	}
+
+	gid = (u64 *) item->data;
+
+	for (i = 0; i < info->ngroups; i++)
+		gid[i] = from_kgid_munged(domain->user_namespace,
+					  GROUP_AT(info, i));
+
+exit_put_groups:
+	put_group_info(info);
+
+	return ret;
+}
+
+static int kdbus_meta_append_src_names(struct kdbus_meta *meta,
+				       struct kdbus_conn *conn)
+{
+	struct kdbus_name_entry *e;
+	int ret = 0;
+
+	if (!conn)
+		return 0;
+
+	mutex_lock(&conn->lock);
+	list_for_each_entry(e, &conn->names_list, conn_entry) {
+		struct kdbus_item *item;
+		size_t len;
+
+		len = strlen(e->name) + 1;
+		item = kdbus_meta_append_item(meta, KDBUS_ITEM_OWNED_NAME,
+					      sizeof(struct kdbus_name) + len);
+		if (IS_ERR(item)) {
+			ret = PTR_ERR(item);
+			break;
+		}
+
+		item->name.flags = e->flags;
+		memcpy(item->name.name, e->name, len);
+	}
+	mutex_unlock(&conn->lock);
+
+	return ret;
+}
+
+static int kdbus_meta_append_exe(struct kdbus_meta *meta)
+{
+	struct mm_struct *mm = get_task_mm(current);
+	struct path *exe_path = NULL;
+	char *pathname;
+	int ret = 0;
+	size_t len;
+	char *tmp;
+
+	if (!mm)
+		return -EFAULT;
+
+	down_read(&mm->mmap_sem);
+	if (mm->exe_file) {
+		path_get(&mm->exe_file->f_path);
+		exe_path = &mm->exe_file->f_path;
+	}
+	up_read(&mm->mmap_sem);
+
+	if (!exe_path)
+		goto exit_mmput;
+
+	tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
+	if (!tmp) {
+		ret = -ENOMEM;
+		goto exit_path_put;
+	}
+
+	pathname = d_path(exe_path, tmp, PAGE_SIZE);
+	if (IS_ERR(pathname)) {
+		ret = PTR_ERR(pathname);
+		goto exit_free_page;
+	}
+
+	len = tmp + PAGE_SIZE - pathname;
+	ret = kdbus_meta_append_data(meta, KDBUS_ITEM_EXE, pathname, len);
+
+exit_free_page:
+	free_page((unsigned long) tmp);
+
+exit_path_put:
+	path_put(exe_path);
+
+exit_mmput:
+	mmput(mm);
+
+	return ret;
+}
+
+static int kdbus_meta_append_cmdline(struct kdbus_meta *meta)
+{
+	struct mm_struct *mm;
+	int ret = 0;
+	size_t len;
+	char *tmp;
+
+	tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
+	if (!tmp)
+		return -ENOMEM;
+
+	mm = get_task_mm(current);
+	if (!mm) {
+		ret = -EFAULT;
+		goto exit_free_page;
+	}
+
+	if (!mm->arg_end)
+		goto exit_mmput;
+
+	len = mm->arg_end - mm->arg_start;
+	if (len > PAGE_SIZE)
+		len = PAGE_SIZE;
+
+	ret = copy_from_user(tmp, (const char __user *)mm->arg_start, len);
+	if (ret < 0)
+		goto exit_mmput;
+
+	ret = kdbus_meta_append_data(meta, KDBUS_ITEM_CMDLINE, tmp, len);
+
+exit_mmput:
+	mmput(mm);
+
+exit_free_page:
+	free_page((unsigned long) tmp);
+
+	return ret;
+}
+
+static int kdbus_meta_append_caps(struct kdbus_meta *meta)
+{
+	struct caps {
+		u32 last_cap;
+		struct {
+			u32 caps[_KERNEL_CAPABILITY_U32S];
+		} set[4];
+	} caps;
+	unsigned int i;
+	const struct cred *cred = current_cred();
+
+	caps.last_cap = CAP_LAST_CAP;
+
+	for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
+		caps.set[0].caps[i] = cred->cap_inheritable.cap[i];
+		caps.set[1].caps[i] = cred->cap_permitted.cap[i];
+		caps.set[2].caps[i] = cred->cap_effective.cap[i];
+		caps.set[3].caps[i] = cred->cap_bset.cap[i];
+	}
+
+	/* clear unused bits */
+	for (i = 0; i < 4; i++)
+		caps.set[i].caps[CAP_TO_INDEX(CAP_LAST_CAP)] &=
+			CAP_TO_MASK(CAP_LAST_CAP + 1) - 1;
+
+	return kdbus_meta_append_data(meta, KDBUS_ITEM_CAPS,
+				      &caps, sizeof(caps));
+}
+
+#ifdef CONFIG_CGROUPS
+static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
+{
+	char *buf, *path;
+	int ret;
+
+	buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
+	if (!buf)
+		return -ENOMEM;
+
+	path = task_cgroup_path(current, buf, PAGE_SIZE);
+
+	if (path)
+		ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, path);
+	else
+		ret = -ENAMETOOLONG;
+
+	free_page((unsigned long) buf);
+
+	return ret;
+}
+#endif
+
+#ifdef CONFIG_AUDITSYSCALL
+static int kdbus_meta_append_audit(struct kdbus_meta *meta,
+				   const struct kdbus_domain *domain)
+{
+	struct kdbus_audit audit;
+
+	audit.loginuid = from_kuid_munged(domain->user_namespace,
+					  audit_get_loginuid(current));
+	audit.sessionid = audit_get_sessionid(current);
+
+	return kdbus_meta_append_data(meta, KDBUS_ITEM_AUDIT,
+				      &audit, sizeof(audit));
+}
+#endif
+
+#ifdef CONFIG_SECURITY
+static int kdbus_meta_append_seclabel(struct kdbus_meta *meta)
+{
+	u32 len, sid;
+	char *label;
+	int ret;
+
+	security_task_getsecid(current, &sid);
+	ret = security_secid_to_secctx(sid, &label, &len);
+	if (ret == -EOPNOTSUPP)
+		return 0;
+	if (ret < 0)
+		return ret;
+
+	if (label && len > 0)
+		ret = kdbus_meta_append_data(meta, KDBUS_ITEM_SECLABEL,
+					     label, len);
+	security_release_secctx(label, len);
+
+	return ret;
+}
+#endif
+
+/**
+ * kdbus_meta_append() - collect metadata from current process
+ * @meta:		Metadata object
+ * @domain:		The domain to use for namespace translations
+ * @conn:		Current connection to read names from
+ * @seq:		Message sequence number
+ * @which:		KDBUS_ATTACH_* flags which typ of data to attach
+ *
+ * Collect the data specified in flags and allocate or extend
+ * the buffer in the metadata object.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_meta_append(struct kdbus_meta *meta,
+		      struct kdbus_domain *domain,
+		      struct kdbus_conn *conn,
+		      u64 seq, u64 which)
+{
+	int ret;
+	u64 mask;
+
+	/* which metadata is wanted but not yet attached? */
+	mask = which & ~meta->attached;
+	if (mask == 0)
+		return 0;
+
+	if (mask & KDBUS_ATTACH_TIMESTAMP) {
+		ret = kdbus_meta_append_timestamp(meta, seq);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_TIMESTAMP;
+	}
+
+	if (mask & KDBUS_ATTACH_CREDS) {
+		ret = kdbus_meta_append_cred(meta, domain);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_CREDS;
+	}
+
+	if (mask & KDBUS_ATTACH_AUXGROUPS) {
+		ret = kdbus_meta_append_auxgroups(meta, domain);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_AUXGROUPS;
+	}
+
+	if (mask & KDBUS_ATTACH_NAMES && conn) {
+		ret = kdbus_meta_append_src_names(meta, conn);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_NAMES;
+	}
+
+	if (mask & KDBUS_ATTACH_TID_COMM) {
+		char comm[TASK_COMM_LEN];
+
+		get_task_comm(comm, current->group_leader);
+		ret = kdbus_meta_append_str(meta, KDBUS_ITEM_TID_COMM, comm);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_TID_COMM;
+	}
+
+	if (mask & KDBUS_ATTACH_PID_COMM) {
+		char comm[TASK_COMM_LEN];
+
+		get_task_comm(comm, current);
+		ret = kdbus_meta_append_str(meta, KDBUS_ITEM_PID_COMM, comm);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_PID_COMM;
+	}
+
+	if (mask & KDBUS_ATTACH_EXE) {
+		ret = kdbus_meta_append_exe(meta);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_EXE;
+	}
+
+	if (mask & KDBUS_ATTACH_CMDLINE) {
+		ret = kdbus_meta_append_cmdline(meta);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_CMDLINE;
+	}
+
+	/* we always return a 4 elements, the element size is 1/4  */
+	if (mask & KDBUS_ATTACH_CAPS) {
+		ret = kdbus_meta_append_caps(meta);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_CAPS;
+	}
+
+#ifdef CONFIG_CGROUPS
+	/* attach the path of the one group hierarchy specified for the bus */
+	if (mask & KDBUS_ATTACH_CGROUP) {
+		ret = kdbus_meta_append_cgroup(meta);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_CGROUP;
+	}
+#endif
+
+#ifdef CONFIG_AUDITSYSCALL
+	if (mask & KDBUS_ATTACH_AUDIT) {
+		ret = kdbus_meta_append_audit(meta, domain);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_AUDIT;
+	}
+#endif
+
+#ifdef CONFIG_SECURITY
+	if (mask & KDBUS_ATTACH_SECLABEL) {
+		ret = kdbus_meta_append_seclabel(meta);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_SECLABEL;
+	}
+#endif
+
+	if ((mask & KDBUS_ATTACH_CONN_DESCRIPTION) && conn && conn->name) {
+		ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CONN_DESCRIPTION,
+					    conn->name);
+		if (ret < 0)
+			return ret;
+
+		meta->attached |= KDBUS_ATTACH_CONN_DESCRIPTION;
+	}
+
+	return 0;
+}
+
+static inline u64 kdbus_item_attach_flag(u64 type)
+{
+	BUG_ON(type < _KDBUS_ITEM_ATTACH_BASE);
+	BUG_ON(type >= _KDBUS_ITEM_ATTACH_BASE + 64);
+	return 1ULL << (type - _KDBUS_ITEM_ATTACH_BASE);
+}
+
+/**
+ * kdbus_meta_size() - calculate the size of an excerpt of a metadata db
+ * @meta:	The database object containing the metadata
+ * @conn_dst:	The connection that is about to receive the data
+ * @mask:	Pointer to KDBUS_ATTACH_* bitmask to calculate the size for.
+ *		Callers *must* use the same mask for calls to
+ *		kdbus_meta_write().
+ *
+ * Return: the size in bytes the masked data will consume. Data that should
+ * not received by @conn_dst will be filtered out.
+ */
+size_t kdbus_meta_size(const struct kdbus_meta *meta,
+		       const struct kdbus_conn *conn_dst,
+		       u64 *mask)
+{
+	struct kdbus_domain *domain = conn_dst->ep->bus->domain;
+	const struct kdbus_item *item;
+	size_t size = 0;
+
+	/*
+	 * We currently don't have a way to translate capability flags between
+	 * user namespaces, so let's drop these items in such cases.
+	 */
+	if (domain->user_namespace != current_user_ns())
+		*mask &= ~KDBUS_ATTACH_CAPS;
+
+	/*
+	 * If the domain was created with hide_pid enabled, drop all items
+	 * except for such not revealing anything about the task.
+	 */
+	if (domain->pid_namespace->hide_pid)
+		*mask &= KDBUS_ATTACH_TIMESTAMP | KDBUS_ATTACH_NAMES |
+			 KDBUS_ATTACH_CONN_DESCRIPTION;
+
+	KDBUS_ITEMS_FOREACH(item, meta->data, meta->size)
+		if (*mask & kdbus_item_attach_flag(item->type))
+			size += KDBUS_ALIGN8(item->size);
+
+	return size;
+}
+
+/**
+ * kdbus_meta_write() - Write an excerpt of a metadata db to a slice
+ * @meta:	The database object containing the metadata
+ * @conn_dst:	The connection that is about to receive the data
+ * @mask:	KDBUS_ATTACH_* bitmask to calculate the size for
+ * @slice:	The slice to copy the data to
+ * @off:	The initial offset in the slice to write to
+ *
+ * Copies some of the metadata's items to @slice, if that the item is
+ * suitable for @conn_dst to be received. Otherwise, the item is omitted.
+ * Privided the same input parameters, this function will write exactly as
+ * many bytes as reported by kdbus_meta_size().
+ *
+ * Return: 0 on success, negative error number otherwise.
+ */
+int kdbus_meta_write(const struct kdbus_meta *meta,
+		     const struct kdbus_conn *conn_dst, u64 mask,
+		     const struct kdbus_pool_slice *slice, size_t off)
+{
+	const struct kdbus_item *item;
+	int ret;
+
+	KDBUS_ITEMS_FOREACH(item, meta->data, meta->size)
+		if (mask & kdbus_item_attach_flag(item->type)) {
+			ret = kdbus_pool_slice_copy(slice, off, item,
+						    KDBUS_ALIGN8(item->size));
+			if (ret < 0)
+				return ret;
+
+			off += KDBUS_ALIGN8(item->size);
+		}
+
+	return 0;
+}
diff --git a/ipc/kdbus/metadata.h b/ipc/kdbus/metadata.h
new file mode 100644
index 000000000000..db186bb23d31
--- /dev/null
+++ b/ipc/kdbus/metadata.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_METADATA_H
+#define __KDBUS_METADATA_H
+
+struct kdbus_meta;
+struct kdbus_conn;
+struct kdbus_domain;
+struct kdbus_pool_slice;
+
+struct kdbus_meta *kdbus_meta_new(void);
+struct kdbus_meta *kdbus_meta_dup(const struct kdbus_meta *orig);
+int kdbus_meta_append_data(struct kdbus_meta *meta, u64 type,
+			   const void *buf, size_t len);
+int kdbus_meta_append(struct kdbus_meta *meta,
+		      struct kdbus_domain *domain,
+		      struct kdbus_conn *conn,
+		      u64 seq, u64 which);
+void kdbus_meta_free(struct kdbus_meta *meta);
+size_t kdbus_meta_size(const struct kdbus_meta *meta,
+		       const struct kdbus_conn *conn_dst,
+		       u64 *mask);
+int kdbus_meta_write(const struct kdbus_meta *meta,
+		     const struct kdbus_conn *conn_dst, u64 mask,
+		     const struct kdbus_pool_slice *slice, size_t off);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add code for notifications and matches
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch adds code for matches and notifications.

Notifications are broadcast messages generated by the kernel, which
notify subscribes when connections are created or destroyed, when
well-known-names have been claimed, released or changed ownership,
or when reply messages have timed out.

Matches are used to tell the kernel driver which broadcast messages
a connection is interested in. Matches can either be specific on one
of the kernel-generated notification types, or carry a bloom filter
mask to match against a message from userspace. The latter is a way
to pre-filter messages from other connections in order to mitigate
unnecessary wakeups.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/match.c  | 524 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/match.h  |  31 ++++
 ipc/kdbus/notify.c | 235 ++++++++++++++++++++++++
 ipc/kdbus/notify.h |  29 +++
 4 files changed, 819 insertions(+)
 create mode 100644 ipc/kdbus/match.c
 create mode 100644 ipc/kdbus/match.h
 create mode 100644 ipc/kdbus/notify.c
 create mode 100644 ipc/kdbus/notify.h

diff --git a/ipc/kdbus/match.c b/ipc/kdbus/match.c
new file mode 100644
index 000000000000..e70ed28eff3f
--- /dev/null
+++ b/ipc/kdbus/match.c
@@ -0,0 +1,524 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ * Copyright (C) 2014 Djalal Harouni <tixxdz@opendz.org>
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/hash.h>
+#include <linux/init.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "endpoint.h"
+#include "item.h"
+#include "match.h"
+#include "message.h"
+
+/**
+ * struct kdbus_match_db - message filters
+ * @entries_list:	List of matches
+ * @mdb_rwlock:		Match data lock
+ * @entries_count:	Number of entries in database
+ */
+struct kdbus_match_db {
+	struct list_head entries_list;
+	struct rw_semaphore mdb_rwlock;
+	unsigned int entries_count;
+};
+
+/**
+ * struct kdbus_match_entry - a match database entry
+ * @cookie:		User-supplied cookie to lookup the entry
+ * @list_entry:		The list entry element for the db list
+ * @rules_list:		The list head for tracking rules of this entry
+ */
+struct kdbus_match_entry {
+	u64 cookie;
+	struct list_head list_entry;
+	struct list_head rules_list;
+};
+
+/**
+ * struct kdbus_bloom_mask - mask to match against filter
+ * @generations:	Number of generations carried
+ * @data:		Array of bloom bit fields
+ */
+struct kdbus_bloom_mask {
+	u64 generations;
+	u64 *data;
+};
+
+/**
+ * struct kdbus_match_rule - a rule appended to a match entry
+ * @type:		An item type to match agains
+ * @bloom_mask:		Bloom mask to match a message's filter against, used
+ *			with KDBUS_ITEM_BLOOM_MASK
+ * @name:		Name to match against, used with KDBUS_ITEM_NAME,
+ *			KDBUS_ITEM_NAME_{ADD,REMOVE,CHANGE}
+ * @old_id:		ID to match against, used with
+ *			KDBUS_ITEM_NAME_{ADD,REMOVE,CHANGE},
+ *			KDBUS_ITEM_ID_REMOVE
+ * @new_id:		ID to match against, used with
+ *			KDBUS_ITEM_NAME_{ADD,REMOVE,CHANGE},
+ *			KDBUS_ITEM_ID_REMOVE
+ * @src_id:		ID to match against, used with KDBUS_ITEM_ID
+ * @rules_entry:	Entry in the entry's rules list
+ */
+struct kdbus_match_rule {
+	u64 type;
+	union {
+		struct kdbus_bloom_mask bloom_mask;
+		struct {
+			char *name;
+			u64 old_id;
+			u64 new_id;
+		};
+		u64 src_id;
+	};
+	struct list_head rules_entry;
+};
+
+static void kdbus_match_rule_free(struct kdbus_match_rule *rule)
+{
+	switch (rule->type) {
+	case KDBUS_ITEM_BLOOM_MASK:
+		kfree(rule->bloom_mask.data);
+		break;
+
+	case KDBUS_ITEM_NAME:
+	case KDBUS_ITEM_NAME_ADD:
+	case KDBUS_ITEM_NAME_REMOVE:
+	case KDBUS_ITEM_NAME_CHANGE:
+		kfree(rule->name);
+		break;
+
+	case KDBUS_ITEM_ID:
+	case KDBUS_ITEM_ID_ADD:
+	case KDBUS_ITEM_ID_REMOVE:
+		break;
+
+	default:
+		BUG();
+	}
+
+	list_del(&rule->rules_entry);
+	kfree(rule);
+}
+
+static void kdbus_match_entry_free(struct kdbus_match_entry *entry)
+{
+	struct kdbus_match_rule *r, *tmp;
+
+	list_for_each_entry_safe(r, tmp, &entry->rules_list, rules_entry)
+		kdbus_match_rule_free(r);
+
+	list_del(&entry->list_entry);
+	kfree(entry);
+}
+
+/**
+ * kdbus_match_db_free() - free match db resources
+ * @mdb:		The match database
+ */
+void kdbus_match_db_free(struct kdbus_match_db *mdb)
+{
+	struct kdbus_match_entry *entry, *tmp;
+
+	down_write(&mdb->mdb_rwlock);
+	list_for_each_entry_safe(entry, tmp, &mdb->entries_list, list_entry)
+		kdbus_match_entry_free(entry);
+	up_write(&mdb->mdb_rwlock);
+
+	kfree(mdb);
+}
+
+/**
+ * kdbus_match_db_new() - create a new match database
+ *
+ * Return: a new kdbus_match_db on success, ERR_PTR on failure.
+ */
+struct kdbus_match_db *kdbus_match_db_new(void)
+{
+	struct kdbus_match_db *d;
+
+	d = kzalloc(sizeof(*d), GFP_KERNEL);
+	if (!d)
+		return ERR_PTR(-ENOMEM);
+
+	init_rwsem(&d->mdb_rwlock);
+	INIT_LIST_HEAD(&d->entries_list);
+
+	return d;
+}
+
+static bool kdbus_match_bloom(const struct kdbus_bloom_filter *filter,
+			      const struct kdbus_bloom_mask *mask,
+			      const struct kdbus_conn *conn)
+{
+	size_t n = conn->ep->bus->bloom.size / sizeof(u64);
+	const u64 *m;
+	size_t i;
+
+	/*
+	 * The message's filter carries a generation identifier, the
+	 * match's mask possibly carries an array of multiple generations
+	 * of the mask. Select the mask with the closest match of the
+	 * filter's generation.
+	 */
+	m = mask->data + (min(filter->generation, mask->generations - 1) * n);
+
+	/*
+	 * The message's filter contains the messages properties,
+	 * the match's mask contains the properties to look for in the
+	 * message. Check the mask bit field against the filter bit field,
+	 * if the message possibly carries the properties the connection
+	 * has subscribed to.
+	 */
+	for (i = 0; i < n; i++)
+		if ((filter->data[i] & m[i]) != m[i])
+			return false;
+
+	return true;
+}
+
+static bool kdbus_match_rules(const struct kdbus_match_entry *entry,
+			      struct kdbus_conn *conn_src,
+			      struct kdbus_kmsg *kmsg)
+{
+	struct kdbus_match_rule *r;
+
+	/*
+	 * Walk all the rules and bail out immediately
+	 * if any of them is unsatisfied.
+	 */
+
+	list_for_each_entry(r, &entry->rules_list, rules_entry) {
+		if (conn_src == NULL) {
+			/* kernel notifications */
+
+			if (kmsg->notify_type != r->type)
+				return false;
+
+			switch (r->type) {
+			case KDBUS_ITEM_ID_ADD:
+				if (r->new_id != KDBUS_MATCH_ID_ANY &&
+				    r->new_id != kmsg->notify_new_id)
+					return false;
+
+				break;
+
+			case KDBUS_ITEM_ID_REMOVE:
+				if (r->old_id != KDBUS_MATCH_ID_ANY &&
+				    r->old_id != kmsg->notify_old_id)
+					return false;
+
+				break;
+
+			case KDBUS_ITEM_NAME_ADD:
+			case KDBUS_ITEM_NAME_CHANGE:
+			case KDBUS_ITEM_NAME_REMOVE:
+				if ((r->old_id != KDBUS_MATCH_ID_ANY &&
+				     r->old_id != kmsg->notify_old_id) ||
+				    (r->new_id != KDBUS_MATCH_ID_ANY &&
+				     r->new_id != kmsg->notify_new_id) ||
+				    (r->name && kmsg->notify_name &&
+				     strcmp(r->name, kmsg->notify_name) != 0))
+					return false;
+
+				break;
+
+			default:
+				return false;
+			}
+		} else {
+			/* messages from userspace */
+
+			switch (r->type) {
+			case KDBUS_ITEM_BLOOM_MASK:
+				if (!kdbus_match_bloom(kmsg->bloom_filter,
+						       &r->bloom_mask,
+						       conn_src))
+					return false;
+				break;
+
+			case KDBUS_ITEM_ID:
+				if (r->src_id != conn_src->id &&
+				    r->src_id != KDBUS_MATCH_ID_ANY)
+					return false;
+
+				break;
+
+			case KDBUS_ITEM_NAME:
+				if (!kdbus_conn_has_name(conn_src, r->name))
+					return false;
+
+				break;
+
+			default:
+				return false;
+			}
+		}
+	}
+
+	return true;
+}
+
+/**
+ * kdbus_match_db_match_kmsg() - match a kmsg object agains the database entries
+ * @mdb:		The match database
+ * @conn_src:		The connection object originating the message
+ * @kmsg:		The kmsg to perform the match on
+ *
+ * This function will walk through all the database entries previously uploaded
+ * with kdbus_match_db_add(). As soon as any of them has an all-satisfied rule
+ * set, this function will return true.
+ *
+ * Return: true if there was a matching database entry, false otherwise.
+ */
+bool kdbus_match_db_match_kmsg(struct kdbus_match_db *mdb,
+			       struct kdbus_conn *conn_src,
+			       struct kdbus_kmsg *kmsg)
+{
+	struct kdbus_match_entry *entry;
+	bool matched = false;
+
+	down_read(&mdb->mdb_rwlock);
+	list_for_each_entry(entry, &mdb->entries_list, list_entry) {
+		matched = kdbus_match_rules(entry, conn_src, kmsg);
+		if (matched)
+			break;
+	}
+	up_read(&mdb->mdb_rwlock);
+
+	return matched;
+}
+
+static int kdbus_match_db_remove_unlocked(struct kdbus_match_db *mdb,
+					  uint64_t cookie)
+{
+	struct kdbus_match_entry *entry, *tmp;
+	bool found = false;
+
+	list_for_each_entry_safe(entry, tmp, &mdb->entries_list, list_entry)
+		if (entry->cookie == cookie) {
+			kdbus_match_entry_free(entry);
+			--mdb->entries_count;
+			found = true;
+		}
+
+	return found ? 0 : -ENOENT;
+}
+
+/**
+ * kdbus_match_db_add() - add an entry to the match database
+ * @conn:		The connection that was used in the ioctl call
+ * @cmd:		The command as provided by the ioctl call
+ *
+ * This function is used in the context of the KDBUS_CMD_MATCH_ADD ioctl
+ * interface.
+ *
+ * One call to this function (or one ioctl(KDBUS_CMD_MATCH_ADD), respectively,
+ * adds one new database entry with n rules attached to it. Each rule is
+ * described with an kdbus_item, and an entry is considered matching if all
+ * its rules are satisfied.
+ *
+ * The items attached to a kdbus_cmd_match struct have the following mapping:
+ *
+ * KDBUS_ITEM_BLOOM_MASK:	A bloom mask
+ * KDBUS_ITEM_NAME:		A connection's source name
+ * KDBUS_ITEM_ID:		A connection ID
+ * KDBUS_ITEM_NAME_ADD:
+ * KDBUS_ITEM_NAME_REMOVE:
+ * KDBUS_ITEM_NAME_CHANGE:	Well-known name changes, carry
+ *				kdbus_notify_name_change
+ * KDBUS_ITEM_ID_ADD:
+ * KDBUS_ITEM_ID_REMOVE:	Connection ID changes, carry
+ *				kdbus_notify_id_change
+ *
+ * For kdbus_notify_{id,name}_change structs, only the ID and name fields
+ * are looked at at when adding an entry. The flags are unused.
+ *
+ * Also note that KDBUS_ITEM_BLOOM_MASK, KDBUS_ITEM_NAME and KDBUS_ITEM_ID
+ * are used to match messages from userspace, while the others apply to
+ * kernel-generated notifications.
+ *
+ * Return: 0 on success, negative errno on failure
+ */
+int kdbus_match_db_add(struct kdbus_conn *conn,
+		       struct kdbus_cmd_match *cmd)
+{
+	struct kdbus_match_entry *entry = NULL;
+	struct kdbus_match_db *mdb = conn->match_db;
+	struct kdbus_item *item;
+	int ret = 0;
+
+	lockdep_assert_held(conn);
+
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return -ENOMEM;
+
+	entry->cookie = cmd->cookie;
+	INIT_LIST_HEAD(&entry->list_entry);
+	INIT_LIST_HEAD(&entry->rules_list);
+
+	KDBUS_ITEMS_FOREACH(item, cmd->items, KDBUS_ITEMS_SIZE(cmd, items)) {
+		struct kdbus_match_rule *rule;
+		size_t size = item->size - offsetof(struct kdbus_item, data);
+
+		rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+		if (!rule) {
+			ret = -ENOMEM;
+			break;
+		}
+
+		switch (item->type) {
+		/* First matches for userspace messages */
+		case KDBUS_ITEM_BLOOM_MASK: {
+			u64 bsize = conn->ep->bus->bloom.size;
+			u64 generations;
+			u64 remainder;
+
+			generations = div64_u64_rem(size, bsize, &remainder);
+			if (size < bsize || remainder > 0) {
+				ret = -EDOM;
+				break;
+			}
+
+			rule->bloom_mask.data = kmemdup(item->data,
+							size, GFP_KERNEL);
+			if (!rule->bloom_mask.data) {
+				ret = -ENOMEM;
+				break;
+			}
+
+			/* we get an array of n generations of bloom masks */
+			rule->bloom_mask.generations = generations;
+
+			break;
+		}
+
+		case KDBUS_ITEM_NAME:
+			/*
+			 * Do not allow wildcard for now, since we
+			 * must validate the wildcard first
+			 */
+			if (!kdbus_name_is_valid(item->str, false)) {
+				ret = -EINVAL;
+				break;
+			}
+
+			rule->name = kstrdup(item->str, GFP_KERNEL);
+			if (!rule->name)
+				ret = -ENOMEM;
+
+			break;
+
+		case KDBUS_ITEM_ID:
+			rule->src_id = item->id;
+			break;
+
+		/* Now matches for kernel messages */
+		case KDBUS_ITEM_NAME_ADD:
+		case KDBUS_ITEM_NAME_REMOVE:
+		case KDBUS_ITEM_NAME_CHANGE: {
+			rule->old_id = item->name_change.old_id.id;
+			rule->new_id = item->name_change.new_id.id;
+
+			if (size > sizeof(struct kdbus_notify_name_change)) {
+				rule->name = kstrdup(item->name_change.name,
+						     GFP_KERNEL);
+				if (!rule->name)
+					ret = -ENOMEM;
+			}
+
+			break;
+		}
+
+		case KDBUS_ITEM_ID_ADD:
+		case KDBUS_ITEM_ID_REMOVE:
+			if (item->type == KDBUS_ITEM_ID_ADD)
+				rule->new_id = item->id_change.id;
+			else
+				rule->old_id = item->id_change.id;
+
+			break;
+
+		default:
+			ret = -EINVAL;
+			break;
+		}
+
+		if (ret < 0) {
+			kfree(rule);
+			break;
+		}
+
+		rule->type = item->type;
+
+		list_add_tail(&rule->rules_entry, &entry->rules_list);
+	}
+
+	down_write(&mdb->mdb_rwlock);
+
+	/* Remove any entry that has the same cookie as the current one. */
+	if (cmd->flags & KDBUS_MATCH_REPLACE)
+		kdbus_match_db_remove_unlocked(mdb, entry->cookie);
+
+	/*
+	 * If the above removal caught any entry, there will be room for the
+	 * new one.
+	 */
+	if (++mdb->entries_count > KDBUS_MATCH_MAX) {
+		--mdb->entries_count;
+		ret = -EMFILE;
+	}
+
+	if (ret == 0)
+		list_add_tail(&entry->list_entry, &mdb->entries_list);
+	else
+		kdbus_match_entry_free(entry);
+
+	up_write(&mdb->mdb_rwlock);
+
+	return ret;
+}
+
+/**
+ * kdbus_match_db_remove() - remove an entry from the match database
+ * @conn:		The connection that was used in the ioctl call
+ * @cmd:		Pointer to the match data structure
+ *
+ * This function is used in the context of the KDBUS_CMD_MATCH_REMOVE
+ * ioctl interface.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_match_db_remove(struct kdbus_conn *conn,
+			  struct kdbus_cmd_match *cmd)
+{
+	struct kdbus_match_db *mdb = conn->match_db;
+	int ret;
+
+	lockdep_assert_held(conn);
+
+	down_write(&mdb->mdb_rwlock);
+	ret = kdbus_match_db_remove_unlocked(mdb, cmd->cookie);
+	up_write(&mdb->mdb_rwlock);
+
+	return ret;
+}
diff --git a/ipc/kdbus/match.h b/ipc/kdbus/match.h
new file mode 100644
index 000000000000..16b982ce607d
--- /dev/null
+++ b/ipc/kdbus/match.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_MATCH_H
+#define __KDBUS_MATCH_H
+
+struct kdbus_conn;
+struct kdbus_kmsg;
+struct kdbus_match_db;
+
+struct kdbus_match_db *kdbus_match_db_new(void);
+void kdbus_match_db_free(struct kdbus_match_db *db);
+int kdbus_match_db_add(struct kdbus_conn *conn,
+		       struct kdbus_cmd_match *cmd);
+int kdbus_match_db_remove(struct kdbus_conn *conn,
+			  struct kdbus_cmd_match *cmd);
+bool kdbus_match_db_match_kmsg(struct kdbus_match_db *db,
+			       struct kdbus_conn *conn_src,
+			       struct kdbus_kmsg *kmsg);
+
+#endif
diff --git a/ipc/kdbus/notify.c b/ipc/kdbus/notify.c
new file mode 100644
index 000000000000..e212ec11927c
--- /dev/null
+++ b/ipc/kdbus/notify.c
@@ -0,0 +1,235 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/spinlock.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "item.h"
+#include "message.h"
+#include "notify.h"
+
+static int kdbus_notify_reply(struct kdbus_bus *bus, u64 id,
+			      u64 cookie, u64 msg_type)
+{
+	struct kdbus_kmsg *kmsg = NULL;
+
+	BUG_ON(id == 0);
+
+	kmsg = kdbus_kmsg_new(0);
+	if (IS_ERR(kmsg))
+		return PTR_ERR(kmsg);
+
+	/*
+	 * a kernel-generated notification can only contain one
+	 * struct kdbus_item, so make a shortcut here for
+	 * faster lookup in the match db.
+	 */
+	kmsg->notify_type = msg_type;
+	kmsg->msg.dst_id = id;
+	kmsg->msg.src_id = KDBUS_SRC_ID_KERNEL;
+	kmsg->msg.payload_type = KDBUS_PAYLOAD_KERNEL;
+	kmsg->msg.cookie_reply = cookie;
+	kmsg->msg.items[0].type = msg_type;
+
+	spin_lock(&bus->notify_lock);
+	list_add_tail(&kmsg->notify_entry, &bus->notify_list);
+	spin_unlock(&bus->notify_lock);
+	return 0;
+}
+
+/**
+ * kdbus_notify_reply_timeout() - queue a timeout reply
+ * @bus:		Bus which queues the messages
+ * @id:			The destination's connection ID
+ * @cookie:		The cookie to set in the reply.
+ *
+ * Queues a message that has a KDBUS_ITEM_REPLY_TIMEOUT item attached.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_notify_reply_timeout(struct kdbus_bus *bus, u64 id, u64 cookie)
+{
+	return kdbus_notify_reply(bus, id, cookie, KDBUS_ITEM_REPLY_TIMEOUT);
+}
+
+/**
+ * kdbus_notify_reply_dead() - queue a 'dead' reply
+ * @bus:		Bus which queues the messages
+ * @id:			The destination's connection ID
+ * @cookie:		The cookie to set in the reply.
+ *
+ * Queues a message that has a KDBUS_ITEM_REPLY_DEAD item attached.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_notify_reply_dead(struct kdbus_bus *bus, u64 id, u64 cookie)
+{
+	return kdbus_notify_reply(bus, id, cookie, KDBUS_ITEM_REPLY_DEAD);
+}
+
+/**
+ * kdbus_notify_name_change() - queue a notification about a name owner change
+ * @bus:		Bus which queues the messages
+ * @type:		The type if the notification; KDBUS_ITEM_NAME_ADD,
+ *			KDBUS_ITEM_NAME_CHANGE or KDBUS_ITEM_NAME_REMOVE
+ * @old_id:		The id of the connection that used to own the name
+ * @new_id:		The id of the new owner connection
+ * @old_flags:		The flags to pass in the KDBUS_ITEM flags field for
+ *			the old owner
+ * @new_flags:		The flags to pass in the KDBUS_ITEM flags field for
+ *			the new owner
+ * @name:		The name that was removed or assigned to a new owner
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_notify_name_change(struct kdbus_bus *bus, u64 type,
+			     u64 old_id, u64 new_id,
+			     u64 old_flags, u64 new_flags,
+			     const char *name)
+{
+	struct kdbus_kmsg *kmsg = NULL;
+	size_t name_len, extra_size;
+
+	name_len = strlen(name) + 1;
+	extra_size = sizeof(struct kdbus_notify_name_change) + name_len;
+	kmsg = kdbus_kmsg_new(extra_size);
+	if (IS_ERR(kmsg))
+		return PTR_ERR(kmsg);
+
+	kmsg->msg.dst_id = KDBUS_DST_ID_BROADCAST;
+	kmsg->msg.src_id = KDBUS_SRC_ID_KERNEL;
+	kmsg->msg.payload_type = KDBUS_PAYLOAD_KERNEL;
+	kmsg->notify_type = type;
+	kmsg->notify_old_id = old_id;
+	kmsg->notify_new_id = new_id;
+	kmsg->msg.items[0].type = type;
+	kmsg->msg.items[0].name_change.old_id.id = old_id;
+	kmsg->msg.items[0].name_change.old_id.flags = old_flags;
+	kmsg->msg.items[0].name_change.new_id.id = new_id;
+	kmsg->msg.items[0].name_change.new_id.flags = new_flags;
+	memcpy(kmsg->msg.items[0].name_change.name, name, name_len);
+	kmsg->notify_name = kmsg->msg.items[0].name_change.name;
+
+	spin_lock(&bus->notify_lock);
+	list_add_tail(&kmsg->notify_entry, &bus->notify_list);
+	spin_unlock(&bus->notify_lock);
+	return 0;
+}
+
+/**
+ * kdbus_notify_id_change() - queue a notification about a unique ID change
+ * @bus:		Bus which queues the messages
+ * @type:		The type if the notification; KDBUS_ITEM_ID_ADD or
+ *			KDBUS_ITEM_ID_REMOVE
+ * @id:			The id of the connection that was added or removed
+ * @flags:		The flags to pass in the KDBUS_ITEM flags field
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_notify_id_change(struct kdbus_bus *bus, u64 type, u64 id, u64 flags)
+{
+	struct kdbus_kmsg *kmsg = NULL;
+
+	kmsg = kdbus_kmsg_new(sizeof(struct kdbus_notify_id_change));
+	if (IS_ERR(kmsg))
+		return PTR_ERR(kmsg);
+
+	kmsg->msg.dst_id = KDBUS_DST_ID_BROADCAST;
+	kmsg->msg.src_id = KDBUS_SRC_ID_KERNEL;
+	kmsg->msg.payload_type = KDBUS_PAYLOAD_KERNEL;
+	kmsg->notify_type = type;
+
+	switch (type) {
+	case KDBUS_ITEM_ID_ADD:
+		kmsg->notify_new_id = id;
+		break;
+
+	case KDBUS_ITEM_ID_REMOVE:
+		kmsg->notify_old_id = id;
+		break;
+
+	default:
+		BUG();
+	}
+
+	kmsg->msg.items[0].type = type;
+	kmsg->msg.items[0].id_change.id = id;
+	kmsg->msg.items[0].id_change.flags = flags;
+
+	spin_lock(&bus->notify_lock);
+	list_add_tail(&kmsg->notify_entry, &bus->notify_list);
+	spin_unlock(&bus->notify_lock);
+	return 0;
+}
+
+/**
+ * kdbus_notify_flush() - send a list of collected messages
+ * @bus:		Bus which queues the messages
+ *
+ * The list is empty after sending the messages.
+ */
+void kdbus_notify_flush(struct kdbus_bus *bus)
+{
+	LIST_HEAD(notify_list);
+	struct kdbus_kmsg *kmsg, *tmp;
+
+	mutex_lock(&bus->notify_flush_lock);
+
+	spin_lock(&bus->notify_lock);
+	list_splice_init(&bus->notify_list, &notify_list);
+	spin_unlock(&bus->notify_lock);
+
+	list_for_each_entry_safe(kmsg, tmp, &notify_list, notify_entry) {
+		kmsg->seq = atomic64_inc_return(&bus->domain->msg_seq_last);
+
+		if (kmsg->msg.dst_id != KDBUS_DST_ID_BROADCAST) {
+			struct kdbus_conn *conn;
+
+			conn = kdbus_bus_find_conn_by_id(bus, kmsg->msg.dst_id);
+			if (conn) {
+				kdbus_conn_entry_insert(NULL, conn, kmsg, NULL);
+				kdbus_conn_unref(conn);
+			}
+		} else {
+			kdbus_bus_broadcast(bus, NULL, kmsg);
+		}
+
+		list_del(&kmsg->notify_entry);
+		kdbus_kmsg_free(kmsg);
+	}
+
+	mutex_unlock(&bus->notify_flush_lock);
+}
+
+/**
+ * kdbus_notify_free() - free a list of collected messages
+ * @bus:		Bus which queues the messages
+ */
+void kdbus_notify_free(struct kdbus_bus *bus)
+{
+	struct kdbus_kmsg *kmsg, *tmp;
+
+	list_for_each_entry_safe(kmsg, tmp, &bus->notify_list, notify_entry) {
+		list_del(&kmsg->notify_entry);
+		kdbus_kmsg_free(kmsg);
+	}
+}
diff --git a/ipc/kdbus/notify.h b/ipc/kdbus/notify.h
new file mode 100644
index 000000000000..95d3711e49f3
--- /dev/null
+++ b/ipc/kdbus/notify.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_NOTIFY_H
+#define __KDBUS_NOTIFY_H
+
+struct kdbus_bus;
+
+int kdbus_notify_id_change(struct kdbus_bus *bus, u64 type, u64 id, u64 flags);
+int kdbus_notify_reply_timeout(struct kdbus_bus *bus, u64 id, u64 cookie);
+int kdbus_notify_reply_dead(struct kdbus_bus *bus, u64 id, u64 cookie);
+int kdbus_notify_name_change(struct kdbus_bus *bus, u64 type,
+			     u64 old_id, u64 new_id,
+			     u64 old_flags, u64 new_flags,
+			     const char *name);
+void kdbus_notify_flush(struct kdbus_bus *bus);
+void kdbus_notify_free(struct kdbus_bus *bus);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add code for buses, domains and endpoints
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  2014-11-21  8:14     ` Harald Hoyer
  2014-11-21  8:39   ` Harald Hoyer
  -1 siblings, 2 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

Add the logic to handle the following entities:

Domain:
  A domain is an unamed object containing a number of buses. A
  domain is automatically created when an instance of kdbusfs
  is mounted, and destroyed when it is unmounted.
  Every domain offers its own "control" device node to create
  buses.  Domains have no connection to each other and cannot
  see nor talk to each other.

Bus:
  A bus is a named object inside a domain. Clients exchange messages
  over a bus. Multiple buses themselves have no connection to each
  other; messages can only be exchanged on the same bus. The default
  entry point to a bus, where clients establish the connection to, is
  the "bus" device node /dev/kdbus/<bus name>/bus.  Common operating
  system setups create one "system bus" per system, and one "user
  bus" for every logged-in user. Applications or services may create
  their own private named buses.

Endpoint:
  An endpoint provides the device node to talk to a bus. Opening an
  endpoint creates a new connection to the bus to which the endpoint
  belongs. Every bus has a default endpoint called "bus". A bus can
  optionally offer additional endpoints with custom names to provide
  a restricted access to the same bus. Custom endpoints carry
  additional policy which can be used to give sandboxed processes
  only a locked-down, limited, filtered access to the same bus.

See Documentation/kdbus.txt for more details.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/bus.c      | 459 +++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/bus.h      |  98 ++++++++++
 ipc/kdbus/domain.c   | 349 ++++++++++++++++++++++++++++++++++++
 ipc/kdbus/domain.h   |  84 +++++++++
 ipc/kdbus/endpoint.c | 497 +++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/endpoint.h |  91 ++++++++++
 6 files changed, 1578 insertions(+)
 create mode 100644 ipc/kdbus/bus.c
 create mode 100644 ipc/kdbus/bus.h
 create mode 100644 ipc/kdbus/domain.c
 create mode 100644 ipc/kdbus/domain.h
 create mode 100644 ipc/kdbus/endpoint.c
 create mode 100644 ipc/kdbus/endpoint.h

diff --git a/ipc/kdbus/bus.c b/ipc/kdbus/bus.c
new file mode 100644
index 000000000000..2808a2e87707
--- /dev/null
+++ b/ipc/kdbus/bus.c
@@ -0,0 +1,459 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/hashtable.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/random.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "bus.h"
+#include "notify.h"
+#include "connection.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "item.h"
+#include "match.h"
+#include "message.h"
+#include "metadata.h"
+#include "names.h"
+#include "policy.h"
+
+static void kdbus_bus_free(struct kdbus_node *node)
+{
+	struct kdbus_bus *bus = container_of(node, struct kdbus_bus, node);
+
+	BUG_ON(!list_empty(&bus->monitors_list));
+	BUG_ON(!hash_empty(bus->conn_hash));
+
+	kdbus_notify_free(bus);
+
+	kdbus_domain_user_unref(bus->user);
+	kdbus_name_registry_free(bus->name_registry);
+	kdbus_domain_unref(bus->domain);
+	kdbus_policy_db_clear(&bus->policy_db);
+	kdbus_meta_free(bus->meta);
+	kfree(bus);
+}
+
+static void kdbus_bus_release(struct kdbus_node *node, bool was_active)
+{
+	struct kdbus_bus *bus = container_of(node, struct kdbus_bus, node);
+
+	if (was_active)
+		atomic_dec(&bus->user->buses);
+}
+
+/**
+ * kdbus_bus_new() - create a kdbus_cmd_make from user-supplied data
+ * @domain:		The domain to work on
+ * @make:		Information as passed in by userspace
+ * @uid:		The uid of the device node
+ * @gid:		The gid of the device node
+ *
+ * This function is part of the connection ioctl() interface and will parse
+ * the user-supplied data in order to create a new kdbus_bus.
+ *
+ * Return: the new bus on success, ERR_PTR on failure.
+ */
+struct kdbus_bus *kdbus_bus_new(struct kdbus_domain *domain,
+				const struct kdbus_cmd_make *make,
+				kuid_t uid, kgid_t gid)
+{
+	const struct kdbus_bloom_parameter *bloom = NULL;
+	const struct kdbus_item *item;
+	struct kdbus_bus *b;
+	const char *name = NULL;
+	char prefix[16];
+	int ret;
+
+	u64 attach_flags = 0;
+
+	KDBUS_ITEMS_FOREACH(item, make->items, KDBUS_ITEMS_SIZE(make, items)) {
+		switch (item->type) {
+		case KDBUS_ITEM_MAKE_NAME:
+			if (name)
+				return ERR_PTR(-EEXIST);
+
+			name = item->str;
+			break;
+
+		case KDBUS_ITEM_BLOOM_PARAMETER:
+			if (bloom)
+				return ERR_PTR(-EEXIST);
+
+			bloom = &item->bloom_parameter;
+			break;
+
+		case KDBUS_ITEM_ATTACH_FLAGS_RECV:
+			if (attach_flags)
+				return ERR_PTR(-EEXIST);
+
+			attach_flags = item->data64[0];
+			break;
+		}
+	}
+
+	if (!name || !bloom)
+		return ERR_PTR(-EBADMSG);
+
+	/* 'any' degrades to 'all' for compatibility */
+	if (attach_flags == _KDBUS_ATTACH_ANY)
+		attach_flags = _KDBUS_ATTACH_ALL;
+
+	/* reject unknown attach flags */
+	if (attach_flags & ~_KDBUS_ATTACH_ALL)
+		return ERR_PTR(-EINVAL);
+	if (bloom->size < 8 || bloom->size > KDBUS_BUS_BLOOM_MAX_SIZE)
+		return ERR_PTR(-EINVAL);
+	if (!KDBUS_IS_ALIGNED8(bloom->size))
+		return ERR_PTR(-EINVAL);
+	if (bloom->n_hash < 1)
+		return ERR_PTR(-EINVAL);
+
+	/* enforce "$UID-" prefix */
+	snprintf(prefix, sizeof(prefix), "%u-",
+		 from_kuid(domain->user_namespace, uid));
+	if (strncmp(name, prefix, strlen(prefix) != 0))
+		return ERR_PTR(-EINVAL);
+
+	b = kzalloc(sizeof(*b), GFP_KERNEL);
+	if (!b)
+		return ERR_PTR(-ENOMEM);
+
+	kdbus_node_init(&b->node, KDBUS_NODE_BUS);
+
+	b->node.free_cb = kdbus_bus_free;
+	b->node.release_cb = kdbus_bus_release;
+	b->node.uid = uid;
+	b->node.gid = gid;
+	b->node.mode = S_IRUSR | S_IXUSR;
+
+	b->access = make->flags & (KDBUS_MAKE_ACCESS_WORLD |
+				   KDBUS_MAKE_ACCESS_GROUP);
+	if (b->access & (KDBUS_MAKE_ACCESS_GROUP | KDBUS_MAKE_ACCESS_WORLD))
+		b->node.mode |= S_IRGRP | S_IXGRP;
+	if (b->access & KDBUS_MAKE_ACCESS_WORLD)
+		b->node.mode |= S_IROTH | S_IXOTH;
+
+	b->bus_flags = make->flags;
+	b->bloom = *bloom;
+	b->attach_flags_req = attach_flags;
+	mutex_init(&b->lock);
+	init_rwsem(&b->conn_rwlock);
+	hash_init(b->conn_hash);
+	INIT_LIST_HEAD(&b->monitors_list);
+	INIT_LIST_HEAD(&b->notify_list);
+	spin_lock_init(&b->notify_lock);
+	mutex_init(&b->notify_flush_lock);
+	atomic64_set(&b->conn_seq_last, 0);
+	b->domain = kdbus_domain_ref(domain);
+	kdbus_policy_db_init(&b->policy_db);
+	b->id = atomic64_inc_return(&domain->bus_seq_last);
+
+	/* generate unique bus id */
+	generate_random_uuid(b->id128);
+
+	ret = kdbus_node_link(&b->node, &domain->node, name);
+	if (ret < 0)
+		goto exit_unref;
+
+	/* cache the metadata/credentials of the creator */
+	b->meta = kdbus_meta_new();
+	if (IS_ERR(b->meta)) {
+		ret = PTR_ERR(b->meta);
+		b->meta = NULL;
+		goto exit_unref;
+	}
+
+	ret = kdbus_meta_append(b->meta, domain, NULL, 0,
+				KDBUS_ATTACH_CREDS	|
+				KDBUS_ATTACH_TID_COMM	|
+				KDBUS_ATTACH_PID_COMM	|
+				KDBUS_ATTACH_EXE	|
+				KDBUS_ATTACH_CMDLINE	|
+				KDBUS_ATTACH_CGROUP	|
+				KDBUS_ATTACH_CAPS	|
+				KDBUS_ATTACH_SECLABEL	|
+				KDBUS_ATTACH_AUDIT);
+	if (ret < 0)
+		goto exit_unref;
+
+	b->name_registry = kdbus_name_registry_new();
+	if (IS_ERR(b->name_registry)) {
+		ret = PTR_ERR(b->name_registry);
+		b->name_registry = NULL;
+		goto exit_unref;
+	}
+
+	b->user = kdbus_domain_get_user(domain, uid);
+	if (IS_ERR(b->user)) {
+		ret = PTR_ERR(b->user);
+		b->user = NULL;
+		goto exit_unref;
+	}
+
+	return b;
+
+exit_unref:
+	kdbus_node_deactivate(&b->node);
+	kdbus_node_unref(&b->node);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_bus_ref() - increase the reference counter of a kdbus_bus
+ * @bus:		The bus to reference
+ *
+ * Every user of a bus, except for its creator, must add a reference to the
+ * kdbus_bus using this function.
+ *
+ * Return: the bus itself
+ */
+struct kdbus_bus *kdbus_bus_ref(struct kdbus_bus *bus)
+{
+	if (bus)
+		kdbus_node_ref(&bus->node);
+	return bus;
+}
+
+/**
+ * kdbus_bus_unref() - decrease the reference counter of a kdbus_bus
+ * @bus:		The bus to unref
+ *
+ * Release a reference. If the reference count drops to 0, the bus will be
+ * freed.
+ *
+ * Return: NULL
+ */
+struct kdbus_bus *kdbus_bus_unref(struct kdbus_bus *bus)
+{
+	if (bus)
+		kdbus_node_unref(&bus->node);
+	return NULL;
+}
+
+/**
+ * kdbus_bus_activate() - activate a bus
+ * @bus:		Bus
+ *
+ * Activate a bus and make it available to user-space.
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int kdbus_bus_activate(struct kdbus_bus *bus)
+{
+	struct kdbus_ep *ep;
+	int ret;
+
+	if (atomic_inc_return(&bus->user->buses) > KDBUS_USER_MAX_BUSES) {
+		atomic_dec(&bus->user->buses);
+		return -EMFILE;
+	}
+
+	/*
+	 * kdbus_bus_activate() must not be called multiple times, so if
+	 * kdbus_node_activate() didn't activate the node, it must already be
+	 * dead.
+	 */
+	if (!kdbus_node_activate(&bus->node)) {
+		atomic_dec(&bus->user->buses);
+		return -ESHUTDOWN;
+	}
+
+	/*
+	 * Create a new default endpoint for this bus. If activation succeeds,
+	 * we drop our own reference, effectively causing the endpoint to be
+	 * deactivated and released when the parent domain is.
+	 */
+	ep = kdbus_ep_new(bus, "bus", bus->access,
+			  bus->node.uid, bus->node.gid, false);
+	if (IS_ERR(ep))
+		return PTR_ERR(ep);
+
+	ret = kdbus_ep_activate(ep);
+	if (ret < 0)
+		kdbus_ep_deactivate(ep);
+	kdbus_ep_unref(ep);
+
+	return 0;
+}
+
+/**
+ * kdbus_bus_deactivate() - deactivate a bus
+ * @bus:               The kdbus reference
+ *
+ * The passed bus will be disconnected and the associated endpoint will be
+ * unref'ed.
+ */
+void kdbus_bus_deactivate(struct kdbus_bus *bus)
+{
+	kdbus_node_deactivate(&bus->node);
+}
+
+/**
+ * kdbus_bus_find_conn_by_id() - find a connection with a given id
+ * @bus:		The bus to look for the connection
+ * @id:			The 64-bit connection id
+ *
+ * Looks up a connection with a given id. The returned connection
+ * is ref'ed, and needs to be unref'ed by the user. Returns NULL if
+ * the connection can't be found.
+ */
+struct kdbus_conn *kdbus_bus_find_conn_by_id(struct kdbus_bus *bus, u64 id)
+{
+	struct kdbus_conn *conn, *found = NULL;
+
+	down_read(&bus->conn_rwlock);
+	hash_for_each_possible(bus->conn_hash, conn, hentry, id)
+		if (conn->id == id) {
+			found = kdbus_conn_ref(conn);
+			break;
+		}
+	up_read(&bus->conn_rwlock);
+
+	return found;
+}
+
+/**
+ * kdbus_bus_broadcast() - send a message to all subscribed connections
+ * @bus:	The bus the connections are connected to
+ * @conn_src:	The source connection, may be %NULL for kernel notifications
+ * @kmsg:	The message to send.
+ *
+ * Send @kmsg to all connections that are currently active on the bus.
+ * Connections must still have matches installed in order to subscribe to
+ * let the message pass.
+ */
+void kdbus_bus_broadcast(struct kdbus_bus *bus,
+			 struct kdbus_conn *conn_src,
+			 struct kdbus_kmsg *kmsg)
+{
+	const struct kdbus_msg *msg = &kmsg->msg;
+	struct kdbus_conn *conn_dst;
+	unsigned int i;
+	int ret = 0;
+
+	down_read(&bus->conn_rwlock);
+
+	hash_for_each(bus->conn_hash, i, conn_dst, hentry) {
+		if (conn_dst->id == msg->src_id)
+			continue;
+
+		/*
+		 * Activator or policy holder connections will
+		 * not receive any broadcast messages, only
+		 * ordinary and monitor ones.
+		 */
+		if (!kdbus_conn_is_ordinary(conn_dst) &&
+		    !kdbus_conn_is_monitor(conn_dst))
+			continue;
+
+		if (!kdbus_match_db_match_kmsg(conn_dst->match_db, conn_src,
+					       kmsg))
+			continue;
+
+		ret = kdbus_ep_policy_check_notification(conn_dst->ep,
+							 conn_dst, kmsg);
+		if (ret < 0)
+			continue;
+
+		/*
+		 * The first receiver which requests additional metadata
+		 * causes the message to carry it; data that is in fact added
+		 * to the message is still subject to what the receiver
+		 * requested, and will be filtered by kdbus_meta_write().
+		 */
+		if (conn_src) {
+			/* Check if conn_src is allowed to signal */
+			ret = kdbus_ep_policy_check_broadcast(conn_dst->ep,
+							      conn_src,
+							      conn_dst);
+			if (ret < 0)
+				continue;
+
+			ret = kdbus_ep_policy_check_src_names(conn_dst->ep,
+							      conn_src,
+							      conn_dst);
+			if (ret < 0)
+				continue;
+
+			ret = kdbus_kmsg_attach_metadata(kmsg, conn_src,
+							 conn_dst);
+			if (ret < 0)
+				goto exit_unlock;
+		}
+
+		ret = kdbus_conn_entry_insert(conn_src, conn_dst, kmsg, NULL);
+		if (ret < 0)
+			atomic_inc(&conn_dst->lost_count);
+	}
+
+exit_unlock:
+	up_read(&bus->conn_rwlock);
+}
+
+
+/**
+ * kdbus_cmd_bus_creator_info() - get information on a bus creator
+ * @conn:	The querying connection
+ * @cmd_info:	The command buffer, as passed in from the ioctl
+ *
+ * Gather information on the creator of the bus @conn is connected to.
+ *
+ * Return: 0 on success, error otherwise.
+ */
+int kdbus_cmd_bus_creator_info(struct kdbus_conn *conn,
+			       struct kdbus_cmd_info *cmd_info)
+{
+	struct kdbus_bus *bus = conn->ep->bus;
+	struct kdbus_pool_slice *slice;
+	struct kdbus_info info = {};
+	u64 flags = cmd_info->flags;
+	int ret;
+
+	info.id = bus->id;
+	info.flags = bus->bus_flags;
+	info.size = sizeof(info) +
+		    kdbus_meta_size(bus->meta, conn, &flags);
+
+	if (info.size == 0)
+		return -EPERM;
+
+	slice = kdbus_pool_slice_alloc(conn->pool, info.size);
+	if (IS_ERR(slice))
+		return PTR_ERR(slice);
+
+	ret = kdbus_pool_slice_copy(slice, 0, &info, sizeof(info));
+	if (ret < 0)
+		goto exit_free_slice;
+
+	ret = kdbus_meta_write(bus->meta, conn, flags, slice, sizeof(info));
+	if (ret < 0)
+		goto exit_free_slice;
+
+	/* write back the offset */
+	cmd_info->offset = kdbus_pool_slice_offset(slice);
+	kdbus_pool_slice_flush(slice);
+	kdbus_pool_slice_make_public(slice);
+
+	return 0;
+
+exit_free_slice:
+	kdbus_pool_slice_free(slice);
+	return ret;
+}
diff --git a/ipc/kdbus/bus.h b/ipc/kdbus/bus.h
new file mode 100644
index 000000000000..cd45b128acad
--- /dev/null
+++ b/ipc/kdbus/bus.h
@@ -0,0 +1,98 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_BUS_H
+#define __KDBUS_BUS_H
+
+#include <linux/hashtable.h>
+#include <linux/spinlock.h>
+#include <linux/rwsem.h>
+
+#include "node.h"
+#include "policy.h"
+#include "util.h"
+
+/**
+ * struct kdbus_bus - bus in a domain
+ * @node:		kdbus_node
+ * @disconnected:	Invalidated data
+ * @domain:		Domain of this bus
+ * @id:			ID of this bus in the domain
+ * @lock:		Bus data lock
+ * @access:		The access flags for the bus directory
+ * @ep_seq_last:	Last used endpoint id sequence number
+ * @conn_seq_last:	Last used connection id sequence number
+ * @bus_flags:		Simple pass-through flags from userspace to userspace
+ * @attach_flags_req:	Attach flags required by connecting peers
+ * @name_registry:	Name registry of this bus
+ * @bloom:		Bloom parameters
+ * @id128:		Unique random 128 bit ID of this bus
+ * @user:		Owner of the bus
+ * @policy_db:		Policy database for this bus
+ * @notify_list:	List of pending kernel-generated messages
+ * @notify_lock:	Notification list lock
+ * @notify_flush_lock:	Notification flushing lock
+ * @conn_rwlock:	Read/Write lock for all lists of child connections
+ * @conn_hash:		Map of connection IDs
+ * @monitors_list:	Connections that monitor this bus
+ * @meta:		Meta information about the bus creator
+ *
+ * A bus provides a "bus" endpoint / device node.
+ *
+ * A bus is created by opening the control node and issuing the
+ * KDBUS_CMD_BUS_MAKE iotcl. Closing this file immediately destroys
+ * the bus.
+ */
+struct kdbus_bus {
+	struct kdbus_node node;
+	struct kdbus_domain *domain;
+	u64 id;
+	struct mutex lock;
+	unsigned int access;
+	atomic64_t ep_seq_last;
+	atomic64_t conn_seq_last;
+	u64 bus_flags;
+	u64 attach_flags_req;
+	struct kdbus_name_registry *name_registry;
+	struct kdbus_bloom_parameter bloom;
+	u8 id128[16];
+	struct kdbus_domain_user *user;
+	struct kdbus_policy_db policy_db;
+	struct list_head notify_list;
+	spinlock_t notify_lock;
+	struct mutex notify_flush_lock;
+
+	struct rw_semaphore conn_rwlock;
+	DECLARE_HASHTABLE(conn_hash, 8);
+	struct list_head monitors_list;
+
+	struct kdbus_meta *meta;
+};
+
+struct kdbus_kmsg;
+
+struct kdbus_bus *kdbus_bus_new(struct kdbus_domain *domain,
+				const struct kdbus_cmd_make *make,
+				kuid_t uid, kgid_t gid);
+struct kdbus_bus *kdbus_bus_ref(struct kdbus_bus *bus);
+struct kdbus_bus *kdbus_bus_unref(struct kdbus_bus *bus);
+int kdbus_bus_activate(struct kdbus_bus *bus);
+void kdbus_bus_deactivate(struct kdbus_bus *bus);
+
+int kdbus_cmd_bus_creator_info(struct kdbus_conn *conn,
+			       struct kdbus_cmd_info *cmd_info);
+struct kdbus_conn *kdbus_bus_find_conn_by_id(struct kdbus_bus *bus, u64 id);
+void kdbus_bus_broadcast(struct kdbus_bus *bus, struct kdbus_conn *conn_src,
+			 struct kdbus_kmsg *kmsg);
+
+#endif
diff --git a/ipc/kdbus/domain.c b/ipc/kdbus/domain.c
new file mode 100644
index 000000000000..0396439e4286
--- /dev/null
+++ b/ipc/kdbus/domain.c
@@ -0,0 +1,349 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "bus.h"
+#include "domain.h"
+#include "handle.h"
+#include "item.h"
+#include "limits.h"
+#include "util.h"
+
+static void kdbus_domain_control_free(struct kdbus_node *node)
+{
+	kfree(node);
+}
+
+static struct kdbus_node *kdbus_domain_control_new(struct kdbus_domain *domain,
+						   unsigned int access)
+{
+	struct kdbus_node *node;
+	int ret;
+
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return ERR_PTR(-ENOMEM);
+
+	kdbus_node_init(node, KDBUS_NODE_CONTROL);
+
+	node->free_cb = kdbus_domain_control_free;
+	node->mode = domain->node.mode;
+	node->mode = S_IRUSR | S_IWUSR;
+	if (access & (KDBUS_MAKE_ACCESS_GROUP | KDBUS_MAKE_ACCESS_WORLD))
+		node->mode |= S_IRGRP | S_IWGRP;
+	if (access & KDBUS_MAKE_ACCESS_WORLD)
+		node->mode |= S_IROTH | S_IWOTH;
+
+	ret = kdbus_node_link(node, &domain->node, "control");
+	if (ret < 0)
+		goto exit_free;
+
+	return node;
+
+exit_free:
+	kdbus_node_deactivate(node);
+	kdbus_node_unref(node);
+	return ERR_PTR(ret);
+}
+
+static void kdbus_domain_free(struct kdbus_node *node)
+{
+	struct kdbus_domain *domain = container_of(node, struct kdbus_domain,
+						   node);
+
+	BUG_ON(!hash_empty(domain->user_hash));
+
+	put_pid_ns(domain->pid_namespace);
+	put_user_ns(domain->user_namespace);
+	idr_destroy(&domain->user_idr);
+	kfree(domain);
+}
+
+/**
+ * kdbus_domain_new() - create a new domain
+ * @access:		The access mode for this node (KDBUS_MAKE_ACCESS_*)
+ *
+ * Return: a new kdbus_domain on success, ERR_PTR on failure
+ */
+struct kdbus_domain *kdbus_domain_new(unsigned int access)
+{
+	struct kdbus_domain *d;
+	int ret;
+
+	d = kzalloc(sizeof(*d), GFP_KERNEL);
+	if (!d)
+		return ERR_PTR(-ENOMEM);
+
+	kdbus_node_init(&d->node, KDBUS_NODE_DOMAIN);
+
+	d->node.free_cb = kdbus_domain_free;
+	d->node.mode = S_IRUSR | S_IXUSR;
+	if (access & (KDBUS_MAKE_ACCESS_GROUP | KDBUS_MAKE_ACCESS_WORLD))
+		d->node.mode |= S_IRGRP | S_IXGRP;
+	if (access & KDBUS_MAKE_ACCESS_WORLD)
+		d->node.mode |= S_IROTH | S_IXOTH;
+
+	d->access = access;
+	mutex_init(&d->lock);
+	atomic64_set(&d->msg_seq_last, 0);
+	idr_init(&d->user_idr);
+	d->pid_namespace = get_pid_ns(task_active_pid_ns(current));
+	d->user_namespace = get_user_ns(current_user_ns());
+
+	ret = kdbus_node_link(&d->node, NULL, NULL);
+	if (ret < 0)
+		goto exit_unref;
+
+	return d;
+
+exit_unref:
+	kdbus_node_deactivate(&d->node);
+	kdbus_node_unref(&d->node);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_domain_ref() - take a domain reference
+ * @domain:		Domain
+ *
+ * Return: the domain itself
+ */
+struct kdbus_domain *kdbus_domain_ref(struct kdbus_domain *domain)
+{
+	if (domain)
+		kdbus_node_ref(&domain->node);
+	return domain;
+}
+
+/**
+ * kdbus_domain_unref() - drop a domain reference
+ * @domain:		Domain
+ *
+ * When the last reference is dropped, the domain internal structure
+ * is freed.
+ *
+ * Return: NULL
+ */
+struct kdbus_domain *kdbus_domain_unref(struct kdbus_domain *domain)
+{
+	if (domain)
+		kdbus_node_unref(&domain->node);
+	return NULL;
+}
+
+/**
+ * kdbus_domain_activate() - activate a domain
+ * @domain:		Domain
+ *
+ * Activate a domain so it will be visible to user-space and can be accessed
+ * by external entities.
+ *
+ * Returns: 0 on success, negative error-code on failure
+ */
+int kdbus_domain_activate(struct kdbus_domain *domain)
+{
+	struct kdbus_node *control;
+
+	/*
+	 * kdbus_domain_activate() must not be called multiple times, so if
+	 * kdbus_node_activate() didn't activate the node, it must already be
+	 * dead.
+	 */
+	if (!kdbus_node_activate(&domain->node))
+		return -ESHUTDOWN;
+
+	/*
+	 * Create a control-node for this domain. We drop our own reference
+	 * immediately, effectively causing the node to be deactivated and
+	 * released when the parent domain is.
+	 */
+	control = kdbus_domain_control_new(domain, domain->access);
+	if (IS_ERR(control))
+		return PTR_ERR(control);
+
+	kdbus_node_activate(control);
+	kdbus_node_unref(control);
+
+	return 0;
+}
+
+/**
+ * kdbus_domain_deactivate() - invalidate a domain
+ * @domain:		Domain
+ */
+void kdbus_domain_deactivate(struct kdbus_domain *domain)
+{
+	kdbus_node_deactivate(&domain->node);
+}
+
+/**
+ * kdbus_domain_user_assign_id() - allocate ID and assign it to the
+ *				   domain user
+ * @domain:		The domain of the user
+ * @user:		The kdbus_domain_user object of the user
+ *
+ * Returns 0 if ID in [0, INT_MAX] is successfully assigned to the
+ * domain user. Negative errno on failure.
+ *
+ * The user index is used in arrays for accounting user quota in
+ * receiver queues.
+ *
+ * Caller must have the domain lock held and must ensure that the
+ * domain was not disconnected.
+ */
+static int kdbus_domain_user_assign_id(struct kdbus_domain *domain,
+				       struct kdbus_domain_user *user)
+{
+	int ret;
+
+	/*
+	 * Allocate the smallest possible index for this user; used
+	 * in arrays for accounting user quota in receiver queues.
+	 */
+	ret = idr_alloc(&domain->user_idr, user, 0, 0, GFP_KERNEL);
+	if (ret < 0)
+		return ret;
+
+	user->idr = ret;
+
+	return 0;
+}
+
+/**
+ * kdbus_domain_get_user() - get a kdbus_domain_user object
+ * @domain:		The domain of the user
+ * @uid:		The uid of the user; INVALID_UID for an
+ *			anonymous user like a custom endpoint
+ *
+ * If there is a uid matching, then use the already accounted
+ * kdbus_domain_user, increment its reference counter and return it.
+ * Otherwise allocate a new one, link it into the domain and return it.
+ *
+ * Return: the accounted domain user on success, ERR_PTR on failure.
+ */
+struct kdbus_domain_user *kdbus_domain_get_user(struct kdbus_domain *domain,
+						kuid_t uid)
+{
+	int ret;
+	struct kdbus_domain_user *tmp_user;
+	struct kdbus_domain_user *u = NULL;
+
+	mutex_lock(&domain->lock);
+
+	/* find uid and reference it */
+	if (uid_valid(uid)) {
+		hash_for_each_possible(domain->user_hash, tmp_user,
+				       hentry, __kuid_val(uid)) {
+			if (!uid_eq(tmp_user->uid, uid))
+				continue;
+
+			/*
+			 * If the ref-count is already 0, the destructor is
+			 * about to unlink and destroy the object. Continue
+			 * looking for a next one or create one, if none found.
+			 */
+			if (kref_get_unless_zero(&tmp_user->kref)) {
+				mutex_unlock(&domain->lock);
+				return tmp_user;
+			}
+		}
+	}
+
+	u = kzalloc(sizeof(*u), GFP_KERNEL);
+	if (!u) {
+		ret = -ENOMEM;
+		goto exit_unlock;
+	}
+
+	kref_init(&u->kref);
+	u->domain = kdbus_domain_ref(domain);
+	u->uid = uid;
+	atomic_set(&u->buses, 0);
+	atomic_set(&u->connections, 0);
+
+	/* Assign user ID and link into domain */
+	ret = kdbus_domain_user_assign_id(domain, u);
+	if (ret < 0)
+		goto exit_free;
+
+	/* UID hash map */
+	hash_add(domain->user_hash, &u->hentry, __kuid_val(u->uid));
+
+	mutex_unlock(&domain->lock);
+	return u;
+
+exit_free:
+	kdbus_domain_unref(u->domain);
+	kfree(u);
+exit_unlock:
+	mutex_unlock(&domain->lock);
+	return ERR_PTR(ret);
+}
+
+static void __kdbus_domain_user_free(struct kref *kref)
+{
+	struct kdbus_domain_user *user =
+		container_of(kref, struct kdbus_domain_user, kref);
+
+	BUG_ON(atomic_read(&user->buses) > 0);
+	BUG_ON(atomic_read(&user->connections) > 0);
+
+	/*
+	 * Lookups ignore objects with a ref-count of 0. Therefore, we can
+	 * safely remove it from the table after dropping the last reference.
+	 * No-one will acquire a ref in parallel.
+	 */
+	mutex_lock(&user->domain->lock);
+	idr_remove(&user->domain->user_idr, user->idr);
+	hash_del(&user->hentry);
+	mutex_unlock(&user->domain->lock);
+
+	kdbus_domain_unref(user->domain);
+	kfree(user);
+}
+
+/**
+ * kdbus_domain_user_ref() - take a domain user reference
+ * @u:		User
+ *
+ * Return: the domain user itself
+ */
+struct kdbus_domain_user *kdbus_domain_user_ref(struct kdbus_domain_user *u)
+{
+	kref_get(&u->kref);
+	return u;
+}
+
+/**
+ * kdbus_domain_user_unref() - drop a domain user reference
+ * @u:		User
+ *
+ * When the last reference is dropped, the domain internal structure
+ * is freed.
+ *
+ * Return: NULL
+ */
+struct kdbus_domain_user *kdbus_domain_user_unref(struct kdbus_domain_user *u)
+{
+	if (u)
+		kref_put(&u->kref, __kdbus_domain_user_free);
+	return NULL;
+}
diff --git a/ipc/kdbus/domain.h b/ipc/kdbus/domain.h
new file mode 100644
index 000000000000..ad5447626ed1
--- /dev/null
+++ b/ipc/kdbus/domain.h
@@ -0,0 +1,84 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_DOMAIN_H
+#define __KDBUS_DOMAIN_H
+
+#include <linux/hashtable.h>
+#include <linux/idr.h>
+#include <linux/kref.h>
+#include <linux/user_namespace.h>
+#include <linux/pid_namespace.h>
+
+#include "node.h"
+
+/**
+ * struct kdbus_domain - domain for buses
+ * @node:		Underlying API node
+ * @access:		Access mode for this domain
+ * @lock:		Domain data lock
+ * @bus_seq_last:	Last used bus id sequence number
+ * @msg_seq_last:	Last used message id sequence number
+ * @user_hash:		Accounting of user resources
+ * @user_idr:		Map of all users; smallest possible index
+ * @pid_namespace:	PID namespace, pinned at creation time
+ * @user_namespace:	User namespace, pinned at creation time
+ */
+struct kdbus_domain {
+	struct kdbus_node node;
+	unsigned int access;
+	struct mutex lock;
+	atomic64_t bus_seq_last;
+	atomic64_t msg_seq_last;
+	DECLARE_HASHTABLE(user_hash, 6);
+	struct idr user_idr;
+	struct pid_namespace *pid_namespace;
+	struct user_namespace *user_namespace;
+};
+
+/**
+ * struct kdbus_domain_user - resource accounting for users
+ * @kref:		Reference counter
+ * @domain:		Domain of the user
+ * @hentry:		Entry in domain user map
+ * @idr:		Smallest possible index number of all users
+ * @uid:		UID of the user
+ * @buses:		Number of buses the user has created
+ * @connections:	Number of connections the user has created
+ */
+struct kdbus_domain_user {
+	struct kref kref;
+	struct kdbus_domain *domain;
+	struct hlist_node hentry;
+	unsigned int idr;
+	kuid_t uid;
+	atomic_t buses;
+	atomic_t connections;
+};
+
+#define kdbus_domain_from_node(_node) container_of((_node), \
+						   struct kdbus_domain, \
+						   node)
+
+struct kdbus_domain *kdbus_domain_new(unsigned int access);
+struct kdbus_domain *kdbus_domain_ref(struct kdbus_domain *domain);
+struct kdbus_domain *kdbus_domain_unref(struct kdbus_domain *domain);
+int kdbus_domain_activate(struct kdbus_domain *domain);
+void kdbus_domain_deactivate(struct kdbus_domain *domain);
+
+struct kdbus_domain_user *kdbus_domain_get_user(struct kdbus_domain *domain,
+						kuid_t uid);
+struct kdbus_domain_user *kdbus_domain_user_ref(struct kdbus_domain_user *u);
+struct kdbus_domain_user *kdbus_domain_user_unref(struct kdbus_domain_user *u);
+
+#endif
diff --git a/ipc/kdbus/endpoint.c b/ipc/kdbus/endpoint.c
new file mode 100644
index 000000000000..dd8189009975
--- /dev/null
+++ b/ipc/kdbus/endpoint.c
@@ -0,0 +1,497 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "domain.h"
+#include "endpoint.h"
+#include "handle.h"
+#include "item.h"
+#include "message.h"
+#include "policy.h"
+
+static void kdbus_ep_free(struct kdbus_node *node)
+{
+	struct kdbus_ep *ep = container_of(node, struct kdbus_ep, node);
+
+	BUG_ON(!list_empty(&ep->conn_list));
+
+	kdbus_policy_db_clear(&ep->policy_db);
+	kdbus_bus_unref(ep->bus);
+	kdbus_domain_user_unref(ep->user);
+	kfree(ep);
+}
+
+static void kdbus_ep_release(struct kdbus_node *node, bool was_active)
+{
+	struct kdbus_ep *ep = container_of(node, struct kdbus_ep, node);
+
+	/* disconnect all connections to this endpoint */
+	for (;;) {
+		struct kdbus_conn *conn;
+
+		mutex_lock(&ep->lock);
+		conn = list_first_entry_or_null(&ep->conn_list,
+						struct kdbus_conn,
+						ep_entry);
+		if (!conn) {
+			mutex_unlock(&ep->lock);
+			break;
+		}
+
+		/* take reference, release lock, disconnect without lock */
+		kdbus_conn_ref(conn);
+		mutex_unlock(&ep->lock);
+
+		kdbus_conn_disconnect(conn, false);
+		kdbus_conn_unref(conn);
+	}
+}
+
+/**
+ * kdbus_ep_new() - create a new endpoint
+ * @bus:		The bus this endpoint will be created for
+ * @name:		The name of the endpoint
+ * @access:		The access flags for this node (KDBUS_MAKE_ACCESS_*)
+ * @uid:		The uid of the device node
+ * @gid:		The gid of the device node
+ * @policy:		Whether or not the endpoint should have a policy db
+ *
+ * This function will create a new enpoint with the given
+ * name and properties for a given bus.
+ *
+ * Return: a new kdbus_ep on success, ERR_PTR on failure.
+ */
+struct kdbus_ep *kdbus_ep_new(struct kdbus_bus *bus, const char *name,
+			      unsigned int access, kuid_t uid, kgid_t gid,
+			      bool policy)
+{
+	struct kdbus_ep *e;
+	int ret;
+
+	e = kzalloc(sizeof(*e), GFP_KERNEL);
+	if (!e)
+		return ERR_PTR(-ENOMEM);
+
+	kdbus_node_init(&e->node, KDBUS_NODE_ENDPOINT);
+
+	e->node.free_cb = kdbus_ep_free;
+	e->node.release_cb = kdbus_ep_release;
+	e->node.uid = uid;
+	e->node.gid = gid;
+	e->node.mode = S_IRUSR | S_IWUSR;
+	if (access & (KDBUS_MAKE_ACCESS_GROUP | KDBUS_MAKE_ACCESS_WORLD))
+		e->node.mode |= S_IRGRP | S_IWGRP;
+	if (access & KDBUS_MAKE_ACCESS_WORLD)
+		e->node.mode |= S_IROTH | S_IWOTH;
+
+	mutex_init(&e->lock);
+	INIT_LIST_HEAD(&e->conn_list);
+	kdbus_policy_db_init(&e->policy_db);
+	e->has_policy = policy;
+	e->bus = kdbus_bus_ref(bus);
+	e->id = atomic64_inc_return(&bus->ep_seq_last);
+
+	ret = kdbus_node_link(&e->node, &bus->node, name);
+	if (ret < 0)
+		goto exit_unref;
+
+	return e;
+
+exit_unref:
+	kdbus_node_deactivate(&e->node);
+	kdbus_node_unref(&e->node);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_ep_ref() - increase the reference counter of a kdbus_ep
+ * @ep:			The endpoint to reference
+ *
+ * Every user of an endpoint, except for its creator, must add a reference to
+ * the kdbus_ep instance using this function.
+ *
+ * Return: the ep itself
+ */
+struct kdbus_ep *kdbus_ep_ref(struct kdbus_ep *ep)
+{
+	if (ep)
+		kdbus_node_ref(&ep->node);
+	return ep;
+}
+
+/**
+ * kdbus_ep_unref() - decrease the reference counter of a kdbus_ep
+ * @ep:		The ep to unref
+ *
+ * Release a reference. If the reference count drops to 0, the ep will be
+ * freed.
+ *
+ * Return: NULL
+ */
+struct kdbus_ep *kdbus_ep_unref(struct kdbus_ep *ep)
+{
+	if (ep)
+		kdbus_node_unref(&ep->node);
+	return NULL;
+}
+
+/**
+ * kdbus_ep_activate() - Activatean endpoint
+ * @ep:			Endpoint
+ *
+ * Return: 0 on success, negative error otherwise.
+ */
+int kdbus_ep_activate(struct kdbus_ep *ep)
+{
+	/*
+	 * kdbus_ep_activate() must not be called multiple times, so if
+	 * kdbus_node_activate() didn't activate the node, it must already be
+	 * dead.
+	 */
+	if (!kdbus_node_activate(&ep->node))
+		return -ESHUTDOWN;
+
+	return 0;
+}
+
+/**
+ * kdbus_ep_deactivate() - invalidate an endpoint
+ * @ep:			Endpoint
+ */
+void kdbus_ep_deactivate(struct kdbus_ep *ep)
+{
+	kdbus_node_deactivate(&ep->node);
+}
+
+/**
+ * kdbus_ep_policy_set() - set policy for an endpoint
+ * @ep:			The endpoint
+ * @items:		The kdbus items containing policy information
+ * @items_size:		The total length of the items
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_ep_policy_set(struct kdbus_ep *ep,
+			const struct kdbus_item *items,
+			size_t items_size)
+{
+	return kdbus_policy_set(&ep->policy_db, items, items_size, 0, true, ep);
+}
+
+/**
+ * kdbus_ep_policy_check_see_access_unlocked() - verify a connection can see
+ *						 the passed name
+ * @ep:			Endpoint to operate on
+ * @conn:		Connection that lists names
+ * @name:		Name that is tried to be listed
+ *
+ * This verifies that @conn is allowed to see the well-known name @name via the
+ * endpoint @ep.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_see_access_unlocked(struct kdbus_ep *ep,
+					      struct kdbus_conn *conn,
+					      const char *name)
+{
+	int ret;
+
+	/*
+	 * Check policy, if the endpoint of the connection has a db.
+	 * Note that policy DBs instanciated along with connections
+	 * don't have SEE rules, so it's sufficient to check the
+	 * endpoint's database.
+	 *
+	 * The lock for the policy db is held across all calls of
+	 * kdbus_name_list_all(), so the entries in both writing
+	 * and non-writing runs of kdbus_name_list_write() are the
+	 * same.
+	 */
+
+	if (!ep->has_policy)
+		return 0;
+
+	ret = kdbus_policy_check_see_access_unlocked(&ep->policy_db,
+						     conn->cred, name);
+
+	/* don't leak hints whether a name exists on a custom endpoint. */
+	if (ret == -EPERM)
+		return -ENOENT;
+
+	return ret;
+}
+
+/**
+ * kdbus_ep_policy_check_see_access() - verify a connection can see
+ *					the passed name
+ * @ep:			Endpoint to operate on
+ * @conn:		Connection that lists names
+ * @name:		Name that is tried to be listed
+ *
+ * This verifies that @conn is allowed to see the well-known name @name via the
+ * endpoint @ep.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_see_access(struct kdbus_ep *ep,
+				     struct kdbus_conn *conn,
+				     const char *name)
+{
+	int ret;
+
+	down_read(&ep->policy_db.entries_rwlock);
+	mutex_lock(&conn->lock);
+
+	ret = kdbus_ep_policy_check_see_access_unlocked(ep, conn, name);
+
+	mutex_unlock(&conn->lock);
+	up_read(&ep->policy_db.entries_rwlock);
+
+	return ret;
+}
+
+/**
+ * kdbus_ep_policy_check_notification() - verify a connection is allowed to see
+ *					  the name in a notification
+ * @ep:			Endpoint to operate on
+ * @conn:		Connection connected to the endpoint
+ * @kmsg:		The message carrying the notification
+ *
+ * This function verifies that @conn is allowed to see the well-known name
+ * inside a name-change notification contained in @msg via the endpoint @ep.
+ * If @msg is not a notification for name changes, this function does nothing
+ * but return 0.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_notification(struct kdbus_ep *ep,
+				       struct kdbus_conn *conn,
+				       const struct kdbus_kmsg *kmsg)
+{
+	int ret = 0;
+
+	if (kmsg->msg.src_id != KDBUS_SRC_ID_KERNEL || !ep->has_policy)
+		return 0;
+
+	switch (kmsg->notify_type) {
+	case KDBUS_ITEM_NAME_ADD:
+	case KDBUS_ITEM_NAME_REMOVE:
+	case KDBUS_ITEM_NAME_CHANGE:
+		ret = kdbus_ep_policy_check_see_access(ep, conn,
+						       kmsg->notify_name);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+/**
+ * kdbus_ep_policy_check_src_names() - check whether a connection's endpoint
+ *				       is allowed to see any of another
+ *				       connection's currently owned names
+ * @ep:			Endpoint to operate on
+ * @conn_src:		Connection that owns the names
+ * @conn_dst:		Destination connection to check credentials against
+ *
+ * This function checks whether @ep is allowed to see any of the names
+ * currently owned by @conn_src.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_src_names(struct kdbus_ep *ep,
+				    struct kdbus_conn *conn_src,
+				    struct kdbus_conn *conn_dst)
+{
+	struct kdbus_name_entry *e;
+	int ret = -ENOENT;
+
+	if (!ep->has_policy)
+		return 0;
+
+	down_read(&ep->policy_db.entries_rwlock);
+	mutex_lock(&conn_src->lock);
+
+	list_for_each_entry(e, &conn_src->names_list, conn_entry) {
+		ret = kdbus_ep_policy_check_see_access_unlocked(ep, conn_dst,
+								e->name);
+		if (ret == 0)
+			break;
+	}
+
+	mutex_unlock(&conn_src->lock);
+	up_read(&ep->policy_db.entries_rwlock);
+
+	return ret;
+}
+
+static int
+kdbus_custom_ep_check_talk_access(struct kdbus_ep *ep,
+				  struct kdbus_conn *conn_src,
+				  struct kdbus_conn *conn_dst)
+{
+	int ret;
+
+	if (!ep->has_policy)
+		return 0;
+
+	/* Custom endpoints have stricter policies */
+	ret = kdbus_policy_check_talk_access(&ep->policy_db,
+					     conn_src, conn_dst);
+
+	/*
+	 * Don't leak hints whether a name exists on a custom
+	 * endpoint.
+	 */
+	if (ret == -EPERM)
+		ret = -ENOENT;
+
+	return ret;
+}
+
+static bool
+kdbus_ep_has_default_talk_access(struct kdbus_conn *conn_src,
+				 struct kdbus_conn *conn_dst)
+{
+	if (conn_src->privileged)
+		return true;
+
+	if (uid_eq(conn_src->cred->fsuid, conn_dst->cred->uid))
+		return true;
+
+	return false;
+}
+
+/**
+ * kdbus_ep_policy_check_talk_access() - verify a connection can talk to the
+ *					 the passed connection
+ * @ep:			Endpoint to operate on
+ * @conn_src:		Connection that tries to talk
+ * @conn_dst:		Connection that is talked to
+ *
+ * This verifies that @conn_src is allowed to talk to @conn_dst via the
+ * endpoint @ep.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_talk_access(struct kdbus_ep *ep,
+				      struct kdbus_conn *conn_src,
+				      struct kdbus_conn *conn_dst)
+{
+	int ret;
+
+	/* First check the custom endpoint with its policies */
+	ret = kdbus_custom_ep_check_talk_access(ep, conn_src, conn_dst);
+	if (ret < 0)
+		return ret;
+
+	/* Then check if it satisfies the implicit policies */
+	if (kdbus_ep_has_default_talk_access(conn_src, conn_dst))
+		return 0;
+
+	/* Fallback to the default endpoint policy */
+	return kdbus_policy_check_talk_access(&ep->bus->policy_db,
+					      conn_src, conn_dst);
+}
+
+/**
+ * kdbus_ep_policy_check_broadcast() - verify a connection can send
+ *				       broadcast messages to the
+ *				       passed connection
+ * @ep:			Endpoint to operate on
+ * @conn_src:		Connection that tries to talk
+ * @conn_dst:		Connection that is talked to
+ *
+ * This verifies that @conn_src is allowed to send broadcast messages
+ * to @conn_dst via the endpoint @ep.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_broadcast(struct kdbus_ep *ep,
+				    struct kdbus_conn *conn_src,
+				    struct kdbus_conn *conn_dst)
+{
+	int ret;
+
+	/* First check the custom endpoint with its policies */
+	ret = kdbus_custom_ep_check_talk_access(ep, conn_src, conn_dst);
+	if (ret < 0)
+		return ret;
+
+	/* Then check if it satisfies the implicit policies */
+	if (kdbus_ep_has_default_talk_access(conn_src, conn_dst))
+		return 0;
+
+	/*
+	 * If conn_src owns names on the bus, and the conn_dst does
+	 * not own any name, then allow conn_src to signal to
+	 * conn_dst. Otherwise fallback and perform the bus policy
+	 * check on conn_dst.
+	 *
+	 * This way we allow services to signal on the bus, and we
+	 * block broadcasts directed to services that own names and
+	 * do not want to receive these messages unless there is a
+	 * policy entry to permit it. By this we try to follow the
+	 * same logic used for unicat messages.
+	 */
+	if (atomic_read(&conn_src->name_count) > 0 &&
+	    atomic_read(&conn_dst->name_count) == 0)
+		return 0;
+
+	/* Fallback to the default endpoint policy */
+	return kdbus_policy_check_talk_access(&ep->bus->policy_db,
+					      conn_src, conn_dst);
+}
+
+/**
+ * kdbus_ep_policy_check_own_access() - verify a connection can own the passed
+ *					name
+ * @ep:			Endpoint to operate on
+ * @conn:		Connection that acquires a name
+ * @name:		Name that is about to be acquired
+ *
+ * This verifies that @conn is allowed to acquire the well-known name @name via
+ * the endpoint @ep.
+ *
+ * Return: 0 if allowed, negative error code if not.
+ */
+int kdbus_ep_policy_check_own_access(struct kdbus_ep *ep,
+				     const struct kdbus_conn *conn,
+				     const char *name)
+{
+	int ret;
+
+	if (ep->has_policy) {
+		ret = kdbus_policy_check_own_access(&ep->policy_db,
+						    conn->cred, name);
+		if (ret < 0)
+			return ret;
+	}
+
+	if (conn->privileged)
+		return 0;
+
+	return kdbus_policy_check_own_access(&ep->bus->policy_db,
+					     conn->cred, name);
+}
diff --git a/ipc/kdbus/endpoint.h b/ipc/kdbus/endpoint.h
new file mode 100644
index 000000000000..e7c69f705399
--- /dev/null
+++ b/ipc/kdbus/endpoint.h
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_ENDPOINT_H
+#define __KDBUS_ENDPOINT_H
+
+#include "limits.h"
+#include "names.h"
+#include "node.h"
+#include "policy.h"
+#include "util.h"
+
+struct kdbus_kmsg;
+
+/*
+ * struct kdbus_endpoint - enpoint to access a bus
+ * @node:		The kdbus node
+ * @bus:		Bus behind this endpoint
+ * @id:			ID of this endpoint on the bus
+ * @conn_list:		Connections of this endpoint
+ * @lock:		Endpoint data lock
+ * @user:		Custom enpoints account against an anonymous user
+ * @policy_db:		Uploaded policy
+ * @disconnected:	Invalidated data
+ * @has_policy:		The policy-db is valid and should be used
+ *
+ * An enpoint offers access to a bus; the default device node name is "bus".
+ * Additional custom endpoints to the same bus can be created and they can
+ * carry their own policies/filters.
+ */
+struct kdbus_ep {
+	struct kdbus_node node;
+	struct kdbus_bus *bus;
+	u64 id;
+	struct list_head conn_list;
+	struct mutex lock;
+	struct kdbus_domain_user *user;
+	struct kdbus_policy_db policy_db;
+
+	bool has_policy : 1;
+};
+
+#define kdbus_ep_from_node(_node) container_of((_node), \
+					       struct kdbus_ep, \
+					       node)
+
+struct kdbus_ep *kdbus_ep_new(struct kdbus_bus *bus, const char *name,
+			      unsigned int access, kuid_t uid, kgid_t gid,
+			      bool policy);
+struct kdbus_ep *kdbus_ep_ref(struct kdbus_ep *ep);
+struct kdbus_ep *kdbus_ep_unref(struct kdbus_ep *ep);
+int kdbus_ep_activate(struct kdbus_ep *ep);
+void kdbus_ep_deactivate(struct kdbus_ep *ep);
+
+int kdbus_ep_policy_set(struct kdbus_ep *ep,
+			const struct kdbus_item *items,
+			size_t items_size);
+
+int kdbus_ep_policy_check_see_access_unlocked(struct kdbus_ep *ep,
+					      struct kdbus_conn *conn,
+					      const char *name);
+int kdbus_ep_policy_check_see_access(struct kdbus_ep *ep,
+				     struct kdbus_conn *conn,
+				     const char *name);
+int kdbus_ep_policy_check_notification(struct kdbus_ep *ep,
+				       struct kdbus_conn *conn,
+				       const struct kdbus_kmsg *kmsg);
+int kdbus_ep_policy_check_src_names(struct kdbus_ep *ep,
+				    struct kdbus_conn *conn_src,
+				    struct kdbus_conn *conn_dst);
+int kdbus_ep_policy_check_talk_access(struct kdbus_ep *ep,
+				      struct kdbus_conn *conn_src,
+				      struct kdbus_conn *conn_dst);
+int kdbus_ep_policy_check_broadcast(struct kdbus_ep *ep,
+				    struct kdbus_conn *conn_src,
+				    struct kdbus_conn *conn_dst);
+int kdbus_ep_policy_check_own_access(struct kdbus_ep *ep,
+				     const struct kdbus_conn *conn,
+				     const char *name);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add name registry implementation
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch adds the name registry implementation.

Each bus instantiates a name registry to resolve well-known names
into unique connection IDs for message delivery. The registry will
be queried when a message is sent with kdbus_msg.dst_id set to
KDBUS_DST_ID_NAME, or when a registry dump is requested.

It's important to have this registry implemented in the kernel to
implement lookups and take-overs in a race-free way.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/names.c | 921 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/names.h |  81 +++++
 2 files changed, 1002 insertions(+)
 create mode 100644 ipc/kdbus/names.c
 create mode 100644 ipc/kdbus/names.h

diff --git a/ipc/kdbus/names.c b/ipc/kdbus/names.c
new file mode 100644
index 000000000000..552309bd6716
--- /dev/null
+++ b/ipc/kdbus/names.c
@@ -0,0 +1,921 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ * Copyright (C) 2014 Djalal Harouni
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/ctype.h>
+#include <linux/fs.h>
+#include <linux/hash.h>
+#include <linux/idr.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/rwsem.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "endpoint.h"
+#include "item.h"
+#include "names.h"
+#include "notify.h"
+#include "policy.h"
+
+/**
+ * struct kdbus_name_queue_item - a queue item for a name
+ * @conn:		The associated connection
+ * @entry:		Name entry queuing up for
+ * @entry_entry:	List element for the list in @entry
+ * @conn_entry:		List element for the list in @conn
+ * @flags:		The queuing flags
+ */
+struct kdbus_name_queue_item {
+	struct kdbus_conn *conn;
+	struct kdbus_name_entry *entry;
+	struct list_head entry_entry;
+	struct list_head conn_entry;
+	u64 flags;
+};
+
+static void kdbus_name_entry_free(struct kdbus_name_entry *e)
+{
+	hash_del(&e->hentry);
+	kfree(e->name);
+	kfree(e);
+}
+
+/**
+ * kdbus_name_registry_free() - drop a name reg's reference
+ * @reg:		The name registry, may be %NULL
+ *
+ * Cleanup the name registry's internal structures.
+ */
+void kdbus_name_registry_free(struct kdbus_name_registry *reg)
+{
+	struct kdbus_name_entry *e;
+	struct hlist_node *tmp;
+	unsigned int i;
+
+	if (!reg)
+		return;
+
+	hash_for_each_safe(reg->entries_hash, i, tmp, e, hentry)
+		kdbus_name_entry_free(e);
+
+	kfree(reg);
+}
+
+/**
+ * kdbus_name_registry_new() - create a new name registry
+ *
+ * Return: a new kdbus_name_registry on success, ERR_PTR on failure.
+ */
+struct kdbus_name_registry *kdbus_name_registry_new(void)
+{
+	struct kdbus_name_registry *r;
+
+	r = kzalloc(sizeof(*r), GFP_KERNEL);
+	if (!r)
+		return ERR_PTR(-ENOMEM);
+
+	hash_init(r->entries_hash);
+	init_rwsem(&r->rwlock);
+
+	return r;
+}
+
+static struct kdbus_name_entry *
+kdbus_name_lookup(struct kdbus_name_registry *reg, u32 hash, const char *name)
+{
+	struct kdbus_name_entry *e;
+
+	hash_for_each_possible(reg->entries_hash, e, hentry, hash)
+		if (strcmp(e->name, name) == 0)
+			return e;
+
+	return NULL;
+}
+
+static void kdbus_name_queue_item_free(struct kdbus_name_queue_item *q)
+{
+	list_del(&q->entry_entry);
+	list_del(&q->conn_entry);
+	kfree(q);
+}
+
+/*
+ * The caller must hold the lock so we decrement the counter and
+ * delete the entry.
+ *
+ * The caller needs to hold its own reference, so the connection does not go
+ * away while the entry's reference is dropped under lock.
+ */
+static void kdbus_name_entry_remove_owner(struct kdbus_name_entry *e)
+{
+	BUG_ON(!e->conn);
+	BUG_ON(!mutex_is_locked(&e->conn->lock));
+
+	atomic_dec(&e->conn->name_count);
+	list_del(&e->conn_entry);
+	e->conn = kdbus_conn_unref(e->conn);
+}
+
+static void kdbus_name_entry_set_owner(struct kdbus_name_entry *e,
+				       struct kdbus_conn *conn)
+{
+	BUG_ON(e->conn);
+	BUG_ON(!mutex_is_locked(&conn->lock));
+
+	e->conn = kdbus_conn_ref(conn);
+	atomic_inc(&conn->name_count);
+	list_add_tail(&e->conn_entry, &e->conn->names_list);
+}
+
+static int kdbus_name_replace_owner(struct kdbus_name_entry *e,
+				    struct kdbus_conn *conn, u64 flags)
+{
+	struct kdbus_conn *conn_old = kdbus_conn_ref(e->conn);
+	int ret;
+
+	BUG_ON(conn == conn_old);
+	BUG_ON(!conn_old);
+
+	/* take lock of both connections in a defined order */
+	if (conn < conn_old) {
+		mutex_lock(&conn->lock);
+		mutex_lock_nested(&conn_old->lock, 1);
+	} else {
+		mutex_lock(&conn_old->lock);
+		mutex_lock_nested(&conn->lock, 1);
+	}
+
+	if (!kdbus_conn_active(conn)) {
+		ret = -ECONNRESET;
+		goto exit_unlock;
+	}
+
+	ret = kdbus_notify_name_change(conn->ep->bus, KDBUS_ITEM_NAME_CHANGE,
+				       e->conn->id, conn->id,
+				       e->flags, flags, e->name);
+	if (ret < 0)
+		goto exit_unlock;
+
+	/* hand over name ownership */
+	kdbus_name_entry_remove_owner(e);
+	kdbus_name_entry_set_owner(e, conn);
+	e->flags = flags;
+
+exit_unlock:
+	mutex_unlock(&conn_old->lock);
+	mutex_unlock(&conn->lock);
+
+	kdbus_conn_unref(conn_old);
+	return ret;
+}
+
+static int kdbus_name_entry_release(struct kdbus_name_entry *e,
+				    struct kdbus_bus *bus)
+{
+	struct kdbus_conn *conn;
+
+	/* give it to first active waiter in the queue */
+	while (!list_empty(&e->queue_list)) {
+		struct kdbus_name_queue_item *q;
+		int ret;
+
+		q = list_first_entry(&e->queue_list,
+				     struct kdbus_name_queue_item,
+				     entry_entry);
+
+		ret = kdbus_name_replace_owner(e, q->conn, q->flags);
+		if (ret < 0)
+			continue;
+
+		kdbus_name_queue_item_free(q);
+		return 0;
+	}
+
+	/* hand it back to an active activator connection */
+	if (e->activator && e->activator != e->conn) {
+		u64 flags = KDBUS_NAME_ACTIVATOR;
+		int ret;
+
+		/*
+		 * Move messages still queued in the old connection
+		 * and addressed to that name to the new connection.
+		 * This allows a race and loss-free name and message
+		 * takeover and exit-on-idle services.
+		 */
+		ret = kdbus_conn_move_messages(e->activator, e->conn,
+					       e->name_id);
+		if (ret < 0)
+			goto exit_release;
+
+		return kdbus_name_replace_owner(e, e->activator, flags);
+	}
+
+exit_release:
+	/* release the name */
+	kdbus_notify_name_change(e->conn->ep->bus, KDBUS_ITEM_NAME_REMOVE,
+				 e->conn->id, 0,
+				 e->flags, 0, e->name);
+
+	conn = kdbus_conn_ref(e->conn);
+	mutex_lock(&conn->lock);
+	kdbus_name_entry_remove_owner(e);
+	mutex_unlock(&conn->lock);
+	kdbus_conn_unref(conn);
+
+	kdbus_conn_unref(e->activator);
+	kdbus_name_entry_free(e);
+
+	return 0;
+}
+
+static int kdbus_name_release(struct kdbus_name_registry *reg,
+			      struct kdbus_conn *conn,
+			      const char *name)
+{
+	struct kdbus_name_queue_item *q_tmp, *q;
+	struct kdbus_name_entry *e = NULL;
+	u32 hash;
+	int ret = 0;
+
+	hash = kdbus_str_hash(name);
+
+	/* lock order: domain -> bus -> ep -> names -> connection */
+	mutex_lock(&conn->ep->bus->lock);
+	down_write(&reg->rwlock);
+
+	e = kdbus_name_lookup(reg, hash, name);
+	if (!e) {
+		ret = -ESRCH;
+		goto exit_unlock;
+	}
+
+	/* Is the connection already the real owner of the name? */
+	if (e->conn == conn) {
+		ret = kdbus_name_entry_release(e, conn->ep->bus);
+	} else {
+		/*
+		 * Otherwise, walk the list of queued entries and search
+		 * for items for connection.
+		 */
+
+		/* In case the name belongs to somebody else */
+		ret = -EADDRINUSE;
+
+		list_for_each_entry_safe(q, q_tmp,
+					 &e->queue_list,
+					 entry_entry) {
+			if (q->conn != conn)
+				continue;
+
+			kdbus_name_queue_item_free(q);
+			ret = 0;
+			break;
+		}
+	}
+
+	/*
+	 * Now that the connection has lost a name, purge all cached policy
+	 * entries, so upon the next message, TALK access will be checked
+	 * against the names the connection actually owns.
+	 */
+	if (ret == 0)
+		kdbus_conn_purge_policy_cache(conn);
+
+exit_unlock:
+	up_write(&reg->rwlock);
+	mutex_unlock(&conn->ep->bus->lock);
+
+	return ret;
+}
+
+/**
+ * kdbus_name_remove_by_conn() - remove all name entries of a given connection
+ * @reg:		The name registry
+ * @conn:		The connection which entries to remove
+ *
+ * This function removes all name entry held by a given connection.
+ */
+void kdbus_name_remove_by_conn(struct kdbus_name_registry *reg,
+			       struct kdbus_conn *conn)
+{
+	struct kdbus_name_queue_item *q_tmp, *q;
+	struct kdbus_conn *activator = NULL;
+	struct kdbus_name_entry *e_tmp, *e;
+	LIST_HEAD(names_queue_list);
+	LIST_HEAD(names_list);
+
+	/* lock order: domain -> bus -> ep -> names -> conn */
+	mutex_lock(&conn->ep->bus->lock);
+	down_write(&reg->rwlock);
+
+	mutex_lock(&conn->lock);
+	list_splice_init(&conn->names_list, &names_list);
+	list_splice_init(&conn->names_queue_list, &names_queue_list);
+	mutex_unlock(&conn->lock);
+
+	if (kdbus_conn_is_activator(conn)) {
+		activator = conn->activator_of->activator;
+		conn->activator_of->activator = NULL;
+	}
+	list_for_each_entry_safe(q, q_tmp, &names_queue_list, conn_entry)
+		kdbus_name_queue_item_free(q);
+	list_for_each_entry_safe(e, e_tmp, &names_list, conn_entry)
+		kdbus_name_entry_release(e, conn->ep->bus);
+
+	up_write(&reg->rwlock);
+	mutex_unlock(&conn->ep->bus->lock);
+
+	kdbus_conn_unref(activator);
+	kdbus_notify_flush(conn->ep->bus);
+}
+
+/**
+ * kdbus_name_lock() - look up a name in a name registry and lock it
+ * @reg:		The name registry
+ * @name:		The name to look up
+ *
+ * Search for a name in a given name registry and return it with the
+ * registry-lock held. If the object is not found, the lock is not acquired and
+ * NULL is returned. The caller is responsible of unlocking the name via
+ * kdbus_name_unlock() again. Note that kdbus_name_unlock() can be safely called
+ * with NULL as name. In this case, it's a no-op as nothing was locked.
+ *
+ * The *_lock() + *_unlock() logic is only required for callers that need to
+ * protect their code against concurrent activator/implementor name changes.
+ * Multiple readers can lock names concurrently. However, you may not change
+ * name-ownership while holding a name-lock.
+ *
+ * Return: NULL if name is unknown, otherwise return a pointer to the name
+ *         entry with the name-lock held (reader lock only).
+ */
+struct kdbus_name_entry *kdbus_name_lock(struct kdbus_name_registry *reg,
+					 const char *name)
+{
+	struct kdbus_name_entry *e = NULL;
+	u32 hash = kdbus_str_hash(name);
+
+	down_read(&reg->rwlock);
+	e = kdbus_name_lookup(reg, hash, name);
+	if (e)
+		return e;
+	up_read(&reg->rwlock);
+
+	return NULL;
+}
+
+/**
+ * kdbus_name_unlock() - unlock one name in a name registry
+ * @reg:		The name registry
+ * @entry:		The locked name entry or NULL
+ *
+ * This is the unlock-counterpart of kdbus_name_lock(). It unlocks a name that
+ * was previously successfully locked. You can safely pass NULL as entry and
+ * this will become a no-op. Therefore, it's safe to always call this on the
+ * return-value of kdbus_name_lock().
+ *
+ * Return: This always returns NULL.
+ */
+struct kdbus_name_entry *kdbus_name_unlock(struct kdbus_name_registry *reg,
+					   struct kdbus_name_entry *entry)
+{
+	if (entry) {
+		BUG_ON(!rwsem_is_locked(&reg->rwlock));
+		up_read(&reg->rwlock);
+	}
+
+	return NULL;
+}
+
+static int kdbus_name_queue_conn(struct kdbus_conn *conn, u64 flags,
+				 struct kdbus_name_entry *e)
+{
+	struct kdbus_name_queue_item *q;
+
+	q = kzalloc(sizeof(*q), GFP_KERNEL);
+	if (!q)
+		return -ENOMEM;
+
+	q->conn = conn;
+	q->flags = flags;
+	q->entry = e;
+
+	list_add_tail(&q->entry_entry, &e->queue_list);
+	list_add_tail(&q->conn_entry, &conn->names_queue_list);
+
+	return 0;
+}
+
+/**
+ * kdbus_name_is_valid() - check if a name is valid
+ * @p:			The name to check
+ * @allow_wildcard:	Whether or not to allow a wildcard name
+ *
+ * A name is valid if all of the following criterias are met:
+ *
+ *  - The name has two or more elements separated by a period ('.') character.
+ *  - All elements must contain at least one character.
+ *  - Each element must only contain the ASCII characters "[A-Z][a-z][0-9]_-"
+ *    and must not begin with a digit.
+ *  - The name must not exceed KDBUS_NAME_MAX_LEN.
+ *  - If @allow_wildcard is true, the name may end on '.*'
+ */
+bool kdbus_name_is_valid(const char *p, bool allow_wildcard)
+{
+	bool dot, found_dot = false;
+	const char *q;
+
+	for (dot = true, q = p; *q; q++) {
+		if (*q == '.') {
+			if (dot)
+				return false;
+
+			found_dot = true;
+			dot = true;
+		} else {
+			bool good;
+
+			good = isalpha(*q) || (!dot && isdigit(*q)) ||
+				*q == '_' || *q == '-' ||
+				(allow_wildcard && dot &&
+					*q == '*' && *(q + 1) == '\0');
+
+			if (!good)
+				return false;
+
+			dot = false;
+		}
+	}
+
+	if (q - p > KDBUS_NAME_MAX_LEN)
+		return false;
+
+	if (dot)
+		return false;
+
+	if (!found_dot)
+		return false;
+
+	return true;
+}
+
+/**
+ * kdbus_name_acquire() - acquire a name
+ * @reg:		The name registry
+ * @conn:		The connection to pin this entry to
+ * @name:		The name to acquire
+ * @flags:		Acquisition flags (KDBUS_NAME_*)
+ *
+ * Callers must ensure that @conn is either a privileged bus user or has
+ * sufficient privileges in the policy-db to own the well-known name @name.
+ *
+ * Return: 0 success, negative error number on failure.
+ */
+int kdbus_name_acquire(struct kdbus_name_registry *reg,
+		       struct kdbus_conn *conn,
+		       const char *name, u64 *flags)
+{
+	struct kdbus_name_entry *e = NULL;
+	u32 hash;
+	int ret = 0;
+
+	/* lock order: domain -> bus -> ep -> names -> conn */
+	mutex_lock(&conn->ep->bus->lock);
+	down_write(&reg->rwlock);
+
+	hash = kdbus_str_hash(name);
+	e = kdbus_name_lookup(reg, hash, name);
+	if (e) {
+		/* connection already owns that name */
+		if (e->conn == conn) {
+			ret = -EALREADY;
+			goto exit_unlock;
+		}
+
+		if (kdbus_conn_is_activator(conn)) {
+			/* An activator can only own a single name */
+			if (conn->activator_of) {
+				if (conn->activator_of == e)
+					ret = -EALREADY;
+				else
+					ret = -EINVAL;
+			} else if (!e->activator && !conn->activator_of) {
+				/*
+				 * Activator registers for name that is
+				 * already owned
+				 */
+				e->activator = kdbus_conn_ref(conn);
+				conn->activator_of = e;
+			}
+
+			goto exit_unlock;
+		}
+
+		/* take over the name of an activator connection */
+		if (e->flags & KDBUS_NAME_ACTIVATOR) {
+			/*
+			 * Take over the messages queued in the activator
+			 * connection, the activator itself never reads them.
+			 */
+			ret = kdbus_conn_move_messages(conn, e->activator, 0);
+			if (ret < 0)
+				goto exit_unlock;
+
+			ret = kdbus_name_replace_owner(e, conn, *flags);
+			goto exit_unlock;
+		}
+
+		/* take over the name if both parties agree */
+		if ((*flags & KDBUS_NAME_REPLACE_EXISTING) &&
+		    (e->flags & KDBUS_NAME_ALLOW_REPLACEMENT)) {
+			/*
+			 * Move name back to the queue, in case we take it away
+			 * from a connection which asked for queuing.
+			 */
+			if (e->flags & KDBUS_NAME_QUEUE) {
+				ret = kdbus_name_queue_conn(e->conn,
+							    e->flags, e);
+				if (ret < 0)
+					goto exit_unlock;
+			}
+
+			ret = kdbus_name_replace_owner(e, conn, *flags);
+			goto exit_unlock;
+		}
+
+		/* add it to the queue waiting for the name */
+		if (*flags & KDBUS_NAME_QUEUE) {
+			ret = kdbus_name_queue_conn(conn, *flags, e);
+
+			/* tell the caller that we queued it */
+			*flags |= KDBUS_NAME_IN_QUEUE;
+
+			goto exit_unlock;
+		}
+
+		/* the name is busy, return a failure */
+		ret = -EEXIST;
+		goto exit_unlock;
+	} else {
+		/* An activator can only own a single name */
+		if (kdbus_conn_is_activator(conn) &&
+		    conn->activator_of) {
+			ret = -EINVAL;
+			goto exit_unlock;
+		}
+	}
+
+	/* new name entry */
+	e = kzalloc(sizeof(*e), GFP_KERNEL);
+	if (!e) {
+		ret = -ENOMEM;
+		goto exit_unlock;
+	}
+
+	e->name = kstrdup(name, GFP_KERNEL);
+	if (!e->name) {
+		kfree(e);
+		ret = -ENOMEM;
+		goto exit_unlock;
+	}
+
+	if (kdbus_conn_is_activator(conn)) {
+		e->activator = kdbus_conn_ref(conn);
+		conn->activator_of = e;
+	}
+
+	e->flags = *flags;
+	INIT_LIST_HEAD(&e->queue_list);
+	e->name_id = ++reg->name_seq_last;
+
+	mutex_lock(&conn->lock);
+	if (!kdbus_conn_active(conn)) {
+		mutex_unlock(&conn->lock);
+		kfree(e);
+		ret = -ECONNRESET;
+		goto exit_unlock;
+	}
+	hash_add(reg->entries_hash, &e->hentry, hash);
+	kdbus_name_entry_set_owner(e, conn);
+	mutex_unlock(&conn->lock);
+
+	kdbus_notify_name_change(e->conn->ep->bus, KDBUS_ITEM_NAME_ADD,
+				 0, e->conn->id,
+				 0, e->flags, e->name);
+
+exit_unlock:
+	up_write(&reg->rwlock);
+	mutex_unlock(&conn->ep->bus->lock);
+	kdbus_notify_flush(conn->ep->bus);
+	return ret;
+}
+
+/**
+ * kdbus_cmd_name_acquire() - acquire a name from a ioctl command buffer
+ * @reg:		The name registry
+ * @conn:		The connection to pin this entry to
+ * @cmd:		The command as passed in by the ioctl
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_cmd_name_acquire(struct kdbus_name_registry *reg,
+			   struct kdbus_conn *conn,
+			   struct kdbus_cmd_name *cmd)
+{
+	const char *name;
+	int ret;
+
+	name = kdbus_items_get_str(cmd->items, KDBUS_ITEMS_SIZE(cmd, items),
+				   KDBUS_ITEM_NAME);
+	if (IS_ERR(name))
+		return -EINVAL;
+
+	if (!kdbus_name_is_valid(name, false))
+		return -EINVAL;
+
+	/*
+	 * Do atomic_inc_return here to reserve our slot, then decrement
+	 * it before returning.
+	 */
+	if (atomic_inc_return(&conn->name_count) > KDBUS_CONN_MAX_NAMES) {
+		ret = -E2BIG;
+		goto out_dec;
+	}
+
+	ret = kdbus_ep_policy_check_own_access(conn->ep, conn, name);
+	if (ret < 0)
+		goto out_dec;
+
+	ret = kdbus_name_acquire(reg, conn, name, &cmd->flags);
+	kdbus_notify_flush(conn->ep->bus);
+
+out_dec:
+	/* Decrement the previous allocated slot */
+	atomic_dec(&conn->name_count);
+	return ret;
+}
+
+/**
+ * kdbus_cmd_name_release() - release a name entry from a ioctl command buffer
+ * @reg:		The name registry
+ * @conn:		The connection that holds the name
+ * @cmd:		The command as passed in by the ioctl
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_cmd_name_release(struct kdbus_name_registry *reg,
+			   struct kdbus_conn *conn,
+			   const struct kdbus_cmd_name *cmd)
+{
+	int ret;
+	const char *name;
+
+	name = kdbus_items_get_str(cmd->items, KDBUS_ITEMS_SIZE(cmd, items),
+				   KDBUS_ITEM_NAME);
+	if (IS_ERR(name))
+		return -EINVAL;
+
+	if (!kdbus_name_is_valid(name, false))
+		return -EINVAL;
+
+	ret = kdbus_ep_policy_check_see_access(conn->ep, conn, name);
+	if (ret < 0)
+		return ret;
+
+	ret = kdbus_name_release(reg, conn, name);
+
+	kdbus_notify_flush(conn->ep->bus);
+	return ret;
+}
+
+static int kdbus_name_list_write(struct kdbus_conn *conn,
+				 struct kdbus_conn *c,
+				 struct kdbus_pool_slice *slice,
+				 size_t *pos,
+				 struct kdbus_name_entry *e,
+				 bool write)
+{
+	const size_t len = sizeof(struct kdbus_name_info);
+	size_t p = *pos;
+	size_t name_item_size = 0;
+
+	if (e) {
+		name_item_size = offsetof(struct kdbus_item, name.name) +
+				 KDBUS_ALIGN8(strlen(e->name) + 1);
+
+		if (kdbus_ep_policy_check_see_access_unlocked(conn->ep, conn,
+							      e->name) < 0)
+			return 0;
+	}
+
+	if (write) {
+		int ret;
+		struct kdbus_name_info info = {
+			.size = len,
+			.owner_id = c->id,
+			.conn_flags = c->flags,
+		};
+
+		info.size += name_item_size;
+
+		/* write record */
+		ret = kdbus_pool_slice_copy(slice, p, &info, len);
+		if (ret < 0)
+			return ret;
+		p += len;
+
+		/* append name */
+		if (e) {
+			/* fake the header of a kdbus_name item */
+			struct {
+				__u64 size;
+				__u64 type;
+				__u64 flags;
+			} h;
+			size_t nlen;
+
+			h.size = name_item_size;
+			h.type = KDBUS_ITEM_OWNED_NAME;
+			h.flags = e->flags;
+
+			ret = kdbus_pool_slice_copy(slice, p, &h, sizeof(h));
+			if (ret < 0)
+				return ret;
+
+			p += sizeof(h);
+
+			nlen = name_item_size - sizeof(h);
+			ret = kdbus_pool_slice_copy(slice, p, e->name, nlen);
+			if (ret < 0)
+				return ret;
+
+			p += nlen;
+		}
+	} else {
+		p += len + name_item_size;
+	}
+
+	*pos = p;
+	return 0;
+}
+
+static int kdbus_name_list_all(struct kdbus_conn *conn, u64 flags,
+			       struct kdbus_pool_slice *slice,
+			       size_t *pos, bool write)
+{
+	struct kdbus_conn *c;
+	size_t p = *pos;
+	int ret, i;
+
+	hash_for_each(conn->ep->bus->conn_hash, i, c, hentry) {
+		bool added = false;
+
+		/* skip activators */
+		if (!(flags & KDBUS_NAME_LIST_ACTIVATORS) &&
+		    kdbus_conn_is_activator(c))
+			continue;
+
+		/* all names the connection owns */
+		if (flags & (KDBUS_NAME_LIST_NAMES |
+			     KDBUS_NAME_LIST_ACTIVATORS)) {
+			struct kdbus_name_entry *e;
+
+			mutex_lock(&c->lock);
+			list_for_each_entry(e, &c->names_list, conn_entry) {
+				struct kdbus_conn *a = e->activator;
+
+				if ((flags & KDBUS_NAME_LIST_ACTIVATORS) &&
+				    a && a != c) {
+					ret = kdbus_name_list_write(conn, a,
+							slice, &p, e, write);
+					if (ret < 0) {
+						mutex_unlock(&c->lock);
+						return ret;
+					}
+
+					added = true;
+				}
+
+				if (flags & KDBUS_NAME_LIST_NAMES ||
+				    kdbus_conn_is_activator(c)) {
+					ret = kdbus_name_list_write(conn, c,
+							slice, &p, e, write);
+					if (ret < 0) {
+						mutex_unlock(&c->lock);
+						return ret;
+					}
+
+					added = true;
+				}
+			}
+			mutex_unlock(&c->lock);
+		}
+
+		/* queue of names the connection is currently waiting for */
+		if (flags & KDBUS_NAME_LIST_QUEUED) {
+			struct kdbus_name_queue_item *q;
+
+			mutex_lock(&c->lock);
+			list_for_each_entry(q, &c->names_queue_list,
+					    conn_entry) {
+				ret = kdbus_name_list_write(conn, c,
+						slice, &p, q->entry, write);
+				if (ret < 0) {
+					mutex_unlock(&c->lock);
+					return ret;
+				}
+
+				added = true;
+			}
+			mutex_unlock(&c->lock);
+		}
+
+		/* nothing added so far, just add the unique ID */
+		if (!added && flags & KDBUS_NAME_LIST_UNIQUE) {
+			ret = kdbus_name_list_write(conn, c,
+					slice, &p, NULL, write);
+			if (ret < 0)
+				return ret;
+		}
+	}
+
+	*pos = p;
+	return 0;
+}
+
+/**
+ * kdbus_cmd_name_list() - list names of a connection
+ * @reg:		The name registry
+ * @conn:		The connection holding the name entries
+ * @cmd:		The command as passed in by the ioctl
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_cmd_name_list(struct kdbus_name_registry *reg,
+			struct kdbus_conn *conn,
+			struct kdbus_cmd_name_list *cmd)
+{
+	struct kdbus_policy_db *policy_db;
+	struct kdbus_name_list list = {};
+	struct kdbus_pool_slice *slice;
+	size_t pos;
+	int ret;
+
+	policy_db = &conn->ep->policy_db;
+
+	/* lock order: domain -> bus -> ep -> names -> conn */
+	down_read(&reg->rwlock);
+	down_read(&conn->ep->bus->conn_rwlock);
+	down_read(&policy_db->entries_rwlock);
+
+	/* size of header + records */
+	pos = sizeof(struct kdbus_name_list);
+	ret = kdbus_name_list_all(conn, cmd->flags, NULL, &pos, false);
+	if (ret < 0)
+		goto exit_unlock;
+
+	slice = kdbus_pool_slice_alloc(conn->pool, pos);
+	if (IS_ERR(slice)) {
+		ret = PTR_ERR(slice);
+		goto exit_unlock;
+	}
+
+	/* copy the header, specifying the overall size */
+	list.size = pos;
+	ret = kdbus_pool_slice_copy(slice, 0, &list, sizeof(list));
+	if (ret < 0)
+		goto exit_pool_free;
+
+	/* copy the records */
+	pos = sizeof(struct kdbus_name_list);
+	ret = kdbus_name_list_all(conn, cmd->flags, slice, &pos, true);
+	if (ret < 0)
+		goto exit_pool_free;
+
+	cmd->offset = kdbus_pool_slice_offset(slice);
+	kdbus_pool_slice_flush(slice);
+	kdbus_pool_slice_make_public(slice);
+
+exit_pool_free:
+	if (ret < 0)
+		kdbus_pool_slice_free(slice);
+exit_unlock:
+	up_read(&policy_db->entries_rwlock);
+	up_read(&conn->ep->bus->conn_rwlock);
+	up_read(&reg->rwlock);
+	return ret;
+}
diff --git a/ipc/kdbus/names.h b/ipc/kdbus/names.h
new file mode 100644
index 000000000000..a4d732a38782
--- /dev/null
+++ b/ipc/kdbus/names.h
@@ -0,0 +1,81 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_NAMES_H
+#define __KDBUS_NAMES_H
+
+#include <linux/hashtable.h>
+#include <linux/rwsem.h>
+
+/**
+ * struct kdbus_name_registry - names registered for a bus
+ * @entries_hash:	Map of entries
+ * @lock:		Registry data lock
+ * @name_seq_last:	Last used sequence number to assign to a name entry
+ */
+struct kdbus_name_registry {
+	DECLARE_HASHTABLE(entries_hash, 8);
+	struct rw_semaphore rwlock;
+	u64 name_seq_last;
+};
+
+/**
+ * struct kdbus_name_entry - well-know name entry
+ * @name:		The well-known name
+ * @name_id:		Sequence number of name entry to be able to uniquely
+ *			identify a name over its registration lifetime
+ * @flags:		KDBUS_NAME_* flags
+ * @queue_list:		List of queued waiters for the well-known name
+ * @conn_entry:		Entry in connection
+ * @hentry:		Entry in registry map
+ * @conn:		Connection owning the name
+ * @activator:		Connection of the activator queuing incoming messages
+ */
+struct kdbus_name_entry {
+	char *name;
+	u64 name_id;
+	u64 flags;
+	struct list_head queue_list;
+	struct list_head conn_entry;
+	struct hlist_node hentry;
+	struct kdbus_conn *conn;
+	struct kdbus_conn *activator;
+};
+
+struct kdbus_name_registry *kdbus_name_registry_new(void);
+void kdbus_name_registry_free(struct kdbus_name_registry *reg);
+
+int kdbus_name_acquire(struct kdbus_name_registry *reg,
+		       struct kdbus_conn *conn,
+		       const char *name, u64 *flags);
+int kdbus_cmd_name_acquire(struct kdbus_name_registry *reg,
+			   struct kdbus_conn *conn,
+			   struct kdbus_cmd_name *cmd);
+int kdbus_cmd_name_release(struct kdbus_name_registry *reg,
+			   struct kdbus_conn *conn,
+			   const struct kdbus_cmd_name *cmd);
+int kdbus_cmd_name_list(struct kdbus_name_registry *reg,
+			struct kdbus_conn *conn,
+			struct kdbus_cmd_name_list *cmd);
+
+struct kdbus_name_entry *kdbus_name_lock(struct kdbus_name_registry *reg,
+					 const char *name);
+struct kdbus_name_entry *kdbus_name_unlock(struct kdbus_name_registry *reg,
+					   struct kdbus_name_entry *entry);
+
+void kdbus_name_remove_by_conn(struct kdbus_name_registry *reg,
+			       struct kdbus_conn *conn);
+
+bool kdbus_name_is_valid(const char *p, bool allow_wildcard);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add policy database implementation
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch adds the policy database implementation.

A policy databases restrict the possibilities of connections to own,
see and talk to well-known names. It can be associated with a bus
(through a policy holder connection) or a custom endpoint.

By default, buses have an empty policy database that is augmented on
demand when a policy holder connection is instantiated.

Policies are set through KDBUS_CMD_HELLO (when creating a policy
holder connection), KDBUS_CMD_CONN_UPDATE (when updating a policy
holder connection), KDBUS_CMD_EP_MAKE (creating a custom endpoint)
or KDBUS_CMD_EP_UPDATE (updating a custom endpoint). In all cases,
the name and policy access information is stored in items of type
KDBUS_ITEM_NAME and KDBUS_ITEM_POLICY_ACCESS.

See Documentation/kdbus.txt for more details.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 ipc/kdbus/policy.c | 629 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 ipc/kdbus/policy.h |  61 ++++++
 2 files changed, 690 insertions(+)
 create mode 100644 ipc/kdbus/policy.c
 create mode 100644 ipc/kdbus/policy.h

diff --git a/ipc/kdbus/policy.c b/ipc/kdbus/policy.c
new file mode 100644
index 000000000000..53652dfa2973
--- /dev/null
+++ b/ipc/kdbus/policy.c
@@ -0,0 +1,629 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ * Copyright (C) 2014 Djalal Harouni
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <linux/fs.h>
+#include <linux/init.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "bus.h"
+#include "connection.h"
+#include "domain.h"
+#include "item.h"
+#include "names.h"
+#include "policy.h"
+
+#define KDBUS_POLICY_HASH_SIZE	64
+
+/**
+ * struct kdbus_policy_db_cache_entry - a cached entry
+ * @conn_a:		Connection A
+ * @conn_b:		Connection B
+ * @owner:		Owner of policy-entry that produced this cache-entry
+ * @hentry:		The hash table entry for the database's entries_hash
+ */
+struct kdbus_policy_db_cache_entry {
+	struct kdbus_conn *conn_a;
+	struct kdbus_conn *conn_b;
+	const void *owner;
+	struct hlist_node hentry;
+};
+
+/**
+ * struct kdbus_policy_db_entry_access - a database entry access item
+ * @type:		One of KDBUS_POLICY_ACCESS_* types
+ * @access:		Access to grant. One of KDBUS_POLICY_*
+ * @uid:		For KDBUS_POLICY_ACCESS_USER, the global uid
+ * @gid:		For KDBUS_POLICY_ACCESS_GROUP, the global gid
+ * @list:		List entry item for the entry's list
+ *
+ * This is the internal version of struct kdbus_policy_db_access.
+ */
+struct kdbus_policy_db_entry_access {
+	u8 type;		/* USER, GROUP, WORLD */
+	u8 access;		/* OWN, TALK, SEE */
+	union {
+		kuid_t uid;	/* global uid */
+		kgid_t gid;	/* global gid */
+	};
+	struct list_head list;
+};
+
+/**
+ * struct kdbus_policy_db_entry - a policy database entry
+ * @name:		The name to match the policy entry against
+ * @hentry:		The hash entry for the database's entries_hash
+ * @access_list:	List head for keeping tracks of the entry's
+ *			access items.
+ * @owner:		The owner of this entry. Can be a kdbus_conn or
+ *			a kdbus_ep object.
+ * @wildcard:		The name is a wildcard, such as ending on '.*'
+ */
+struct kdbus_policy_db_entry {
+	char *name;
+	struct hlist_node hentry;
+	struct list_head access_list;
+	const void *owner;
+	bool wildcard:1;
+};
+
+static void kdbus_policy_entry_free(struct kdbus_policy_db_entry *e)
+{
+	struct kdbus_policy_db_entry_access *a, *tmp;
+
+	list_for_each_entry_safe(a, tmp, &e->access_list, list) {
+		list_del(&a->list);
+		kfree(a);
+	}
+
+	kfree(e->name);
+	kfree(e);
+}
+
+static const struct kdbus_policy_db_entry *
+kdbus_policy_lookup(struct kdbus_policy_db *db,
+		    const char *name, u32 hash, bool wildcard)
+{
+	struct kdbus_policy_db_entry *e, *found = NULL;
+
+	hash_for_each_possible(db->entries_hash, e, hentry, hash)
+		if (strcmp(e->name, name) == 0 && !e->wildcard)
+			return e;
+
+	if (wildcard) {
+		const char *tmp;
+		char *dot;
+
+		tmp = kstrdup(name, GFP_KERNEL);
+		if (!tmp)
+			return NULL;
+
+		dot = strrchr(tmp, '.');
+		if (!dot)
+			goto exit_free;
+
+		*dot = '\0';
+		hash = kdbus_str_hash(tmp);
+
+		hash_for_each_possible(db->entries_hash, e, hentry, hash)
+			if (strcmp(e->name, tmp) == 0 && e->wildcard) {
+				found = e;
+				/* never "break;" in hash_for_each() */
+				goto exit_free;
+			}
+
+exit_free:
+		kfree(tmp);
+	}
+
+	return found;
+}
+
+/**
+ * kdbus_policy_db_clear - release all memory from a policy db
+ * @db:		The policy database
+ */
+void kdbus_policy_db_clear(struct kdbus_policy_db *db)
+{
+	struct kdbus_policy_db_cache_entry *ce;
+	struct kdbus_policy_db_entry *e;
+	struct hlist_node *tmp;
+	unsigned int i;
+
+	BUG_ON(!db);
+
+	/* purge entries */
+	down_write(&db->entries_rwlock);
+	hash_for_each_safe(db->entries_hash, i, tmp, e, hentry) {
+		hash_del(&e->hentry);
+		kdbus_policy_entry_free(e);
+	}
+	up_write(&db->entries_rwlock);
+
+	/* purge cache */
+	mutex_lock(&db->cache_lock);
+	hash_for_each_safe(db->talk_access_hash, i, tmp, ce, hentry) {
+		hash_del(&ce->hentry);
+		kfree(ce);
+	}
+	mutex_unlock(&db->cache_lock);
+}
+
+/**
+ * kdbus_policy_db_init() - initialize a new policy database
+ * @db:		The location of the database
+ *
+ * This initializes a new policy-db. The underlying memory must have been
+ * cleared to zero by the caller.
+ */
+void kdbus_policy_db_init(struct kdbus_policy_db *db)
+{
+	hash_init(db->entries_hash);
+	hash_init(db->talk_access_hash);
+	init_rwsem(&db->entries_rwlock);
+	mutex_init(&db->cache_lock);
+}
+
+static int kdbus_policy_check_access(const struct kdbus_policy_db_entry *e,
+				     const struct cred *cred,
+				     unsigned int access)
+{
+	struct kdbus_policy_db_entry_access *a;
+	struct group_info *group_info;
+	int i;
+
+	if (!e)
+		return -EPERM;
+
+	group_info = cred->group_info;
+
+	list_for_each_entry(a, &e->access_list, list) {
+		if (a->access >= access) {
+			switch (a->type) {
+			case KDBUS_POLICY_ACCESS_USER:
+				if (uid_eq(cred->uid, a->uid))
+					return 0;
+				break;
+			case KDBUS_POLICY_ACCESS_GROUP:
+				if (gid_eq(cred->gid, a->gid))
+					return 0;
+
+				for (i = 0; i < group_info->ngroups; i++) {
+					kgid_t gid = GROUP_AT(group_info, i);
+
+					if (gid_eq(gid, a->gid))
+						return 0;
+				}
+
+				break;
+			case KDBUS_POLICY_ACCESS_WORLD:
+				return 0;
+			}
+		}
+	}
+
+	return -EPERM;
+}
+
+/**
+ * kdbus_policy_check_own_access() - check whether a connection is allowed
+ *				     to own a name
+ * @db:		The policy database
+ * @cred:	The creds to check against
+ * @name:	The name to check
+ *
+ * Return: 0 if the connection is allowed to own the name, -EPERM otherwise
+ */
+int kdbus_policy_check_own_access(struct kdbus_policy_db *db,
+				  const struct cred *cred,
+				  const char *name)
+{
+	const struct kdbus_policy_db_entry *e;
+	int ret;
+
+	down_read(&db->entries_rwlock);
+	e = kdbus_policy_lookup(db, name, kdbus_str_hash(name), true);
+	ret = kdbus_policy_check_access(e, cred, KDBUS_POLICY_OWN);
+	up_read(&db->entries_rwlock);
+
+	return ret;
+}
+
+/**
+ * kdbus_policy_check_talk_access() - check if one connection is allowed
+ *				       to send a message to another connection
+ * @db:			The policy database
+ * @conn_src:		The source connection
+ * @conn_dst:		The destination connection
+ *
+ * Return: 0 if access is granted, -EPERM if not, negative errno on failure
+ */
+int kdbus_policy_check_talk_access(struct kdbus_policy_db *db,
+				   struct kdbus_conn *conn_src,
+				   struct kdbus_conn *conn_dst)
+{
+	struct kdbus_policy_db_cache_entry *ce;
+	struct kdbus_name_entry *name_entry;
+	unsigned int hash = 0;
+	const void *owner;
+	int ret;
+
+	/*
+	 * If there was a positive match for these two connections before,
+	 * there's an entry in the hash table for them.
+	 */
+	hash ^= hash_ptr(conn_src, KDBUS_POLICY_HASH_SIZE);
+	hash ^= hash_ptr(conn_dst, KDBUS_POLICY_HASH_SIZE);
+
+	mutex_lock(&db->cache_lock);
+	hash_for_each_possible(db->talk_access_hash, ce, hentry, hash)
+		if (ce->conn_a == conn_src && ce->conn_b == conn_dst) {
+			mutex_unlock(&db->cache_lock);
+			return 0;
+		}
+	mutex_unlock(&db->cache_lock);
+
+	/*
+	 * Otherwise, walk the connection list and store a hash-table entry if
+	 * send access is granted.
+	 */
+
+	down_read(&db->entries_rwlock);
+
+	ret = -EPERM;
+	mutex_lock(&conn_dst->lock);
+	list_for_each_entry(name_entry, &conn_dst->names_list, conn_entry) {
+		u32 hash = kdbus_str_hash(name_entry->name);
+		const struct kdbus_policy_db_entry *e;
+
+		e = kdbus_policy_lookup(db, name_entry->name, hash, true);
+		if (kdbus_policy_check_access(e, conn_src->cred,
+					      KDBUS_POLICY_TALK) == 0) {
+			owner = e->owner;
+			ret = 0;
+			break;
+		}
+	}
+	mutex_unlock(&conn_dst->lock);
+
+	if (ret >= 0) {
+		ret = -ENOMEM;
+		ce = kmalloc(sizeof(*ce), GFP_KERNEL);
+		if (ce) {
+			ce->conn_a = conn_src;
+			ce->conn_b = conn_dst;
+			ce->owner = owner;
+			INIT_HLIST_NODE(&ce->hentry);
+
+			mutex_lock(&db->cache_lock);
+			hash_add(db->talk_access_hash, &ce->hentry, hash);
+			mutex_unlock(&db->cache_lock);
+
+			ret = 0;
+		}
+	}
+
+	up_read(&db->entries_rwlock);
+
+	return ret;
+}
+
+/**
+ * kdbus_policy_check_see_access_unlocked() - Check whether a connection is
+ *					      allowed to see a given name
+ * @db:		The policy database
+ * @cred:	The cred to check against
+ * @name:	The name
+ *
+ * Return: 0 if permission to see the name is granted, -EPERM otherwise
+ */
+int kdbus_policy_check_see_access_unlocked(struct kdbus_policy_db *db,
+					   const struct cred *cred,
+					   const char *name)
+{
+	const struct kdbus_policy_db_entry *e;
+
+	e = kdbus_policy_lookup(db, name, kdbus_str_hash(name), true);
+	return kdbus_policy_check_access(e, cred, KDBUS_POLICY_SEE);
+}
+
+static void __kdbus_policy_remove_owner_cache(struct kdbus_policy_db *db,
+					      const void *owner)
+{
+	struct kdbus_policy_db_cache_entry *ce;
+	struct hlist_node *tmp;
+	int i;
+
+	mutex_lock(&db->cache_lock);
+	hash_for_each_safe(db->talk_access_hash, i, tmp, ce, hentry)
+		if (ce->owner == owner) {
+			hash_del(&ce->hentry);
+			kfree(ce);
+		}
+	mutex_unlock(&db->cache_lock);
+}
+
+static void __kdbus_policy_remove_owner(struct kdbus_policy_db *db,
+					const void *owner)
+{
+	struct kdbus_policy_db_entry *e;
+	struct hlist_node *tmp;
+	int i;
+
+	hash_for_each_safe(db->entries_hash, i, tmp, e, hentry)
+		if (e->owner == owner) {
+			hash_del(&e->hentry);
+			kdbus_policy_entry_free(e);
+		}
+}
+
+/**
+ * kdbus_policy_remove_owner() - remove all entries related to a connection
+ * @db:		The policy database
+ * @owner:	The connection which items to remove
+ */
+void kdbus_policy_remove_owner(struct kdbus_policy_db *db,
+			       const void *owner)
+{
+	down_write(&db->entries_rwlock);
+	__kdbus_policy_remove_owner(db, owner);
+	__kdbus_policy_remove_owner_cache(db, owner);
+	up_write(&db->entries_rwlock);
+}
+
+/**
+ * kdbus_policy_purge_cache_for_conn() - remove all cached entries related to
+ *				a connection
+ * @db:		The policy database
+ * @conn:	The connection which items to remove
+ */
+void kdbus_policy_purge_cache(struct kdbus_policy_db *db,
+			      const struct kdbus_conn *conn)
+{
+	struct kdbus_policy_db_cache_entry *ce;
+	struct hlist_node *tmp;
+	int i;
+
+	mutex_lock(&db->cache_lock);
+	hash_for_each_safe(db->talk_access_hash, i, tmp, ce, hentry)
+		if (ce->conn_a == conn || ce->conn_b == conn) {
+			hash_del(&ce->hentry);
+			kfree(ce);
+		}
+	mutex_unlock(&db->cache_lock);
+}
+
+/*
+ * Convert user provided policy access to internal kdbus policy
+ * access
+ */
+static struct kdbus_policy_db_entry_access *
+kdbus_policy_make_access(const struct kdbus_policy_access *uaccess)
+{
+	int ret;
+	struct kdbus_policy_db_entry_access *a;
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return ERR_PTR(-ENOMEM);
+
+	ret = -EINVAL;
+	switch (uaccess->access) {
+	case KDBUS_POLICY_SEE:
+	case KDBUS_POLICY_TALK:
+	case KDBUS_POLICY_OWN:
+		a->access = uaccess->access;
+		break;
+	default:
+		goto err;
+	}
+
+	switch (uaccess->type) {
+	case KDBUS_POLICY_ACCESS_USER:
+		a->uid = make_kuid(current_user_ns(), uaccess->id);
+		if (!uid_valid(a->uid))
+			goto err;
+
+		break;
+	case KDBUS_POLICY_ACCESS_GROUP:
+		a->gid = make_kgid(current_user_ns(), uaccess->id);
+		if (!gid_valid(a->gid))
+			goto err;
+
+		break;
+	case KDBUS_POLICY_ACCESS_WORLD:
+		a->uid = current_uid();
+		break;
+	default:
+		goto err;
+	}
+
+	a->type = uaccess->type;
+
+	return a;
+
+err:
+	kfree(a);
+	return ERR_PTR(ret);
+}
+
+/**
+ * kdbus_policy_set() - set a connection's policy rules
+ * @db:				The policy database
+ * @items:			A list of kdbus_item elements that contain both
+ *				names and access rules to set.
+ * @items_size:			The total size of the items.
+ * @max_policies:		The maximum number of policy entries to allow.
+ *				Pass 0 for no limit.
+ * @allow_wildcards:		Boolean value whether wildcard entries (such
+ *				ending on '.*') should be allowed.
+ * @owner:			The owner of the new policy items.
+ *
+ * This function sets a new set of policies for a given owner. The names and
+ * access rules are gathered by walking the list of items passed in as
+ * argument. An item of type KDBUS_ITEM_NAME is expected before any number of
+ * KDBUS_ITEM_POLICY_ACCESS items. If there are more repetitions of this
+ * pattern than denoted in @max_policies, -EINVAL is returned.
+ *
+ * In order to allow atomic replacement of rules, the function first removes
+ * all entries that have been created for the given owner previously.
+ *
+ * Return: 0 on success, negative errno on failure.
+ */
+int kdbus_policy_set(struct kdbus_policy_db *db,
+		     const struct kdbus_item *items,
+		     size_t items_size,
+		     size_t max_policies,
+		     bool allow_wildcards,
+		     const void *owner)
+{
+	struct kdbus_policy_db_entry_access *a;
+	struct kdbus_policy_db_entry *e, *p;
+	const struct kdbus_item *item;
+	struct hlist_node *tmp;
+	HLIST_HEAD(entries);
+	HLIST_HEAD(restore);
+	size_t count = 0;
+	int i, ret = 0;
+	u32 hash;
+
+	if (items_size > KDBUS_POLICY_MAX_SIZE)
+		return -E2BIG;
+
+	/* Walk the list of items and look for new policies */
+	e = NULL;
+	KDBUS_ITEMS_FOREACH(item, items, items_size) {
+		switch (item->type) {
+		case KDBUS_ITEM_NAME: {
+			size_t len;
+
+			if (max_policies && ++count > max_policies) {
+				ret = -E2BIG;
+				goto exit;
+			}
+
+			if (!kdbus_name_is_valid(item->str, true)) {
+				ret = -EINVAL;
+				goto exit;
+			}
+
+			e = kzalloc(sizeof(*e), GFP_KERNEL);
+			if (!e) {
+				ret = -ENOMEM;
+				goto exit;
+			}
+
+			INIT_LIST_HEAD(&e->access_list);
+			e->owner = owner;
+			hlist_add_head(&e->hentry, &entries);
+
+			e->name = kstrdup(item->str, GFP_KERNEL);
+			if (!e->name) {
+				ret = -ENOMEM;
+				goto exit;
+			}
+
+			/*
+			 * If a supplied name ends with an '.*', cut off that
+			 * part, only store anything before it, and mark the
+			 * entry as wildcard.
+			 */
+			len = strlen(e->name);
+			if (len > 2 &&
+			    e->name[len - 3] == '.' &&
+			    e->name[len - 2] == '*') {
+				if (!allow_wildcards) {
+					ret = -EINVAL;
+					goto exit;
+				}
+
+				e->name[len - 3] = '\0';
+				e->wildcard = true;
+			}
+
+			break;
+		}
+
+		case KDBUS_ITEM_POLICY_ACCESS:
+			if (!e) {
+				ret = -EINVAL;
+				goto exit;
+			}
+
+			a = kdbus_policy_make_access(&item->policy_access);
+			if (IS_ERR(a)) {
+				ret = PTR_ERR(a);
+				goto exit;
+			}
+
+			list_add_tail(&a->list, &e->access_list);
+			break;
+		}
+	}
+
+	down_write(&db->entries_rwlock);
+
+	/* remember previous entries to restore in case of failure */
+	hash_for_each_safe(db->entries_hash, i, tmp, e, hentry)
+		if (e->owner == owner) {
+			hash_del(&e->hentry);
+			hlist_add_head(&e->hentry, &restore);
+		}
+
+	hlist_for_each_entry_safe(e, tmp, &entries, hentry) {
+		/* prevent duplicates */
+		hash = kdbus_str_hash(e->name);
+		hash_for_each_possible(db->entries_hash, p, hentry, hash)
+			if (strcmp(e->name, p->name) == 0 &&
+			    e->wildcard == p->wildcard) {
+				ret = -EEXIST;
+				goto restore;
+			}
+
+		hlist_del(&e->hentry);
+		hash_add(db->entries_hash, &e->hentry, hash);
+	}
+
+	/* purge all cache-entries produced by previous rules */
+	__kdbus_policy_remove_owner_cache(db, owner);
+
+restore:
+	/* if we failed, flush all entries we added so far, but keep cache */
+	if (ret < 0)
+		__kdbus_policy_remove_owner(db, owner);
+
+	/* if we failed, restore entries, otherwise release them */
+	hlist_for_each_entry_safe(e, tmp, &restore, hentry) {
+		hlist_del(&e->hentry);
+		if (ret < 0) {
+			hash = kdbus_str_hash(e->name);
+			hash_add(db->entries_hash, &e->hentry, hash);
+		} else {
+			kdbus_policy_entry_free(e);
+		}
+	}
+
+	up_write(&db->entries_rwlock);
+
+exit:
+	hlist_for_each_entry_safe(e, tmp, &entries, hentry) {
+		hlist_del(&e->hentry);
+		kdbus_policy_entry_free(e);
+	}
+
+	return ret;
+}
diff --git a/ipc/kdbus/policy.h b/ipc/kdbus/policy.h
new file mode 100644
index 000000000000..6eb7d77b54b0
--- /dev/null
+++ b/ipc/kdbus/policy.h
@@ -0,0 +1,61 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+ * Copyright (C) 2013-2014 Daniel Mack <daniel@zonque.org>
+ * Copyright (C) 2013-2014 David Herrmann <dh.herrmann@gmail.com>
+ * Copyright (C) 2013-2014 Linux Foundation
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#ifndef __KDBUS_POLICY_H
+#define __KDBUS_POLICY_H
+
+#include <linux/hashtable.h>
+#include <linux/mutex.h>
+#include <linux/rwsem.h>
+
+struct kdbus_conn;
+struct kdbus_item;
+
+/**
+ * struct kdbus_policy_db - policy database
+ * @entries_hash:	Hashtable of entries
+ * @talk_access_hash:	Hashtable of send access elements
+ * @entries_lock:	Mutex to protect the database's access entries
+ * @cache_lock:		Mutex to protect the database's cache
+ */
+struct kdbus_policy_db {
+	DECLARE_HASHTABLE(entries_hash, 6);
+	DECLARE_HASHTABLE(talk_access_hash, 6);
+	struct rw_semaphore entries_rwlock;
+	struct mutex cache_lock;
+};
+
+void kdbus_policy_db_init(struct kdbus_policy_db *db);
+void kdbus_policy_db_clear(struct kdbus_policy_db *db);
+
+int kdbus_policy_check_see_access_unlocked(struct kdbus_policy_db *db,
+					   const struct cred *cred,
+					   const char *name);
+int kdbus_policy_check_talk_access(struct kdbus_policy_db *db,
+				   struct kdbus_conn *conn_src,
+				   struct kdbus_conn *conn_dst);
+int kdbus_policy_check_own_access(struct kdbus_policy_db *db,
+				  const struct cred *cred,
+				  const char *name);
+void kdbus_policy_purge_cache(struct kdbus_policy_db *db,
+			      const struct kdbus_conn *conn);
+void kdbus_policy_remove_owner(struct kdbus_policy_db *db,
+			       const void *owner);
+int kdbus_policy_set(struct kdbus_policy_db *db,
+		     const struct kdbus_item *items,
+		     size_t items_size,
+		     size_t max_policies,
+		     bool allow_wildcards,
+		     const void *owner);
+
+#endif
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add Makefile, Kconfig and MAINTAINERS entry
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch hooks up the build system to actually compile the files
added by previous patches. It also adds an entry to MAINTAINERS to
direct people to Greg KH, David Herrmann, Djalal Harouni and me for
questions and patches.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 MAINTAINERS        | 12 ++++++++++++
 init/Kconfig       | 12 ++++++++++++
 ipc/Makefile       |  2 +-
 ipc/kdbus/Makefile | 21 +++++++++++++++++++++
 4 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 ipc/kdbus/Makefile

diff --git a/MAINTAINERS b/MAINTAINERS
index c444907ccd69..cfb1819667bc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5276,6 +5276,18 @@ S:	Maintained
 F:	Documentation/kbuild/kconfig-language.txt
 F:	scripts/kconfig/
 
+KDBUS
+M:	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+M:	Daniel Mack <daniel@zonque.org>
+M:	David Herrmann <dh.herrmann@googlemail.com>
+M:	Djalal Harouni <tixxdz@opendz.org>
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+F:	ipc/kdbus/*
+F:	Documentation/kdbus.txt
+F:	include/uapi/linux/kdbus.h
+F:	tools/testing/selftests/kdbus/
+
 KDUMP
 M:	Vivek Goyal <vgoyal@redhat.com>
 M:	Haren Myneni <hbabu@us.ibm.com>
diff --git a/init/Kconfig b/init/Kconfig
index 2081a4d3d917..44bf491c971f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -261,6 +261,18 @@ config POSIX_MQUEUE_SYSCTL
 	depends on SYSCTL
 	default y
 
+config KDBUS
+	tristate "kdbus interprocess communication"
+	depends on TMPFS
+	help
+	  D-Bus is a system for low-latency, low-overhead, easy to use
+	  interprocess communication (IPC).
+
+	  See Documentation/kdbus.txt
+
+	  To compile this driver as a module, choose M here: the
+	  module will be called kdbus.
+
 config CROSS_MEMORY_ATTACH
 	bool "Enable process_vm_readv/writev syscalls"
 	depends on MMU
diff --git a/ipc/Makefile b/ipc/Makefile
index 9075e172e52c..df8df7c4e33c 100644
--- a/ipc/Makefile
+++ b/ipc/Makefile
@@ -9,4 +9,4 @@ obj_mq-$(CONFIG_COMPAT) += compat_mq.o
 obj-$(CONFIG_POSIX_MQUEUE) += mqueue.o msgutil.o $(obj_mq-y)
 obj-$(CONFIG_IPC_NS) += namespace.o
 obj-$(CONFIG_POSIX_MQUEUE_SYSCTL) += mq_sysctl.o
-
+obj-$(CONFIG_KDBUS) += kdbus/
diff --git a/ipc/kdbus/Makefile b/ipc/kdbus/Makefile
new file mode 100644
index 000000000000..37e5dd4f218a
--- /dev/null
+++ b/ipc/kdbus/Makefile
@@ -0,0 +1,21 @@
+kdbus-y := \
+	bus.o \
+	connection.o \
+	endpoint.o \
+	fs.o \
+	handle.o \
+	item.o \
+	main.o \
+	match.o \
+	message.o \
+	metadata.o \
+	names.o \
+	node.o \
+	notify.o \
+	domain.o \
+	policy.o \
+	pool.o \
+	queue.o \
+	util.o
+
+obj-$(CONFIG_KDBUS) += kdbus.o
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* kdbus: add selftests
  2014-11-21  5:02 ` Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  (?)
@ 2014-11-21  5:02 ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  5:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz, Greg Kroah-Hartman

From: Daniel Mack <daniel@zonque.org>

This patch adds a quite extensive test suite for kdbus that checks
the most important code pathes in the driver. The idea is to extend
the test suite over time.

Also, this code can serve as an example implementation to show how to
use the kernel API from userspace.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/testing/selftests/Makefile                 |    1 +
 tools/testing/selftests/kdbus/.gitignore         |   11 +
 tools/testing/selftests/kdbus/Makefile           |   45 +
 tools/testing/selftests/kdbus/kdbus-enum.c       |   94 ++
 tools/testing/selftests/kdbus/kdbus-enum.h       |   14 +
 tools/testing/selftests/kdbus/kdbus-test.c       |  546 ++++++++++
 tools/testing/selftests/kdbus/kdbus-test.h       |   81 ++
 tools/testing/selftests/kdbus/kdbus-util.c       | 1240 ++++++++++++++++++++++
 tools/testing/selftests/kdbus/kdbus-util.h       |  143 +++
 tools/testing/selftests/kdbus/test-activator.c   |  317 ++++++
 tools/testing/selftests/kdbus/test-benchmark.c   |  409 +++++++
 tools/testing/selftests/kdbus/test-bus.c         |  130 +++
 tools/testing/selftests/kdbus/test-chat.c        |  123 +++
 tools/testing/selftests/kdbus/test-connection.c  |  501 +++++++++
 tools/testing/selftests/kdbus/test-daemon.c      |   66 ++
 tools/testing/selftests/kdbus/test-endpoint.c    |  221 ++++
 tools/testing/selftests/kdbus/test-fd.c          |  664 ++++++++++++
 tools/testing/selftests/kdbus/test-free.c        |   34 +
 tools/testing/selftests/kdbus/test-match.c       |  437 ++++++++
 tools/testing/selftests/kdbus/test-message.c     |  371 +++++++
 tools/testing/selftests/kdbus/test-metadata-ns.c |  258 +++++
 tools/testing/selftests/kdbus/test-monitor.c     |  156 +++
 tools/testing/selftests/kdbus/test-names.c       |  184 ++++
 tools/testing/selftests/kdbus/test-policy-ns.c   |  622 +++++++++++
 tools/testing/selftests/kdbus/test-policy-priv.c | 1168 ++++++++++++++++++++
 tools/testing/selftests/kdbus/test-policy.c      |   81 ++
 tools/testing/selftests/kdbus/test-race.c        |  313 ++++++
 tools/testing/selftests/kdbus/test-sync.c        |  241 +++++
 tools/testing/selftests/kdbus/test-timeout.c     |   97 ++
 29 files changed, 8568 insertions(+)
 create mode 100644 tools/testing/selftests/kdbus/.gitignore
 create mode 100644 tools/testing/selftests/kdbus/Makefile
 create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-enum.h
 create mode 100644 tools/testing/selftests/kdbus/kdbus-test.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-test.h
 create mode 100644 tools/testing/selftests/kdbus/kdbus-util.c
 create mode 100644 tools/testing/selftests/kdbus/kdbus-util.h
 create mode 100644 tools/testing/selftests/kdbus/test-activator.c
 create mode 100644 tools/testing/selftests/kdbus/test-benchmark.c
 create mode 100644 tools/testing/selftests/kdbus/test-bus.c
 create mode 100644 tools/testing/selftests/kdbus/test-chat.c
 create mode 100644 tools/testing/selftests/kdbus/test-connection.c
 create mode 100644 tools/testing/selftests/kdbus/test-daemon.c
 create mode 100644 tools/testing/selftests/kdbus/test-endpoint.c
 create mode 100644 tools/testing/selftests/kdbus/test-fd.c
 create mode 100644 tools/testing/selftests/kdbus/test-free.c
 create mode 100644 tools/testing/selftests/kdbus/test-match.c
 create mode 100644 tools/testing/selftests/kdbus/test-message.c
 create mode 100644 tools/testing/selftests/kdbus/test-metadata-ns.c
 create mode 100644 tools/testing/selftests/kdbus/test-monitor.c
 create mode 100644 tools/testing/selftests/kdbus/test-names.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy-ns.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy-priv.c
 create mode 100644 tools/testing/selftests/kdbus/test-policy.c
 create mode 100644 tools/testing/selftests/kdbus/test-race.c
 create mode 100644 tools/testing/selftests/kdbus/test-sync.c
 create mode 100644 tools/testing/selftests/kdbus/test-timeout.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 45f145c6f843..10cd20e07244 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -2,6 +2,7 @@ TARGETS = breakpoints
 TARGETS += cpu-hotplug
 TARGETS += efivarfs
 TARGETS += kcmp
+TARGETS += kdbus
 TARGETS += memfd
 TARGETS += memory-hotplug
 TARGETS += mqueue
diff --git a/tools/testing/selftests/kdbus/.gitignore b/tools/testing/selftests/kdbus/.gitignore
new file mode 100644
index 000000000000..4b97beee5f80
--- /dev/null
+++ b/tools/testing/selftests/kdbus/.gitignore
@@ -0,0 +1,11 @@
+*.cmd
+*.ko
+*.mod.c
+modules.order
+Module.symvers
+*.o
+*.swp
+.tmp_versions
+tags
+tools/kdbus-monitor
+test/kdbus-test
diff --git a/tools/testing/selftests/kdbus/Makefile b/tools/testing/selftests/kdbus/Makefile
new file mode 100644
index 000000000000..b3c25409b73e
--- /dev/null
+++ b/tools/testing/selftests/kdbus/Makefile
@@ -0,0 +1,45 @@
+CFLAGS += -I../../../../usr/include/
+CFLAGS += -I../../../../include/uapi/
+CFLAGS += -std=gnu99
+CFLAGS += -DKBUILD_MODNAME=\"kdbus\" -D_GNU_SOURCE
+LDLIBS = -pthread -lcap
+
+OBJS= \
+	kdbus-enum.o		\
+	kdbus-util.o		\
+	kdbus-test.o		\
+	kdbus-test.o		\
+	test-activator.o	\
+	test-benchmark.o	\
+	test-bus.o		\
+	test-chat.o		\
+	test-connection.o	\
+	test-daemon.o		\
+	test-endpoint.o		\
+	test-fd.o		\
+	test-free.o		\
+	test-match.o		\
+	test-message.o		\
+	test-metadata-ns.o	\
+	test-monitor.o		\
+	test-names.o		\
+	test-policy.o		\
+	test-policy-ns.o	\
+	test-policy-priv.o	\
+	test-race.o		\
+	test-sync.o		\
+	test-timeout.o
+
+all: kdbus-test
+
+%.o: %.c
+	gcc $(CFLAGS) -c $< -o $@
+
+kdbus-test: $(OBJS)
+	gcc $(CFLAGS) $^ $(LDLIBS) -o $@
+
+run_tests:
+	./kdbus-test
+
+clean:
+	rm -f *.o kdbus-test
diff --git a/tools/testing/selftests/kdbus/kdbus-enum.c b/tools/testing/selftests/kdbus/kdbus-enum.c
new file mode 100644
index 000000000000..4a1615d3cf00
--- /dev/null
+++ b/tools/testing/selftests/kdbus/kdbus-enum.c
@@ -0,0 +1,94 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/ioctl.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+struct kdbus_enum_table {
+	long long id;
+	const char *name;
+};
+
+#define TABLE(what) static struct kdbus_enum_table kdbus_table_##what[]
+#define ENUM(_id) { .id = _id, .name = STRINGIFY(_id) }
+#define LOOKUP(what)							\
+	const char *enum_##what(long long id)				\
+	{								\
+		for (size_t i = 0; i < ELEMENTSOF(kdbus_table_##what); i++) \
+			if (id == kdbus_table_##what[i].id)		\
+				return kdbus_table_##what[i].name;	\
+		return "UNKNOWN";					\
+	}
+
+TABLE(CMD) = {
+	ENUM(KDBUS_CMD_BUS_MAKE),
+	ENUM(KDBUS_CMD_ENDPOINT_MAKE),
+	ENUM(KDBUS_CMD_HELLO),
+	ENUM(KDBUS_CMD_MSG_SEND),
+	ENUM(KDBUS_CMD_MSG_RECV),
+	ENUM(KDBUS_CMD_NAME_LIST),
+	ENUM(KDBUS_CMD_NAME_RELEASE),
+	ENUM(KDBUS_CMD_CONN_INFO),
+	ENUM(KDBUS_CMD_MATCH_ADD),
+	ENUM(KDBUS_CMD_MATCH_REMOVE),
+};
+LOOKUP(CMD);
+
+TABLE(MSG) = {
+	ENUM(_KDBUS_ITEM_NULL),
+	ENUM(KDBUS_ITEM_PAYLOAD_VEC),
+	ENUM(KDBUS_ITEM_PAYLOAD_OFF),
+	ENUM(KDBUS_ITEM_PAYLOAD_MEMFD),
+	ENUM(KDBUS_ITEM_FDS),
+	ENUM(KDBUS_ITEM_BLOOM_PARAMETER),
+	ENUM(KDBUS_ITEM_BLOOM_FILTER),
+	ENUM(KDBUS_ITEM_DST_NAME),
+	ENUM(KDBUS_ITEM_MAKE_NAME),
+	ENUM(KDBUS_ITEM_ATTACH_FLAGS_SEND),
+	ENUM(KDBUS_ITEM_ATTACH_FLAGS_RECV),
+	ENUM(KDBUS_ITEM_ID),
+	ENUM(KDBUS_ITEM_NAME),
+	ENUM(KDBUS_ITEM_TIMESTAMP),
+	ENUM(KDBUS_ITEM_CREDS),
+	ENUM(KDBUS_ITEM_AUXGROUPS),
+	ENUM(KDBUS_ITEM_OWNED_NAME),
+	ENUM(KDBUS_ITEM_TID_COMM),
+	ENUM(KDBUS_ITEM_PID_COMM),
+	ENUM(KDBUS_ITEM_EXE),
+	ENUM(KDBUS_ITEM_CMDLINE),
+	ENUM(KDBUS_ITEM_CGROUP),
+	ENUM(KDBUS_ITEM_CAPS),
+	ENUM(KDBUS_ITEM_SECLABEL),
+	ENUM(KDBUS_ITEM_AUDIT),
+	ENUM(KDBUS_ITEM_CONN_DESCRIPTION),
+	ENUM(KDBUS_ITEM_NAME_ADD),
+	ENUM(KDBUS_ITEM_NAME_REMOVE),
+	ENUM(KDBUS_ITEM_NAME_CHANGE),
+	ENUM(KDBUS_ITEM_ID_ADD),
+	ENUM(KDBUS_ITEM_ID_REMOVE),
+	ENUM(KDBUS_ITEM_REPLY_TIMEOUT),
+	ENUM(KDBUS_ITEM_REPLY_DEAD),
+};
+LOOKUP(MSG);
+
+TABLE(PAYLOAD) = {
+	ENUM(KDBUS_PAYLOAD_KERNEL),
+	ENUM(KDBUS_PAYLOAD_DBUS),
+};
+LOOKUP(PAYLOAD);
diff --git a/tools/testing/selftests/kdbus/kdbus-enum.h b/tools/testing/selftests/kdbus/kdbus-enum.h
new file mode 100644
index 000000000000..110bfd332859
--- /dev/null
+++ b/tools/testing/selftests/kdbus/kdbus-enum.h
@@ -0,0 +1,14 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+#pragma once
+
+const char *enum_CMD(long long id);
+const char *enum_MSG(long long id);
+const char *enum_MATCH(long long id);
+const char *enum_PAYLOAD(long long id);
diff --git a/tools/testing/selftests/kdbus/kdbus-test.c b/tools/testing/selftests/kdbus/kdbus-test.c
new file mode 100644
index 000000000000..02f295a03ebc
--- /dev/null
+++ b/tools/testing/selftests/kdbus/kdbus-test.c
@@ -0,0 +1,546 @@
+#include <errno.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <time.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <assert.h>
+#include <getopt.h>
+#include <stdbool.h>
+#include <sys/wait.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+enum {
+	TEST_CREATE_BUS		= 1 << 0,
+	TEST_CREATE_CONN	= 1 << 1,
+};
+
+struct kdbus_test {
+	const char *name;
+	const char *desc;
+	int (*func)(struct kdbus_test_env *env);
+	unsigned int flags;
+};
+
+static const struct kdbus_test tests[] = {
+	{
+		.name	= "bus-make",
+		.desc	= "bus make functions",
+		.func	= kdbus_test_bus_make,
+		.flags	= 0,
+	},
+	{
+		.name	= "hello",
+		.desc	= "the HELLO command",
+		.func	= kdbus_test_hello,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "byebye",
+		.desc	= "the BYEBYE command",
+		.func	= kdbus_test_byebye,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "chat",
+		.desc	= "a chat pattern",
+		.func	= kdbus_test_chat,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "daemon",
+		.desc	= "a simple dameon",
+		.func	= kdbus_test_daemon,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "fd-passing",
+		.desc	= "file descriptor passing",
+		.func	= kdbus_test_fd_passing,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "endpoint",
+		.desc	= "custom endpoint",
+		.func	= kdbus_test_custom_endpoint,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "monitor",
+		.desc	= "monitor functionality",
+		.func	= kdbus_test_monitor,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "name-basics",
+		.desc	= "basic name registry functions",
+		.func	= kdbus_test_name_basic,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "name-conflict",
+		.desc	= "name registry conflict details",
+		.func	= kdbus_test_name_conflict,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "name-queue",
+		.desc	= "queuing of names",
+		.func	= kdbus_test_name_queue,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "message-basic",
+		.desc	= "basic message handling",
+		.func	= kdbus_test_message_basic,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "message-prio",
+		.desc	= "handling of messages with priority",
+		.func	= kdbus_test_message_prio,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "message-quota",
+		.desc	= "message quotas are enforced",
+		.func	= kdbus_test_message_quota,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "timeout",
+		.desc	= "timeout",
+		.func	= kdbus_test_timeout,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "sync-byebye",
+		.desc	= "synchronous replies vs. BYEBYE",
+		.func	= kdbus_test_sync_byebye,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "sync-reply",
+		.desc	= "synchronous replies",
+		.func	= kdbus_test_sync_reply,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "message-free",
+		.desc	= "freeing of memory",
+		.func	= kdbus_test_free,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "connection-info",
+		.desc	= "retrieving connection information",
+		.func	= kdbus_test_conn_info,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "connection-update",
+		.desc	= "updating connection information",
+		.func	= kdbus_test_conn_update,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "writable-pool",
+		.desc	= "verifying pools are never writable",
+		.func	= kdbus_test_writable_pool,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "policy",
+		.desc	= "policy",
+		.func	= kdbus_test_policy,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "policy-priv",
+		.desc	= "unprivileged bus access",
+		.func	= kdbus_test_policy_priv,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "policy-ns",
+		.desc	= "policy in user namespaces",
+		.func	= kdbus_test_policy_ns,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "metadata-ns",
+		.desc	= "metadata in user namespaces",
+		.func	= kdbus_test_metadata_ns,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-id-add",
+		.desc	= "adding of matches by id",
+		.func	= kdbus_test_match_id_add,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-id-remove",
+		.desc	= "removing of matches by id",
+		.func	= kdbus_test_match_id_remove,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-replace",
+		.desc	= "replace of matches with the same cookie",
+		.func	= kdbus_test_match_replace,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-name-add",
+		.desc	= "adding of matches by name",
+		.func	= kdbus_test_match_name_add,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-name-remove",
+		.desc	= "removing of matches by name",
+		.func	= kdbus_test_match_name_remove,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-name-change",
+		.desc	= "matching for name changes",
+		.func	= kdbus_test_match_name_change,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "match-bloom",
+		.desc	= "matching with bloom filters",
+		.func	= kdbus_test_match_bloom,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "activator",
+		.desc	= "activator connections",
+		.func	= kdbus_test_activator,
+		.flags	= TEST_CREATE_BUS | TEST_CREATE_CONN,
+	},
+	{
+		.name	= "benchmark",
+		.desc	= "benchmark",
+		.func	= kdbus_test_benchmark,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "race-byebye",
+		.desc	= "race multiple byebyes",
+		.func	= kdbus_test_race_byebye,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{
+		.name	= "race-byebye-match",
+		.desc	= "race byebye vs match removal",
+		.func	= kdbus_test_race_byebye_match,
+		.flags	= TEST_CREATE_BUS,
+	},
+	{ NULL } /* sentinel */
+};
+
+static int test_prepare_env(const struct kdbus_test *t,
+			    struct kdbus_test_env *env,
+			    const char *root,
+			    const char *busname)
+{
+	if (t->flags & TEST_CREATE_BUS) {
+		char *s, *n;
+		int ret;
+
+		asprintf(&s, "%s/control", root);
+
+		env->control_fd = open(s, O_RDWR);
+		free(s);
+		ASSERT_RETURN(env->control_fd >= 0);
+
+		if (!busname) {
+			n = unique_name("test-bus");
+			ASSERT_RETURN(n);
+		}
+
+		ret = kdbus_create_bus(env->control_fd, busname ?: n,
+				       _KDBUS_ATTACH_ALL, &s);
+		ASSERT_RETURN(ret == 0);
+
+		asprintf(&env->buspath, "%s/%s/bus", root, s);
+		free(s);
+	}
+
+	if (t->flags & TEST_CREATE_CONN) {
+		env->conn = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_RETURN(env->conn);
+	}
+
+	env->root = root;
+
+	return 0;
+}
+
+void test_unprepare_env(const struct kdbus_test *t, struct kdbus_test_env *env)
+{
+	if (env->conn) {
+		kdbus_conn_free(env->conn);
+		env->conn = NULL;
+	}
+
+	if (env->control_fd >= 0) {
+		close(env->control_fd);
+		env->control_fd = -1;
+	}
+
+	if (env->buspath) {
+		free(env->buspath);
+		env->buspath = NULL;
+	}
+}
+
+static int test_run(const struct kdbus_test *t, const char *root,
+		    const char *busname, int wait)
+{
+	int ret;
+	struct kdbus_test_env env = {};
+
+	ret = test_prepare_env(t, &env, root, busname);
+	if (ret != TEST_OK)
+		return ret;
+
+	if (wait > 0) {
+		printf("Sleeping %d seconds before running test ...\n", wait);
+		sleep(wait);
+	}
+
+	ret = t->func(&env);
+	test_unprepare_env(t, &env);
+	return ret;
+}
+
+static int test_run_forked(const struct kdbus_test *t, const char *root,
+			   const char *busname, int wait)
+{
+	int ret;
+	pid_t pid;
+
+	pid = fork();
+	if (pid < 0) {
+		return TEST_ERR;
+	} else if (pid == 0) {
+		ret = test_run(t, root, busname, wait);
+		_exit(ret);
+	}
+
+	pid = waitpid(pid, &ret, 0);
+	if (pid <= 0)
+		return TEST_ERR;
+	else if (!WIFEXITED(ret))
+		return TEST_ERR;
+	else
+		return WEXITSTATUS(ret);
+}
+
+static void print_test_result(int ret)
+{
+	switch (ret) {
+	case TEST_OK:
+		printf("OK");
+		break;
+	case TEST_SKIP:
+		printf("SKIPPED");
+		break;
+	case TEST_ERR:
+		printf("ERROR");
+		break;
+	}
+}
+
+static int run_all_tests(const char *root, const char *busname)
+{
+	int ret;
+	unsigned int fail_cnt = 0;
+	unsigned int skip_cnt = 0;
+	unsigned int ok_cnt = 0;
+	unsigned int i;
+	const struct kdbus_test *t;
+
+	kdbus_util_verbose = false;
+
+	for (t = tests; t->name; t++) {
+		printf("Testing %s (%s) ", t->desc, t->name);
+		for (i = 0; i < 60 - strlen(t->desc) - strlen(t->name); i++)
+			printf(".");
+		printf(" ");
+
+		ret = test_run_forked(t, root, busname, 0);
+		switch (ret) {
+		case TEST_OK:
+			ok_cnt++;
+			break;
+		case TEST_SKIP:
+			skip_cnt++;
+			break;
+		case TEST_ERR:
+			fail_cnt++;
+			break;
+		}
+
+		print_test_result(ret);
+		printf("\n");
+	}
+
+	printf("\nSUMMARY: %d tests passed, %d skipped, %d failed\n",
+	       ok_cnt, skip_cnt, fail_cnt);
+
+	return fail_cnt > 0 ? EXIT_FAILURE : EXIT_SUCCESS;
+}
+
+static void usage(const char *argv0)
+{
+	const struct kdbus_test *t;
+	unsigned int i;
+
+	printf("Usage: %s [options]\n"
+	       "Options:\n"
+	       "\t-x, --loop		Run in a loop\n"
+	       "\t-f, --fork		Fork before running a test\n"
+	       "\t-h, --help		Print this help\n"
+	       "\t-r, --root <root>	Toplevel of the kdbus hierarchy\n"
+	       "\t-t, --test <test-id>	Run one specific test only, in verbose mode\n"
+	       "\t-b, --bus <busname>	Instead of generating a random bus name, take <busname>.\n"
+	       "\t-w, --wait <secs>	Wait <secs> before actually starting test\n"
+	       "\n", argv0);
+
+	printf("By default, all test are run once, and a summary is printed.\n"
+	       "Available tests for --test:\n\n");
+
+	for (t = tests; t->name; t++) {
+		printf("\t%s", t->name);
+
+		for (i = 0; i < 24 - strlen(t->name); i++)
+			printf(" ");
+
+		printf("Test %s\n", t->desc);
+	}
+
+	printf("\n");
+	printf("Note that some tests may, if run specifically by --test, "
+	       "behave differently, and not terminate by themselves.\n");
+
+	exit(EXIT_FAILURE);
+}
+
+int main(int argc, char *argv[])
+{
+	int t, ret = 0;
+	int arg_loop = 0;
+	char *arg_root = NULL;
+	char *arg_test = NULL;
+	char *arg_busname = NULL;
+	int arg_wait = 0;
+	int arg_fork = 0;
+	char *control;
+
+	static const struct option options[] = {
+		{ "loop",	no_argument,		NULL, 'x' },
+		{ "help",	no_argument,		NULL, 'h' },
+		{ "root",	required_argument,	NULL, 'r' },
+		{ "test",	required_argument,	NULL, 't' },
+		{ "bus",	required_argument,	NULL, 'b' },
+		{ "wait",	required_argument,	NULL, 'w' },
+		{ "fork",	no_argument,		NULL, 'f' },
+		{}
+	};
+
+	srand(time(NULL));
+
+	while ((t = getopt_long(argc, argv, "hxfr:t:b:w:", options, NULL)) >= 0) {
+		switch (t) {
+		case 'x':
+			arg_loop = 1;
+			break;
+
+		case 'r':
+			arg_root = optarg;
+			break;
+
+		case 't':
+			arg_test = optarg;
+			break;
+
+		case 'b':
+			arg_busname = optarg;
+			break;
+
+		case 'w':
+			arg_wait = strtol(optarg, NULL, 10);
+			break;
+
+		case 'f':
+			arg_fork = 1;
+			break;
+
+		default:
+		case 'h':
+			usage(argv[0]);
+		}
+	}
+
+	if (!arg_root)
+		arg_root = "/sys/fs/kdbus";
+
+	asprintf(&control, "%s/control", arg_root);
+
+	if (access(control, W_OK) < 0) {
+		printf("Unable to locate control node at '%s'.\n", control);
+		return EXIT_FAILURE;
+	}
+
+	free(control);
+
+	if (arg_test) {
+		const struct kdbus_test *t;
+
+		for (t = tests; t->name; t++) {
+			if (!strcmp(t->name, arg_test)) {
+				do {
+					if (arg_fork)
+						ret = test_run_forked(t,
+								arg_root,
+								arg_busname,
+								arg_wait);
+					else
+						ret = test_run(t, arg_root,
+							       arg_busname,
+							       arg_wait);
+					printf("Testing %s: ", t->desc);
+					print_test_result(ret);
+					printf("\n");
+
+					if (ret != TEST_OK)
+						break;
+				} while (arg_loop);
+
+				return ret == TEST_OK ? 0 : EXIT_FAILURE;
+			}
+		}
+
+		printf("Unknown test-id '%s'\n", arg_test);
+		return EXIT_FAILURE;
+	}
+
+	do {
+		ret = run_all_tests(arg_root, arg_busname);
+		if (ret != TEST_OK)
+			break;
+	} while (arg_loop);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/kdbus/kdbus-test.h b/tools/testing/selftests/kdbus/kdbus-test.h
new file mode 100644
index 000000000000..9728a49d089c
--- /dev/null
+++ b/tools/testing/selftests/kdbus/kdbus-test.h
@@ -0,0 +1,81 @@
+#ifndef _TEST_KDBUS_H_
+#define _TEST_KDBUS_H_
+
+struct kdbus_test_env {
+	char *buspath;
+	const char *root;
+	int control_fd;
+	struct kdbus_conn *conn;
+};
+
+enum {
+	TEST_OK,
+	TEST_SKIP,
+	TEST_ERR,
+};
+
+#define ASSERT_RETURN_VAL(cond, val)		\
+	if (!(cond)) {			\
+		fprintf(stderr,	"Assertion '%s' failed in %s(), %s:%d\n", \
+			#cond, __func__, __FILE__, __LINE__);	\
+		return val;	\
+	}
+
+#define ASSERT_EXIT_VAL(cond, val)		\
+	if (!(cond)) {			\
+		fprintf(stderr, "Assertion '%s' failed in %s(), %s:%d\n", \
+			#cond, __func__, __FILE__, __LINE__);	\
+		_exit(val);	\
+	}
+
+#define ASSERT_BREAK(cond)		\
+	if (!(cond)) {			\
+		fprintf(stderr, "Assertion '%s' failed in %s(), %s:%d\n", \
+			#cond, __func__, __FILE__, __LINE__);	\
+		break; \
+	}
+
+#define ASSERT_RETURN(cond)		\
+	ASSERT_RETURN_VAL(cond, TEST_ERR)
+
+#define ASSERT_EXIT(cond)		\
+	ASSERT_EXIT_VAL(cond, EXIT_FAILURE)
+
+int kdbus_test_activator(struct kdbus_test_env *env);
+int kdbus_test_benchmark(struct kdbus_test_env *env);
+int kdbus_test_bus_make(struct kdbus_test_env *env);
+int kdbus_test_byebye(struct kdbus_test_env *env);
+int kdbus_test_chat(struct kdbus_test_env *env);
+int kdbus_test_conn_info(struct kdbus_test_env *env);
+int kdbus_test_conn_update(struct kdbus_test_env *env);
+int kdbus_test_daemon(struct kdbus_test_env *env);
+int kdbus_test_custom_endpoint(struct kdbus_test_env *env);
+int kdbus_test_fd_passing(struct kdbus_test_env *env);
+int kdbus_test_free(struct kdbus_test_env *env);
+int kdbus_test_hello(struct kdbus_test_env *env);
+int kdbus_test_match_bloom(struct kdbus_test_env *env);
+int kdbus_test_match_id_add(struct kdbus_test_env *env);
+int kdbus_test_match_id_remove(struct kdbus_test_env *env);
+int kdbus_test_match_replace(struct kdbus_test_env *env);
+int kdbus_test_match_name_add(struct kdbus_test_env *env);
+int kdbus_test_match_name_change(struct kdbus_test_env *env);
+int kdbus_test_match_name_remove(struct kdbus_test_env *env);
+int kdbus_test_message_basic(struct kdbus_test_env *env);
+int kdbus_test_message_prio(struct kdbus_test_env *env);
+int kdbus_test_message_quota(struct kdbus_test_env *env);
+int kdbus_test_metadata_ns(struct kdbus_test_env *env);
+int kdbus_test_monitor(struct kdbus_test_env *env);
+int kdbus_test_name_basic(struct kdbus_test_env *env);
+int kdbus_test_name_conflict(struct kdbus_test_env *env);
+int kdbus_test_name_queue(struct kdbus_test_env *env);
+int kdbus_test_policy(struct kdbus_test_env *env);
+int kdbus_test_policy_ns(struct kdbus_test_env *env);
+int kdbus_test_policy_priv(struct kdbus_test_env *env);
+int kdbus_test_race_byebye(struct kdbus_test_env *env);
+int kdbus_test_race_byebye_match(struct kdbus_test_env *env);
+int kdbus_test_sync_byebye(struct kdbus_test_env *env);
+int kdbus_test_sync_reply(struct kdbus_test_env *env);
+int kdbus_test_timeout(struct kdbus_test_env *env);
+int kdbus_test_writable_pool(struct kdbus_test_env *env);
+
+#endif /* _TEST_KDBUS_H_ */
diff --git a/tools/testing/selftests/kdbus/kdbus-util.c b/tools/testing/selftests/kdbus/kdbus-util.c
new file mode 100644
index 000000000000..9aa609866304
--- /dev/null
+++ b/tools/testing/selftests/kdbus/kdbus-util.c
@@ -0,0 +1,1240 @@
+/*
+ * Copyright (C) 2013-2014 Daniel Mack
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2014 Djalal Harouni
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <stdio.h>
+#include <stdarg.h>
+#include <string.h>
+#include <time.h>
+#include <inttypes.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <errno.h>
+#include <assert.h>
+#include <poll.h>
+#include <grp.h>
+#include <sys/capability.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <linux/unistd.h>
+#include <linux/memfd.h>
+
+#ifndef __NR_memfd_create
+  #ifdef __x86_64__
+    #define __NR_memfd_create 319
+  #elif defined __arm__
+    #define __NR_memfd_create 385
+  #else
+    #define __NR_memfd_create 356
+  #endif
+#endif
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+#ifndef F_ADD_SEALS
+#define F_LINUX_SPECIFIC_BASE	1024
+#define F_ADD_SEALS     (F_LINUX_SPECIFIC_BASE + 9)
+#define F_GET_SEALS     (F_LINUX_SPECIFIC_BASE + 10)
+
+#define F_SEAL_SEAL     0x0001  /* prevent further seals from being set */
+#define F_SEAL_SHRINK   0x0002  /* prevent file from shrinking */
+#define F_SEAL_GROW     0x0004  /* prevent file from growing */
+#define F_SEAL_WRITE    0x0008  /* prevent writes */
+#endif
+
+int kdbus_util_verbose = true;
+
+int kdbus_create_bus(int control_fd, const char *name,
+		     uint64_t req_meta, char **path)
+{
+	struct {
+		struct kdbus_cmd_make head;
+
+		/* bloom size item */
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_bloom_parameter bloom;
+		} bp;
+
+		/* required metadata item */
+		struct {
+			uint64_t size;
+			uint64_t type;
+			uint64_t flags;
+		} attach;
+
+		/* name item */
+		struct {
+			uint64_t size;
+			uint64_t type;
+			char str[64];
+		} name;
+	} bus_make;
+	int ret;
+
+	memset(&bus_make, 0, sizeof(bus_make));
+	bus_make.bp.size = sizeof(bus_make.bp);
+	bus_make.bp.type = KDBUS_ITEM_BLOOM_PARAMETER;
+	bus_make.bp.bloom.size = 64;
+	bus_make.bp.bloom.n_hash = 1;
+
+	snprintf(bus_make.name.str, sizeof(bus_make.name.str),
+		 "%u-%s", getuid(), name);
+
+	bus_make.attach.type = KDBUS_ITEM_ATTACH_FLAGS_RECV;
+	bus_make.attach.size = sizeof(bus_make.attach);
+	bus_make.attach.flags = req_meta;
+
+	bus_make.name.type = KDBUS_ITEM_MAKE_NAME;
+	bus_make.name.size = KDBUS_ITEM_HEADER_SIZE +
+			     strlen(bus_make.name.str) + 1;
+
+	bus_make.head.flags = KDBUS_MAKE_ACCESS_WORLD;
+	bus_make.head.size = sizeof(bus_make.head) +
+			     bus_make.bp.size +
+			     bus_make.attach.size +
+			     bus_make.name.size;
+
+	kdbus_printf("Creating bus with name >%s< on control fd %d ...\n",
+		     name, control_fd);
+
+	ret = ioctl(control_fd, KDBUS_CMD_BUS_MAKE, &bus_make);
+
+	if (ret == 0 && path)
+		*path = strdup(bus_make.name.str);
+
+	return ret;
+}
+
+struct kdbus_conn *
+kdbus_hello(const char *path, uint64_t flags,
+	    const struct kdbus_item *item, size_t item_size)
+{
+	int fd, ret;
+	struct {
+		struct kdbus_cmd_hello hello;
+
+		struct {
+			uint64_t size;
+			uint64_t type;
+			char str[16];
+		} conn_name;
+
+		uint8_t extra_items[item_size];
+	} h;
+	struct kdbus_conn *conn;
+
+	memset(&h, 0, sizeof(h));
+
+	if (item_size > 0)
+		memcpy(h.extra_items, item, item_size);
+
+	kdbus_printf("-- opening bus connection %s\n", path);
+	fd = open(path, O_RDWR|O_CLOEXEC);
+	if (fd < 0) {
+		kdbus_printf("--- error %d (%m)\n", fd);
+		return NULL;
+	}
+
+	h.hello.flags = flags | KDBUS_HELLO_ACCEPT_FD;
+	h.hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	h.hello.attach_flags_recv = _KDBUS_ATTACH_ALL;
+	h.conn_name.type = KDBUS_ITEM_CONN_DESCRIPTION;
+	strcpy(h.conn_name.str, "this-is-my-name");
+	h.conn_name.size = KDBUS_ITEM_HEADER_SIZE + strlen(h.conn_name.str) + 1;
+
+	h.hello.size = sizeof(h);
+	h.hello.pool_size = POOL_SIZE;
+
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &h.hello);
+	if (ret < 0) {
+		kdbus_printf("--- error when saying hello: %d (%m)\n", ret);
+		return NULL;
+	}
+	kdbus_printf("-- Our peer ID for %s: %llu -- bus uuid: '%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x'\n",
+		     path, (unsigned long long)h.hello.id,
+		     h.hello.id128[0],  h.hello.id128[1],  h.hello.id128[2],
+		     h.hello.id128[3],  h.hello.id128[4],  h.hello.id128[5],
+		     h.hello.id128[6],  h.hello.id128[7],  h.hello.id128[8],
+		     h.hello.id128[9],  h.hello.id128[10], h.hello.id128[11],
+		     h.hello.id128[12], h.hello.id128[13], h.hello.id128[14],
+		     h.hello.id128[15]);
+
+	conn = malloc(sizeof(*conn));
+	if (!conn) {
+		kdbus_printf("unable to malloc()!?\n");
+		return NULL;
+	}
+
+	conn->buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, fd, 0);
+	if (conn->buf == MAP_FAILED) {
+		free(conn);
+		close(fd);
+		kdbus_printf("--- error mmap (%m)\n");
+		return NULL;
+	}
+
+	conn->fd = fd;
+	conn->id = h.hello.id;
+	return conn;
+}
+
+struct kdbus_conn *
+kdbus_hello_registrar(const char *path, const char *name,
+		      const struct kdbus_policy_access *access,
+		      size_t num_access, uint64_t flags)
+{
+	struct kdbus_item *item, *items;
+	size_t i, size;
+
+	size = KDBUS_ITEM_SIZE(strlen(name) + 1) +
+		num_access * KDBUS_ITEM_SIZE(sizeof(*access));
+
+	items = alloca(size);
+
+	item = items;
+	item->size = KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1;
+	item->type = KDBUS_ITEM_NAME;
+	strcpy(item->str, name);
+	item = KDBUS_ITEM_NEXT(item);
+
+	for (i = 0; i < num_access; i++) {
+		item->size = KDBUS_ITEM_HEADER_SIZE +
+			     sizeof(struct kdbus_policy_access);
+		item->type = KDBUS_ITEM_POLICY_ACCESS;
+
+		item->policy_access.type = access[i].type;
+		item->policy_access.access = access[i].access;
+		item->policy_access.id = access[i].id;
+
+		item = KDBUS_ITEM_NEXT(item);
+	}
+
+	return kdbus_hello(path, flags, items, size);
+}
+
+struct kdbus_conn *kdbus_hello_activator(const char *path, const char *name,
+				   const struct kdbus_policy_access *access,
+				   size_t num_access)
+{
+	return kdbus_hello_registrar(path, name, access, num_access,
+				     KDBUS_HELLO_ACTIVATOR);
+}
+
+int kdbus_info(struct kdbus_conn *conn, uint64_t id,
+	       const char *name, uint64_t flags,
+	       uint64_t *offset)
+{
+	struct kdbus_cmd_info *cmd;
+	size_t size = sizeof(*cmd);
+	int ret;
+
+	if (name)
+		size += KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1;
+
+	cmd = alloca(size);
+	memset(cmd, 0, size);
+	cmd->size = size;
+	cmd->flags = flags;
+
+	if (name) {
+		cmd->items[0].size = KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1;
+		cmd->items[0].type = KDBUS_ITEM_NAME;
+		strcpy(cmd->items[0].str, name);
+	} else {
+		cmd->id = id;
+	}
+
+	ret = ioctl(conn->fd, KDBUS_CMD_CONN_INFO, cmd);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("--- error when requesting info: %d (%m)\n", ret);
+		return ret;
+	}
+
+	if (offset)
+		*offset = cmd->offset;
+	else
+		kdbus_free(conn, cmd->offset);
+
+	return 0;
+}
+
+void kdbus_conn_free(struct kdbus_conn *conn)
+{
+	if (!conn)
+		return;
+
+	if (conn->buf)
+		munmap(conn->buf, POOL_SIZE);
+
+	if (conn->fd >= 0)
+		close(conn->fd);
+
+	free(conn);
+}
+
+int sys_memfd_create(const char *name, __u64 size)
+{
+	int ret, fd;
+
+	ret = syscall(__NR_memfd_create, name, MFD_ALLOW_SEALING);
+	if (ret < 0)
+		return ret;
+
+	fd = ret;
+
+	ret = ftruncate(fd, size);
+	if (ret < 0) {
+		close(fd);
+		return ret;
+	}
+
+	return fd;
+}
+
+int sys_memfd_seal_set(int fd)
+{
+	return fcntl(fd, F_ADD_SEALS, F_SEAL_SHRINK |
+			 F_SEAL_GROW | F_SEAL_WRITE | F_SEAL_SEAL);
+}
+
+off_t sys_memfd_get_size(int fd, off_t *size)
+{
+	struct stat stat;
+	int ret;
+
+	ret = fstat(fd, &stat);
+	if (ret < 0) {
+		kdbus_printf("stat() failed: %m\n");
+		return ret;
+	}
+
+	*size = stat.st_size;
+	return 0;
+}
+
+int kdbus_msg_send(const struct kdbus_conn *conn,
+		   const char *name,
+		   uint64_t cookie,
+		   uint64_t flags,
+		   uint64_t timeout,
+		   int64_t priority,
+		   uint64_t dst_id)
+{
+	struct kdbus_msg *msg;
+	const char ref1[1024 * 128 + 3] = "0123456789_0";
+	const char ref2[] = "0123456789_1";
+	struct kdbus_item *item;
+	struct timespec now;
+	uint64_t size;
+	int memfd = -1;
+	int ret;
+
+	size = sizeof(struct kdbus_msg);
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec));
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec));
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec));
+
+	if (dst_id == KDBUS_DST_ID_BROADCAST)
+		size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_bloom_filter)) + 64;
+	else {
+		memfd = sys_memfd_create("my-name-is-nice", 1024 * 1024);
+		if (memfd < 0) {
+			kdbus_printf("failed to create memfd: %m\n");
+			return memfd;
+		}
+
+		if (write(memfd, "kdbus memfd 1234567", 19) != 19) {
+			ret = -errno;
+			kdbus_printf("writing to memfd failed: %m\n");
+			return ret;
+		}
+
+		ret = sys_memfd_seal_set(memfd);
+		if (ret < 0) {
+			ret = -errno;
+			kdbus_printf("memfd sealing failed: %m\n");
+			return ret;
+		}
+
+		size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd));
+	}
+
+	if (name)
+		size += KDBUS_ITEM_SIZE(strlen(name) + 1);
+
+	msg = malloc(size);
+	if (!msg) {
+		ret = -errno;
+		kdbus_printf("unable to malloc()!?\n");
+		return ret;
+	}
+
+	memset(msg, 0, size);
+	msg->flags = flags;
+	msg->priority = priority;
+	msg->size = size;
+	msg->src_id = conn->id;
+	msg->dst_id = name ? 0 : dst_id;
+	msg->cookie = cookie;
+	msg->payload_type = KDBUS_PAYLOAD_DBUS;
+
+	if (timeout) {
+		ret = clock_gettime(CLOCK_MONOTONIC_COARSE, &now);
+		if (ret < 0)
+			return ret;
+
+		msg->timeout_ns = now.tv_sec * 1000000000ULL +
+				  now.tv_nsec + timeout;
+	}
+
+	item = msg->items;
+
+	if (name) {
+		item->type = KDBUS_ITEM_DST_NAME;
+		item->size = KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1;
+		strcpy(item->str, name);
+		item = KDBUS_ITEM_NEXT(item);
+	}
+
+	item->type = KDBUS_ITEM_PAYLOAD_VEC;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_vec);
+	item->vec.address = (uintptr_t)&ref1;
+	item->vec.size = sizeof(ref1);
+	item = KDBUS_ITEM_NEXT(item);
+
+	/* data padding for ref1 */
+	item->type = KDBUS_ITEM_PAYLOAD_VEC;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_vec);
+	item->vec.address = (uintptr_t)NULL;
+	item->vec.size =  KDBUS_ALIGN8(sizeof(ref1)) - sizeof(ref1);
+	item = KDBUS_ITEM_NEXT(item);
+
+	item->type = KDBUS_ITEM_PAYLOAD_VEC;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_vec);
+	item->vec.address = (uintptr_t)&ref2;
+	item->vec.size = sizeof(ref2);
+	item = KDBUS_ITEM_NEXT(item);
+
+	if (dst_id == KDBUS_DST_ID_BROADCAST) {
+		item->type = KDBUS_ITEM_BLOOM_FILTER;
+		item->size = KDBUS_ITEM_SIZE(sizeof(struct kdbus_bloom_filter)) + 64;
+		item->bloom_filter.generation = 0;
+	} else {
+		item->type = KDBUS_ITEM_PAYLOAD_MEMFD;
+		item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_memfd);
+		item->memfd.size = 16;
+		item->memfd.fd = memfd;
+	}
+	item = KDBUS_ITEM_NEXT(item);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, msg);
+	if (memfd >= 0)
+		close(memfd);
+
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error sending message: %d (%m)\n", ret);
+		return ret;
+	}
+
+	if (flags & KDBUS_MSG_FLAGS_SYNC_REPLY) {
+		struct kdbus_msg *reply;
+
+		kdbus_printf("SYNC REPLY @offset %llu:\n", msg->offset_reply);
+		reply = (struct kdbus_msg *)(conn->buf + msg->offset_reply);
+		kdbus_msg_dump(conn, reply);
+
+		kdbus_msg_free(reply);
+
+		ret = kdbus_free(conn, msg->offset_reply);
+		if (ret < 0)
+			return ret;
+	}
+
+	free(msg);
+
+	return 0;
+}
+
+static char *msg_id(uint64_t id, char *buf)
+{
+	if (id == 0)
+		return "KERNEL";
+	if (id == ~0ULL)
+		return "BROADCAST";
+	sprintf(buf, "%llu", (unsigned long long)id);
+	return buf;
+}
+
+int kdbus_msg_dump(const struct kdbus_conn *conn, const struct kdbus_msg *msg)
+{
+	const struct kdbus_item *item = msg->items;
+	char buf_src[32];
+	char buf_dst[32];
+	uint64_t timeout = 0;
+	uint64_t cookie_reply = 0;
+	int ret = 0;
+
+	if (msg->flags & KDBUS_MSG_FLAGS_EXPECT_REPLY)
+		timeout = msg->timeout_ns;
+	else
+		cookie_reply = msg->cookie_reply;
+
+	kdbus_printf("MESSAGE: %s (%llu bytes) flags=0x%08llx, %s → %s, "
+		     "cookie=%llu, timeout=%llu cookie_reply=%llu priority=%lli\n",
+		enum_PAYLOAD(msg->payload_type), (unsigned long long)msg->size,
+		(unsigned long long)msg->flags,
+		msg_id(msg->src_id, buf_src), msg_id(msg->dst_id, buf_dst),
+		(unsigned long long)msg->cookie, (unsigned long long)timeout,
+		(unsigned long long)cookie_reply, (long long)msg->priority);
+
+	KDBUS_ITEM_FOREACH(item, msg, items) {
+		if (item->size < KDBUS_ITEM_HEADER_SIZE) {
+			kdbus_printf("  +%s (%llu bytes) invalid data record\n",
+				     enum_MSG(item->type), item->size);
+			ret = -EINVAL;
+			break;
+		}
+
+		switch (item->type) {
+		case KDBUS_ITEM_PAYLOAD_OFF: {
+			char *s;
+
+			if (item->vec.offset == ~0ULL)
+				s = "[\\0-bytes]";
+			else
+				s = (char *)msg + item->vec.offset;
+
+			kdbus_printf("  +%s (%llu bytes) off=%llu size=%llu '%s'\n",
+			       enum_MSG(item->type), item->size,
+			       (unsigned long long)item->vec.offset,
+			       (unsigned long long)item->vec.size, s);
+			break;
+		}
+
+		case KDBUS_ITEM_PAYLOAD_MEMFD: {
+			char *buf;
+			off_t size;
+
+			buf = mmap(NULL, item->memfd.size, PROT_READ,
+				   MAP_PRIVATE, item->memfd.fd, 0);
+			if (buf == MAP_FAILED) {
+				kdbus_printf("mmap() fd=%i size=%llu failed: %m\n",
+					     item->memfd.fd, item->memfd.size);
+				break;
+			}
+
+			if (sys_memfd_get_size(item->memfd.fd, &size) < 0) {
+				kdbus_printf("KDBUS_CMD_MEMFD_SIZE_GET failed: %m\n");
+				break;
+			}
+
+			kdbus_printf("  +%s (%llu bytes) fd=%i size=%llu filesize=%llu '%s'\n",
+			       enum_MSG(item->type), item->size, item->memfd.fd,
+			       (unsigned long long)item->memfd.size,
+			       (unsigned long long)size, buf);
+			munmap(buf, item->memfd.size);
+			break;
+		}
+
+		case KDBUS_ITEM_CREDS:
+			kdbus_printf("  +%s (%llu bytes) uid=%lld, gid=%lld, pid=%lld, tid=%lld, starttime=%lld\n",
+				enum_MSG(item->type), item->size,
+				item->creds.uid, item->creds.gid,
+				item->creds.pid, item->creds.tid,
+				item->creds.starttime);
+			break;
+
+		case KDBUS_ITEM_AUXGROUPS: {
+			int i, n;
+
+			kdbus_printf("  +%s (%llu bytes)\n",
+				     enum_MSG(item->type), item->size);
+			n = (item->size - KDBUS_ITEM_HEADER_SIZE) /
+				sizeof(uint64_t);
+
+			for (i = 0; i < n; i++)
+				kdbus_printf("    gid[%d] = %lld\n",
+					     i, item->data64[i]);
+			break;
+		}
+
+		case KDBUS_ITEM_NAME:
+		case KDBUS_ITEM_PID_COMM:
+		case KDBUS_ITEM_TID_COMM:
+		case KDBUS_ITEM_EXE:
+		case KDBUS_ITEM_CGROUP:
+		case KDBUS_ITEM_SECLABEL:
+		case KDBUS_ITEM_DST_NAME:
+		case KDBUS_ITEM_CONN_DESCRIPTION:
+			kdbus_printf("  +%s (%llu bytes) '%s' (%zu)\n",
+				     enum_MSG(item->type), item->size,
+				     item->str, strlen(item->str));
+			break;
+
+		case KDBUS_ITEM_OWNED_NAME: {
+			kdbus_printf("  +%s (%llu bytes) '%s' (%zu) flags=0x%08llx\n",
+				     enum_MSG(item->type), item->size,
+				     item->name.name, strlen(item->name.name),
+				     item->name.flags);
+			break;
+		}
+
+		case KDBUS_ITEM_CMDLINE: {
+			size_t size = item->size - KDBUS_ITEM_HEADER_SIZE;
+			const char *str = item->str;
+			int count = 0;
+
+			kdbus_printf("  +%s (%llu bytes) ",
+				     enum_MSG(item->type), item->size);
+			while (size) {
+				kdbus_printf("'%s' ", str);
+				size -= strlen(str) + 1;
+				str += strlen(str) + 1;
+				count++;
+			}
+
+			kdbus_printf("(%d string%s)\n",
+				     count, (count == 1) ? "" : "s");
+			break;
+		}
+
+		case KDBUS_ITEM_AUDIT:
+			kdbus_printf("  +%s (%llu bytes) loginuid=%llu sessionid=%llu\n",
+			       enum_MSG(item->type), item->size,
+			       (unsigned long long)item->audit.loginuid,
+			       (unsigned long long)item->audit.sessionid);
+			break;
+
+		case KDBUS_ITEM_CAPS: {
+			const uint32_t *cap;
+			int n, i;
+
+			kdbus_printf("  +%s (%llu bytes) len=%llu bytes, last_cap %d\n",
+				     enum_MSG(item->type), item->size,
+				     (unsigned long long)item->size -
+					KDBUS_ITEM_HEADER_SIZE,
+				     (int) item->caps.last_cap);
+
+			cap = item->caps.caps;
+			n = (item->size - offsetof(struct kdbus_item, caps.caps))
+				/ 4 / sizeof(uint32_t);
+
+			kdbus_printf("    CapInh=");
+			for (i = 0; i < n; i++)
+				kdbus_printf("%08x", cap[(0 * n) + (n - i - 1)]);
+
+			kdbus_printf(" CapPrm=");
+			for (i = 0; i < n; i++)
+				kdbus_printf("%08x", cap[(1 * n) + (n - i - 1)]);
+
+			kdbus_printf(" CapEff=");
+			for (i = 0; i < n; i++)
+				kdbus_printf("%08x", cap[(2 * n) + (n - i - 1)]);
+
+			kdbus_printf(" CapBnd=");
+			for (i = 0; i < n; i++)
+				kdbus_printf("%08x", cap[(3 * n) + (n - i - 1)]);
+			kdbus_printf("\n");
+			break;
+		}
+
+		case KDBUS_ITEM_TIMESTAMP:
+			kdbus_printf("  +%s (%llu bytes) seq=%llu realtime=%lluns monotonic=%lluns\n",
+			       enum_MSG(item->type), item->size,
+			       (unsigned long long)item->timestamp.seqnum,
+			       (unsigned long long)item->timestamp.realtime_ns,
+			       (unsigned long long)item->timestamp.monotonic_ns);
+			break;
+
+		case KDBUS_ITEM_REPLY_TIMEOUT:
+			kdbus_printf("  +%s (%llu bytes) cookie=%llu\n",
+			       enum_MSG(item->type), item->size,
+			       msg->cookie_reply);
+			break;
+
+		case KDBUS_ITEM_NAME_ADD:
+		case KDBUS_ITEM_NAME_REMOVE:
+		case KDBUS_ITEM_NAME_CHANGE:
+			kdbus_printf("  +%s (%llu bytes) '%s', old id=%lld, now id=%lld, old_flags=0x%llx new_flags=0x%llx\n",
+				enum_MSG(item->type),
+				(unsigned long long) item->size,
+				item->name_change.name,
+				item->name_change.old_id.id,
+				item->name_change.new_id.id,
+				item->name_change.old_id.flags,
+				item->name_change.new_id.flags);
+			break;
+
+		case KDBUS_ITEM_ID_ADD:
+		case KDBUS_ITEM_ID_REMOVE:
+			kdbus_printf("  +%s (%llu bytes) id=%llu flags=%llu\n",
+			       enum_MSG(item->type),
+			       (unsigned long long) item->size,
+			       (unsigned long long) item->id_change.id,
+			       (unsigned long long) item->id_change.flags);
+			break;
+
+		default:
+			kdbus_printf("  +%s (%llu bytes)\n",
+				     enum_MSG(item->type), item->size);
+			break;
+		}
+	}
+
+	if ((char *)item - ((char *)msg + msg->size) >= 8) {
+		kdbus_printf("invalid padding at end of message\n");
+		ret = -EINVAL;
+	}
+
+	kdbus_printf("\n");
+
+	return ret;
+}
+
+void kdbus_msg_free(struct kdbus_msg *msg)
+{
+	const struct kdbus_item *item;
+	int nfds, i;
+
+	if (!msg)
+		return;
+
+	KDBUS_ITEM_FOREACH(item, msg, items) {
+		switch (item->type) {
+		/* close all memfds */
+		case KDBUS_ITEM_PAYLOAD_MEMFD:
+			close(item->memfd.fd);
+			break;
+		case KDBUS_ITEM_FDS:
+			nfds = (item->size - KDBUS_ITEM_HEADER_SIZE) /
+				sizeof(int);
+
+			for (i = 0; i < nfds; i++)
+				close(item->fds[i]);
+
+			break;
+		}
+	}
+}
+
+int kdbus_msg_recv(struct kdbus_conn *conn,
+		   struct kdbus_msg **msg_out,
+		   uint64_t *offset)
+{
+	struct kdbus_cmd_recv recv = {};
+	struct kdbus_msg *msg;
+	int ret;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_RECV, &recv);
+	if (ret < 0) {
+		ret = -errno;
+		return ret;
+	}
+
+	msg = (struct kdbus_msg *)(conn->buf + recv.offset);
+	ret = kdbus_msg_dump(conn, msg);
+	if (ret < 0) {
+		kdbus_msg_free(msg);
+		return ret;
+	}
+
+	if (msg_out) {
+		*msg_out = msg;
+
+		if (offset)
+			*offset = recv.offset;
+	} else {
+		kdbus_msg_free(msg);
+
+		ret = kdbus_free(conn, recv.offset);
+		if (ret < 0)
+			return ret;
+	}
+
+	return 0;
+}
+
+/*
+ * Returns: 0 on success, negative errno on failure.
+ *
+ * We must return -ETIMEDOUT, -ECONNREST, -EAGAIN and other errors.
+ * We must return the result of kdbus_msg_recv()
+ */
+int kdbus_msg_recv_poll(struct kdbus_conn *conn,
+			int timeout_ms,
+			struct kdbus_msg **msg_out,
+			uint64_t *offset)
+{
+	int ret;
+
+	do {
+		struct timeval before, after, diff;
+		struct pollfd fd;
+
+		fd.fd = conn->fd;
+		fd.events = POLLIN | POLLPRI | POLLHUP;
+		fd.revents = 0;
+
+		gettimeofday(&before, NULL);
+		ret = poll(&fd, 1, timeout_ms);
+		gettimeofday(&after, NULL);
+
+		if (ret == 0) {
+			ret = -ETIMEDOUT;
+			break;
+		}
+
+		if (ret > 0) {
+			if (fd.revents & POLLIN)
+				ret = kdbus_msg_recv(conn, msg_out, offset);
+
+			if (fd.revents & (POLLHUP | POLLERR))
+				ret = -ECONNRESET;
+		}
+
+		if (ret == 0 || ret != -EAGAIN)
+			break;
+
+		timersub(&after, &before, &diff);
+		timeout_ms -= diff.tv_sec * 1000UL +
+			      diff.tv_usec / 1000UL;
+	} while (timeout_ms > 0);
+
+	return ret;
+}
+
+int kdbus_free(const struct kdbus_conn *conn, uint64_t offset)
+{
+	struct kdbus_cmd_free cmd_free;
+	int ret;
+
+	cmd_free.offset = offset;
+	cmd_free.flags = 0;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_FREE, &cmd_free);
+	if (ret < 0) {
+		kdbus_printf("KDBUS_CMD_FREE failed: %d (%m)\n", ret);
+		return -errno;
+	}
+
+	return 0;
+}
+
+int kdbus_name_acquire(struct kdbus_conn *conn,
+		       const char *name, uint64_t *flags)
+{
+	struct kdbus_cmd_name *cmd_name;
+	size_t name_len = strlen(name) + 1;
+	uint64_t size = sizeof(*cmd_name) + KDBUS_ITEM_SIZE(name_len);
+	struct kdbus_item *item;
+	int ret;
+
+	cmd_name = alloca(size);
+
+	memset(cmd_name, 0, size);
+
+	item = cmd_name->items;
+	item->size = KDBUS_ITEM_HEADER_SIZE + name_len;
+	item->type = KDBUS_ITEM_NAME;
+	strcpy(item->str, name);
+
+	cmd_name->size = size;
+	if (flags)
+		cmd_name->flags = *flags;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_NAME_ACQUIRE, cmd_name);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error aquiring name: %s\n", strerror(-ret));
+		return ret;
+	}
+
+	kdbus_printf("%s(): flags after call: 0x%llx\n", __func__,
+		     cmd_name->flags);
+
+	if (flags)
+		*flags = cmd_name->flags;
+
+	return 0;
+}
+
+int kdbus_name_release(struct kdbus_conn *conn, const char *name)
+{
+	struct kdbus_cmd_name *cmd_name;
+	size_t name_len = strlen(name) + 1;
+	uint64_t size = sizeof(*cmd_name) + KDBUS_ITEM_SIZE(name_len);
+	struct kdbus_item *item;
+	int ret;
+
+	cmd_name = alloca(size);
+
+	memset(cmd_name, 0, size);
+
+	item = cmd_name->items;
+	item->size = KDBUS_ITEM_HEADER_SIZE + name_len;
+	item->type = KDBUS_ITEM_NAME;
+	strcpy(item->str, name);
+
+	cmd_name->size = size;
+
+	kdbus_printf("conn %lld giving up name '%s'\n",
+		     (unsigned long long) conn->id, name);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_NAME_RELEASE, cmd_name);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error releasing name: %s\n", strerror(-ret));
+		return ret;
+	}
+
+	return 0;
+}
+
+int kdbus_name_list(struct kdbus_conn *conn, uint64_t flags)
+{
+	struct kdbus_cmd_name_list cmd_list;
+	struct kdbus_name_list *list;
+	struct kdbus_name_info *name;
+	int ret;
+
+	cmd_list.flags = flags;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_NAME_LIST, &cmd_list);
+	if (ret < 0) {
+		kdbus_printf("error listing names: %d (%m)\n", ret);
+		return EXIT_FAILURE;
+	}
+
+	kdbus_printf("REGISTRY:\n");
+	list = (struct kdbus_name_list *)(conn->buf + cmd_list.offset);
+	KDBUS_ITEM_FOREACH(name, list, names) {
+		uint64_t flags = 0;
+		struct kdbus_item *item;
+		const char *n = "MISSING-NAME";
+
+		if (name->size == sizeof(struct kdbus_cmd_name))
+			continue;
+
+		KDBUS_ITEM_FOREACH(item, name, items)
+			if (item->type == KDBUS_ITEM_OWNED_NAME) {
+				n = item->name.name;
+				flags = item->name.flags;
+			}
+
+		kdbus_printf("%8llu flags=0x%08llx conn=0x%08llx '%s'\n",
+			     name->owner_id, (unsigned long long )flags,
+			     name->conn_flags, n);
+	}
+	kdbus_printf("\n");
+
+	ret = kdbus_free(conn, cmd_list.offset);
+
+	return ret;
+}
+
+int kdbus_conn_update_attach_flags(struct kdbus_conn *conn, uint64_t flags)
+{
+	int ret;
+	size_t size;
+	struct kdbus_cmd_update *update;
+	struct kdbus_item *item;
+
+	size = sizeof(struct kdbus_cmd_update);
+	size += KDBUS_ITEM_SIZE(sizeof(uint64_t));
+
+	update = malloc(size);
+	if (!update) {
+		ret = -errno;
+		kdbus_printf("error malloc: %d (%m)\n", ret);
+		return ret;
+	}
+
+	memset(update, 0, size);
+	update->size = size;
+
+	item = update->items;
+
+	item->type = KDBUS_ITEM_ATTACH_FLAGS_RECV;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(uint64_t);
+	item->data64[0] = flags;
+	item = KDBUS_ITEM_NEXT(item);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_CONN_UPDATE, update);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error conn update: %d (%m)\n", ret);
+	}
+
+	free(update);
+
+	return ret;
+}
+
+int kdbus_conn_update_policy(struct kdbus_conn *conn, const char *name,
+			     const struct kdbus_policy_access *access,
+			     size_t num_access)
+{
+	struct kdbus_cmd_update *update;
+	struct kdbus_item *item;
+	size_t i, size;
+	int ret;
+
+	size = sizeof(struct kdbus_cmd_update);
+	size += KDBUS_ITEM_SIZE(strlen(name) + 1);
+	size += num_access * KDBUS_ITEM_SIZE(sizeof(struct kdbus_policy_access));
+
+	update = malloc(size);
+	if (!update) {
+		ret = -errno;
+		kdbus_printf("error malloc: %d (%m)\n", ret);
+		return ret;
+	}
+
+	memset(update, 0, size);
+	update->size = size;
+
+	item = update->items;
+
+	item->type = KDBUS_ITEM_NAME;
+	item->size = KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1;
+	strcpy(item->str, name);
+	item = KDBUS_ITEM_NEXT(item);
+
+	for (i = 0; i < num_access; i++) {
+		item->size = KDBUS_ITEM_HEADER_SIZE +
+			     sizeof(struct kdbus_policy_access);
+		item->type = KDBUS_ITEM_POLICY_ACCESS;
+
+		item->policy_access.type = access[i].type;
+		item->policy_access.access = access[i].access;
+		item->policy_access.id = access[i].id;
+
+		item = KDBUS_ITEM_NEXT(item);
+	}
+
+	ret = ioctl(conn->fd, KDBUS_CMD_CONN_UPDATE, update);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error conn update: %d (%m)\n", ret);
+	}
+
+	free(update);
+
+	return ret;
+}
+
+int kdbus_add_match_empty(struct kdbus_conn *conn)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct kdbus_item item;
+	} buf;
+	int ret;
+
+	memset(&buf, 0, sizeof(buf));
+
+	buf.item.size = sizeof(uint64_t) * 3;
+	buf.item.type = KDBUS_ITEM_ID;
+	buf.item.id = KDBUS_MATCH_ID_ANY;
+
+	buf.cmd.size = sizeof(buf.cmd) + buf.item.size;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	if (ret < 0)
+		kdbus_printf("--- error adding conn match: %d (%m)\n", ret);
+
+	return ret;
+}
+
+int drop_privileges(uid_t uid, gid_t gid)
+{
+	int ret;
+
+	ret = setgroups(0, NULL);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error setgroups: %d (%m)\n", ret);
+		return ret;
+	}
+
+	ret = setresgid(gid, gid, gid);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error setresgid: %d (%m)\n", ret);
+		return ret;
+	}
+
+	ret = setresuid(uid, uid, uid);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error setresuid: %d (%m)\n", ret);
+		return ret;
+	}
+
+	return ret;
+}
+
+uint64_t now(clockid_t clock)
+{
+	struct timespec spec;
+
+	clock_gettime(clock, &spec);
+	return spec.tv_sec * 1000ULL * 1000ULL * 1000ULL + spec.tv_nsec;
+}
+
+char *unique_name(const char *prefix)
+{
+	unsigned int i;
+	uint64_t u_now;
+	char n[17];
+	char *str;
+	int r;
+
+	/*
+	 * This returns a random string which is guaranteed to be
+	 * globally unique across all calls to unique_name(). We
+	 * compose the string as:
+	 *   <prefix>-<random>-<time>
+	 * With:
+	 *   <prefix>: string provided by the caller
+	 *   <random>: a random alpha string of 16 characters
+	 *   <time>: the current time in micro-seconds since last boot
+	 *
+	 * The <random> part makes the string always look vastly different,
+	 * the <time> part makes sure no two calls return the same string.
+	 */
+
+	u_now = now(CLOCK_MONOTONIC);
+
+	for (i = 0; i < sizeof(n) - 1; ++i)
+		n[i] = 'a' + (rand() % ('z' - 'a'));
+	n[sizeof(n) - 1] = 0;
+
+	r = asprintf(&str, "%s-%s-%" PRIu64, prefix, n, u_now);
+	if (r < 0)
+		return NULL;
+
+	return str;
+}
+
+static int do_userns_map_id(pid_t pid,
+			    const char *map_file,
+			    const char *map_id)
+{
+	int ret;
+	int fd;
+
+	fd = open(map_file, O_RDWR);
+	if (fd < 0) {
+		ret = -errno;
+		kdbus_printf("error open %s: %d (%m)\n",
+			map_file, ret);
+		return ret;
+	}
+
+	ret = write(fd, map_id, strlen(map_id));
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error write to %s: %d (%m)\n",
+			     map_file, ret);
+		goto out;
+	}
+
+	ret = 0;
+
+out:
+	close(fd);
+	return ret;
+}
+
+int userns_map_uid_gid(pid_t pid,
+		       const char *map_uid,
+		       const char *map_gid)
+{
+	int ret;
+	char file_id[128] = {'\0'};
+
+	snprintf(file_id, sizeof(file_id), "/proc/%ld/uid_map",
+		 (long) pid);
+
+	ret = do_userns_map_id(pid, file_id, map_uid);
+	if (ret < 0)
+		return ret;
+
+	snprintf(file_id, sizeof(file_id), "/proc/%ld/gid_map",
+		 (long) pid);
+
+	return do_userns_map_id(pid, file_id, map_gid);
+}
+
+static int do_cap_get_flag(cap_t caps, cap_value_t cap)
+{
+	int ret;
+	cap_flag_value_t flag_set;
+
+	ret = cap_get_flag(caps, cap, CAP_EFFECTIVE, &flag_set);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error cap_get_flag(): %d (%m)\n", ret);
+		return ret;
+	}
+
+	return (flag_set == CAP_SET);
+}
+
+/*
+ * Returns:
+ *  1 in case all the requested effective capabilities are set.
+ *  0 in case we do not have the requested capabilities. This value
+ *    will be used to abort tests with TEST_SKIP
+ *  Negative errno on failure.
+ *
+ *  Terminate args with a negative value.
+ */
+int test_is_capable(int cap, ...)
+{
+	int ret;
+	va_list ap;
+	cap_t caps;
+
+	caps = cap_get_proc();
+	if (!cap) {
+		ret = -errno;
+		kdbus_printf("error cap_get_proc(): %d (%m)\n", ret);
+		return ret;
+	}
+
+	ret = do_cap_get_flag(caps, (cap_value_t)cap);
+	if (ret <= 0)
+		goto out;
+
+	va_start(ap, cap);
+	while ((cap = va_arg(ap, int)) > 0) {
+		ret = do_cap_get_flag(caps, (cap_value_t)cap);
+		if (ret <= 0)
+			break;
+	}
+	va_end(ap);
+
+out:
+	cap_free(caps);
+	return ret;
+}
diff --git a/tools/testing/selftests/kdbus/kdbus-util.h b/tools/testing/selftests/kdbus/kdbus-util.h
new file mode 100644
index 000000000000..75f68dd32147
--- /dev/null
+++ b/tools/testing/selftests/kdbus/kdbus-util.h
@@ -0,0 +1,143 @@
+/*
+ * Copyright (C) 2013-2014 Kay Sievers
+ * Copyright (C) 2013-2014 Daniel Mack
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+#pragma once
+
+#define BIT(X) (1 << (X))
+
+#include <time.h>
+#include <linux/kdbus.h>
+
+#define _STRINGIFY(x) #x
+#define STRINGIFY(x) _STRINGIFY(x)
+#define ELEMENTSOF(x) (sizeof(x)/sizeof((x)[0]))
+
+#define KDBUS_PTR(addr) ((void *)(uintptr_t)(addr))
+
+#define KDBUS_ALIGN8(l) (((l) + 7) & ~7)
+#define KDBUS_ITEM_HEADER_SIZE offsetof(struct kdbus_item, data)
+#define KDBUS_ITEM_SIZE(s) KDBUS_ALIGN8((s) + KDBUS_ITEM_HEADER_SIZE)
+
+#define KDBUS_ITEM_NEXT(item) \
+	(typeof(item))(((uint8_t *)item) + KDBUS_ALIGN8((item)->size))
+#define KDBUS_ITEM_FOREACH(item, head, first)				\
+	for (item = (head)->first;					\
+	     (uint8_t *)(item) < (uint8_t *)(head) + (head)->size;	\
+	     item = KDBUS_ITEM_NEXT(item))
+
+#define POOL_SIZE (16 * 1024LU * 1024LU)
+
+#define UNPRIV_UID 65534
+#define UNPRIV_GID 65534
+
+/* Dump as user of process, useful for user namespace testing */
+#define SUID_DUMP_USER	1
+
+extern int kdbus_util_verbose;
+
+#define kdbus_printf(X...) \
+	if (kdbus_util_verbose) \
+		printf(X)
+
+#define RUN_UNPRIVILEGED(child_uid, child_gid, _child_, _parent_) ({	\
+		pid_t pid, rpid;					\
+		int ret;						\
+									\
+		pid = fork();						\
+		if (pid == 0) {						\
+			ret = drop_privileges(child_uid, child_gid);	\
+			if (ret < 0)					\
+				_exit(ret);				\
+									\
+			_child_;					\
+			_exit(0);					\
+		} else if (pid > 0) {					\
+			_parent_;					\
+			rpid = waitpid(pid, &ret, 0);			\
+			ASSERT_RETURN(rpid == pid);			\
+			ASSERT_RETURN(WIFEXITED(ret));			\
+			ASSERT_RETURN(WEXITSTATUS(ret) == 0);		\
+			ret = TEST_OK;					\
+		} else {						\
+			ret = pid;					\
+		}							\
+									\
+		ret;							\
+	})
+
+#define RUN_UNPRIVILEGED_CONN(_var_, _bus_, _code_)			\
+	RUN_UNPRIVILEGED(UNPRIV_UID, UNPRIV_GID, ({			\
+		struct kdbus_conn *_var_;				\
+		_var_ = kdbus_hello(_bus_, 0, NULL, 0);			\
+		ASSERT_EXIT(_var_);					\
+		_code_;							\
+		kdbus_conn_free(_var_);					\
+	}), ({ 0; }))
+
+/* Enums for parent if it should drop privs or not */
+enum kdbus_drop_parent {
+	DO_NOT_DROP,
+	DROP_SAME_UNPRIV,
+	DROP_OTHER_UNPRIV,
+};
+
+struct kdbus_conn {
+	int fd;
+	uint64_t id;
+	void *buf;
+};
+
+int sys_memfd_create(const char *name, __u64 size);
+int sys_memfd_seal_set(int fd);
+off_t sys_memfd_get_size(int fd, off_t *size);
+
+int kdbus_name_list(struct kdbus_conn *conn, uint64_t flags);
+int kdbus_name_release(struct kdbus_conn *conn, const char *name);
+int kdbus_name_acquire(struct kdbus_conn *conn, const char *name,
+		       uint64_t *flags);
+void kdbus_msg_free(struct kdbus_msg *msg);
+int kdbus_msg_recv(struct kdbus_conn *conn,
+		   struct kdbus_msg **msg, uint64_t *offset);
+int kdbus_msg_recv_poll(struct kdbus_conn *conn, int timeout_ms,
+			struct kdbus_msg **msg_out, uint64_t *offset);
+int kdbus_free(const struct kdbus_conn *conn, uint64_t offset);
+int kdbus_msg_dump(const struct kdbus_conn *conn,
+		   const struct kdbus_msg *msg);
+int kdbus_create_bus(int control_fd, const char *name, uint64_t req_meta,
+		     char **path);
+int kdbus_msg_send(const struct kdbus_conn *conn, const char *name,
+		   uint64_t cookie, uint64_t flags, uint64_t timeout,
+		   int64_t priority, uint64_t dst_id);
+struct kdbus_conn *kdbus_hello(const char *path, uint64_t hello_flags,
+			       const struct kdbus_item *item,
+			       size_t item_size);
+struct kdbus_conn *kdbus_hello_registrar(const char *path, const char *name,
+					 const struct kdbus_policy_access *access,
+					 size_t num_access, uint64_t flags);
+struct kdbus_conn *kdbus_hello_activator(const char *path, const char *name,
+					 const struct kdbus_policy_access *access,
+					 size_t num_access);
+int kdbus_info(struct kdbus_conn *conn, uint64_t id,
+	       const char *name, uint64_t flags, uint64_t *offset);
+void kdbus_conn_free(struct kdbus_conn *conn);
+int kdbus_conn_update_attach_flags(struct kdbus_conn *conn, uint64_t flags);
+int kdbus_conn_update_policy(struct kdbus_conn *conn, const char *name,
+			     const struct kdbus_policy_access *access,
+			     size_t num_access);
+
+int kdbus_add_match_empty(struct kdbus_conn *conn);
+
+int drop_privileges(uid_t uid, gid_t gid);
+uint64_t now(clockid_t clock);
+char *unique_name(const char *prefix);
+
+int userns_map_uid_gid(pid_t pid,
+		       const char *map_uid,
+		       const char *map_gid);
+int test_is_capable(int cap, ...);
diff --git a/tools/testing/selftests/kdbus/test-activator.c b/tools/testing/selftests/kdbus/test-activator.c
new file mode 100644
index 000000000000..8c7346591d4b
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-activator.c
@@ -0,0 +1,317 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <poll.h>
+#include <sys/capability.h>
+#include <sys/types.h>
+#include <sys/ioctl.h>
+#include <sys/wait.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+static int kdbus_starter_poll(struct kdbus_conn *conn)
+{
+	int ret;
+	struct pollfd fd;
+
+	fd.fd = conn->fd;
+	fd.events = POLLIN | POLLPRI | POLLHUP;
+	fd.revents = 0;
+
+	ret = poll(&fd, 1, 100);
+	if (ret == 0)
+		return -ETIMEDOUT;
+	else if (ret > 0) {
+		if (fd.revents & POLLIN)
+			return 0;
+
+		if (fd.revents & (POLLHUP | POLLERR))
+			ret = -ECONNRESET;
+	}
+
+	return ret;
+}
+
+/* Ensure that kdbus activator logic is safe */
+static int kdbus_priv_activator(struct kdbus_test_env *env)
+{
+	int ret;
+	struct kdbus_msg *msg = NULL;
+	uint64_t cookie = 0xdeadbeef;
+	uint64_t flags = KDBUS_NAME_REPLACE_EXISTING;
+	struct kdbus_conn *activator;
+	struct kdbus_conn *service;
+	struct kdbus_conn *client;
+	struct kdbus_conn *holder;
+	struct kdbus_policy_access *access;
+
+	access = (struct kdbus_policy_access[]){
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = getuid(),
+			.access = KDBUS_POLICY_OWN,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = getuid(),
+			.access = KDBUS_POLICY_TALK,
+		},
+	};
+
+	activator = kdbus_hello_activator(env->buspath, "foo.priv.activator",
+					  access, 2);
+	ASSERT_RETURN(activator);
+
+	service = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(service);
+
+	client = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(client);
+
+	/*
+	 * Make sure that other users can't TALK to the activator
+	 */
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		/* Try to talk using the ID */
+		ret = kdbus_msg_send(unpriv, NULL, 0xdeadbeef, 0, 0,
+				     0, activator->id);
+		ASSERT_EXIT(ret == -ENXIO);
+
+		/* Try to talk to the name */
+		ret = kdbus_msg_send(unpriv, "foo.priv.activator",
+				     0xdeadbeef, 0, 0, 0,
+				     KDBUS_DST_ID_NAME);
+		ASSERT_EXIT(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure that we did not receive anything, so the
+	 * service will not be started automatically
+	 */
+
+	ret = kdbus_starter_poll(activator);
+	ASSERT_RETURN(ret == -ETIMEDOUT);
+
+	/*
+	 * Now try to emulate the starter/service logic and
+	 * acquire the name.
+	 */
+
+	cookie++;
+	ret = kdbus_msg_send(service, "foo.priv.activator", cookie,
+			     0, 0, 0, KDBUS_DST_ID_NAME);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_starter_poll(activator);
+	ASSERT_RETURN(ret == 0);
+
+	/* Policies are still checked, access denied */
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "foo.priv.activator",
+					 &flags);
+		ASSERT_RETURN(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_name_acquire(service, "foo.priv.activator",
+				 &flags);
+	ASSERT_RETURN(ret == 0);
+
+	/* We read our previous starter message */
+
+	ret = kdbus_msg_recv_poll(service, 100, NULL, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* Try to talk, we still fail */
+
+	cookie++;
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		/* Try to talk to the name */
+		ret = kdbus_msg_send(unpriv, "foo.priv.activator",
+				     cookie, 0, 0, 0,
+				     KDBUS_DST_ID_NAME);
+		ASSERT_EXIT(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/* Still nothing to read */
+
+	ret = kdbus_msg_recv_poll(service, 100, NULL, NULL);
+	ASSERT_RETURN(ret == -ETIMEDOUT);
+
+	/* We receive every thing now */
+
+	cookie++;
+	ret = kdbus_msg_send(client, "foo.priv.activator", cookie,
+			     0, 0, 0, KDBUS_DST_ID_NAME);
+	ASSERT_RETURN(ret == 0);
+	ret = kdbus_msg_recv_poll(service, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0 && msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+	kdbus_free(service, msg->offset_reply);
+
+	/* Policies default to deny TALK now */
+	kdbus_conn_free(activator);
+
+	cookie++;
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		/* Try to talk to the name */
+		ret = kdbus_msg_send(unpriv, "foo.priv.activator",
+				     cookie, 0, 0, 0,
+				     KDBUS_DST_ID_NAME);
+		ASSERT_EXIT(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(service, 100, NULL, NULL);
+	ASSERT_RETURN(ret == -ETIMEDOUT);
+
+	/* Same user is able to TALK */
+	cookie++;
+	ret = kdbus_msg_send(client, "foo.priv.activator", cookie,
+			     0, 0, 0, KDBUS_DST_ID_NAME);
+	ASSERT_RETURN(ret == 0);
+	ret = kdbus_msg_recv_poll(service, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0 && msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+	kdbus_free(service, msg->offset_reply);
+
+	access = (struct kdbus_policy_access []){
+		{
+			.type = KDBUS_POLICY_ACCESS_WORLD,
+			.id = getuid(),
+			.access = KDBUS_POLICY_TALK,
+		},
+	};
+
+	holder = kdbus_hello_registrar(env->buspath, "foo.priv.activator",
+				       access, 1, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(holder);
+
+	/* Now we are able to TALK to the name */
+
+	cookie++;
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		/* Try to talk to the name */
+		ret = kdbus_msg_send(unpriv, "foo.priv.activator",
+				     cookie, 0, 0, 0,
+				     KDBUS_DST_ID_NAME);
+		ASSERT_EXIT(ret == 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(service, 100, NULL, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "foo.priv.activator",
+					 &flags);
+		ASSERT_RETURN(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	kdbus_conn_free(service);
+	kdbus_conn_free(client);
+	kdbus_conn_free(holder);
+
+	return 0;
+}
+
+int kdbus_test_activator(struct kdbus_test_env *env)
+{
+	int ret;
+	struct kdbus_conn *activator;
+	struct pollfd fds[2];
+	bool activator_done = false;
+	struct kdbus_policy_access access[2];
+
+	access[0].type = KDBUS_POLICY_ACCESS_USER;
+	access[0].id = 1001;
+	access[0].access = KDBUS_POLICY_OWN;
+
+	access[1].type = KDBUS_POLICY_ACCESS_WORLD;
+	access[1].access = KDBUS_POLICY_TALK;
+
+	activator = kdbus_hello_activator(env->buspath, "foo.test.activator",
+					  access, 2);
+	ASSERT_RETURN(activator);
+
+	ret = kdbus_add_match_empty(env->conn);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_list(env->conn, KDBUS_NAME_LIST_NAMES |
+					 KDBUS_NAME_LIST_UNIQUE |
+					 KDBUS_NAME_LIST_ACTIVATORS |
+					 KDBUS_NAME_LIST_QUEUED);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_send(env->conn, "foo.test.activator", 0xdeafbeef,
+			     0, 0, 0, KDBUS_DST_ID_NAME);
+	ASSERT_RETURN(ret == 0);
+
+	fds[0].fd = activator->fd;
+	fds[1].fd = env->conn->fd;
+
+	kdbus_printf("-- entering poll loop ...\n");
+
+	for (;;) {
+		int i, nfds = sizeof(fds) / sizeof(fds[0]);
+
+		for (i = 0; i < nfds; i++) {
+			fds[i].events = POLLIN | POLLPRI;
+			fds[i].revents = 0;
+		}
+
+		ret = poll(fds, nfds, 3000);
+		ASSERT_RETURN(ret >= 0);
+
+		ret = kdbus_name_list(env->conn, KDBUS_NAME_LIST_NAMES);
+		ASSERT_RETURN(ret == 0);
+
+		if ((fds[0].revents & POLLIN) && !activator_done) {
+			uint64_t flags = KDBUS_NAME_REPLACE_EXISTING;
+
+			kdbus_printf("Starter was called back!\n");
+
+			ret = kdbus_name_acquire(env->conn,
+						 "foo.test.activator", &flags);
+			ASSERT_RETURN(ret == 0);
+
+			activator_done = true;
+		}
+
+		if (fds[1].revents & POLLIN) {
+			kdbus_msg_recv(env->conn, NULL, NULL);
+			break;
+		}
+	}
+
+	/* Check now capabilities, so we run the previous tests */
+	ret = test_is_capable(CAP_SETUID, CAP_SETGID, -1);
+	ASSERT_RETURN(ret >= 0);
+
+	if (!ret)
+		return TEST_SKIP;
+
+	ret = kdbus_priv_activator(env);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(activator);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-benchmark.c b/tools/testing/selftests/kdbus/test-benchmark.c
new file mode 100644
index 000000000000..849e43dac57b
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-benchmark.c
@@ -0,0 +1,409 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <locale.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <errno.h>
+#include <assert.h>
+#include <poll.h>
+#include <sys/time.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/socket.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+#define SERVICE_NAME "foo.bar.echo"
+
+/*
+ * To have a banchmark comparison with unix socket, set:
+ * user_memfd	= false;
+ * compare_uds	= true;
+ * attach_none	= true;		do not attached metadata
+ */
+
+static const bool use_memfd = true;		/* transmit memfd? */
+static const bool compare_uds = false;		/* unix-socket comparison? */
+static const bool attach_none = false;		/* clear attach-flags? */
+static char stress_payload[8192];
+
+struct stats {
+	uint64_t count;
+	uint64_t latency_acc;
+	uint64_t latency_low;
+	uint64_t latency_high;
+};
+
+static struct stats stats;
+
+static void reset_stats(void)
+{
+	stats.count = 0;
+	stats.latency_acc = 0;
+	stats.latency_low = UINT64_MAX;
+	stats.latency_high = 0;
+}
+
+static void dump_stats(bool is_uds)
+{
+	if (stats.count > 0) {
+		kdbus_printf("stats %s: %'llu packets processed, latency (nsecs) min/max/avg %'7llu // %'7llu // %'7llu\n",
+			     is_uds ? " (UNIX)" : "(KDBUS)",
+			     (unsigned long long) stats.count,
+			     (unsigned long long) stats.latency_low,
+			     (unsigned long long) stats.latency_high,
+			     (unsigned long long) (stats.latency_acc / stats.count));
+	} else {
+		kdbus_printf("*** no packets received. bus stuck?\n");
+	}
+}
+
+static void add_stats(uint64_t prev)
+{
+	uint64_t diff;
+
+	diff = now(CLOCK_THREAD_CPUTIME_ID) - prev;
+
+	stats.count++;
+	stats.latency_acc += diff;
+	if (stats.latency_low > diff)
+		stats.latency_low = diff;
+
+	if (stats.latency_high < diff)
+		stats.latency_high = diff;
+}
+
+static int setup_simple_kdbus_msg(struct kdbus_conn *conn,
+				  uint64_t dst_id,
+				  struct kdbus_msg **msg_out)
+{
+	struct kdbus_msg *msg;
+	struct kdbus_item *item;
+	uint64_t size;
+
+	size = sizeof(struct kdbus_msg);
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec));
+
+	msg = malloc(size);
+	ASSERT_RETURN_VAL(msg, -ENOMEM);
+
+	memset(msg, 0, size);
+	msg->size = size;
+	msg->src_id = conn->id;
+	msg->dst_id = dst_id;
+	msg->payload_type = KDBUS_PAYLOAD_DBUS;
+
+	item = msg->items;
+
+	item->type = KDBUS_ITEM_PAYLOAD_VEC;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_vec);
+	item->vec.address = (uintptr_t) stress_payload;
+	item->vec.size = sizeof(stress_payload);
+	item = KDBUS_ITEM_NEXT(item);
+
+	*msg_out = msg;
+
+	return 0;
+}
+
+static int setup_memfd_kdbus_msg(struct kdbus_conn *conn,
+				 uint64_t dst_id,
+				 off_t *memfd_item_offset,
+				 struct kdbus_msg **msg_out)
+{
+	struct kdbus_msg *msg;
+	struct kdbus_item *item;
+	uint64_t size;
+
+	size = sizeof(struct kdbus_msg);
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec));
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd));
+
+	msg = malloc(size);
+	ASSERT_RETURN_VAL(msg, -ENOMEM);
+
+	memset(msg, 0, size);
+	msg->size = size;
+	msg->src_id = conn->id;
+	msg->dst_id = dst_id;
+	msg->payload_type = KDBUS_PAYLOAD_DBUS;
+
+	item = msg->items;
+
+	item->type = KDBUS_ITEM_PAYLOAD_VEC;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_vec);
+	item->vec.address = (uintptr_t) stress_payload;
+	item->vec.size = sizeof(stress_payload);
+	item = KDBUS_ITEM_NEXT(item);
+
+	item->type = KDBUS_ITEM_PAYLOAD_MEMFD;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_memfd);
+	item->memfd.size = sizeof(uint64_t);
+
+	*memfd_item_offset = (unsigned char *)item - (unsigned char *)msg;
+	*msg_out = msg;
+
+	return 0;
+}
+
+static int
+send_echo_request(struct kdbus_conn *conn, uint64_t dst_id,
+		  void *kdbus_msg, off_t memfd_item_offset)
+{
+	int memfd = -1;
+	int ret;
+
+	if (use_memfd) {
+		uint64_t now_ns = now(CLOCK_THREAD_CPUTIME_ID);
+		struct kdbus_item *item = memfd_item_offset + kdbus_msg;
+		memfd = sys_memfd_create("memfd-name", 0);
+		ASSERT_RETURN_VAL(memfd >= 0, memfd);
+
+		ret = write(memfd, &now_ns, sizeof(now_ns));
+		ASSERT_RETURN_VAL(ret == sizeof(now_ns), -EAGAIN);
+
+		ret = sys_memfd_seal_set(memfd);
+		ASSERT_RETURN_VAL(ret == 0, -errno);
+
+		item->memfd.fd = memfd;
+	}
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, kdbus_msg);
+	ASSERT_RETURN_VAL(ret == 0, -errno);
+
+	close(memfd);
+
+	return 0;
+}
+
+static int
+handle_echo_reply(struct kdbus_conn *conn, uint64_t send_ns)
+{
+	int ret;
+	struct kdbus_cmd_recv recv = {};
+	struct kdbus_msg *msg;
+	const struct kdbus_item *item;
+	bool has_memfd = false;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_RECV, &recv);
+	if (ret < 0 && errno == EAGAIN)
+		return -EAGAIN;
+
+	ASSERT_RETURN_VAL(ret == 0, -errno);
+
+	if (!use_memfd)
+		goto out;
+
+	msg = (struct kdbus_msg *)(conn->buf + recv.offset);
+
+	KDBUS_ITEM_FOREACH(item, msg, items) {
+		switch (item->type) {
+		case KDBUS_ITEM_PAYLOAD_MEMFD: {
+			char *buf;
+
+			buf = mmap(NULL, item->memfd.size, PROT_READ,
+				   MAP_PRIVATE, item->memfd.fd, 0);
+			ASSERT_RETURN_VAL(buf != MAP_FAILED, -EINVAL);
+			ASSERT_RETURN_VAL(item->memfd.size == sizeof(uint64_t),
+					  -EINVAL);
+
+			add_stats(*(uint64_t*)buf);
+			munmap(buf, item->memfd.size);
+			close(item->memfd.fd);
+			has_memfd = true;
+			break;
+		}
+
+		case KDBUS_ITEM_PAYLOAD_OFF:
+			/* ignore */
+			break;
+		}
+	}
+
+out:
+	if (!has_memfd)
+		add_stats(send_ns);
+
+	ret = kdbus_free(conn, recv.offset);
+	ASSERT_RETURN_VAL(ret == 0, -errno);
+
+	return 0;
+}
+
+int kdbus_test_benchmark(struct kdbus_test_env *env)
+{
+	static char buf[sizeof(stress_payload)];
+	struct kdbus_msg *kdbus_msg = NULL;
+	off_t memfd_cached_offset = 0;
+	int ret;
+	struct kdbus_conn *conn_a, *conn_b;
+	struct pollfd fds[2];
+	uint64_t start, send_ns, now_ns, diff;
+	unsigned int i;
+	int uds[2];
+
+	setlocale(LC_ALL, "");
+
+	for (i = 0; i < sizeof(stress_payload); i++)
+		stress_payload[i] = i;
+
+	/* setup kdbus pair */
+
+	conn_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_a && conn_b);
+
+	ret = kdbus_add_match_empty(conn_a);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_add_match_empty(conn_b);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(conn_a, SERVICE_NAME, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	if (attach_none) {
+		ret = kdbus_conn_update_attach_flags(conn_a, 0);
+		ASSERT_RETURN(ret == 0);
+	}
+
+	/* setup UDS pair */
+
+	ret = socketpair(AF_UNIX, SOCK_SEQPACKET | SOCK_NONBLOCK, 0, uds);
+	ASSERT_RETURN(ret == 0);
+
+	/* setup a kdbus msg now */
+	if (use_memfd) {
+		ret = setup_memfd_kdbus_msg(conn_b, conn_a->id,
+					    &memfd_cached_offset,
+					    &kdbus_msg);
+		ASSERT_RETURN(ret == 0);
+	} else {
+		ret = setup_simple_kdbus_msg(conn_b, conn_a->id, &kdbus_msg);
+		ASSERT_RETURN(ret == 0);
+	}
+
+	/* start benchmark */
+
+	kdbus_printf("-- entering poll loop ...\n");
+
+	do {
+		/* run kdbus benchmark */
+		fds[0].fd = conn_a->fd;
+		fds[1].fd = conn_b->fd;
+
+		/* cancel any pending message */
+		handle_echo_reply(conn_a, 0);
+
+		start = now(CLOCK_THREAD_CPUTIME_ID);
+		reset_stats();
+
+		send_ns = now(CLOCK_THREAD_CPUTIME_ID);
+		ret = send_echo_request(conn_b, conn_a->id,
+					kdbus_msg, memfd_cached_offset);
+		ASSERT_RETURN(ret == 0);
+
+		while (1) {
+			unsigned int nfds = sizeof(fds) / sizeof(fds[0]);
+			unsigned int i;
+
+			for (i = 0; i < nfds; i++) {
+				fds[i].events = POLLIN | POLLPRI | POLLHUP;
+				fds[i].revents = 0;
+			}
+
+			ret = poll(fds, nfds, 10);
+			if (ret < 0)
+				break;
+
+			if (fds[0].revents & POLLIN) {
+				ret = handle_echo_reply(conn_a, send_ns);
+				ASSERT_RETURN(ret == 0);
+
+				send_ns = now(CLOCK_THREAD_CPUTIME_ID);
+				ret = send_echo_request(conn_b, conn_a->id,
+							kdbus_msg,
+							memfd_cached_offset);
+				ASSERT_RETURN(ret == 0);
+			}
+
+			now_ns = now(CLOCK_THREAD_CPUTIME_ID);
+			diff = now_ns - start;
+			if (diff > 1000000000ULL) {
+				start = now_ns;
+
+				dump_stats(false);
+				break;
+			}
+		}
+
+		if (!compare_uds)
+			continue;
+
+		/* run unix-socket benchmark as comparison */
+
+		fds[0].fd = uds[0];
+		fds[1].fd = uds[1];
+
+		/* cancel any pendign message */
+		read(uds[1], buf, sizeof(buf));
+
+		start = now(CLOCK_THREAD_CPUTIME_ID);
+		reset_stats();
+
+		send_ns = now(CLOCK_THREAD_CPUTIME_ID);
+		ret = write(uds[0], stress_payload, sizeof(stress_payload));
+		ASSERT_RETURN(ret == sizeof(stress_payload));
+
+		while (1) {
+			unsigned int nfds = sizeof(fds) / sizeof(fds[0]);
+			unsigned int i;
+
+			for (i = 0; i < nfds; i++) {
+				fds[i].events = POLLIN | POLLPRI | POLLHUP;
+				fds[i].revents = 0;
+			}
+
+			ret = poll(fds, nfds, 10);
+			if (ret < 0)
+				break;
+
+			if (fds[1].revents & POLLIN) {
+				ret = read(uds[1], buf, sizeof(buf));
+				ASSERT_RETURN(ret == sizeof(buf));
+
+				add_stats(send_ns);
+
+				send_ns = now(CLOCK_THREAD_CPUTIME_ID);
+				ret = write(uds[0], buf, sizeof(buf));
+				ASSERT_RETURN(ret == sizeof(buf));
+			}
+
+			now_ns = now(CLOCK_THREAD_CPUTIME_ID);
+			diff = now_ns - start;
+			if (diff > 1000000000ULL) {
+				start = now_ns;
+
+				dump_stats(true);
+				break;
+			}
+		}
+
+	} while (kdbus_util_verbose);
+
+	kdbus_printf("-- closing bus connections\n");
+
+	free(kdbus_msg);
+
+	kdbus_conn_free(conn_a);
+	kdbus_conn_free(conn_b);
+
+	return (stats.count > 1) ? TEST_OK : TEST_ERR;
+}
diff --git a/tools/testing/selftests/kdbus/test-bus.c b/tools/testing/selftests/kdbus/test-bus.c
new file mode 100644
index 000000000000..86e9fefe6d4a
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-bus.c
@@ -0,0 +1,130 @@
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <limits.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <stdbool.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+static int test_bus_creator_info(const char *bus_path)
+{
+	int ret;
+	struct kdbus_conn *conn;
+	struct kdbus_cmd_info cmd = {};
+
+	cmd.size = sizeof(cmd);
+
+	conn = kdbus_hello(bus_path, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_BUS_CREATOR_INFO, &cmd);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	ret = kdbus_free(conn, cmd.offset);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	return 0;
+}
+
+int kdbus_test_bus_make(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_make head;
+
+		/* bloom size item */
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_bloom_parameter bloom;
+		} bs;
+
+		/* name item */
+		uint64_t n_size;
+		uint64_t n_type;
+		char name[64];
+	} bus_make;
+	char s[PATH_MAX], *name;
+	int ret, control_fd2;
+	uid_t uid;
+
+	name = unique_name("");
+	ASSERT_RETURN(name);
+
+	snprintf(s, sizeof(s), "%s/control", env->root);
+	env->control_fd = open(s, O_RDWR|O_CLOEXEC);
+	ASSERT_RETURN(env->control_fd >= 0);
+
+	control_fd2 = open(s, O_RDWR|O_CLOEXEC);
+	ASSERT_RETURN(control_fd2 >= 0);
+
+	memset(&bus_make, 0, sizeof(bus_make));
+
+	bus_make.bs.size = sizeof(bus_make.bs);
+	bus_make.bs.type = KDBUS_ITEM_BLOOM_PARAMETER;
+	bus_make.bs.bloom.size = 64;
+	bus_make.bs.bloom.n_hash = 1;
+
+	bus_make.n_type = KDBUS_ITEM_MAKE_NAME;
+
+	uid = getuid();
+
+	/* missing uid prefix */
+	snprintf(bus_make.name, sizeof(bus_make.name), "foo");
+	bus_make.n_size = KDBUS_ITEM_HEADER_SIZE + strlen(bus_make.name) + 1;
+	bus_make.head.size = sizeof(struct kdbus_cmd_make) +
+			     sizeof(bus_make.bs) + bus_make.n_size;
+	ret = ioctl(env->control_fd, KDBUS_CMD_BUS_MAKE, &bus_make);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	/* non alphanumeric character */
+	snprintf(bus_make.name, sizeof(bus_make.name), "%u-blah@123", uid);
+	bus_make.n_size = KDBUS_ITEM_HEADER_SIZE + strlen(bus_make.name) + 1;
+	bus_make.head.size = sizeof(struct kdbus_cmd_make) +
+			     sizeof(bus_make.bs) + bus_make.n_size;
+	ret = ioctl(env->control_fd, KDBUS_CMD_BUS_MAKE, &bus_make);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	/* '-' at the end */
+	snprintf(bus_make.name, sizeof(bus_make.name), "%u-blah-", uid);
+	bus_make.n_size = KDBUS_ITEM_HEADER_SIZE + strlen(bus_make.name) + 1;
+	bus_make.head.size = sizeof(struct kdbus_cmd_make) +
+			     sizeof(bus_make.bs) + bus_make.n_size;
+	ret = ioctl(env->control_fd, KDBUS_CMD_BUS_MAKE, &bus_make);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	/* create a new bus */
+	snprintf(bus_make.name, sizeof(bus_make.name), "%u-%s-1", uid, name);
+	bus_make.n_size = KDBUS_ITEM_HEADER_SIZE + strlen(bus_make.name) + 1;
+	bus_make.head.size = sizeof(struct kdbus_cmd_make) +
+			     sizeof(bus_make.bs) + bus_make.n_size;
+	ret = ioctl(env->control_fd, KDBUS_CMD_BUS_MAKE, &bus_make);
+	ASSERT_RETURN(ret == 0);
+
+	ret = ioctl(control_fd2, KDBUS_CMD_BUS_MAKE, &bus_make);
+	ASSERT_RETURN(ret == -1 && errno == EEXIST);
+
+	snprintf(s, sizeof(s), "%s/%u-%s-1/bus", env->root, uid, name);
+	ASSERT_RETURN(access(s, F_OK) == 0);
+
+	ret = test_bus_creator_info(s);
+	ASSERT_RETURN(ret == 0);
+
+	/* can't use the same fd for bus make twice */
+	ret = ioctl(env->control_fd, KDBUS_CMD_BUS_MAKE, &bus_make);
+	ASSERT_RETURN(ret == -1 && errno == EBADFD);
+
+	close(control_fd2);
+	free(name);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-chat.c b/tools/testing/selftests/kdbus/test-chat.c
new file mode 100644
index 000000000000..6a0efbcc3846
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-chat.c
@@ -0,0 +1,123 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <poll.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+int kdbus_test_chat(struct kdbus_test_env *env)
+{
+	int ret, cookie;
+	struct kdbus_conn *conn_a, *conn_b;
+	struct pollfd fds[2];
+	uint64_t flags;
+	int count;
+
+	conn_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_a && conn_b);
+
+	flags = KDBUS_NAME_ALLOW_REPLACEMENT;
+	ret = kdbus_name_acquire(conn_a, "foo.bar.test", &flags);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(conn_a, "foo.bar.baz", NULL);
+	ASSERT_RETURN(ret == 0);
+
+	flags = KDBUS_NAME_QUEUE;
+	ret = kdbus_name_acquire(conn_b, "foo.bar.baz", &flags);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(conn_a, "foo.bar.double", NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(conn_a, "foo.bar.double", NULL);
+	ASSERT_RETURN(ret == -EALREADY);
+
+	ret = kdbus_name_release(conn_a, "foo.bar.double");
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_release(conn_a, "foo.bar.double");
+	ASSERT_RETURN(ret == -ESRCH);
+
+	ret = kdbus_name_list(conn_b, KDBUS_NAME_LIST_UNIQUE |
+				      KDBUS_NAME_LIST_NAMES  |
+				      KDBUS_NAME_LIST_QUEUED |
+				      KDBUS_NAME_LIST_ACTIVATORS);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_add_match_empty(conn_a);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_add_match_empty(conn_b);
+	ASSERT_RETURN(ret == 0);
+
+	cookie = 0;
+	ret = kdbus_msg_send(conn_b, NULL, 0xc0000000 | cookie, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	fds[0].fd = conn_a->fd;
+	fds[1].fd = conn_b->fd;
+
+	kdbus_printf("-- entering poll loop ...\n");
+
+	for (count = 0;; count++) {
+		int i, nfds = sizeof(fds) / sizeof(fds[0]);
+
+		for (i = 0; i < nfds; i++) {
+			fds[i].events = POLLIN | POLLPRI | POLLHUP;
+			fds[i].revents = 0;
+		}
+
+		ret = poll(fds, nfds, 3000);
+		ASSERT_RETURN(ret >= 0);
+
+		if (fds[0].revents & POLLIN) {
+			if (count > 2)
+				kdbus_name_release(conn_a, "foo.bar.baz");
+
+			ret = kdbus_msg_recv(conn_a, NULL, NULL);
+			ASSERT_RETURN(ret == 0);
+			ret = kdbus_msg_send(conn_a, NULL,
+					     0xc0000000 | cookie++,
+					     0, 0, 0, conn_b->id);
+			ASSERT_RETURN(ret == 0);
+		}
+
+		if (fds[1].revents & POLLIN) {
+			ret = kdbus_msg_recv(conn_b, NULL, NULL);
+			ASSERT_RETURN(ret == 0);
+			ret = kdbus_msg_send(conn_b, NULL,
+					     0xc0000000 | cookie++,
+					     0, 0, 0, conn_a->id);
+			ASSERT_RETURN(ret == 0);
+		}
+
+		ret = kdbus_name_list(conn_b, KDBUS_NAME_LIST_UNIQUE |
+					      KDBUS_NAME_LIST_NAMES  |
+					      KDBUS_NAME_LIST_QUEUED |
+					      KDBUS_NAME_LIST_ACTIVATORS);
+		ASSERT_RETURN(ret == 0);
+
+		if (count > 10)
+			break;
+	}
+
+	kdbus_printf("-- closing bus connections\n");
+	kdbus_conn_free(conn_a);
+	kdbus_conn_free(conn_b);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-connection.c b/tools/testing/selftests/kdbus/test-connection.c
new file mode 100644
index 000000000000..a21b6a581408
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-connection.c
@@ -0,0 +1,501 @@
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <limits.h>
+#include <sys/types.h>
+#include <sys/capability.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+#include <stdbool.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+int kdbus_test_hello(struct kdbus_test_env *env)
+{
+	struct kdbus_cmd_hello hello;
+	int fd, ret;
+
+	memset(&hello, 0, sizeof(hello));
+
+	fd = open(env->buspath, O_RDWR|O_CLOEXEC);
+	ASSERT_RETURN(fd >= 0);
+
+	hello.flags = KDBUS_HELLO_ACCEPT_FD;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	hello.attach_flags_recv = _KDBUS_ATTACH_ALL;
+	hello.size = sizeof(struct kdbus_cmd_hello);
+	hello.pool_size = POOL_SIZE;
+
+	/* an unaligned hello must result in -EFAULT */
+	ret = ioctl(fd, KDBUS_CMD_HELLO, (char *) &hello + 1);
+	ASSERT_RETURN(ret == -1 && errno == EFAULT);
+
+	/* a size of 0 must return EMSGSIZE */
+	hello.size = 1;
+	hello.flags = KDBUS_HELLO_ACCEPT_FD;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	hello.size = sizeof(struct kdbus_cmd_hello);
+
+	/* check faulty flags */
+	hello.flags = 1ULL << 32;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	/* kernel must have set its bit in the ioctl buffer */
+	ASSERT_RETURN(hello.kernel_flags & KDBUS_FLAG_KERNEL);
+
+	/* check for faulty pool sizes */
+	hello.pool_size = 0;
+	hello.flags = KDBUS_HELLO_ACCEPT_FD;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == -1 && errno == EFAULT);
+
+	hello.pool_size = 4097;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == -1 && errno == EFAULT);
+
+	hello.pool_size = POOL_SIZE;
+
+	/*
+	 * The connection created by the core requires ALL meta flags
+	 * to be sent. An attempt to send less that that should result
+	 * in -ECONNREFUSED.
+	 */
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL & ~KDBUS_ATTACH_TIMESTAMP;
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == -1 && errno == ECONNREFUSED);
+
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+
+	/* success test */
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == 0);
+
+	/* The kernel should have set KDBUS_FLAG_KERNEL */
+	ASSERT_RETURN(hello.attach_flags_send & KDBUS_FLAG_KERNEL);
+
+	close(fd);
+
+	fd = open(env->buspath, O_RDWR|O_CLOEXEC);
+	ASSERT_RETURN(fd >= 0);
+
+	/* no ACTIVATOR flag without a name */
+	hello.flags = KDBUS_HELLO_ACTIVATOR;
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	close(fd);
+
+	return TEST_OK;
+}
+
+int kdbus_test_byebye(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn;
+	struct kdbus_cmd_recv recv = {};
+	int ret;
+
+	/* create a 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	ret = kdbus_add_match_empty(conn);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_add_match_empty(env->conn);
+	ASSERT_RETURN(ret == 0);
+
+	/* send over 1st connection */
+	ret = kdbus_msg_send(env->conn, NULL, 0, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	/* say byebye on the 2nd, which must fail */
+	ret = ioctl(conn->fd, KDBUS_CMD_BYEBYE, 0);
+	ASSERT_RETURN(ret == -1 && errno == EBUSY);
+
+	/* receive the message */
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_RECV, &recv);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_free(conn, recv.offset);
+	ASSERT_RETURN(ret == 0);
+
+	/* and try again */
+	ret = ioctl(conn->fd, KDBUS_CMD_BYEBYE, 0);
+	ASSERT_RETURN(ret == 0);
+
+	/* a 2nd try should result in -EALREADY */
+	ret = ioctl(conn->fd, KDBUS_CMD_BYEBYE, 0);
+	ASSERT_RETURN(ret == -1 && errno == EALREADY);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
+
+/* Get only the first item */
+static struct kdbus_item *kdbus_get_item(struct kdbus_info *info,
+					 uint64_t type)
+{
+	struct kdbus_item *item;
+
+	KDBUS_ITEM_FOREACH(item, info, items)
+		if (item->type == type)
+			return item;
+
+	return NULL;
+}
+
+static unsigned int kdbus_count_item(struct kdbus_info *info,
+				     uint64_t type)
+{
+	unsigned int i = 0;
+	const struct kdbus_item *item;
+
+	KDBUS_ITEM_FOREACH(item, info, items)
+		if (item->type == type)
+			i++;
+
+	return i;
+}
+
+static int kdbus_fuzz_conn_info(struct kdbus_test_env *env)
+{
+	int ret;
+	unsigned int cnt = 0;
+	uint64_t offset = 0;
+	struct kdbus_info *info;
+	struct kdbus_conn *conn;
+	struct kdbus_conn *privileged;
+	const struct kdbus_item *item;
+	uint64_t valid_flags = KDBUS_ATTACH_NAMES |
+			       KDBUS_ATTACH_CREDS |
+			       KDBUS_ATTACH_CONN_DESCRIPTION;
+
+	uint64_t invalid_flags = KDBUS_ATTACH_NAMES	|
+				 KDBUS_ATTACH_CREDS	|
+				 KDBUS_ATTACH_CAPS	|
+				 KDBUS_ATTACH_CGROUP	|
+				 KDBUS_ATTACH_CONN_DESCRIPTION;
+
+	struct kdbus_creds cached_creds = {
+		.uid	= getuid(),
+		.gid	= getgid(),
+		.pid	= getpid(),
+		.tid	= syscall(SYS_gettid),
+	};
+
+	ret = kdbus_info(env->conn, env->conn->id, NULL,
+			 valid_flags, &offset);
+	ASSERT_RETURN(ret == 0);
+
+	info = (struct kdbus_info *)(env->conn->buf + offset);
+	ASSERT_RETURN(info->id == env->conn->id);
+
+	/* We do not have any well-known name */
+	item = kdbus_get_item(info, KDBUS_ITEM_NAME);
+	ASSERT_RETURN(item == NULL);
+
+	item = kdbus_get_item(info, KDBUS_ITEM_CONN_DESCRIPTION);
+	ASSERT_RETURN(item);
+
+	kdbus_free(env->conn, offset);
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	privileged = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(privileged);
+
+	ret = kdbus_info(conn, conn->id, NULL, valid_flags, &offset);
+	ASSERT_RETURN(ret == 0);
+
+	info = (struct kdbus_info *)(conn->buf + offset);
+	ASSERT_RETURN(info->id == conn->id);
+
+	/* We do not have any well-known name */
+	item = kdbus_get_item(info, KDBUS_ITEM_NAME);
+	ASSERT_RETURN(item == NULL);
+
+	cnt = kdbus_count_item(info, KDBUS_ITEM_CREDS);
+	ASSERT_RETURN(cnt == 1);
+
+	item = kdbus_get_item(info, KDBUS_ITEM_CREDS);
+	ASSERT_RETURN(item);
+
+	/* Compare item->creds with cached creds */
+	ASSERT_RETURN(item->creds.uid == cached_creds.uid &&
+		      item->creds.gid == cached_creds.gid &&
+		      item->creds.pid == cached_creds.pid &&
+		      item->creds.tid == cached_creds.tid);
+
+	/* We did not request KDBUS_ITEM_CAPS */
+	item = kdbus_get_item(info, KDBUS_ITEM_CAPS);
+	ASSERT_RETURN(item == NULL);
+
+	kdbus_free(conn, offset);
+
+	ret = kdbus_name_acquire(conn, "com.example.a", NULL);
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_info(conn, conn->id, NULL, valid_flags, &offset);
+	ASSERT_RETURN(ret == 0);
+
+	info = (struct kdbus_info *)(conn->buf + offset);
+	ASSERT_RETURN(info->id == conn->id);
+
+	item = kdbus_get_item(info, KDBUS_ITEM_OWNED_NAME);
+	ASSERT_RETURN(item && !strcmp(item->name.name, "com.example.a"));
+
+	kdbus_free(conn, offset);
+
+	ret = kdbus_info(conn, 0, "com.example.a", valid_flags, &offset);
+	ASSERT_RETURN(ret == 0);
+
+	info = (struct kdbus_info *)(conn->buf + offset);
+	ASSERT_RETURN(info->id == conn->id);
+
+	kdbus_free(conn, offset);
+
+	ret = RUN_UNPRIVILEGED(UNPRIV_UID, UNPRIV_GID, ({
+		ret = kdbus_info(conn, conn->id, NULL,
+				 valid_flags, &offset);
+		ASSERT_EXIT(ret == 0);
+
+		info = (struct kdbus_info *)(conn->buf + offset);
+		ASSERT_EXIT(info->id == conn->id);
+
+		item = kdbus_get_item(info, KDBUS_ITEM_OWNED_NAME);
+		ASSERT_EXIT(item &&
+			!strcmp(item->name.name, "com.example.a"));
+
+		item = kdbus_get_item(info, KDBUS_ITEM_CREDS);
+		ASSERT_EXIT(item);
+
+		/*
+		 * Compare item->creds with cached creds of
+		 * privileged one.
+		 *
+		 * cmd_info will always return cached creds.
+		 */
+		ASSERT_EXIT(item->creds.uid == cached_creds.uid &&
+			    item->creds.gid == cached_creds.gid &&
+			    item->creds.pid == cached_creds.pid &&
+			    item->creds.tid == cached_creds.tid);
+
+		kdbus_free(conn, offset);
+
+		/*
+		 * Use invalid_flags and make sure that userspace
+		 * do not play with us.
+		 */
+		ret = kdbus_info(conn, conn->id, NULL,
+				 invalid_flags, &offset);
+		ASSERT_EXIT(ret == 0);
+
+		/*
+		 * Make sure that we return only one creds item and
+		 * it points to the cached creds.
+		 */
+		cnt = kdbus_count_item(info, KDBUS_ITEM_CREDS);
+		ASSERT_EXIT(cnt == 1);
+
+		item = kdbus_get_item(info, KDBUS_ITEM_CREDS);
+		ASSERT_EXIT(item);
+
+		/* Compare item->creds with cached creds */
+		ASSERT_EXIT(item->creds.uid == cached_creds.uid &&
+			    item->creds.gid == cached_creds.gid &&
+			    item->creds.pid == cached_creds.pid &&
+			    item->creds.tid == cached_creds.tid);
+
+		cnt = kdbus_count_item(info, KDBUS_ITEM_CGROUP);
+		ASSERT_EXIT(cnt == 1);
+
+		cnt = kdbus_count_item(info, KDBUS_ITEM_CAPS);
+		ASSERT_EXIT(cnt == 1);
+
+		kdbus_free(conn, offset);
+	}),
+	({ 0; }));
+
+	/* A second name */
+	ret = kdbus_name_acquire(conn, "com.example.b", NULL);
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_info(conn, conn->id, NULL, valid_flags, &offset);
+	ASSERT_RETURN(ret == 0);
+
+	info = (struct kdbus_info *)(conn->buf + offset);
+	ASSERT_RETURN(info->id == conn->id);
+
+	cnt = kdbus_count_item(info, KDBUS_ITEM_OWNED_NAME);
+	ASSERT_RETURN(cnt == 2);
+
+	kdbus_free(conn, offset);
+
+	ASSERT_RETURN(ret == 0);
+
+	return 0;
+}
+
+int kdbus_test_conn_info(struct kdbus_test_env *env)
+{
+	int ret;
+	struct {
+		struct kdbus_cmd_info cmd_info;
+
+		struct {
+			uint64_t size;
+			uint64_t type;
+			char str[64];
+		} name;
+	} buf;
+
+	buf.cmd_info.size = sizeof(struct kdbus_cmd_info);
+	buf.cmd_info.flags = 0;
+	buf.cmd_info.id = env->conn->id;
+
+	ret = kdbus_info(env->conn, env->conn->id, NULL, 0, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* try to pass a name that is longer than the buffer's size */
+	buf.name.size = KDBUS_ITEM_HEADER_SIZE + 1;
+	buf.name.type = KDBUS_ITEM_NAME;
+	strcpy(buf.name.str, "foo.bar.bla");
+
+	buf.cmd_info.id = 0;
+	buf.cmd_info.size = sizeof(buf.cmd_info) + buf.name.size;
+	ret = ioctl(env->conn->fd, KDBUS_CMD_CONN_INFO, &buf);
+	ASSERT_RETURN(ret == -1 && errno == EINVAL);
+
+	/* Pass a non existent name */
+	ret = kdbus_info(env->conn, 0, "non.existent.name", 0, NULL);
+	ASSERT_RETURN(ret == -ESRCH);
+
+	/* Test for caps here, so we run the previous test */
+	ret = test_is_capable(CAP_SETUID, CAP_SETGID, -1);
+	ASSERT_RETURN(ret >= 0);
+
+	if (!ret)
+		return TEST_SKIP;
+
+	ret = kdbus_fuzz_conn_info(env);
+	ASSERT_RETURN(ret == 0);
+
+	return TEST_OK;
+}
+
+int kdbus_test_conn_update(struct kdbus_test_env *env)
+{
+	const struct kdbus_item *item;
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	int found = 0;
+	int ret;
+
+	/*
+	 * kdbus_hello() sets all attach flags. Receive a message by this
+	 * connection, and make sure a timestamp item (just to pick one) is
+	 * present.
+	 */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	ret = kdbus_msg_send(env->conn, NULL, 0x12345678, 0, 0, 0, conn->id);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	KDBUS_ITEM_FOREACH(item, msg, items)
+		if (item->type == KDBUS_ITEM_TIMESTAMP)
+			found = 1;
+
+	kdbus_msg_free(msg);
+
+	ASSERT_RETURN(found == 1);
+
+	/*
+	 * Now, modify the attach flags and repeat the action. The item must
+	 * now be missing.
+	 */
+	found = 0;
+
+	ret = kdbus_conn_update_attach_flags(conn, _KDBUS_ATTACH_ALL &
+						   ~KDBUS_ATTACH_TIMESTAMP);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_send(env->conn, NULL, 0x12345678, 0, 0, 0, conn->id);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	KDBUS_ITEM_FOREACH(item, msg, items)
+		if (item->type == KDBUS_ITEM_TIMESTAMP)
+			found = 1;
+
+	ASSERT_RETURN(found == 0);
+
+	kdbus_msg_free(msg);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
+
+int kdbus_test_writable_pool(struct kdbus_test_env *env)
+{
+	struct kdbus_cmd_hello hello;
+	int fd, ret;
+	void *map;
+
+	fd = open(env->buspath, O_RDWR | O_CLOEXEC);
+	ASSERT_RETURN(fd >= 0);
+
+	memset(&hello, 0, sizeof(hello));
+	hello.flags = KDBUS_HELLO_ACCEPT_FD;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+	hello.attach_flags_recv = _KDBUS_ATTACH_ALL;
+	hello.size = sizeof(struct kdbus_cmd_hello);
+	hello.pool_size = POOL_SIZE;
+
+	/* success test */
+	ret = ioctl(fd, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == 0);
+
+	/* pools cannot be mapped writable */
+	map = mmap(NULL, POOL_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	ASSERT_RETURN(map == MAP_FAILED);
+
+	/* pools can always be mapped readable */
+	map = mmap(NULL, POOL_SIZE, PROT_READ, MAP_SHARED, fd, 0);
+	ASSERT_RETURN(map != MAP_FAILED);
+
+	/* make sure we cannot change protection masks to writable */
+	ret = mprotect(map, POOL_SIZE, PROT_READ | PROT_WRITE);
+	ASSERT_RETURN(ret < 0);
+
+	munmap(map, POOL_SIZE);
+	close(fd);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-daemon.c b/tools/testing/selftests/kdbus/test-daemon.c
new file mode 100644
index 000000000000..9007e38d6a7a
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-daemon.c
@@ -0,0 +1,66 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <poll.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+int kdbus_test_daemon(struct kdbus_test_env *env)
+{
+	struct pollfd fds[2];
+	int count;
+	int ret;
+
+	/* This test doesn't make any sense in non-interactive mode */
+	if (!kdbus_util_verbose)
+		return TEST_OK;
+
+	printf("Created connection %llu on bus '%s'\n",
+		(unsigned long long) env->conn->id, env->buspath);
+
+	ret = kdbus_name_acquire(env->conn, "com.example.kdbus-test", NULL);
+	ASSERT_RETURN(ret == 0);
+	printf("  Aquired name: com.example.kdbus-test\n");
+
+	fds[0].fd = env->conn->fd;
+	fds[1].fd = STDIN_FILENO;
+
+	printf("Monitoring connections:\n");
+
+	for (count = 0;; count++) {
+		int i, nfds = sizeof(fds) / sizeof(fds[0]);
+
+		for (i = 0; i < nfds; i++) {
+			fds[i].events = POLLIN | POLLPRI | POLLHUP;
+			fds[i].revents = 0;
+		}
+
+		ret = poll(fds, nfds, -1);
+		if (ret <= 0)
+			break;
+
+		if (fds[0].revents & POLLIN) {
+			ret = kdbus_msg_recv(env->conn, NULL, NULL);
+			ASSERT_RETURN(ret == 0);
+		}
+
+		/* stdin */
+		if (fds[1].revents & POLLIN)
+			break;
+	}
+
+	printf("Closing bus connection\n");
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-endpoint.c b/tools/testing/selftests/kdbus/test-endpoint.c
new file mode 100644
index 000000000000..1a2263f5b584
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-endpoint.c
@@ -0,0 +1,221 @@
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <libgen.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+#define KDBUS_SYSNAME_MAX_LEN			63
+
+static int install_name_add_match(struct kdbus_conn *conn, const char *name)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_name_change chg;
+		} item;
+		char name[64];
+	} buf;
+	int ret;
+
+	/* install the match rule */
+	memset(&buf, 0, sizeof(buf));
+	buf.item.type = KDBUS_ITEM_NAME_ADD;
+	buf.item.chg.old_id.id = KDBUS_MATCH_ID_ANY;
+	buf.item.chg.new_id.id = KDBUS_MATCH_ID_ANY;
+	strncpy(buf.name, name, sizeof(buf.name) - 1);
+	buf.item.size = sizeof(buf.item) + strlen(buf.name) + 1;
+	buf.cmd.size = sizeof(buf.cmd) + buf.item.size;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+static int create_endpoint(const char *buspath, const char *name)
+{
+	struct {
+		struct kdbus_cmd_make head;
+
+		/* name item */
+		struct {
+			uint64_t size;
+			uint64_t type;
+			/* max should be KDBUS_SYSNAME_MAX_LEN */
+			char str[128];
+		} name;
+	} ep_make;
+	int fd, ret;
+
+	fd = open(buspath, O_RDWR);
+	if (fd < 0)
+		return fd;
+
+	memset(&ep_make, 0, sizeof(ep_make));
+
+	snprintf(ep_make.name.str,
+		 /* Use the KDBUS_SYSNAME_MAX_LEN or sizeof(str) */
+		 KDBUS_SYSNAME_MAX_LEN > strlen(name) ?
+		 KDBUS_SYSNAME_MAX_LEN : sizeof(ep_make.name.str),
+		 "%u-%s", getuid(), name);
+
+	ep_make.name.type = KDBUS_ITEM_MAKE_NAME;
+	ep_make.name.size = KDBUS_ITEM_HEADER_SIZE +
+			    strlen(ep_make.name.str) + 1;
+
+	ep_make.head.size = sizeof(ep_make.head) +
+			    ep_make.name.size;
+
+	ret = ioctl(fd, KDBUS_CMD_ENDPOINT_MAKE, &ep_make);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error creating endpoint: %d (%m)\n", ret);
+		return ret;
+	}
+
+	return fd;
+}
+
+static int update_endpoint(int fd, const char *name)
+{
+	int len = strlen(name) + 1;
+	struct {
+		struct kdbus_cmd_update head;
+
+		/* name item */
+		struct {
+			uint64_t size;
+			uint64_t type;
+			char str[KDBUS_ALIGN8(len)];
+		} name;
+
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_policy_access access;
+		} access;
+	} ep_update;
+	int ret;
+
+	memset(&ep_update, 0, sizeof(ep_update));
+
+	ep_update.name.size = KDBUS_ITEM_HEADER_SIZE + len;
+	ep_update.name.type = KDBUS_ITEM_NAME;
+	strncpy(ep_update.name.str, name, sizeof(ep_update.name.str) - 1);
+
+	ep_update.access.size = sizeof(ep_update.access);
+	ep_update.access.type = KDBUS_ITEM_POLICY_ACCESS;
+	ep_update.access.access.type = KDBUS_POLICY_ACCESS_WORLD;
+	ep_update.access.access.access = KDBUS_POLICY_SEE;
+
+	ep_update.head.size = sizeof(ep_update);
+
+	ret = ioctl(fd, KDBUS_CMD_ENDPOINT_UPDATE, &ep_update);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error updating endpoint: %d (%m)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+int kdbus_test_custom_endpoint(struct kdbus_test_env *env)
+{
+	char *ep, *tmp;
+	int ret, ep_fd;
+	struct kdbus_msg *msg;
+	struct kdbus_conn *ep_conn;
+	const char *name = "foo.bar.baz";
+	const char *epname = "foo";
+	char fake_ep[KDBUS_SYSNAME_MAX_LEN + 1] = {'\0'};
+
+	memset(fake_ep, 'X', sizeof(fake_ep) - 1);
+
+	/* Try to create a custom endpoint with a long name */
+	ret = create_endpoint(env->buspath, fake_ep);
+	ASSERT_RETURN(ret == -ENAMETOOLONG);
+
+	/* create a custom endpoint, and open a connection on it */
+	ep_fd = create_endpoint(env->buspath, "foo");
+	ASSERT_RETURN(ep_fd >= 0);
+
+	tmp = strdup(env->buspath);
+	ASSERT_RETURN(tmp);
+
+	ret = asprintf(&ep, "%s/%u-%s", dirname(tmp), getuid(), epname);
+	free(tmp);
+	ASSERT_RETURN(ret >= 0);
+
+	ep_conn = kdbus_hello(ep, 0, NULL, 0);
+	ASSERT_RETURN(ep_conn);
+
+	/*
+	 * Add a name add match on the endpoint connection, acquire name from
+	 * the unfiltered connection, and make sure the filtered connection
+	 * did not get the notification on the name owner change. Also, the
+	 * endpoint connection may not be able to call conn_info, neither on
+	 * the name nor on the ID.
+	 */
+	ret = install_name_add_match(ep_conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(ep_conn, NULL, NULL);
+	ASSERT_RETURN(ret == -EAGAIN);
+
+	ret = kdbus_info(ep_conn, 0, name, 0, NULL);
+	ASSERT_RETURN(ret == -ENOENT);
+
+	ret = kdbus_info(ep_conn, env->conn->id, NULL, 0, NULL);
+	ASSERT_RETURN(ret == -ENOENT);
+
+	/*
+	 * Release the name again, update the custom endpoint policy,
+	 * and try again. This time, the connection on the custom endpoint
+	 * should have gotten it.
+	 */
+	ret = kdbus_name_release(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	ret = update_endpoint(ep_fd, name);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(ep_conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_NAME_ADD);
+	ASSERT_RETURN(msg->items[0].name_change.old_id.id == 0);
+	ASSERT_RETURN(msg->items[0].name_change.new_id.id == env->conn->id);
+	ASSERT_RETURN(strcmp(msg->items[0].name_change.name, name) == 0);
+	kdbus_msg_free(msg);
+
+	ret = kdbus_info(ep_conn, 0, name, 0, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_info(ep_conn, env->conn->id, NULL, 0, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(ep_conn);
+	close(ep_fd);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-fd.c b/tools/testing/selftests/kdbus/test-fd.c
new file mode 100644
index 000000000000..3eda09f61089
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-fd.c
@@ -0,0 +1,664 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/types.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <sys/socket.h>
+#include <sys/wait.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+#define KDBUS_MSG_MAX_ITEMS     128
+#define KDBUS_MSG_MAX_FDS       253
+#define KDBUS_USER_MAX_CONN	256
+
+static int make_msg_payload_dbus(uint64_t src_id, uint64_t dst_id,
+				 uint64_t msg_size,
+				 struct kdbus_msg **msg_dbus)
+{
+	struct kdbus_msg *msg;
+
+	msg = malloc(msg_size);
+	ASSERT_RETURN_VAL(msg, -ENOMEM);
+
+	memset(msg, 0, msg_size);
+	msg->size = msg_size;
+	msg->src_id = src_id;
+	msg->dst_id = dst_id;
+	msg->payload_type = KDBUS_PAYLOAD_DBUS;
+
+	*msg_dbus = msg;
+
+	return 0;
+}
+
+static void make_item_memfds(struct kdbus_item *item,
+			     int *memfds, size_t memfd_size)
+{
+	size_t i;
+
+	for (i = 0; i < memfd_size; i++) {
+		item->type = KDBUS_ITEM_PAYLOAD_MEMFD;
+		item->size = KDBUS_ITEM_HEADER_SIZE +
+			     sizeof(struct kdbus_memfd);
+		item->memfd.fd = memfds[i];
+		item->memfd.size = sizeof(uint64_t); /* const size */
+		item = KDBUS_ITEM_NEXT(item);
+	}
+}
+
+static void make_item_fds(struct kdbus_item *item,
+			  int *fd_array, size_t fd_size)
+{
+	size_t i;
+	item->type = KDBUS_ITEM_FDS;
+	item->size = KDBUS_ITEM_HEADER_SIZE + (sizeof(int) * fd_size);
+
+	for (i = 0; i < fd_size; i++)
+		item->fds[i] = fd_array[i];
+}
+
+static int memfd_write(const char *name, void *buf, size_t bufsize)
+{
+	ssize_t ret;
+	int memfd;
+
+	memfd = sys_memfd_create(name, 0);
+	ASSERT_RETURN_VAL(memfd >= 0, memfd);
+
+	ret = write(memfd, buf, bufsize);
+	ASSERT_RETURN_VAL(ret == (ssize_t)bufsize, -EAGAIN);
+
+	ret = sys_memfd_seal_set(memfd);
+	ASSERT_RETURN_VAL(ret == 0, -errno);
+
+	return memfd;
+}
+
+static int send_memfds(struct kdbus_conn *conn, uint64_t dst_id,
+		       int *memfds_array, size_t memfd_count)
+{
+	struct kdbus_item *item;
+	struct kdbus_msg *msg;
+	uint64_t size;
+	int ret;
+
+	size = sizeof(struct kdbus_msg);
+	size += memfd_count * KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd));
+
+	if (dst_id == KDBUS_DST_ID_BROADCAST)
+		size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_bloom_filter)) + 64;
+
+	ret = make_msg_payload_dbus(conn->id, dst_id, size, &msg);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	item = msg->items;
+
+	if (dst_id == KDBUS_DST_ID_BROADCAST) {
+		item->type = KDBUS_ITEM_BLOOM_FILTER;
+		item->size = KDBUS_ITEM_SIZE(sizeof(struct kdbus_bloom_filter)) + 64;
+		item = KDBUS_ITEM_NEXT(item);
+	}
+
+	make_item_memfds(item, memfds_array, memfd_count);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, msg);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error sending message: %d (%m)\n", ret);
+		return ret;
+	}
+
+	free(msg);
+	return 0;
+}
+
+static int send_fds(struct kdbus_conn *conn, uint64_t dst_id,
+		    int *fd_array, size_t fd_count)
+{
+	struct kdbus_item *item;
+	struct kdbus_msg *msg;
+	uint64_t size;
+	int ret;
+
+	size = sizeof(struct kdbus_msg);
+	size += KDBUS_ITEM_SIZE(sizeof(int) * fd_count);
+
+	ret = make_msg_payload_dbus(conn->id, dst_id, size, &msg);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	item = msg->items;
+
+	make_item_fds(item, fd_array, fd_count);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, msg);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error sending message: %d (%m)\n", ret);
+		return ret;
+	}
+
+	free(msg);
+	return ret;
+}
+
+static int send_fds_memfds(struct kdbus_conn *conn, uint64_t dst_id,
+			   int *fds_array, size_t fd_count,
+			   int *memfds_array, size_t memfd_count)
+{
+	struct kdbus_item *item;
+	struct kdbus_msg *msg;
+	uint64_t size;
+	int ret;
+
+	size = sizeof(struct kdbus_msg);
+	size += memfd_count * KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd));
+	size += KDBUS_ITEM_SIZE(sizeof(int) * fd_count);
+
+	ret = make_msg_payload_dbus(conn->id, dst_id, size, &msg);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	item = msg->items;
+
+	make_item_fds(item, fds_array, fd_count);
+	item = KDBUS_ITEM_NEXT(item);
+	make_item_memfds(item, memfds_array, memfd_count);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, msg);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error sending message: %d (%m)\n", ret);
+		return ret;
+	}
+
+	free(msg);
+	return ret;
+}
+
+/* Return the number of received fds */
+static unsigned int kdbus_item_get_nfds(struct kdbus_msg *msg)
+{
+	unsigned int fds = 0;
+	const struct kdbus_item *item;
+
+	KDBUS_ITEM_FOREACH(item, msg, items) {
+		switch (item->type) {
+		case KDBUS_ITEM_FDS: {
+			fds += (item->size - KDBUS_ITEM_HEADER_SIZE) /
+				sizeof(int);
+			break;
+		}
+
+		case KDBUS_ITEM_PAYLOAD_MEMFD:
+			fds++;
+			break;
+
+		default:
+			break;
+		}
+	}
+
+	return fds;
+}
+
+static struct kdbus_msg *
+get_kdbus_msg_with_fd(struct kdbus_conn *conn_src,
+		      uint64_t dst_id, uint64_t cookie, int fd)
+{
+	int ret;
+	uint64_t size;
+	struct kdbus_item *item;
+	struct kdbus_msg *msg;
+
+	size = sizeof(struct kdbus_msg);
+	if (fd >= 0)
+		size += KDBUS_ITEM_SIZE(sizeof(int));
+
+	ret = make_msg_payload_dbus(conn_src->id, dst_id, size, &msg);
+	ASSERT_RETURN_VAL(ret == 0, NULL);
+
+	msg->cookie = cookie;
+
+	if (fd >= 0) {
+		item = msg->items;
+
+		make_item_fds(item, (int *)&fd, 1);
+	}
+
+	return msg;
+}
+
+static int kdbus_test_no_fds(struct kdbus_test_env *env,
+			     int *fds, int *memfd)
+{
+	pid_t pid;
+	int ret, status;
+	uint64_t cookie;
+	int connfd1, connfd2;
+	struct kdbus_msg *msg, *msg_sync_reply;
+	struct kdbus_cmd_hello hello;
+	struct kdbus_conn *conn_src, *conn_dst, *conn_dummy;
+
+	conn_src = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_src);
+
+	connfd1 = open(env->buspath, O_RDWR|O_CLOEXEC);
+	ASSERT_RETURN(connfd1 >= 0);
+
+	connfd2 = open(env->buspath, O_RDWR|O_CLOEXEC);
+	ASSERT_RETURN(connfd2 >= 0);
+
+	/*
+	 * Create connections without KDBUS_HELLO_ACCEPT_FD
+	 * to test if send fd operations are blocked
+	 */
+	conn_dst = malloc(sizeof(*conn_dst));
+	ASSERT_RETURN(conn_dst);
+
+	conn_dummy = malloc(sizeof(*conn_dummy));
+	ASSERT_RETURN(conn_dummy);
+
+	memset(&hello, 0, sizeof(hello));
+	hello.size = sizeof(struct kdbus_cmd_hello);
+	hello.pool_size = POOL_SIZE;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+
+	ret = ioctl(connfd1, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == 0);
+
+	conn_dst->fd = connfd1;
+	conn_dst->id = hello.id;
+
+	memset(&hello, 0, sizeof(hello));
+	hello.size = sizeof(struct kdbus_cmd_hello);
+	hello.pool_size = POOL_SIZE;
+	hello.attach_flags_send = _KDBUS_ATTACH_ALL;
+
+	ret = ioctl(connfd2, KDBUS_CMD_HELLO, &hello);
+	ASSERT_RETURN(ret == 0);
+
+	conn_dummy->fd = connfd2;
+	conn_dummy->id = hello.id;
+
+	conn_dst->buf = mmap(NULL, POOL_SIZE, PROT_READ,
+			     MAP_PRIVATE, connfd1, 0);
+	ASSERT_RETURN(conn_dst->buf != MAP_FAILED);
+
+	conn_dummy->buf = mmap(NULL, POOL_SIZE, PROT_READ,
+			       MAP_PRIVATE, connfd2, 0);
+	ASSERT_RETURN(conn_dummy->buf != MAP_FAILED);
+
+	/*
+	 * Send fds to connection that do not accept fd passing
+	 */
+	ret = send_fds(conn_src, conn_dst->id, fds, 1);
+	ASSERT_RETURN(ret == -ECOMM);
+
+	/*
+	 * memfd are kdbus payload
+	 */
+	ret = send_memfds(conn_src, conn_dst->id, memfd, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv_poll(conn_dst, 100, NULL, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	cookie = time(NULL);
+
+	pid = fork();
+	ASSERT_RETURN_VAL(pid >= 0, pid);
+
+	if (pid == 0) {
+		struct timespec now;
+
+		/*
+		 * A sync send/reply to a connection that do not
+		 * accept fds should fail if it contains an fd
+		 */
+		msg_sync_reply = get_kdbus_msg_with_fd(conn_dst,
+						       conn_dummy->id,
+						       cookie, fds[0]);
+		ASSERT_EXIT(msg_sync_reply);
+
+		ret = clock_gettime(CLOCK_MONOTONIC_COARSE, &now);
+		ASSERT_EXIT(ret == 0);
+
+		msg_sync_reply->timeout_ns = now.tv_sec * 1000000000ULL +
+					     now.tv_nsec + 100000000ULL;
+		msg_sync_reply->flags = KDBUS_MSG_FLAGS_EXPECT_REPLY |
+					KDBUS_MSG_FLAGS_SYNC_REPLY;
+
+
+		ret = ioctl(conn_dst->fd, KDBUS_CMD_MSG_SEND,
+			    msg_sync_reply);
+		ASSERT_EXIT(ret < 0 && -errno == -ECOMM);
+
+		/*
+		 * Now send a normal message, but the sync reply
+		 * will fail since it contains an fd that the
+		 * original sender do not want.
+		 *
+		 * The original sender will fail with -ETIMEDOUT
+		 */
+		cookie++;
+		ret = kdbus_msg_send(conn_dst, NULL, cookie,
+				     KDBUS_MSG_FLAGS_EXPECT_REPLY |
+				     KDBUS_MSG_FLAGS_SYNC_REPLY,
+				     5000000000ULL, 0, conn_src->id);
+		ASSERT_EXIT(ret == -EREMOTEIO);
+
+		cookie++;
+		ret = kdbus_msg_recv_poll(conn_dst, 100, &msg, NULL);
+		ASSERT_EXIT(ret == 0);
+		ASSERT_EXIT(msg->cookie == cookie);
+
+		free(msg_sync_reply);
+		kdbus_msg_free(msg);
+
+		_exit(EXIT_SUCCESS);
+	}
+
+	ret = kdbus_msg_recv_poll(conn_dummy, 100, NULL, NULL);
+	ASSERT_RETURN(ret == -ETIMEDOUT);
+
+	cookie++;
+	ret = kdbus_msg_recv_poll(conn_src, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0 && msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+
+	/*
+	 * Try to reply with a kdbus connection handle, this should
+	 * fail with -EOPNOTSUPP
+	 */
+	msg_sync_reply = get_kdbus_msg_with_fd(conn_src,
+					       conn_dst->id,
+					       cookie, conn_dst->fd);
+	ASSERT_RETURN(msg_sync_reply);
+
+	msg_sync_reply->cookie_reply = cookie;
+
+	ret = ioctl(conn_src->fd, KDBUS_CMD_MSG_SEND, msg_sync_reply);
+	ASSERT_RETURN(ret < 0 && -errno == -EOPNOTSUPP);
+
+	free(msg_sync_reply);
+
+	/*
+	 * Try to reply with a normal fd, this should fail even
+	 * if the response is a sync reply
+	 *
+	 * From the sender view we fail with -ECOMM
+	 */
+	msg_sync_reply = get_kdbus_msg_with_fd(conn_src,
+					       conn_dst->id,
+					       cookie, fds[0]);
+	ASSERT_RETURN(msg_sync_reply);
+
+	msg_sync_reply->cookie_reply = cookie;
+
+	ret = ioctl(conn_src->fd, KDBUS_CMD_MSG_SEND, msg_sync_reply);
+	ASSERT_RETURN(ret < 0 && -errno == -ECOMM);
+
+	free(msg_sync_reply);
+
+	/*
+	 * Resend another normal message and check if the queue
+	 * is clear
+	 */
+	cookie++;
+	ret = kdbus_msg_send(conn_src, NULL, cookie, 0, 0, 0,
+			     conn_dst->id);
+	ASSERT_RETURN(ret == 0);
+
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN_VAL(ret >= 0, ret);
+
+	kdbus_conn_free(conn_dummy);
+	kdbus_conn_free(conn_dst);
+	kdbus_conn_free(conn_src);
+
+	return (status == EXIT_SUCCESS) ? TEST_OK : TEST_ERR;
+}
+
+static int kdbus_send_multiple_fds(struct kdbus_conn *conn_src,
+				   struct kdbus_conn *conn_dst)
+{
+	int ret, i;
+	unsigned int nfds;
+	int fds[KDBUS_MSG_MAX_FDS + 1];
+	int memfds[KDBUS_MSG_MAX_ITEMS + 1];
+	struct kdbus_msg *msg;
+	uint64_t dummy_value;
+
+	dummy_value = time(NULL);
+
+	for (i = 0; i < KDBUS_MSG_MAX_FDS + 1; i++) {
+		fds[i] = open("/dev/null", O_RDWR|O_CLOEXEC);
+		ASSERT_RETURN_VAL(fds[i] >= 0, -errno);
+	}
+
+	/* Send KDBUS_MSG_MAX_FDS with one more fd */
+	ret = send_fds(conn_src, conn_dst->id, fds, KDBUS_MSG_MAX_FDS + 1);
+	ASSERT_RETURN(ret == -EMFILE);
+
+	/* Retry with the correct KDBUS_MSG_MAX_FDS */
+	ret = send_fds(conn_src, conn_dst->id, fds, KDBUS_MSG_MAX_FDS);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(conn_dst, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* Check we got the right number of fds */
+	nfds = kdbus_item_get_nfds(msg);
+	ASSERT_RETURN(nfds == KDBUS_MSG_MAX_FDS);
+
+	kdbus_msg_free(msg);
+
+	for (i = 0; i < KDBUS_MSG_MAX_ITEMS + 1; i++, dummy_value++) {
+		memfds[i] = memfd_write("memfd-name",
+					&dummy_value,
+					sizeof(dummy_value));
+		ASSERT_RETURN_VAL(memfds[i] >= 0, memfds[i]);
+	}
+
+	/* Send KDBUS_MSG_MAX_FDS with one more memfd */
+	ret = send_memfds(conn_src, conn_dst->id,
+			  memfds, KDBUS_MSG_MAX_ITEMS + 1);
+	ASSERT_RETURN(ret == -E2BIG);
+
+	/* Retry with the correct KDBUS_MSG_MAX_ITEMS */
+	ret = send_memfds(conn_src, conn_dst->id,
+			  memfds, KDBUS_MSG_MAX_ITEMS);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(conn_dst, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* Check we got the right number of fds */
+	nfds = kdbus_item_get_nfds(msg);
+	ASSERT_RETURN(nfds == KDBUS_MSG_MAX_ITEMS);
+
+	kdbus_msg_free(msg);
+
+
+	/* Combine multiple 254 fds and 100 memfds */
+	ret = send_fds_memfds(conn_src, conn_dst->id,
+			      fds, KDBUS_MSG_MAX_FDS + 1,
+			      memfds, 100);
+	ASSERT_RETURN(ret == -EMFILE);
+
+	/* Combine multiple 253 fds and 128 + 1 memfds */
+	ret = send_fds_memfds(conn_src, conn_dst->id,
+			      fds, KDBUS_MSG_MAX_FDS,
+			      memfds, KDBUS_MSG_MAX_ITEMS + 1);
+	ASSERT_RETURN(ret == -E2BIG);
+
+	ret = send_fds_memfds(conn_src, conn_dst->id,
+			      fds, 153, memfds, 100);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(conn_dst, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* Check we got the right number of fds */
+	nfds = kdbus_item_get_nfds(msg);
+	ASSERT_RETURN(nfds == 253);
+
+	kdbus_msg_free(msg);
+
+	for (i = 0; i < KDBUS_MSG_MAX_FDS + 1; i++)
+		close(fds[i]);
+
+	for (i = 0; i < KDBUS_MSG_MAX_ITEMS + 1; i++)
+		close(memfds[i]);
+
+	return 0;
+}
+
+int kdbus_test_fd_passing(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn_src, *conn_dst;
+	const char *str = "stackenblocken";
+	const struct kdbus_item *item;
+	struct kdbus_msg *msg;
+	unsigned int i;
+	time_t now;
+	int fds_conn[2];
+	int sock_pair[2];
+	int fds[2];
+	int memfd;
+	int ret;
+
+	now = time(NULL);
+
+	/* create two connections */
+	conn_src = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_dst = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_src && conn_dst);
+
+	fds_conn[0] = conn_src->fd;
+	fds_conn[1] = conn_dst->fd;
+
+	ret = socketpair(AF_UNIX, SOCK_STREAM, 0, sock_pair);
+	ASSERT_RETURN(ret == 0);
+
+	/* Setup memfd */
+	memfd = memfd_write("memfd-name", &now, sizeof(now));
+	ASSERT_RETURN(memfd >= 0);
+
+	/* Setup pipes */
+	ret = pipe(fds);
+	ASSERT_RETURN(ret == 0);
+
+	i = write(fds[1], str, strlen(str));
+	ASSERT_RETURN(i == strlen(str));
+
+	/*
+	 * Try to ass the handle of a connection as message payload.
+	 * This must fail.
+	 */
+	ret = send_fds(conn_src, conn_dst->id, fds_conn, 2);
+	ASSERT_RETURN(ret == -ENOTSUP);
+
+	ret = send_fds(conn_dst, conn_src->id, fds_conn, 2);
+	ASSERT_RETURN(ret == -ENOTSUP);
+
+	ret = send_fds(conn_src, conn_dst->id, sock_pair, 2);
+	ASSERT_RETURN(ret == -ENOTSUP);
+
+	/*
+	 * Send fds and memfds to connection that do not accept fds
+	 */
+	ret = kdbus_test_no_fds(env, fds, (int *)&memfd);
+	ASSERT_RETURN(ret == 0);
+
+	/* Try to broadcast file descriptors. This must fail. */
+	ret = send_fds(conn_src, KDBUS_DST_ID_BROADCAST, fds, 1);
+	ASSERT_RETURN(ret == -ENOTUNIQ);
+
+	/* Try to broadcast memfd. This must succeed. */
+	ret = send_memfds(conn_src, KDBUS_DST_ID_BROADCAST, (int *)&memfd, 1);
+	ASSERT_RETURN(ret == 0);
+
+	/* Open code this loop */
+loop_send_fds:
+
+	/*
+	 * Send the read end of the pipe and close it.
+	 */
+	ret = send_fds(conn_src, conn_dst->id, fds, 1);
+	ASSERT_RETURN(ret == 0);
+	close(fds[0]);
+
+	ret = kdbus_msg_recv(conn_dst, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	KDBUS_ITEM_FOREACH(item, msg, items) {
+		if (item->type == KDBUS_ITEM_FDS) {
+			char tmp[14];
+			int nfds = (item->size - KDBUS_ITEM_HEADER_SIZE) /
+					sizeof(int);
+			ASSERT_RETURN(nfds == 1);
+
+			i = read(item->fds[0], tmp, sizeof(tmp));
+			if (i != 0) {
+				ASSERT_RETURN(i == sizeof(tmp));
+				ASSERT_RETURN(memcmp(tmp, str, sizeof(tmp)) == 0);
+
+				/* Write EOF */
+				close(fds[1]);
+
+				/*
+				 * Resend the read end of the pipe,
+				 * the receiver still holds a reference
+				 * to it...
+				 */
+				goto loop_send_fds;
+			}
+
+			/* Got EOF */
+
+			/*
+			 * Close the last reference to the read end
+			 * of the pipe, other references are
+			 * automatically closed just after send.
+			 */
+			close(item->fds[0]);
+		}
+	}
+
+	/*
+	 * Try to resend the read end of the pipe. Must fail with
+	 * -EBADF since both the sender and receiver closed their
+	 * references to it. We assume the above since sender and
+	 * receiver are on the same process.
+	 */
+	ret = send_fds(conn_src, conn_dst->id, fds, 1);
+	ASSERT_RETURN(ret == -EBADF);
+
+	/* Then we clear out received any data... */
+	kdbus_msg_free(msg);
+
+	ret = kdbus_send_multiple_fds(conn_src, conn_dst);
+	ASSERT_RETURN(ret == 0);
+
+	close(sock_pair[0]);
+	close(sock_pair[1]);
+	close(memfd);
+
+	kdbus_conn_free(conn_src);
+	kdbus_conn_free(conn_dst);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-free.c b/tools/testing/selftests/kdbus/test-free.c
new file mode 100644
index 000000000000..f43e3f616738
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-free.c
@@ -0,0 +1,34 @@
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+int kdbus_test_free(struct kdbus_test_env *env)
+{
+	int ret;
+	struct kdbus_cmd_free cmd_free;
+
+	/* free an unallocated buffer */
+	cmd_free.flags = 0;
+	cmd_free.offset = 0;
+	ret = ioctl(env->conn->fd, KDBUS_CMD_FREE, &cmd_free);
+	ASSERT_RETURN(ret == -1 && errno == ENXIO);
+
+	/* free a buffer out of the pool's bounds */
+	cmd_free.offset = POOL_SIZE + 1;
+	ret = ioctl(env->conn->fd, KDBUS_CMD_FREE, &cmd_free);
+	ASSERT_RETURN(ret == -1 && errno == ENXIO);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-match.c b/tools/testing/selftests/kdbus/test-match.c
new file mode 100644
index 000000000000..821ee89e0b02
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-match.c
@@ -0,0 +1,437 @@
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+int kdbus_test_match_id_add(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_id_change chg;
+		} item;
+	} buf;
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	int ret;
+
+	memset(&buf, 0, sizeof(buf));
+
+	buf.cmd.size = sizeof(buf);
+	buf.cmd.cookie = 0xdeafbeefdeaddead;
+	buf.item.size = sizeof(buf.item);
+	buf.item.type = KDBUS_ITEM_ID_ADD;
+	buf.item.chg.id = KDBUS_MATCH_ID_ANY;
+
+	/* match on id add */
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	ASSERT_RETURN(ret == 0);
+
+	/* create 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	/* 1st connection should have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_ID_ADD);
+	ASSERT_RETURN(msg->items[0].id_change.id == conn->id);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
+
+int kdbus_test_match_id_remove(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_id_change chg;
+		} item;
+	} buf;
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	size_t id;
+	int ret;
+
+	/* create 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+	id = conn->id;
+
+	memset(&buf, 0, sizeof(buf));
+	buf.cmd.size = sizeof(buf);
+	buf.cmd.cookie = 0xdeafbeefdeaddead;
+	buf.item.size = sizeof(buf.item);
+	buf.item.type = KDBUS_ITEM_ID_REMOVE;
+	buf.item.chg.id = id;
+
+	/* register match on 2nd connection */
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	ASSERT_RETURN(ret == 0);
+
+	/* remove 2nd connection again */
+	kdbus_conn_free(conn);
+
+	/* 1st connection should have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_ID_REMOVE);
+	ASSERT_RETURN(msg->items[0].id_change.id == id);
+
+	return TEST_OK;
+}
+
+int kdbus_test_match_replace(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_id_change chg;
+		} item;
+	} buf;
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	size_t id;
+	int ret;
+
+	/* add a match to id_add */
+	ASSERT_RETURN(kdbus_test_match_id_add(env) == TEST_OK);
+
+	/* do a replace of the match from id_add to id_remove */
+	memset(&buf, 0, sizeof(buf));
+
+	buf.cmd.size = sizeof(buf);
+	buf.cmd.cookie = 0xdeafbeefdeaddead;
+	buf.cmd.flags = KDBUS_MATCH_REPLACE;
+	buf.item.size = sizeof(buf.item);
+	buf.item.type = KDBUS_ITEM_ID_REMOVE;
+	buf.item.chg.id = KDBUS_MATCH_ID_ANY;
+
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+
+	/* create 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+	id = conn->id;
+
+	/* 1st connection should _not_ have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret != 0);
+
+	/* remove 2nd connection */
+	kdbus_conn_free(conn);
+
+	/* 1st connection should _now_ have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_ID_REMOVE);
+	ASSERT_RETURN(msg->items[0].id_change.id == id);
+
+	return TEST_OK;
+}
+
+int kdbus_test_match_name_add(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_name_change chg;
+		} item;
+		char name[64];
+	} buf;
+	struct kdbus_msg *msg;
+	char *name;
+	int ret;
+
+	name = "foo.bla.blaz";
+
+	/* install the match rule */
+	memset(&buf, 0, sizeof(buf));
+	buf.item.type = KDBUS_ITEM_NAME_ADD;
+	buf.item.chg.old_id.id = KDBUS_MATCH_ID_ANY;
+	buf.item.chg.new_id.id = KDBUS_MATCH_ID_ANY;
+	strncpy(buf.name, name, sizeof(buf.name) - 1);
+	buf.item.size = sizeof(buf.item) + strlen(buf.name) + 1;
+	buf.cmd.size = sizeof(buf.cmd) + buf.item.size;
+
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	ASSERT_RETURN(ret == 0);
+
+	/* acquire the name */
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* we should have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_NAME_ADD);
+	ASSERT_RETURN(msg->items[0].name_change.old_id.id == 0);
+	ASSERT_RETURN(msg->items[0].name_change.new_id.id == env->conn->id);
+	ASSERT_RETURN(strcmp(msg->items[0].name_change.name, name) == 0);
+
+	return TEST_OK;
+}
+
+int kdbus_test_match_name_remove(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_name_change chg;
+		} item;
+		char name[64];
+	} buf;
+	struct kdbus_msg *msg;
+	char *name;
+	int ret;
+
+	name = "foo.bla.blaz";
+
+	/* acquire the name */
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* install the match rule */
+	memset(&buf, 0, sizeof(buf));
+	buf.item.type = KDBUS_ITEM_NAME_REMOVE;
+	buf.item.chg.old_id.id = KDBUS_MATCH_ID_ANY;
+	buf.item.chg.new_id.id = KDBUS_MATCH_ID_ANY;
+	strncpy(buf.name, name, sizeof(buf.name) - 1);
+	buf.item.size = sizeof(buf.item) + strlen(buf.name) + 1;
+	buf.cmd.size = sizeof(buf.cmd) + buf.item.size;
+
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	ASSERT_RETURN(ret == 0);
+
+	/* release the name again */
+	kdbus_name_release(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	/* we should have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_NAME_REMOVE);
+	ASSERT_RETURN(msg->items[0].name_change.old_id.id == env->conn->id);
+	ASSERT_RETURN(msg->items[0].name_change.new_id.id == 0);
+	ASSERT_RETURN(strcmp(msg->items[0].name_change.name, name) == 0);
+
+	return TEST_OK;
+}
+
+int kdbus_test_match_name_change(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			struct kdbus_notify_name_change chg;
+		} item;
+		char name[64];
+	} buf;
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	uint64_t flags;
+	char *name = "foo.bla.baz";
+	int ret;
+
+	/* acquire the name */
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* install the match rule */
+	memset(&buf, 0, sizeof(buf));
+	buf.item.type = KDBUS_ITEM_NAME_CHANGE;
+	buf.item.chg.old_id.id = KDBUS_MATCH_ID_ANY;
+	buf.item.chg.new_id.id = KDBUS_MATCH_ID_ANY;
+	strncpy(buf.name, name, sizeof(buf.name) - 1);
+	buf.item.size = sizeof(buf.item) + strlen(buf.name) + 1;
+	buf.cmd.size = sizeof(buf.cmd) + buf.item.size;
+
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	ASSERT_RETURN(ret == 0);
+
+	/* create a 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	/* allow the new connection to own the same name */
+	/* queue the 2nd connection as waiting owner */
+	flags = KDBUS_NAME_QUEUE;
+	ret = kdbus_name_acquire(conn, name, &flags);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(flags & KDBUS_NAME_IN_QUEUE);
+
+	/* release name from 1st connection */
+	ret = kdbus_name_release(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	/* we should have received a notification */
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ASSERT_RETURN(msg->items[0].type == KDBUS_ITEM_NAME_CHANGE);
+	ASSERT_RETURN(msg->items[0].name_change.old_id.id == env->conn->id);
+	ASSERT_RETURN(msg->items[0].name_change.new_id.id == conn->id);
+	ASSERT_RETURN(strcmp(msg->items[0].name_change.name, name) == 0);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
+
+static int send_bloom_filter(const struct kdbus_conn *conn,
+			     uint64_t cookie,
+			     const uint8_t *filter,
+			     size_t filter_size,
+			     uint64_t filter_generation)
+{
+	struct kdbus_msg *msg;
+	struct kdbus_item *item;
+	uint64_t size;
+	int ret;
+
+	size = sizeof(struct kdbus_msg);
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_bloom_filter)) + filter_size;
+
+	msg = alloca(size);
+
+	memset(msg, 0, size);
+	msg->size = size;
+	msg->src_id = conn->id;
+	msg->dst_id = KDBUS_DST_ID_BROADCAST;
+	msg->payload_type = KDBUS_PAYLOAD_DBUS;
+	msg->cookie = cookie;
+
+	item = msg->items;
+	item->type = KDBUS_ITEM_BLOOM_FILTER;
+	item->size = KDBUS_ITEM_SIZE(sizeof(struct kdbus_bloom_filter)) +
+				filter_size;
+
+	item->bloom_filter.generation = filter_generation;
+	memcpy(item->bloom_filter.data, filter, filter_size);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, msg);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error sending message: %d (%m)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+int kdbus_test_match_bloom(struct kdbus_test_env *env)
+{
+	struct {
+		struct kdbus_cmd_match cmd;
+		struct {
+			uint64_t size;
+			uint64_t type;
+			uint8_t data_gen0[64];
+			uint8_t data_gen1[64];
+		} item;
+	} buf;
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	uint64_t cookie = 0xf000f00f;
+	uint8_t filter[64];
+	int ret;
+
+	/* install the match rule */
+	memset(&buf, 0, sizeof(buf));
+	buf.cmd.size = sizeof(buf);
+
+	buf.item.size = sizeof(buf.item);
+	buf.item.type = KDBUS_ITEM_BLOOM_MASK;
+	buf.item.data_gen0[0] = 0x55;
+	buf.item.data_gen0[63] = 0x80;
+
+	buf.item.data_gen1[1] = 0xaa;
+	buf.item.data_gen1[9] = 0x02;
+
+	ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_ADD, &buf);
+	ASSERT_RETURN(ret == 0);
+
+	/* create a 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	/* a message with a 0'ed out filter must not reach the other peer */
+	memset(filter, 0, sizeof(filter));
+	ret = send_bloom_filter(conn, ++cookie, filter, sizeof(filter), 0);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == -EAGAIN);
+
+	/* now set the filter to the connection's mask and expect success */
+	filter[0] = 0x55;
+	filter[63] = 0x80;
+	ret = send_bloom_filter(conn, ++cookie, filter, sizeof(filter), 0);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	/* broaden the filter and try again. this should also succeed. */
+	filter[0] = 0xff;
+	filter[8] = 0xff;
+	filter[63] = 0xff;
+	ret = send_bloom_filter(conn, ++cookie, filter, sizeof(filter), 0);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	/* the same filter must not match against bloom generation 1 */
+	ret = send_bloom_filter(conn, ++cookie, filter, sizeof(filter), 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == -EAGAIN);
+
+	/* set a different filter and try again */
+	filter[1] = 0xaa;
+	filter[9] = 0x02;
+	ret = send_bloom_filter(conn, ++cookie, filter, sizeof(filter), 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv(env->conn, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-message.c b/tools/testing/selftests/kdbus/test-message.c
new file mode 100644
index 000000000000..8422fbeadc2c
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-message.c
@@ -0,0 +1,371 @@
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+#include <sys/eventfd.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+/* maximum number of queued messages from the same indvidual user */
+#define KDBUS_CONN_MAX_MSGS_PER_USER            16
+
+/* maximum number of queued messages in a connection */
+#define KDBUS_CONN_MAX_MSGS			256
+
+int kdbus_test_message_basic(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn;
+	struct kdbus_msg *msg;
+	uint64_t cookie = 0x1234abcd5678eeff;
+	uint64_t offset;
+	int ret;
+
+	/* create a 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	ret = kdbus_add_match_empty(conn);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_add_match_empty(env->conn);
+	ASSERT_RETURN(ret == 0);
+
+	/* send over 1st connection */
+	ret = kdbus_msg_send(env->conn, NULL, cookie, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	/* ... and receive on the 2nd */
+	ret = kdbus_msg_recv_poll(conn, 100, &msg, &offset);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+
+	ret = kdbus_free(conn, offset);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
+
+static int msg_recv_prio(struct kdbus_conn *conn,
+			 int64_t requested_prio,
+			 int64_t expected_prio)
+{
+	struct kdbus_cmd_recv recv = {
+		.flags = KDBUS_RECV_USE_PRIORITY,
+		.priority = requested_prio,
+	};
+	struct kdbus_msg *msg;
+	int ret;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_RECV, &recv);
+	if (ret < 0) {
+		kdbus_printf("error receiving message: %d (%m)\n", -errno);
+		return -errno;
+	}
+
+	msg = (struct kdbus_msg *)(conn->buf + recv.offset);
+	kdbus_msg_dump(conn, msg);
+
+	if (msg->priority != expected_prio) {
+		kdbus_printf("expected message prio %lld, got %lld\n",
+			     (unsigned long long) expected_prio,
+			     (unsigned long long) msg->priority);
+		return -EINVAL;
+	}
+
+	kdbus_msg_free(msg);
+	ret = kdbus_free(conn, recv.offset);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+int kdbus_test_message_prio(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *a, *b;
+	uint64_t cookie = 0;
+
+	a = kdbus_hello(env->buspath, 0, NULL, 0);
+	b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(a && b);
+
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,   25, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0, -600, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,   10, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,  -35, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0, -100, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,   20, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,  -15, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0, -800, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0, -150, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,   10, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0, -800, a->id) == 0);
+	ASSERT_RETURN(kdbus_msg_send(b, NULL, ++cookie, 0, 0,  -10, a->id) == 0);
+
+	ASSERT_RETURN(msg_recv_prio(a, -200, -800) == 0);
+	ASSERT_RETURN(msg_recv_prio(a, -100, -800) == 0);
+	ASSERT_RETURN(msg_recv_prio(a, -400, -600) == 0);
+	ASSERT_RETURN(msg_recv_prio(a, -400, -600) == -ENOMSG);
+	ASSERT_RETURN(msg_recv_prio(a, 10, -150) == 0);
+	ASSERT_RETURN(msg_recv_prio(a, 10, -100) == 0);
+
+	kdbus_printf("--- get priority (all)\n");
+	ASSERT_RETURN(kdbus_msg_recv(a, NULL, NULL) == 0);
+
+	kdbus_conn_free(a);
+	kdbus_conn_free(b);
+
+	return TEST_OK;
+}
+
+/* Return the number of message successfully sent */
+static int kdbus_fill_conn_queue(struct kdbus_conn *conn_src,
+				 struct kdbus_conn *conn_dst,
+				 unsigned int max_msgs)
+{
+	unsigned int i;
+	uint64_t cookie = 0;
+	int ret;
+
+	for (i = 0; i < max_msgs; i++) {
+		ret = kdbus_msg_send(conn_src, NULL, ++cookie, 0,
+				     0, 0, conn_dst->id);
+		if (ret < 0)
+			break;
+	}
+
+	return i;
+}
+
+
+static int kdbus_test_multi_users_quota(struct kdbus_test_env *env)
+{
+	int ret, efd1, efd2;
+	unsigned int cnt, recved_count;
+	unsigned int max_user_msgs = KDBUS_CONN_MAX_MSGS_PER_USER;
+	struct kdbus_conn *conn;
+	struct kdbus_conn *privileged;
+	struct kdbus_conn *holder;
+	eventfd_t child1_count = 0, child2_count = 0;
+	struct kdbus_policy_access access = {
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = geteuid(),
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	holder = kdbus_hello_registrar(env->buspath, "com.example.a",
+				       &access, 1,
+				       KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(holder);
+
+	privileged = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(privileged);
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	/* Acquire name with access world so they can talk to us */
+	ret = kdbus_name_acquire(conn, "com.example.a", NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/* Use this to tell parent how many messages have bee sent */
+	efd1 = eventfd(0, EFD_CLOEXEC);
+	ASSERT_RETURN_VAL(efd1 >= 0, efd1);
+
+	efd2 = eventfd(0, EFD_CLOEXEC);
+	ASSERT_RETURN_VAL(efd2 >= 0, efd2);
+
+	/*
+	 * Queue multiple messages as different users at the
+	 * same time.
+	 *
+	 * When the receiver queue count is below
+	 * KDBUS_CONN_MAX_MSGS_PER_USER messages are not accounted.
+	 *
+	 * So we start two threads running under different uid, they
+	 * race and each one will try to send:
+	 * (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1  msg
+	 *
+	 * Both threads will return how many message was successfull
+	 * queued, later we compute and try to validate the user quota
+	 * checks.
+	 */
+	ret = RUN_UNPRIVILEGED(UNPRIV_UID, UNPRIV_GID, ({
+		struct kdbus_conn *unpriv;
+
+		unpriv = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_EXIT(unpriv);
+
+		cnt = kdbus_fill_conn_queue(unpriv, conn,
+					    (max_user_msgs * 2) + 1);
+		/* Explicitly check for 0 we can't send it to eventfd */
+		ASSERT_EXIT(cnt > 0);
+
+		ret = eventfd_write(efd1, cnt);
+		ASSERT_EXIT(ret == 0);
+	}),
+	({;
+		/* Queue other messages as a different user */
+		ret = RUN_UNPRIVILEGED(UNPRIV_UID - 1, UNPRIV_GID - 1, ({
+			struct kdbus_conn *unpriv;
+
+			unpriv = kdbus_hello(env->buspath, 0, NULL, 0);
+			ASSERT_EXIT(unpriv);
+
+			cnt = kdbus_fill_conn_queue(unpriv, conn,
+						    (max_user_msgs * 2) + 1);
+			/* Explicitly check for 0 */
+			ASSERT_EXIT(cnt > 0);
+
+			ret = eventfd_write(efd2, cnt);
+			ASSERT_EXIT(ret == 0);
+		}),
+		({ 0; }));
+		ASSERT_RETURN(ret == 0);
+
+	}));
+	ASSERT_RETURN(ret == 0);
+
+	/* Delay reading, so if children die we are not blocked */
+	ret = eventfd_read(efd1, &child1_count);
+	ASSERT_RETURN(ret >= 0);
+
+	ret = eventfd_read(efd2, &child2_count);
+	ASSERT_RETURN(ret >= 0);
+
+	recved_count = child1_count + child2_count;
+
+	/* Validate how many messages have been sent */
+	ASSERT_RETURN(recved_count > 0);
+
+	/*
+	 * We start accounting after KDBUS_CONN_MAX_MSGS_PER_USER
+	 * so now we have a KDBUS_CONN_MAX_MSGS_PER_USER not
+	 * accounted, and given we have at least sent
+	 * (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1 for the two threads:
+	 * recved_count for both treads will for sure exceed that
+	 * value.
+	 *
+	 * 1) Both thread1 msgs + threads2 msgs exceed
+	 *    KDBUS_CONN_MAX_MSGS_PER_USER. Accounting is started.
+	 * 2) Now both of them will be able to send only his quota
+	 *    which is KDBUS_CONN_MAX_MSGS_PER_USER
+	 *    (previous sent messages of 1) were not accounted)
+	 */
+	ASSERT_RETURN(recved_count > (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1)
+
+	/*
+	 * A process should never send more than
+	 * (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1)
+	 */
+	ASSERT_RETURN(child1_count < (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1);
+
+	/*
+	 * Now both no accounted messages should give us
+	 * KDBUS_CONN_MAX_MSGS_PER_USER when the accounting
+	 * started.
+	 *
+	 * child1 non accounted + child2 non accounted =
+	 * KDBUS_CONN_MAX_MSGS_PER_USER
+	 */
+	ASSERT_RETURN(KDBUS_CONN_MAX_MSGS_PER_USER ==
+		((child1_count - KDBUS_CONN_MAX_MSGS_PER_USER) +
+		 ((recved_count - child1_count) -
+		  KDBUS_CONN_MAX_MSGS_PER_USER)));
+
+	/*
+	 * A process should never send more than
+	 * (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1)
+	 */
+	ASSERT_RETURN(child2_count < (KDBUS_CONN_MAX_MSGS_PER_USER * 2) + 1);
+
+	/*
+	 * Now both no accounted messages should give us
+	 * KDBUS_CONN_MAX_MSGS_PER_USER when the accounting
+	 * started.
+	 *
+	 * child1 non accounted + child2 non accounted =
+	 * KDBUS_CONN_MAX_MSGS_PER_USER
+	 */
+	ASSERT_RETURN(KDBUS_CONN_MAX_MSGS_PER_USER ==
+		((child2_count - KDBUS_CONN_MAX_MSGS_PER_USER) +
+		 ((recved_count - child2_count) -
+		  KDBUS_CONN_MAX_MSGS_PER_USER)));
+
+	/* Try to queue up more, but we fail no space in the pool */
+	cnt = kdbus_fill_conn_queue(privileged, conn, KDBUS_CONN_MAX_MSGS);
+	ASSERT_RETURN(cnt > 0 && cnt < KDBUS_CONN_MAX_MSGS);
+
+	ret = kdbus_msg_send(privileged, NULL, 0xdeadbeef, 0, 0,
+			     0, conn->id);
+	ASSERT_RETURN(ret == -ENOBUFS);
+
+	close(efd1);
+	close(efd2);
+
+	kdbus_conn_free(privileged);
+	kdbus_conn_free(holder);
+	kdbus_conn_free(conn);
+
+	return 0;
+}
+
+int kdbus_test_message_quota(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *a, *b;
+	uint64_t cookie = 0;
+	int ret;
+	int i;
+
+	if (geteuid() == 0) {
+		ret = kdbus_test_multi_users_quota(env);
+		ASSERT_RETURN(ret == 0);
+
+		/* Drop to 'nobody' and continue test */
+		ret = setresuid(UNPRIV_UID, UNPRIV_UID, UNPRIV_UID);
+		ASSERT_RETURN(ret == 0);
+	}
+
+	a = kdbus_hello(env->buspath, 0, NULL, 0);
+	b = kdbus_hello(env->buspath, 0, NULL, 0);
+
+	ret = kdbus_fill_conn_queue(b, a,
+				    KDBUS_CONN_MAX_MSGS_PER_USER * 2);
+	ASSERT_RETURN(ret == (KDBUS_CONN_MAX_MSGS_PER_USER * 2));
+
+	ret = kdbus_msg_send(b, NULL, ++cookie, 0, 0, 0, a->id);
+	ASSERT_RETURN(ret == -ENOBUFS);
+
+	for (i = 0; i < KDBUS_CONN_MAX_MSGS_PER_USER * 2; ++i) {
+		ret = kdbus_msg_recv(a, NULL, NULL);
+		ASSERT_RETURN(ret == 0);
+	}
+
+	ret = kdbus_fill_conn_queue(b, a,
+				    KDBUS_CONN_MAX_MSGS_PER_USER * 2);
+	ASSERT_RETURN(ret == (KDBUS_CONN_MAX_MSGS_PER_USER * 2));
+
+	ret = kdbus_msg_send(b, NULL, ++cookie, 0, 0, 0, a->id);
+	ASSERT_RETURN(ret == -ENOBUFS);
+
+	kdbus_conn_free(a);
+	kdbus_conn_free(b);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-metadata-ns.c b/tools/testing/selftests/kdbus/test-metadata-ns.c
new file mode 100644
index 000000000000..3af67d98084a
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-metadata-ns.c
@@ -0,0 +1,258 @@
+/* Test metadata in new namespaces */
+
+#include <stdio.h>
+#include <string.h>
+#include <sched.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <signal.h>
+#include <sys/wait.h>
+#include <sys/ioctl.h>
+#include <sys/prctl.h>
+#include <sys/eventfd.h>
+#include <sys/syscall.h>
+#include <sys/capability.h>
+#include <linux/sched.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+static int __kdbus_clone_userns_test(const char *bus, struct kdbus_conn *conn)
+{
+	int efd = -1;
+	pid_t pid;
+	int ret;
+	int status;
+	unsigned int uid = 65534;
+	int test_status = TEST_ERR;
+
+	ret = drop_privileges(UNPRIV_UID, UNPRIV_GID);
+	if (ret < 0)
+		goto out;
+
+	/**
+	 * Since we just dropped privileges, the dumpable flag was just
+	 * cleared which makes the /proc/$clone_child/uid_map to be
+	 * owned by root, hence any userns uid mapping will fail with
+	 * -EPERM since the mapping will be done by uid 65534.
+	 *
+	 * To avoid this set the dumpable flag again which makes procfs
+	 * update the /proc/$clone_child/ inodes owner to 65534.
+	 *
+	 * Using this we will be able write to /proc/$clone_child/uid_map
+	 * as uid 65534 and map the uid 65534 to 0 inside the user
+	 * namespace.
+	 */
+	ret = prctl(PR_SET_DUMPABLE, SUID_DUMP_USER);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error prctl: %d (%m)\n", ret);
+		goto out;
+	}
+
+	/* sync with parent */
+	efd = eventfd(0, EFD_CLOEXEC);
+	if (efd < 0) {
+		ret = -errno;
+		kdbus_printf("error eventfd: %d (%m)\n", ret);
+		goto out;
+	}
+
+	pid = syscall(__NR_clone, SIGCHLD | CLONE_NEWUSER, NULL);
+	if (pid < 0) {
+		ret = -errno;
+		kdbus_printf("error clone: %d (%m)\n", ret);
+
+		/* Unprivileged can't create user namespace ? */
+		if (ret == -EPERM) {
+			kdbus_printf("-- CLONE_NEWUSER TEST Failed for "
+				     "uid: %u\n -- Make sure that your kernel "
+				     "do not allow CLONE_NEWUSER for "
+				     "unprivileged users\n",
+				uid);
+			test_status = TEST_SKIP;
+		}
+
+		goto out;
+	}
+
+	if (pid == 0) {
+		struct kdbus_conn *conn_src;
+		eventfd_t event_status = 0;
+
+		ret = prctl(PR_SET_PDEATHSIG, SIGKILL);
+		if (ret < 0) {
+			ret = -errno;
+			kdbus_printf("error prctl: %d (%m)\n", ret);
+			_exit(TEST_ERR);
+		}
+
+		ret = eventfd_read(efd, &event_status);
+		if (ret < 0 || event_status != 1)
+			_exit(TEST_ERR);
+
+		/* ping connection from the new user namespace */
+		conn_src = kdbus_hello(bus, 0, NULL, 0);
+		ASSERT_EXIT(conn_src);
+
+		ret = kdbus_add_match_empty(conn_src);
+		ASSERT_EXIT(ret == 0);
+
+		ret = kdbus_msg_send(conn_src, NULL, 0xabcd1234,
+				     0, 0, 0, conn->id);
+		ASSERT_EXIT(ret == 0);
+
+		kdbus_conn_free(conn_src);
+		_exit(TEST_OK);
+	}
+
+	ret = userns_map_uid_gid(pid, "0 65534 1", "0 65534 1");
+	if (ret < 0) {
+		/* send error to child */
+		eventfd_write(efd, 2);
+		kdbus_printf("error mapping uid/gid in new user namespace\n");
+		goto out;
+	}
+
+	ret = eventfd_write(efd, 1);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error eventfd_write: %d (%m)\n", ret);
+		goto out;
+	}
+
+	ret = waitpid(pid, &status, 0);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error waitpid: %d (%m)\n", ret);
+		goto out;
+	}
+
+	if (WIFEXITED(status))
+		test_status = WEXITSTATUS(status);
+
+out:
+	if (efd != -1)
+		close(efd);
+
+	return test_status;
+}
+
+/* Get only the first item */
+static struct kdbus_item *kdbus_get_item(struct kdbus_msg *msg,
+					 uint64_t type)
+{
+	struct kdbus_item *item;
+
+	KDBUS_ITEM_FOREACH(item, msg, items)
+		if (item->type == type)
+			return item;
+
+	return NULL;
+}
+
+static int kdbus_clone_userns_test(const char *bus, struct kdbus_conn *conn)
+{
+	int ret;
+	pid_t pid;
+	int status;
+	struct kdbus_msg *msg;
+	const struct kdbus_item *item;
+	/* unpriv user will create its user_ns and change its uid/gid */
+	const struct kdbus_creds unpriv_cached_creds = {
+		.uid	= UNPRIV_UID,
+		.gid	= UNPRIV_GID,
+	};
+
+	kdbus_printf("STARTING TEST 'metadata-ns' in a new user namespace.\n");
+
+	pid = fork();
+	ASSERT_RETURN_VAL(pid >= 0, -errno);
+
+	if (pid == 0) {
+		ret = prctl(PR_SET_PDEATHSIG, SIGKILL);
+		ASSERT_EXIT_VAL(ret == 0, -errno);
+
+		ret = __kdbus_clone_userns_test(bus, conn);
+		_exit(ret);
+	}
+
+	/* Receive in the original (root privileged) user namespace */
+	ret = kdbus_msg_recv_poll(conn, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* We do not get KDBUS_ITEM_CAPS */
+	item = kdbus_get_item(msg, KDBUS_ITEM_CAPS);
+	ASSERT_RETURN(item == NULL);
+
+	item = kdbus_get_item(msg, KDBUS_ITEM_CREDS);
+	ASSERT_RETURN(item);
+
+	/*
+	 * Compare received items, creds must be translated into
+	 * the domain user namespace, so that used is unprivileged
+	 */
+	ASSERT_RETURN(item->creds.uid == unpriv_cached_creds.uid &&
+		      item->creds.gid == unpriv_cached_creds.gid);
+
+	kdbus_msg_free(msg);
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN(ret >= 0);
+
+	if (WIFEXITED(status))
+		return WEXITSTATUS(status);
+
+	return TEST_OK;
+}
+
+int kdbus_test_metadata_ns(struct kdbus_test_env *env)
+{
+	int ret;
+	struct kdbus_conn *holder, *conn;
+	struct kdbus_policy_access policy_access = {
+		/* Allow world so we can inspect metadata in namespace */
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = geteuid(),
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	/* we require user-namespaces */
+	if (access("/proc/self/uid_map", F_OK) != 0)
+		return TEST_SKIP;
+
+	ret = test_is_capable(CAP_SETUID, CAP_SETGID, CAP_SYS_ADMIN, -1);
+	ASSERT_RETURN(ret >= 0);
+
+	/* no enough privileges, SKIP test */
+	if (!ret)
+		return TEST_SKIP;
+
+	holder = kdbus_hello_registrar(env->buspath, "com.example.metadata",
+				       &policy_access, 1,
+				       KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(holder);
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	ret = kdbus_add_match_empty(conn);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_acquire(conn, "com.example.metadata", NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	ret = kdbus_clone_userns_test(env->buspath, conn);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(holder);
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-monitor.c b/tools/testing/selftests/kdbus/test-monitor.c
new file mode 100644
index 000000000000..dfda5dccb7af
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-monitor.c
@@ -0,0 +1,156 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <errno.h>
+#include <assert.h>
+#include <signal.h>
+#include <sys/time.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+static bool kdbus_item_in_message(struct kdbus_msg *msg,
+				  uint64_t type)
+{
+	const struct kdbus_item *item;
+
+	KDBUS_ITEM_FOREACH(item, msg, items)
+		if (item->type == type)
+			return true;
+
+	return false;
+}
+
+int kdbus_test_monitor(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *monitor, *conn;
+	unsigned int cookie = 0xdeadbeef;
+	struct kdbus_msg *msg;
+	uint64_t offset = 0;
+	int ret;
+
+	monitor = kdbus_hello(env->buspath, KDBUS_HELLO_MONITOR, NULL, 0);
+	ASSERT_RETURN(monitor);
+
+	/* check that we can acquire a name */
+	ret = kdbus_name_acquire(monitor, "foo.bar.baz", NULL);
+	ASSERT_RETURN(ret == -EOPNOTSUPP);
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	ret = kdbus_msg_send(env->conn, NULL, cookie, 0, 0,  0, conn->id);
+	ASSERT_RETURN(ret == 0);
+
+	/* the recipient should have got the message */
+	ret = kdbus_msg_recv(conn, &msg, &offset);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+	kdbus_msg_free(msg);
+	kdbus_free(conn, offset);
+
+	/* and so should the monitor */
+	ret = kdbus_msg_recv(monitor, &msg, &offset);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+	kdbus_free(monitor, offset);
+
+	cookie++;
+	ret = kdbus_msg_send(env->conn, NULL, cookie, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	/* The monitor did not install matches, this will timeout */
+	ret = kdbus_msg_recv_poll(monitor, 100, NULL, NULL);
+	ASSERT_RETURN(ret == -ETIMEDOUT);
+
+	/* Install empty match for monitor */
+	ret = kdbus_add_match_empty(monitor);
+	ASSERT_RETURN(ret == 0);
+
+	cookie++;
+	ret = kdbus_msg_send(env->conn, NULL, cookie, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	/* The monitor should get the message now. */
+	ret = kdbus_msg_recv_poll(monitor, 100, &msg, &offset);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+	kdbus_free(monitor, offset);
+
+	/*
+	 * Since we are the only monitor, update the attach flags
+	 * and tell we are not interessted in attach flags
+	*/
+
+	ret = kdbus_conn_update_attach_flags(monitor, 0);
+	ASSERT_RETURN(ret == 0);
+
+	cookie++;
+	ret = kdbus_msg_send(env->conn, NULL, cookie, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv_poll(monitor, 100, &msg, &offset);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	ret = kdbus_item_in_message(msg, KDBUS_ITEM_TIMESTAMP);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_msg_free(msg);
+	kdbus_free(monitor, offset);
+
+	/*
+	 * Now we are interested in KDBUS_ITEM_TIMESTAMP and
+	 * KDBUS_ITEM_CREDS
+	 */
+	ret = kdbus_conn_update_attach_flags(monitor,
+					     KDBUS_ATTACH_TIMESTAMP |
+					     KDBUS_ATTACH_CREDS);
+	ASSERT_RETURN(ret == 0);
+
+	cookie++;
+	ret = kdbus_msg_send(env->conn, NULL, cookie, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv_poll(monitor, 100, &msg, &offset);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(msg->cookie == cookie);
+
+	ret = kdbus_item_in_message(msg, KDBUS_ITEM_TIMESTAMP);
+	ASSERT_RETURN(ret == 1);
+
+	ret = kdbus_item_in_message(msg, KDBUS_ITEM_CREDS);
+	ASSERT_RETURN(ret == 1);
+
+	/* the KDBUS_ITEM_PID_COMM was not requested */
+	ret = kdbus_item_in_message(msg, KDBUS_ITEM_PID_COMM);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_msg_free(msg);
+	kdbus_free(monitor, offset);
+
+	kdbus_conn_free(monitor);
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-names.c b/tools/testing/selftests/kdbus/test-names.c
new file mode 100644
index 000000000000..83d479646315
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-names.c
@@ -0,0 +1,184 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <limits.h>
+#include <sys/ioctl.h>
+#include <getopt.h>
+#include <stdbool.h>
+
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+#include "kdbus-test.h"
+
+static int conn_is_name_owner(const struct kdbus_conn *conn,
+			      const char *needle)
+{
+	struct kdbus_cmd_name_list cmd_list;
+	struct kdbus_name_list *list;
+	struct kdbus_name_info *name;
+	bool found = false;
+	int ret;
+
+	cmd_list.flags = KDBUS_NAME_LIST_NAMES;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_NAME_LIST, &cmd_list);
+	ASSERT_RETURN(ret == 0);
+
+	list = (struct kdbus_name_list *)(conn->buf + cmd_list.offset);
+	KDBUS_ITEM_FOREACH(name, list, names) {
+		struct kdbus_item *item;
+		const char *n = NULL;
+
+		KDBUS_ITEM_FOREACH(item, name, items)
+			if (item->type == KDBUS_ITEM_OWNED_NAME)
+				n = item->name.name;
+
+		if (name->owner_id == conn->id &&
+		    n && strcmp(needle, n) == 0) {
+			found = true;
+			break;
+		}
+	}
+
+	ret = kdbus_free(conn, cmd_list.offset);
+	ASSERT_RETURN(ret == 0);
+
+	return found ? 0 : -1;
+}
+
+int kdbus_test_name_basic(struct kdbus_test_env *env)
+{
+	char *name, *dot_name, *invalid_name, *wildcard_name;
+	int ret;
+
+	name = "foo.bla.blaz";
+	dot_name = ".bla.blaz";
+	invalid_name = "foo";
+	wildcard_name = "foo.bla.bl.*";
+
+	/* Name is not valid, must fail */
+	ret = kdbus_name_acquire(env->conn, dot_name, NULL);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	ret = kdbus_name_acquire(env->conn, invalid_name, NULL);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	ret = kdbus_name_acquire(env->conn, wildcard_name, NULL);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	/* check that we can acquire a name */
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = conn_is_name_owner(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	/* ... and release it again */
+	ret = kdbus_name_release(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	ret = conn_is_name_owner(env->conn, name);
+	ASSERT_RETURN(ret != 0);
+
+	/* check that we can't release it again */
+	ret = kdbus_name_release(env->conn, name);
+	ASSERT_RETURN(ret == -ESRCH);
+
+	/* check that we can't release a name that we don't own */
+	ret = kdbus_name_release(env->conn, "foo.bar.xxx");
+	ASSERT_RETURN(ret == -ESRCH);
+
+	/* Name is not valid, must fail */
+	ret = kdbus_name_release(env->conn, dot_name);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	ret = kdbus_name_release(env->conn, invalid_name);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	ret = kdbus_name_release(env->conn, wildcard_name);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	return TEST_OK;
+}
+
+int kdbus_test_name_conflict(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn;
+	char *name;
+	int ret;
+
+	name = "foo.bla.blaz";
+
+	/* create a 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	/* allow the new connection to own the same name */
+	/* acquire name from the 1st connection */
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = conn_is_name_owner(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	/* check that we can't acquire it again from the 1st connection */
+	ret = kdbus_name_acquire(env->conn, name, NULL);
+	ASSERT_RETURN(ret == -EALREADY);
+
+	/* check that we also can't acquire it again from the 2nd connection */
+	ret = kdbus_name_acquire(conn, name, NULL);
+	ASSERT_RETURN(ret == -EEXIST);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
+
+int kdbus_test_name_queue(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn;
+	const char *name;
+	uint64_t flags;
+	int ret;
+
+	name = "foo.bla.blaz";
+
+	flags = KDBUS_NAME_ALLOW_REPLACEMENT;
+
+	/* create a 2nd connection */
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn != NULL);
+
+	/* allow the new connection to own the same name */
+	/* acquire name from the 1st connection */
+	ret = kdbus_name_acquire(env->conn, name, &flags);
+	ASSERT_RETURN(ret == 0);
+
+	ret = conn_is_name_owner(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	/* queue the 2nd connection as waiting owner */
+	flags = KDBUS_NAME_QUEUE;
+	ret = kdbus_name_acquire(conn, name, &flags);
+	ASSERT_RETURN(ret == 0);
+	ASSERT_RETURN(flags & KDBUS_NAME_IN_QUEUE);
+
+	/* release name from 1st connection */
+	ret = kdbus_name_release(env->conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	/* now the name should be owned by the 2nd connection */
+	ret = conn_is_name_owner(conn, name);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(conn);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-policy-ns.c b/tools/testing/selftests/kdbus/test-policy-ns.c
new file mode 100644
index 000000000000..eea9e16e3418
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-policy-ns.c
@@ -0,0 +1,622 @@
+/*
+ * Copyright (C) 2014 Djalal Harouni
+ *
+ * kdbus is free software; you can redistribute it and/or modify it under
+ * the terms of the GNU Lesser General Public License as published by the
+ * Free Software Foundation; either version 2.1 of the License, or (at
+ * your option) any later version.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <pthread.h>
+#include <sched.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <errno.h>
+#include <signal.h>
+#include <sys/wait.h>
+#include <sys/prctl.h>
+#include <sys/ioctl.h>
+#include <sys/eventfd.h>
+#include <sys/syscall.h>
+#include <linux/sched.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+#define MAX_CONN	64
+#define POLICY_NAME	"foo.test.policy-test"
+
+#define KDBUS_CONN_MAX_MSGS_PER_USER            16
+
+/**
+ * Note: this test can be used to inspect policy_db->talk_access_hash
+ *
+ * The purpose of these tests:
+ * 1) Check KDBUS_POLICY_TALK
+ * 2) Check the cache state: kdbus_policy_db->talk_access_hash
+ * Should be extended
+ */
+
+/**
+ * Check a list of connections against conn_db[0]
+ * conn_db[0] will own the name "foo.test.policy-test" and the
+ * policy holder connection for this name will update the policy
+ * entries, so different use cases can be tested.
+ */
+static struct kdbus_conn **conn_db;
+
+static void *kdbus_recv_echo(void *ptr)
+{
+	int ret;
+	struct kdbus_conn *conn = ptr;
+
+	ret = kdbus_msg_recv_poll(conn, 200, NULL, NULL);
+
+	return (void *)(long)ret;
+}
+
+/* Trigger kdbus_policy_set() */
+static int kdbus_set_policy_talk(struct kdbus_conn *conn,
+				 const char *name,
+				 uid_t id, unsigned int type)
+{
+	int ret;
+	struct kdbus_policy_access access = {
+		.type = type,
+		.id = id,
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	ret = kdbus_conn_update_policy(conn, name, &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	return TEST_OK;
+}
+
+/* return TEST_OK or TEST_ERR on failure */
+static int kdbus_register_same_activator(char *bus, const char *name,
+					 struct kdbus_conn **c)
+{
+	int ret;
+	struct kdbus_conn *activator;
+
+	activator = kdbus_hello_activator(bus, name, NULL, 0);
+	if (activator) {
+		*c = activator;
+		fprintf(stderr, "--- error was able to register name twice '%s'.\n",
+			name);
+		return TEST_ERR;
+	}
+
+	ret = -errno;
+	/* -EEXIST means test succeeded */
+	if (ret == -EEXIST)
+		return TEST_OK;
+
+	return TEST_ERR;
+}
+
+/* return TEST_OK or TEST_ERR on failure */
+static int kdbus_register_policy_holder(char *bus, const char *name,
+					struct kdbus_conn **conn)
+{
+	struct kdbus_conn *c;
+	struct kdbus_policy_access access[2];
+
+	access[0].type = KDBUS_POLICY_ACCESS_USER;
+	access[0].access = KDBUS_POLICY_OWN;
+	access[0].id = geteuid();
+
+	access[1].type = KDBUS_POLICY_ACCESS_WORLD;
+	access[1].access = KDBUS_POLICY_TALK;
+	access[1].id = geteuid();
+
+	c = kdbus_hello_registrar(bus, name, access, 2,
+				  KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(c);
+
+	*conn = c;
+
+	return TEST_OK;
+}
+
+/**
+ * Create new threads for receiving from multiple senders,
+ * The 'conn_db' will be populated by newly created connections.
+ * Caller should free all allocated connections.
+ *
+ * return 0 on success, negative errno on failure.
+ */
+static int kdbus_recv_in_threads(const char *bus, const char *name,
+				 struct kdbus_conn **conn_db)
+{
+	int ret;
+	bool pool_full = false;
+	unsigned int sent_packets = 0;
+	unsigned int lost_packets = 0;
+	unsigned int i, tid;
+	unsigned long dst_id;
+	unsigned long cookie = 1;
+	unsigned int thread_nr = MAX_CONN - 1;
+	pthread_t thread_id[MAX_CONN - 1] = {'\0'};
+
+	dst_id = name ? KDBUS_DST_ID_NAME : conn_db[0]->id;
+
+	for (tid = 0, i = 1; tid < thread_nr; tid++, i++) {
+		ret = pthread_create(&thread_id[tid], NULL,
+				     kdbus_recv_echo, (void *)conn_db[0]);
+		if (ret < 0) {
+			ret = -errno;
+			kdbus_printf("error pthread_create: %d (%m)\n",
+				      ret);
+			break;
+		}
+
+		/* just free before re-using */
+		kdbus_conn_free(conn_db[i]);
+		conn_db[i] = NULL;
+
+		/* We need to create connections here */
+		conn_db[i] = kdbus_hello(bus, 0, NULL, 0);
+		if (!conn_db[i]) {
+			ret = -errno;
+			break;
+		}
+
+		ret = kdbus_add_match_empty(conn_db[i]);
+		if (ret < 0)
+			break;
+
+		ret = kdbus_msg_send(conn_db[i], name, cookie++,
+				     0, 0, 0, dst_id);
+		if (ret < 0) {
+			/*
+			 * Receivers are not reading their messages,
+			 * not scheduled ?!
+			 *
+			 * So set the pool full here, perhaps the
+			 * connection pool or queue was full, later
+			 * recheck receivers errors
+			 */
+			if (ret == -ENOBUFS || ret == -EXFULL)
+				pool_full = true;
+			break;
+		}
+
+		sent_packets++;
+	}
+
+	for (tid = 0; tid < thread_nr; tid++) {
+		int thread_ret = 0;
+
+		if (thread_id[tid]) {
+			pthread_join(thread_id[tid], (void *)&thread_ret);
+			if (thread_ret < 0) {
+				/* Update only if send did not fail */
+				if (ret == 0)
+					ret = thread_ret;
+
+				lost_packets++;
+			}
+		}
+	}
+
+	/*
+	 * When sending if we did fail with -ENOBUFS or -EXFULL
+	 * then we should have set lost_packet and we should at
+	 * least have sent_packets set to KDBUS_CONN_MAX_MSGS_PER_USER
+	 */
+	if (pool_full) {
+		ASSERT_RETURN(lost_packets > 0);
+
+		/*
+		 * We should at least send KDBUS_CONN_MAX_MSGS_PER_USER
+		 *
+		 * For every send operation we create a thread to
+		 * recv the packet, so we keep the queue clean
+		 */
+		ASSERT_RETURN(sent_packets >= KDBUS_CONN_MAX_MSGS_PER_USER);
+
+		/*
+		 * Set ret to zero since we only failed due to
+		 * the receiving threads that have not been
+		 * scheduled
+		 */
+		ret = 0;
+	}
+
+	return ret;
+}
+
+/* Return: TEST_OK or TEST_ERR on failure */
+static int kdbus_normal_test(const char *bus, const char *name,
+			     struct kdbus_conn **conn_db)
+{
+	int ret;
+
+	ret = kdbus_recv_in_threads(bus, name, conn_db);
+	ASSERT_RETURN(ret >= 0);
+
+	return TEST_OK;
+}
+
+static int kdbus_fork_test_by_id(const char *bus,
+				 struct kdbus_conn **conn_db,
+				 int parent_status, int child_status)
+{
+	int ret;
+	pid_t pid;
+	uint64_t cookie = 0x9876ecba;
+	struct kdbus_msg *msg = NULL;
+	uint64_t offset = 0;
+	int status = 0;
+
+	/*
+	 * If the child_status is not EXIT_SUCCESS, then we expect
+	 * that sending from the child will fail, thus receiving
+	 * from parent must error with -ETIMEDOUT, and vice versa.
+	 */
+	bool parent_timedout = !!child_status;
+	bool child_timedout = !!parent_status;
+
+	pid = fork();
+	ASSERT_RETURN_VAL(pid >= 0, pid);
+
+	if (pid == 0) {
+		struct kdbus_conn *conn_src;
+
+		ret = prctl(PR_SET_PDEATHSIG, SIGKILL);
+		ASSERT_EXIT(ret == 0);
+
+		ret = drop_privileges(65534, 65534);
+		ASSERT_EXIT(ret == 0);
+
+		conn_src = kdbus_hello(bus, 0, NULL, 0);
+		ASSERT_EXIT(conn_src);
+
+		ret = kdbus_add_match_empty(conn_src);
+		ASSERT_EXIT(ret == 0);
+
+		/*
+		 * child_status is always checked against send
+		 * operations, in case it fails always return
+		 * EXIT_FAILURE.
+		 */
+		ret = kdbus_msg_send(conn_src, NULL, cookie,
+				     0, 0, 0, conn_db[0]->id);
+		ASSERT_EXIT(ret == child_status);
+
+		ret = kdbus_msg_recv_poll(conn_src, 100, NULL, NULL);
+
+		kdbus_conn_free(conn_src);
+
+		/*
+		 * Child kdbus_msg_recv_poll() should timeout since
+		 * the parent_status was set to a non EXIT_SUCCESS
+		 * value.
+		 */
+		if (child_timedout)
+			_exit(ret == -ETIMEDOUT ? EXIT_SUCCESS : EXIT_FAILURE);
+
+		_exit(ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
+	}
+
+	ret = kdbus_msg_recv_poll(conn_db[0], 100, &msg, &offset);
+	/*
+	 * If parent_timedout is set then this should fail with
+	 * -ETIMEDOUT since the child_status was set to a non
+	 * EXIT_SUCCESS value. Otherwise, assume
+	 * that kdbus_msg_recv_poll() has succeeded.
+	 */
+	if (parent_timedout) {
+		ASSERT_RETURN_VAL(ret == -ETIMEDOUT, TEST_ERR);
+
+		/* timedout no need to continue, we don't have the
+		 * child connection ID, so just terminate. */
+		goto out;
+	} else {
+		ASSERT_RETURN_VAL(ret == 0, ret);
+	}
+
+	ret = kdbus_msg_send(conn_db[0], NULL, ++cookie,
+			     0, 0, 0, msg->src_id);
+	/*
+	 * parent_status is checked against send operations,
+	 * on failures always return TEST_ERR.
+	 */
+	ASSERT_RETURN_VAL(ret == parent_status, TEST_ERR);
+
+	kdbus_msg_free(msg);
+	kdbus_free(conn_db[0], offset);
+
+out:
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN_VAL(ret >= 0, ret);
+
+	return (status == EXIT_SUCCESS) ? TEST_OK : TEST_ERR;
+}
+
+/*
+ * Return: TEST_OK, TEST_ERR or TEST_SKIP
+ * we return TEST_OK only if the children return with the expected
+ * 'expected_status' that is specified as an argument.
+ */
+static int kdbus_fork_test(const char *bus, const char *name,
+			   struct kdbus_conn **conn_db, int expected_status)
+{
+	pid_t pid;
+	int ret = 0;
+	int status = 0;
+
+	pid = fork();
+	ASSERT_RETURN_VAL(pid >= 0, pid);
+
+	if (pid == 0) {
+		ret = prctl(PR_SET_PDEATHSIG, SIGKILL);
+		ASSERT_EXIT(ret == 0);
+
+		ret = drop_privileges(65534, 65534);
+		ASSERT_EXIT(ret == 0);
+
+		ret = kdbus_recv_in_threads(bus, name, conn_db);
+		_exit(ret == expected_status ? EXIT_SUCCESS : EXIT_FAILURE);
+	}
+
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN(ret >= 0);
+
+	return (status == EXIT_SUCCESS) ? TEST_OK : TEST_ERR;
+}
+
+/* Return EXIT_SUCCESS, EXIT_FAILURE or negative errno */
+static int __kdbus_clone_userns_test(const char *bus,
+				     const char *name,
+				     struct kdbus_conn **conn_db,
+				     int expected_status)
+{
+	int efd;
+	pid_t pid;
+	int ret = 0;
+	unsigned int uid = 65534;
+	int status;
+
+	ret = drop_privileges(uid, uid);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	/*
+	 * Since we just dropped privileges, the dumpable flag was just
+	 * cleared which makes the /proc/$clone_child/uid_map to be
+	 * owned by root, hence any userns uid mapping will fail with
+	 * -EPERM since the mapping will be done by uid 65534.
+	 *
+	 * To avoid this set the dumpable flag again which makes procfs
+	 * update the /proc/$clone_child/ inodes owner to 65534.
+	 *
+	 * Using this we will be able write to /proc/$clone_child/uid_map
+	 * as uid 65534 and map the uid 65534 to 0 inside the user
+	 * namespace.
+	 */
+	ret = prctl(PR_SET_DUMPABLE, SUID_DUMP_USER);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	/* sync parent/child */
+	efd = eventfd(0, EFD_CLOEXEC);
+	ASSERT_RETURN_VAL(efd >= 0, efd);
+
+	pid = syscall(__NR_clone, SIGCHLD|CLONE_NEWUSER, NULL);
+	if (pid < 0) {
+		ret = -errno;
+		kdbus_printf("error clone: %d (%m)\n", ret);
+		/*
+		 * Normal user not allowed to create userns,
+		 * so nothing to worry about ?
+		 */
+		if (ret == -EPERM) {
+			kdbus_printf("-- CLONE_NEWUSER TEST Failed for uid: %u\n"
+				"-- Make sure that your kernel do not allow "
+				"CLONE_NEWUSER for unprivileged users\n"
+				"-- Upstream Commit: "
+				"https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5eaf563e\n",
+				uid);
+			ret = 0;
+		}
+
+		return ret;
+	}
+
+	if (pid == 0) {
+		struct kdbus_conn *conn_src;
+		eventfd_t event_status = 0;
+
+		ret = prctl(PR_SET_PDEATHSIG, SIGKILL);
+		ASSERT_EXIT(ret == 0);
+
+		ret = eventfd_read(efd, &event_status);
+		ASSERT_EXIT(ret >= 0 && event_status == 1);
+
+		/* ping connection from the new user namespace */
+		conn_src = kdbus_hello(bus, 0, NULL, 0);
+		ASSERT_EXIT(conn_src);
+
+		ret = kdbus_add_match_empty(conn_src);
+		ASSERT_EXIT(ret == 0);
+
+		ret = kdbus_msg_send(conn_src, name, 0xabcd1234,
+				     0, 0, 0, KDBUS_DST_ID_NAME);
+		kdbus_conn_free(conn_src);
+
+		_exit(ret == expected_status ? EXIT_SUCCESS : EXIT_FAILURE);
+	}
+
+	ret = userns_map_uid_gid(pid, "0 65534 1", "0 65534 1");
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	/* Tell child we are ready */
+	ret = eventfd_write(efd, 1);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN_VAL(ret >= 0, ret);
+
+	close(efd);
+
+	return status == EXIT_SUCCESS ? TEST_OK : TEST_ERR;
+}
+
+static int kdbus_clone_userns_test(const char *bus,
+				   const char *name,
+				   struct kdbus_conn **conn_db,
+				   int expected_status)
+{
+	pid_t pid;
+	int ret = 0;
+	int status;
+
+	pid = fork();
+	ASSERT_RETURN_VAL(pid >= 0, -errno);
+
+	if (pid == 0) {
+		ret = prctl(PR_SET_PDEATHSIG, SIGKILL);
+		if (ret < 0)
+			_exit(EXIT_FAILURE);
+
+		ret = __kdbus_clone_userns_test(bus, name, conn_db,
+						expected_status);
+		_exit(ret);
+	}
+
+	/*
+	 * Receive in the original (root privileged) user namespace,
+	 * must fail with -ETIMEDOUT.
+	 */
+	ret = kdbus_msg_recv_poll(conn_db[0], 100, NULL, NULL);
+	ASSERT_RETURN_VAL(ret == -ETIMEDOUT, ret);
+
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN_VAL(ret >= 0, ret);
+
+	return (status == EXIT_SUCCESS) ? TEST_OK : TEST_ERR;
+}
+
+int kdbus_test_policy_ns(struct kdbus_test_env *env)
+{
+	int i;
+	int ret;
+	struct kdbus_conn *activator = NULL;
+	struct kdbus_conn *policy_holder = NULL;
+	char *bus = env->buspath;
+
+	if (geteuid() > 0) {
+		kdbus_printf("error geteuid() != 0, %s() needs root\n",
+			     __func__);
+		return TEST_SKIP;
+	}
+
+	/* we require user-namespaces */
+	if (access("/proc/self/uid_map", F_OK) != 0)
+		return TEST_SKIP;
+
+	conn_db = calloc(MAX_CONN, sizeof(struct kdbus_conn *));
+	ASSERT_RETURN(conn_db);
+
+	memset(conn_db, 0, MAX_CONN * sizeof(struct kdbus_conn *));
+
+	conn_db[0] = kdbus_hello(bus, 0, NULL, 0);
+	ASSERT_RETURN(conn_db[0]);
+
+	ret = kdbus_add_match_empty(conn_db[0]);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_fork_test_by_id(bus, conn_db, -EPERM, -EPERM);
+	ASSERT_EXIT(ret == 0);
+
+	ret = kdbus_register_policy_holder(bus, POLICY_NAME,
+					   &policy_holder);
+	ASSERT_RETURN(ret == 0);
+
+	/* Try to register the same name with an activator */
+	ret = kdbus_register_same_activator(bus, POLICY_NAME,
+					    &activator);
+	ASSERT_RETURN(ret == 0);
+
+	/* Acquire POLICY_NAME */
+	ret = kdbus_name_acquire(conn_db[0], POLICY_NAME, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_normal_test(bus, POLICY_NAME, conn_db);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_name_list(conn_db[0], KDBUS_NAME_LIST_NAMES |
+					  KDBUS_NAME_LIST_UNIQUE |
+					  KDBUS_NAME_LIST_ACTIVATORS |
+					  KDBUS_NAME_LIST_QUEUED);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_fork_test(bus, POLICY_NAME, conn_db, EXIT_SUCCESS);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * children connections are able to talk to conn_db[0] since
+	 * current POLICY_NAME TALK type is KDBUS_POLICY_ACCESS_WORLD,
+	 * so expect EXIT_SUCCESS when sending from child. However,
+	 * since the child's connection does not own any well-known
+	 * name, The parent connection conn_db[0] should fail with
+	 * -EPERM but since it is a privileged bus user the TALK is
+	 *  allowed.
+	 */
+	ret = kdbus_fork_test_by_id(bus, conn_db,
+				    EXIT_SUCCESS, EXIT_SUCCESS);
+	ASSERT_EXIT(ret == 0);
+
+	/*
+	 * Connections that can talk are perhaps being destroyed now.
+	 * Restrict the policy and purge cache entries where the
+	 * conn_db[0] is the destination.
+	 *
+	 * Now only connections with uid == 0 are allowed to talk.
+	 */
+	ret = kdbus_set_policy_talk(policy_holder, POLICY_NAME,
+				    geteuid(), KDBUS_POLICY_ACCESS_USER);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Testing connections (FORK+DROP) again:
+	 * After setting the policy re-check connections
+	 * we expect the children to fail with -EPERM
+	 */
+	ret = kdbus_fork_test(bus, POLICY_NAME, conn_db, -EPERM);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Now expect that both parent and child to fail.
+	 *
+	 * Child should fail with -EPERM since we just restricted
+	 * the POLICY_NAME TALK to uid 0 and its uid is 65534.
+	 *
+	 * Since the parent's connection will timeout when receiving
+	 * from the child, we never continue. FWIW just put -EPERM.
+	 */
+	ret = kdbus_fork_test_by_id(bus, conn_db, -EPERM, -EPERM);
+	ASSERT_EXIT(ret == 0);
+
+	/* Check if the name can be reached in a new userns */
+	ret = kdbus_clone_userns_test(bus, POLICY_NAME, conn_db, -EPERM);
+	ASSERT_RETURN(ret == 0);
+
+	for (i = 0; i < MAX_CONN; i++)
+		kdbus_conn_free(conn_db[i]);
+
+	kdbus_conn_free(activator);
+	kdbus_conn_free(policy_holder);
+
+	free(conn_db);
+
+	return ret;
+}
diff --git a/tools/testing/selftests/kdbus/test-policy-priv.c b/tools/testing/selftests/kdbus/test-policy-priv.c
new file mode 100644
index 000000000000..3463792c0f58
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-policy-priv.c
@@ -0,0 +1,1168 @@
+#include <errno.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <time.h>
+#include <sys/capability.h>
+#include <sys/eventfd.h>
+#include <sys/ioctl.h>
+#include <sys/wait.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+static int test_policy_priv_by_id(const char *bus,
+				  struct kdbus_conn *conn_dst,
+				  bool drop_second_user,
+				  int parent_status,
+				  int child_status)
+{
+	int ret;
+	uint64_t expected_cookie = time(NULL) ^ 0xdeadbeef;
+
+	ASSERT_RETURN(conn_dst);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, bus, ({
+		ret = kdbus_msg_send(unpriv, NULL,
+				     expected_cookie, 0, 0, 0,
+				     conn_dst->id);
+		ASSERT_EXIT(ret == child_status);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(conn_dst, 100, NULL, NULL);
+	ASSERT_RETURN(ret == parent_status);
+
+	return 0;
+}
+
+static int test_policy_priv_by_broadcast(const char *bus,
+					 struct kdbus_conn *conn_dst,
+					 int drop_second_user,
+					 int parent_status,
+					 int child_status)
+{
+	int ret;
+	int efd;
+	eventfd_t event_status = 0;
+	struct kdbus_msg *msg = NULL;
+	uid_t second_uid = UNPRIV_UID;
+	gid_t second_gid = UNPRIV_GID;
+	struct kdbus_conn *child_2 = conn_dst;
+	uint64_t expected_cookie = time(NULL) ^ 0xdeadbeef;
+
+	/* Drop to another unprivileged user other than UNPRIV_UID */
+	if (drop_second_user == DROP_OTHER_UNPRIV) {
+		second_uid = UNPRIV_UID - 1;
+		second_gid = UNPRIV_GID - 1;
+	}
+
+	/* child will signal parent to send broadcast */
+	efd = eventfd(0, EFD_CLOEXEC);
+	ASSERT_RETURN_VAL(efd >= 0, efd);
+
+	ret = RUN_UNPRIVILEGED(UNPRIV_UID, UNPRIV_GID, ({
+		struct kdbus_conn *child;
+
+		child = kdbus_hello(bus, 0, NULL, 0);
+		ASSERT_EXIT(child);
+
+		ret = kdbus_add_match_empty(child);
+		ASSERT_EXIT(ret == 0);
+
+		/* signal parent */
+		ret = eventfd_write(efd, 1);
+		ASSERT_EXIT(ret == 0);
+
+		ret = kdbus_msg_recv_poll(child, 300, &msg, NULL);
+		ASSERT_EXIT(ret == child_status);
+
+		/*
+		 * If we expect the child to get the broadcast
+		 * message, then check the received cookie.
+		 */
+		if (ret == 0) {
+			ASSERT_EXIT(expected_cookie == msg->cookie);
+		}
+
+		/* Use expected_cookie since 'msg' might be NULL */
+		ret = kdbus_msg_send(child, NULL, expected_cookie + 1,
+				     0, 0, 0, KDBUS_DST_ID_BROADCAST);
+		ASSERT_EXIT(ret == 0);
+
+		kdbus_msg_free(msg);
+		kdbus_conn_free(child);
+	}),
+	({
+		if (drop_second_user == DO_NOT_DROP) {
+			ASSERT_RETURN(child_2);
+
+			ret = eventfd_read(efd, &event_status);
+			ASSERT_RETURN(ret >= 0 && event_status == 1);
+
+			ret = kdbus_msg_send(child_2, NULL,
+					     expected_cookie, 0, 0, 0,
+					     KDBUS_DST_ID_BROADCAST);
+			ASSERT_RETURN(ret == 0);
+
+			ret = kdbus_msg_recv_poll(child_2, 300,
+						  &msg, NULL);
+			ASSERT_RETURN(ret == parent_status);
+
+			/*
+			 * Check returned cookie in case we expect
+			 * success.
+			 */
+			if (ret == 0) {
+				ASSERT_RETURN(msg->cookie ==
+					      expected_cookie + 1);
+			}
+
+			kdbus_msg_free(msg);
+		} else {
+			/*
+			 * Two unprivileged users will try to
+			 * communicate using broadcast.
+			 */
+			ret = RUN_UNPRIVILEGED(second_uid, second_gid, ({
+				child_2 = kdbus_hello(bus, 0, NULL, 0);
+				ASSERT_EXIT(child_2);
+
+				ret = kdbus_add_match_empty(child_2);
+				ASSERT_EXIT(ret == 0);
+
+				ret = eventfd_read(efd, &event_status);
+				ASSERT_RETURN(ret >= 0 && event_status == 1);
+
+				ret = kdbus_msg_send(child_2, NULL,
+						expected_cookie, 0, 0, 0,
+						KDBUS_DST_ID_BROADCAST);
+				ASSERT_EXIT(ret == 0);
+
+				ret = kdbus_msg_recv_poll(child_2, 100,
+							  &msg, NULL);
+				ASSERT_EXIT(ret == parent_status);
+
+				/*
+				 * Check returned cookie in case we expect
+				 * success.
+				 */
+				if (ret == 0) {
+					ASSERT_EXIT(msg->cookie ==
+						    expected_cookie + 1);
+				}
+
+				kdbus_msg_free(msg);
+				kdbus_conn_free(child_2);
+			}),
+			({ 0; }));
+		}
+	}));
+
+	close(efd);
+
+	return ret;
+}
+
+static void nosig(int sig)
+{
+}
+
+static int test_priv_before_policy_upload(struct kdbus_test_env *env)
+{
+	int ret;
+	struct kdbus_conn *conn;
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	/*
+	 * Make sure unprivileged bus user cannot acquire names
+	 * before registring any policy holder.
+	 */
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret < 0);
+	}));
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Make sure unprivileged bus users cannot talk by default
+	 * to privileged ones, unless a policy holder that allows
+	 * this was uploaded.
+	 */
+
+	ret = test_policy_priv_by_id(env->buspath, conn, false,
+				     -ETIMEDOUT, -EPERM);
+	ASSERT_RETURN(ret == 0);
+
+	/* Activate matching for a privileged connection */
+	ret = kdbus_add_match_empty(conn);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * First make sure that BROADCAST with msg flag
+	 * KDBUS_MSG_FLAGS_EXPECT_REPLY will fail with -ENOTUNIQ
+	 */
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, NULL, 0xdeadbeef,
+				     KDBUS_MSG_FLAGS_EXPECT_REPLY,
+				     5000000000ULL, 0,
+				     KDBUS_DST_ID_BROADCAST);
+		ASSERT_EXIT(ret == -ENOTUNIQ);
+	}));
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Test broadcast with a privileged connection.
+	 *
+	 * The first receiver should get the broadcast message since
+	 * the sender is a privileged connection.
+	 *
+	 * The privileged connection should not get the broadcast
+	 * message since the sender is an unprivileged connection.
+	 * It will fail with -ETIMEDOUT.
+	 *
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, conn,
+					    DO_NOT_DROP,
+					    -ETIMEDOUT, EXIT_SUCCESS);
+	ASSERT_RETURN(ret == 0);
+
+
+	/*
+	 * Test broadcast with two unprivileged connections running
+	 * under the same user.
+	 *
+	 * Both connections should succeed.
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, NULL,
+					    DROP_SAME_UNPRIV,
+					    EXIT_SUCCESS, EXIT_SUCCESS);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Test broadcast with two unprivileged connections running
+	 * under different users.
+	 *
+	 * Both connections will fail with -ETIMEDOUT.
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, NULL,
+					    DROP_OTHER_UNPRIV,
+					    -ETIMEDOUT, -ETIMEDOUT);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(conn);
+
+	return ret;
+}
+
+static int test_broadcast_after_policy_upload(struct kdbus_test_env *env)
+{
+	int ret;
+	int efd;
+	eventfd_t event_status = 0;
+	struct kdbus_msg *msg = NULL;
+	struct kdbus_conn *owner_a, *owner_b;
+	struct kdbus_conn *holder_a, *holder_b;
+	struct kdbus_policy_access access = {};
+	uint64_t expected_cookie = time(NULL) ^ 0xdeadbeef;
+
+	owner_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(owner_a);
+
+	ret = kdbus_name_acquire(owner_a, "com.example.broadcastA", NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users cannot talk by default
+	 * to privileged ones, unless a policy holder that allows
+	 * this was uploaded.
+	 */
+
+	ret = test_policy_priv_by_id(env->buspath, owner_a, false,
+				     -ETIMEDOUT, -EPERM);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Make sure that conn wont receive broadcasts unless it
+	 * installs a match.
+	 *
+	 * At same time check that the unprivileged connection will
+	 * receive the broadcast message from the privileged one.
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, owner_a,
+					    DO_NOT_DROP,
+					    -ETIMEDOUT, EXIT_SUCCESS);
+	ASSERT_RETURN(ret == 0);
+
+	/* Activate matching for a privileged connection */
+	ret = kdbus_add_match_empty(owner_a);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Redo the previous test. The privileged conn won't receive
+	 * broadcast messages from the unprivileged one.
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, owner_a,
+					    DO_NOT_DROP,
+					    -ETIMEDOUT, EXIT_SUCCESS);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Test that broadcast between two unprivileged users running
+	 * under the same user still succeed.
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, NULL,
+					    DROP_SAME_UNPRIV,
+					    EXIT_SUCCESS, EXIT_SUCCESS);
+	ASSERT_RETURN(ret == 0);
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_USER,
+		.id = geteuid(),
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	holder_a = kdbus_hello_registrar(env->buspath,
+					 "com.example.broadcastA",
+					 &access, 1,
+					 KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(holder_a);
+
+	holder_b = kdbus_hello_registrar(env->buspath,
+					 "com.example.broadcastB",
+					 &access, 1,
+					 KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(holder_b);
+
+	owner_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(owner_b);
+
+	ret = kdbus_name_acquire(owner_b, "com.example.broadcastB", NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/* Activate matching for a privileged connection */
+	ret = kdbus_add_match_empty(owner_b);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Test that even if "com.example.broadcastA" and
+	 * "com.example.broadcastB" do restrict TALK access by default
+	 * they are able to signal each other using broadcast due to
+	 * the fact they are privileged connections.
+	 */
+
+	ret = kdbus_msg_send(owner_a, NULL, 0xdeadbeef, 0, 0, 0,
+			     KDBUS_DST_ID_BROADCAST);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_msg_recv_poll(owner_b, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* Check src ID */
+	ASSERT_RETURN(msg->src_id == owner_a->id);
+
+	kdbus_msg_free(msg);
+	kdbus_free(owner_b, msg->offset_reply);
+
+
+	/* Release name "com.example.broadcastB" */
+
+	ret = kdbus_name_release(owner_b, "com.example.broadcastB");
+	ASSERT_EXIT(ret >= 0);
+
+	/* KDBUS_POLICY_OWN for unprivileged connections */
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = geteuid(),
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	/* Update the policy so unprivileged will own the name */
+
+	ret = kdbus_conn_update_policy(holder_b,
+				       "com.example.broadcastB",
+				       &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Test broadcasts from an unprivileged connection that
+	 * owns a name.
+	 *
+	 * We'll have four destinations here:
+	 *
+	 * owner_a: privileged connection that owns
+	 * "com.example.broadcastA". TALK access are subject to policy
+	 * rules and they are stricted so it should not receive
+	 * the signal. Should fail with -ETIMEDOUT
+	 *
+	 * owner_b: privileged connection (running under a different
+	 * uid) that do not own names, but with an empty broadcast
+	 * match, so it will receive broadcasts. Should get the
+	 * message.
+	 *
+	 * unpriv_a: unpriv connection that do not own any name.
+	 * It will receive the broadcast since it is running under
+	 * the same user of the one broadcasting and did install
+	 * matches. It should get the message.
+	 *
+	 * unpriv_b: unpriv connection is not interested in broadcast
+	 * messages, so it did not install broadcast matches. Should
+	 * fail with -ETIMEDOUT
+	 */
+
+	++expected_cookie;
+	efd = eventfd(0, EFD_CLOEXEC);
+	ASSERT_RETURN_VAL(efd >= 0, efd);
+
+	ret = RUN_UNPRIVILEGED(UNPRIV_UID, UNPRIV_UID, ({
+		struct kdbus_conn *unpriv_owner;
+		struct kdbus_conn *unpriv_a, *unpriv_b;
+
+		unpriv_owner = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_EXIT(unpriv_owner);
+
+		unpriv_a = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_EXIT(unpriv_a);
+
+		unpriv_b = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_EXIT(unpriv_b);
+
+		ret = kdbus_name_acquire(unpriv_owner,
+					 "com.example.broadcastB",
+					 NULL);
+		ASSERT_EXIT(ret >= 0);
+
+		ret = kdbus_add_match_empty(unpriv_a);
+		ASSERT_EXIT(ret == 0);
+
+		/* Signal that we are doing broadcasts */
+		ret = eventfd_write(efd, 1);
+		ASSERT_EXIT(ret == 0);
+
+		/*
+		 * Do broadcast from a connection that owns the
+		 * names "com.example.broadcastB".
+		 */
+		ret = kdbus_msg_send(unpriv_owner, NULL,
+				     expected_cookie,
+				     0, 0, 0,
+				     KDBUS_DST_ID_BROADCAST);
+		ASSERT_EXIT(ret == 0);
+
+		/*
+		 * Unprivileged connection running under the same
+		 * user. It should succeed.
+		 */
+		ret = kdbus_msg_recv_poll(unpriv_a, 100, &msg, NULL);
+		ASSERT_EXIT(ret == 0 && msg->cookie == expected_cookie);
+
+		/* Not interested in broadcast */
+		ret = kdbus_msg_recv_poll(unpriv_b, 100, NULL, NULL);
+		ASSERT_EXIT(ret == -ETIMEDOUT);
+	}),
+	({
+		ret = eventfd_read(efd, &event_status);
+		ASSERT_RETURN(ret >= 0 && event_status == 1);
+
+		/*
+		 * owner_a must fail with -ETIMEDOUT, since it owns
+		 * name "com.example.broadcastA" and its TALK
+		 * access is restriced.
+		 */
+		ret = kdbus_msg_recv_poll(owner_a, 100, NULL, NULL);
+		ASSERT_RETURN(ret == -ETIMEDOUT);
+
+		/*
+		 * owner_b got the broadcast from an unprivileged
+		 * connection.
+		 */
+		ret = kdbus_msg_recv_poll(owner_b, 100, &msg, NULL);
+		ASSERT_RETURN(ret == 0);
+
+		/* confirm the received cookie */
+		ASSERT_RETURN(msg->cookie == expected_cookie);
+
+		kdbus_msg_free(msg);
+		kdbus_free(owner_b, msg->offset_reply);
+
+
+	}));
+	ASSERT_RETURN(ret == 0);
+
+	close(efd);
+
+	/*
+	 * Test broadcast with two unprivileged connections running
+	 * under different users.
+	 *
+	 * Both connections will fail with -ETIMEDOUT.
+	 */
+
+	ret = test_policy_priv_by_broadcast(env->buspath, NULL,
+					    DROP_OTHER_UNPRIV,
+					    -ETIMEDOUT, -ETIMEDOUT);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Perform last tests, allow others to talk to name
+	 * "com.example.broadcastA". So now broadcasting to that
+	 * connection should succeed since the policy allow it.
+	 */
+
+	/* KDBUS_POLICY_OWN for unprivileged connections */
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = geteuid(),
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	ret = kdbus_conn_update_policy(holder_a,
+				       "com.example.broadcastA",
+				       &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	++expected_cookie;
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.broadcastB",
+					 NULL);
+		ASSERT_EXIT(ret >= 0);
+		ret = kdbus_msg_send(unpriv, NULL, expected_cookie,
+				     0, 0, 0, KDBUS_DST_ID_BROADCAST);
+		ASSERT_EXIT(ret == 0);
+	}));
+	ASSERT_RETURN(ret == 0);
+
+	/* owner_a will get the broadcast now. */
+	ret = kdbus_msg_recv_poll(owner_a, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* confirm the received cookie */
+	ASSERT_RETURN(msg->cookie == expected_cookie);
+
+	kdbus_msg_free(msg);
+	kdbus_free(owner_a, msg->offset_reply);
+
+	/*
+	 * owner_a released name "com.example.broadcastA". It should
+	 * receive broadcasts, no more policies and it has a match.
+	 *
+	 * Unprivileged connection will own a name and will try to
+	 * signal to the privileged connection. It should succeeded.
+	 */
+
+	ret = kdbus_name_release(owner_a, "com.example.broadcastA");
+	ASSERT_EXIT(ret >= 0);
+
+	++expected_cookie;
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.broadcastB",
+					 NULL);
+		ASSERT_EXIT(ret >= 0);
+		ret = kdbus_msg_send(unpriv, NULL, expected_cookie,
+				     0, 0, 0, KDBUS_DST_ID_BROADCAST);
+		ASSERT_EXIT(ret == 0);
+	}));
+	ASSERT_RETURN(ret == 0);
+
+	/* owner_a will get the broadcast now. */
+	ret = kdbus_msg_recv_poll(owner_a, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	/* confirm the received cookie */
+	ASSERT_RETURN(msg->cookie == expected_cookie);
+
+	kdbus_msg_free(msg);
+	kdbus_free(owner_a, msg->offset_reply);
+
+	kdbus_conn_free(owner_a);
+	kdbus_conn_free(owner_b);
+	kdbus_conn_free(holder_a);
+	kdbus_conn_free(holder_b);
+
+	return 0;
+}
+
+static int test_policy_priv(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn_a, *conn_b, *conn, *owner;
+	struct kdbus_policy_access access, *acc;
+	sigset_t sset;
+	size_t num;
+	int ret;
+
+	/*
+	 * Make sure we have CAP_SETUID/SETGID so we can drop privileges
+	 */
+
+	ret = test_is_capable(CAP_SETUID, CAP_SETGID, -1);
+	ASSERT_RETURN(ret >= 0);
+
+	if (!ret)
+		return TEST_SKIP;
+
+	/*
+	 * Setup:
+	 *  conn_a: policy holder for com.example.a
+	 *  conn_b: name holder of com.example.b
+	 */
+
+	signal(SIGUSR1, nosig);
+	sigemptyset(&sset);
+	sigaddset(&sset, SIGUSR1);
+	sigprocmask(SIG_BLOCK, &sset, NULL);
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	/*
+	 * Before registering any policy holder, make sure that the
+	 * bus is secure by default. This test is necessary, it catches
+	 * several cases where old D-Bus was vulnerable.
+	 */
+
+	ret = test_priv_before_policy_upload(env);
+	ASSERT_RETURN(ret == 0);
+
+	/* Register policy holder */
+
+	conn_a = kdbus_hello_registrar(env->buspath, "com.example.a",
+				       NULL, 0, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(conn_a);
+
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_b);
+
+	ret = kdbus_name_acquire(conn_b, "com.example.b", NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/*
+	 * Make sure bus-owners can always acquire names.
+	 */
+	ret = kdbus_name_acquire(conn, "com.example.a", NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	kdbus_conn_free(conn);
+
+	/*
+	 * Make sure unprivileged users cannot acquire names with default
+	 * policy assigned.
+	 */
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret < 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged users can acquire names if we make them
+	 * world-accessible.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = 0,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged users can acquire names if we make them
+	 * gid-accessible. But only if the gid matches.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_GROUP,
+		.id = UNPRIV_GID,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_GROUP,
+		.id = 1,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret < 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged users can acquire names if we make them
+	 * uid-accessible. But only if the uid matches.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_USER,
+		.id = UNPRIV_UID,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_USER,
+		.id = 1,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret < 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged users cannot acquire names if no owner-policy
+	 * matches, even if SEE/TALK policies match.
+	 */
+
+	num = 4;
+	acc = (struct kdbus_policy_access[]){
+		{
+			.type = KDBUS_POLICY_ACCESS_GROUP,
+			.id = UNPRIV_GID,
+			.access = KDBUS_POLICY_SEE,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = UNPRIV_UID,
+			.access = KDBUS_POLICY_TALK,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_WORLD,
+			.id = 0,
+			.access = KDBUS_POLICY_TALK,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_WORLD,
+			.id = 0,
+			.access = KDBUS_POLICY_SEE,
+		},
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", acc, num);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret < 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged users can acquire names if the only matching
+	 * policy is somewhere in the middle.
+	 */
+
+	num = 5;
+	acc = (struct kdbus_policy_access[]){
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 1,
+			.access = KDBUS_POLICY_OWN,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 2,
+			.access = KDBUS_POLICY_OWN,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = UNPRIV_UID,
+			.access = KDBUS_POLICY_OWN,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 3,
+			.access = KDBUS_POLICY_OWN,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 4,
+			.access = KDBUS_POLICY_OWN,
+		},
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", acc, num);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_name_acquire(unpriv, "com.example.a", NULL);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Clear policies
+	 */
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", NULL, 0);
+	ASSERT_RETURN(ret == 0);
+
+	/*
+	 * Make sure privileged bus users can _always_ talk to others.
+	 */
+
+	conn = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn);
+
+	ret = kdbus_msg_send(conn, "com.example.b", 0xdeadbeef, 0, 0, 0, 0);
+	ASSERT_EXIT(ret >= 0);
+	ret = kdbus_msg_recv_poll(conn_b, 100, NULL, NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	kdbus_conn_free(conn);
+
+	/*
+	 * Make sure unprivileged bus users cannot talk by default.
+	 */
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users can talk to equals, even without
+	 * policy.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_USER,
+		.id = UNPRIV_UID,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.c", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		struct kdbus_conn *owner;
+
+		owner = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_RETURN(owner);
+
+		ret = kdbus_name_acquire(owner, "com.example.c", NULL);
+		ASSERT_EXIT(ret >= 0);
+
+		ret = kdbus_msg_send(unpriv, "com.example.c", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+		ret = kdbus_msg_recv_poll(owner, 100, NULL, NULL);
+		ASSERT_EXIT(ret >= 0);
+
+		kdbus_conn_free(owner);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users can talk to privileged users if a
+	 * suitable UID policy is set.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_USER,
+		.id = UNPRIV_UID,
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.b", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(conn_b, 100, NULL, NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users can talk to privileged users if a
+	 * suitable GID policy is set.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_GROUP,
+		.id = UNPRIV_GID,
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.b", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(conn_b, 100, NULL, NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users can talk to privileged users if a
+	 * suitable WORLD policy is set.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = 0,
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.b", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(conn_b, 100, NULL, NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users cannot talk to privileged users if
+	 * no suitable policy is set.
+	 */
+
+	num = 5;
+	acc = (struct kdbus_policy_access[]){
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 0,
+			.access = KDBUS_POLICY_OWN,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 1,
+			.access = KDBUS_POLICY_TALK,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = UNPRIV_UID,
+			.access = KDBUS_POLICY_SEE,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 3,
+			.access = KDBUS_POLICY_TALK,
+		},
+		{
+			.type = KDBUS_POLICY_ACCESS_USER,
+			.id = 4,
+			.access = KDBUS_POLICY_TALK,
+		},
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.b", acc, num);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure unprivileged bus users can talk to privileged users if a
+	 * suitable OWN privilege overwrites TALK.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = 0,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.b", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	ret = kdbus_msg_recv_poll(conn_b, 100, NULL, NULL);
+	ASSERT_EXIT(ret >= 0);
+
+	/*
+	 * Make sure the TALK cache is reset correctly when policies are
+	 * updated.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = 0,
+		.access = KDBUS_POLICY_TALK,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.b", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = RUN_UNPRIVILEGED_CONN(unpriv, env->buspath, ({
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+
+		ret = kdbus_msg_recv_poll(conn_b, 100, NULL, NULL);
+		ASSERT_EXIT(ret >= 0);
+
+		ret = kdbus_conn_update_policy(conn_a, "com.example.b",
+					       NULL, 0);
+		ASSERT_RETURN(ret == 0);
+
+		ret = kdbus_msg_send(unpriv, "com.example.b", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret == -EPERM);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+	/*
+	 * Make sure the TALK cache is reset correctly when policy holders
+	 * disconnect.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_WORLD,
+		.id = 0,
+		.access = KDBUS_POLICY_OWN,
+	};
+
+	conn = kdbus_hello_registrar(env->buspath, "com.example.c",
+				     NULL, 0, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(conn);
+
+	ret = kdbus_conn_update_policy(conn, "com.example.c", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	owner = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(owner);
+
+	ret = kdbus_name_acquire(owner, "com.example.c", NULL);
+	ASSERT_RETURN(ret >= 0);
+
+	ret = RUN_UNPRIVILEGED(UNPRIV_UID, UNPRIV_GID, ({
+		struct kdbus_conn *unpriv;
+
+		/* wait for parent to be finished */
+		sigemptyset(&sset);
+		ret = sigsuspend(&sset);
+		ASSERT_RETURN(ret == -1 && errno == EINTR);
+
+		unpriv = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_RETURN(unpriv);
+
+		ret = kdbus_msg_send(unpriv, "com.example.c", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret >= 0);
+
+		ret = kdbus_msg_recv_poll(owner, 100, NULL, NULL);
+		ASSERT_EXIT(ret >= 0);
+
+		/* free policy holder */
+		kdbus_conn_free(conn);
+
+		ret = kdbus_msg_send(unpriv, "com.example.c", 0xdeadbeef, 0, 0,
+				     0, 0);
+		ASSERT_EXIT(ret == -EPERM);
+
+		kdbus_conn_free(unpriv);
+	}), ({
+		/* make sure policy holder is only valid in child */
+		kdbus_conn_free(conn);
+		kill(pid, SIGUSR1);
+	}));
+	ASSERT_RETURN(ret >= 0);
+
+
+	/*
+	 * The following tests are necessary.
+	 */
+
+	ret = test_broadcast_after_policy_upload(env);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_conn_free(owner);
+
+	/*
+	 * cleanup resources
+	 */
+
+	kdbus_conn_free(conn_b);
+	kdbus_conn_free(conn_a);
+
+	return TEST_OK;
+}
+
+int kdbus_test_policy_priv(struct kdbus_test_env *env)
+{
+	pid_t pid;
+	int ret;
+
+	/* make sure to exit() if a child returns from fork() */
+	pid = getpid();
+	ret = test_policy_priv(env);
+	if (pid != getpid())
+		exit(1);
+
+	return ret;
+}
diff --git a/tools/testing/selftests/kdbus/test-policy.c b/tools/testing/selftests/kdbus/test-policy.c
new file mode 100644
index 000000000000..4eb6e65f96d1
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-policy.c
@@ -0,0 +1,81 @@
+#include <errno.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+int kdbus_test_policy(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn_a, *conn_b;
+	struct kdbus_policy_access access;
+	int ret;
+
+	/* Invalid name */
+	conn_a = kdbus_hello_registrar(env->buspath, ".example.a",
+				       NULL, 0, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(conn_a == NULL);
+
+	conn_a = kdbus_hello_registrar(env->buspath, "example",
+				       NULL, 0, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(conn_a == NULL);
+
+	conn_a = kdbus_hello_registrar(env->buspath, "com.example.a",
+				       NULL, 0, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(conn_a);
+
+	conn_b = kdbus_hello_registrar(env->buspath, "com.example.b",
+				       NULL, 0, KDBUS_HELLO_POLICY_HOLDER);
+	ASSERT_RETURN(conn_b);
+
+	/*
+	 * Verify there cannot be any duplicate entries, except for specific vs.
+	 * wildcard entries.
+	 */
+
+	access = (struct kdbus_policy_access){
+		.type = KDBUS_POLICY_ACCESS_USER,
+		.id = geteuid(),
+		.access = KDBUS_POLICY_SEE,
+	};
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_conn_update_policy(conn_b, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == -EEXIST);
+
+	ret = kdbus_conn_update_policy(conn_b, "com.example.a.*", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.a.*", &access, 1);
+	ASSERT_RETURN(ret == -EEXIST);
+
+	ret = kdbus_conn_update_policy(conn_a, "com.example.*", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_conn_update_policy(conn_b, "com.example.a", &access, 1);
+	ASSERT_RETURN(ret == 0);
+
+	ret = kdbus_conn_update_policy(conn_b, "com.example.*", &access, 1);
+	ASSERT_RETURN(ret == -EEXIST);
+
+	/* Invalid name */
+	ret = kdbus_conn_update_policy(conn_b, ".example.*", &access, 1);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	ret = kdbus_conn_update_policy(conn_b, "example", &access, 1);
+	ASSERT_RETURN(ret == -EINVAL);
+
+	kdbus_conn_free(conn_b);
+	kdbus_conn_free(conn_a);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-race.c b/tools/testing/selftests/kdbus/test-race.c
new file mode 100644
index 000000000000..75cec3fe28a1
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-race.c
@@ -0,0 +1,313 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/ioctl.h>
+#include <pthread.h>
+#include <stdbool.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+struct race_thread {
+	pthread_spinlock_t lock;
+	pthread_t thread;
+	int (*fn) (struct kdbus_test_env *env, void *ctx);
+	struct kdbus_test_env *env;
+	void *ctx;
+	int ret;
+};
+
+static void *race_thread_fn(void *data)
+{
+	struct race_thread *thread = data;
+	int ret;
+
+	ret = pthread_spin_lock(&thread->lock);
+	if (ret < 0)
+		goto error;
+
+	ret = thread->fn(thread->env, thread->ctx);
+	pthread_spin_unlock(&thread->lock);
+
+error:
+	return (void*)(long)ret;
+}
+
+static int race_thread_init(struct race_thread *thread)
+{
+	int ret;
+
+	ret = pthread_spin_init(&thread->lock, PTHREAD_PROCESS_PRIVATE);
+	ASSERT_RETURN(ret >= 0);
+
+	ret = pthread_spin_lock(&thread->lock);
+	ASSERT_RETURN(ret >= 0);
+
+	ret = pthread_create(&thread->thread, NULL, race_thread_fn, thread);
+	ASSERT_RETURN(ret >= 0);
+
+	return TEST_OK;
+}
+
+static void race_thread_run(struct race_thread *thread,
+			    int (*fn)(struct kdbus_test_env *env, void *ctx),
+			    struct kdbus_test_env *env, void *ctx)
+{
+	int ret;
+
+	thread->fn = fn;
+	thread->env = env;
+	thread->ctx = ctx;
+
+	ret = pthread_spin_unlock(&thread->lock);
+	if (ret < 0)
+		abort();
+}
+
+static int race_thread_join(struct race_thread *thread)
+{
+	void *val = (void*)(long)-EFAULT;
+	int ret;
+
+	ret = pthread_join(thread->thread, &val);
+	ASSERT_RETURN(ret >= 0);
+
+	thread->ret = (long)val;
+
+	return TEST_OK;
+}
+
+static void shuffle(size_t *array, size_t n)
+{
+	size_t i, j, t;
+
+	if (n <= 1)
+		return;
+
+	for (i = 0; i < n - 1; i++) {
+		j = i + rand() / (RAND_MAX / (n - i) + 1);
+		t = array[j];
+		array[j] = array[i];
+		array[i] = t;
+	}
+}
+
+static int race_thread(int (*init_fn) (struct kdbus_test_env *env, void *ctx),
+		       int (*exit_fn) (struct kdbus_test_env *env, void *ctx,
+		                       int *ret, size_t n_ret),
+		       int (*verify_fn) (struct kdbus_test_env *env, void *ctx),
+		       int (**fns) (struct kdbus_test_env *env, void *ctx),
+		       size_t n_fns, struct kdbus_test_env *env, void *ctx,
+		       size_t runs)
+{
+	struct race_thread *t;
+	size_t i, num, *order;
+	int *ret, r;
+
+	t = calloc(sizeof(*t), n_fns);
+	ASSERT_RETURN(t != NULL);
+
+	ret = calloc(sizeof(*ret), n_fns);
+	ASSERT_RETURN(ret != NULL);
+
+	order = calloc(sizeof(*order), n_fns);
+	ASSERT_RETURN(order != NULL);
+
+	for (num = 0; num < runs; ++num) {
+		ASSERT_RETURN(init_fn(env, ctx) == TEST_OK);
+
+		for (i = 0; i < n_fns; ++i) {
+			ASSERT_RETURN(race_thread_init(&t[i]) == TEST_OK);
+			order[i] = i;
+		}
+
+		/* random order */
+		shuffle(order, n_fns);
+		for (i = 0; i < n_fns; ++i)
+			race_thread_run(&t[order[i]], fns[order[i]], env, ctx);
+
+		for (i = 0; i < n_fns; ++i) {
+			ASSERT_RETURN(race_thread_join(&t[i]) == TEST_OK);
+			ret[i] = t[i].ret;
+		}
+
+		ASSERT_RETURN(exit_fn(env, ctx, ret, n_fns) == TEST_OK);
+	}
+
+	r = verify_fn(env, ctx);
+	free(order);
+	free(ret);
+	free(t);
+	return r;
+}
+
+#define ASSERT_RACE(env, ctx, runs, init_fn, exit_fn, verify_fn, ...) ({\
+		int (*fns[])(struct kdbus_test_env*, void*) = {		\
+			__VA_ARGS__					\
+		};							\
+		size_t cnt = sizeof(fns) / sizeof(*fns);		\
+		race_thread(init_fn, exit_fn, verify_fn,		\
+				fns, cnt, env, ctx, runs);		\
+	})
+
+#define TEST_RACE2(_name_, _runs_, _ctx_, _a_, _b_, _init_, _exit_, _verify_)\
+	static int _name_ ## ___a(struct kdbus_test_env *env, void *_ctx)\
+	{								\
+		__attribute__((__unused__)) _ctx_ *ctx = _ctx;		\
+		_a_;							\
+		return TEST_OK;						\
+	}								\
+	static int _name_ ## ___b(struct kdbus_test_env *env, void *_ctx)\
+	{								\
+		__attribute__((__unused__)) _ctx_ *ctx = _ctx;		\
+		_b_;							\
+		return TEST_OK;						\
+	}								\
+	static int _name_ ## ___init(struct kdbus_test_env *env,	\
+				void *_ctx)				\
+	{								\
+		__attribute__((__unused__)) _ctx_ *ctx = _ctx;		\
+		_init_;							\
+		return TEST_OK;						\
+	}								\
+	static int _name_ ## ___exit(struct kdbus_test_env *env,	\
+				void *_ctx, int *ret, size_t n_ret)	\
+	{								\
+		__attribute__((__unused__)) _ctx_ *ctx = _ctx;		\
+		_exit_;							\
+		return TEST_OK;						\
+	}								\
+	static int _name_ ## ___verify(struct kdbus_test_env *env,	\
+				void *_ctx)				\
+	{								\
+		__attribute__((__unused__)) _ctx_ *ctx = _ctx;		\
+		_verify_;						\
+		return TEST_OK;						\
+	}								\
+	int _name_ (struct kdbus_test_env *env) {			\
+		_ctx_ ctx;						\
+		memset(&ctx, 0, sizeof(ctx));				\
+		return ASSERT_RACE(env, &ctx, _runs_,			\
+				_name_ ## ___init,			\
+				_name_ ## ___exit,			\
+				_name_ ## ___verify,			\
+				_name_ ## ___a,				\
+				_name_ ## ___b);			\
+	}
+
+/*
+ * Race Testing
+ * This file provides some rather trivial helpers to run multiple threads in
+ * parallel and test for races. You can define races with TEST_RACEX(), whereas
+ * 'X' is the number of threads you want. The arguments to this function should
+ * be code-blocks that are executed in the threads. Each code-block, if it
+ * does not contain a "return" statement, will implicitly return TEST_OK.
+ *
+ * The arguments are:
+ * @arg1: The name of the test to define
+ * @arg2: The number of runs
+ * @arg3: The datatype used as context across all test runs
+ * @arg4-@argN: The code-blocks for the threads to run.
+ * @argN+1: The code-block that is run before each test-run. Use it to
+ *          initialize your contexts.
+ * @argN+2: The code-block that is run after each test-run. Use it to verify
+ *          everything went as expected.
+ * @argN+3: The code-block that is executed after all runs are finished. Use it
+ *          to verify the sum of results.
+ *
+ * Each function has "env" and "ctx" as variables implicitly defined.
+ * Furthermore, the function executed after the tests were run can access "ret",
+ * which is an array of return values of all threads. "n_ret" is the number of
+ * threads.
+ *
+ * Race testing is kinda nasty if you cannot place breakpoints yourself.
+ * Therefore, we run each thread multiple times and allow you to verify the
+ * results of all test-runs after we're finished. Usually, we try to verify all
+ * possible outcomes happened. However, no-one can predict how the scheduler
+ * ran each thread, even if we run 10k times. Furthermore, the execution of all
+ * threads is randomized by us, so we cannot predict how they're run. Therefore,
+ * we only return TEST_SKIP in those cases. This is not a hard-failure, but
+ * signals test-runners that something went unexpected.
+ */
+
+/*
+ * We run BYEBYE in parallel in two threads. Only one of them is allowed to
+ * succeed, the other one *MUST* return -EALREADY.
+ */
+TEST_RACE2(kdbus_test_race_byebye, 100, int,
+	({
+		return ioctl(env->conn->fd, KDBUS_CMD_BYEBYE, 0) ? -errno : 0;
+	}),
+	({
+		return ioctl(env->conn->fd, KDBUS_CMD_BYEBYE, 0) ? -errno : 0;
+	}),
+	({
+		env->conn = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_RETURN(env->conn);
+	}),
+	({
+		ASSERT_RETURN((ret[0] == 0 && ret[1] == -EALREADY) ||
+			      (ret[1] == 0 && ret[0] == -EALREADY));
+		kdbus_conn_free(env->conn);
+		env->conn = NULL;
+	}),
+	({
+	}))
+
+/*
+ * Run BYEBYE against MATCH_REMOVE. If BYEBYE is first, it returns 0 and
+ * MATCH_REMOVE must fail with ECONNRESET. If BYEBYE is last, it still succeeds
+ * but MATCH_REMOVE does, too.
+ * Run 10k times; at least on my machine it takes usually about ~100 runs to
+ * trigger ECONNRESET races.
+ */
+TEST_RACE2(kdbus_test_race_byebye_match, 10000,
+	struct {
+		bool res1 : 1;
+		bool res2 : 1;
+	},
+	({
+		return ioctl(env->conn->fd, KDBUS_CMD_BYEBYE, 0) ? -errno : 0;
+	}),
+	({
+		struct kdbus_cmd_match cmd = { };
+		int ret;
+
+		cmd.size = sizeof(cmd);
+		cmd.cookie = 0xdeadbeef;
+		ret = ioctl(env->conn->fd, KDBUS_CMD_MATCH_REMOVE, &cmd);
+		if (ret == 0 || errno == ENOENT)
+			return 0;
+
+		return -errno;
+	}),
+	({
+		env->conn = kdbus_hello(env->buspath, 0, NULL, 0);
+		ASSERT_RETURN(env->conn);
+	}),
+	({
+		if (ret[0] == 0 && ret[1] == 0) {
+			/* MATCH_REMOVE ran first, then BYEBYE */
+			ctx->res1 = true;
+		} else if (ret[0] == 0 && ret[1] == -ECONNRESET) {
+			/* BYEBYE ran first, then MATCH_REMOVE failed */
+			ctx->res2 = true;
+		} else {
+			ASSERT_RETURN(0);
+		}
+
+		kdbus_conn_free(env->conn);
+		env->conn = NULL;
+	}),
+	({
+		if (!ctx->res1 || !ctx->res2)
+			return TEST_SKIP;
+	}))
diff --git a/tools/testing/selftests/kdbus/test-sync.c b/tools/testing/selftests/kdbus/test-sync.c
new file mode 100644
index 000000000000..7aff75bc611d
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-sync.c
@@ -0,0 +1,241 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/ioctl.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <sys/wait.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+static struct kdbus_conn *conn_a, *conn_b;
+static unsigned int cookie = 0xdeadbeef;
+
+static void nop_handler(int sig) {}
+
+static int send_reply(const struct kdbus_conn *conn,
+		      uint64_t reply_cookie,
+		      uint64_t dst_id)
+{
+	struct kdbus_msg *msg;
+	const char ref1[1024 * 128 + 3] = "0123456789_0";
+	struct kdbus_item *item;
+	uint64_t size;
+	int ret;
+
+	size = sizeof(struct kdbus_msg);
+	size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_vec));
+
+	msg = malloc(size);
+	if (!msg) {
+		ret = -errno;
+		kdbus_printf("unable to malloc()!?\n");
+		return ret;
+	}
+
+	memset(msg, 0, size);
+	msg->size = size;
+	msg->src_id = conn->id;
+	msg->dst_id = dst_id;
+	msg->cookie_reply = reply_cookie;
+	msg->payload_type = KDBUS_PAYLOAD_DBUS;
+
+	item = msg->items;
+
+	item->type = KDBUS_ITEM_PAYLOAD_VEC;
+	item->size = KDBUS_ITEM_HEADER_SIZE + sizeof(struct kdbus_vec);
+	item->vec.address = (uintptr_t)&ref1;
+	item->vec.size = sizeof(ref1);
+	item = KDBUS_ITEM_NEXT(item);
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_SEND, msg);
+	if (ret < 0) {
+		ret = -errno;
+		kdbus_printf("error sending message: %d (%m)\n", ret);
+		return ret;
+	}
+
+	free(msg);
+
+	return 0;
+}
+
+static int interrupt_sync(struct kdbus_conn *conn_src,
+			  struct kdbus_conn *conn_dst)
+{
+	pid_t pid;
+	int ret, status;
+	struct kdbus_msg *msg = NULL;
+	struct sigaction sa = {
+		.sa_handler = nop_handler,
+		.sa_flags = SA_NOCLDSTOP|SA_RESTART,
+	};
+
+	cookie++;
+	pid = fork();
+	ASSERT_RETURN_VAL(pid >= 0, pid);
+
+	if (pid == 0) {
+		ret = sigaction(SIGINT, &sa, NULL);
+		ASSERT_EXIT(ret == 0);
+
+		ret = kdbus_msg_send(conn_dst, NULL, cookie,
+				     KDBUS_MSG_FLAGS_EXPECT_REPLY |
+				     KDBUS_MSG_FLAGS_SYNC_REPLY,
+				     100000000ULL, 0, conn_src->id);
+		ASSERT_EXIT(ret == -ETIMEDOUT);
+
+		_exit(EXIT_SUCCESS);
+	}
+
+	ret = kdbus_msg_recv_poll(conn_src, 100, &msg, NULL);
+	ASSERT_RETURN(ret == 0 && msg->cookie == cookie);
+
+	kdbus_msg_free(msg);
+
+	ret = kill(pid, SIGINT);
+	ASSERT_RETURN_VAL(ret == 0, ret);
+
+	ret = waitpid(pid, &status, 0);
+	ASSERT_RETURN_VAL(ret >= 0, ret);
+
+	if (WIFSIGNALED(status))
+		return TEST_ERR;
+
+	ret = kdbus_msg_recv_poll(conn_src, 100, NULL, NULL);
+	ASSERT_RETURN(ret == -ETIMEDOUT);
+
+	return (status == EXIT_SUCCESS) ? TEST_OK : TEST_ERR;
+}
+
+static void *run_thread_reply(void *data)
+{
+	int ret;
+
+	ret = kdbus_msg_recv_poll(conn_a, 3000, NULL, NULL);
+	if (ret == 0) {
+		kdbus_printf("Thread received message, sending reply ...\n");
+		send_reply(conn_a, cookie, conn_b->id);
+	}
+
+	pthread_exit(NULL);
+	return NULL;
+}
+
+int kdbus_test_sync_reply(struct kdbus_test_env *env)
+{
+	pthread_t thread;
+	int ret;
+
+	conn_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_a && conn_b);
+
+	pthread_create(&thread, NULL, run_thread_reply, NULL);
+
+	ret = kdbus_msg_send(conn_b, NULL, cookie,
+			     KDBUS_MSG_FLAGS_EXPECT_REPLY |
+			     KDBUS_MSG_FLAGS_SYNC_REPLY,
+			     5000000000ULL, 0, conn_a->id);
+
+	pthread_join(thread, NULL);
+	ASSERT_RETURN(ret == 0);
+
+	ret = interrupt_sync(conn_a, conn_b);
+	ASSERT_RETURN(ret == 0);
+
+	kdbus_printf("-- closing bus connections\n");
+
+	kdbus_conn_free(conn_a);
+	kdbus_conn_free(conn_b);
+
+	return TEST_OK;
+}
+
+#define BYEBYE_ME ((void*)0L)
+#define BYEBYE_THEM ((void*)1L)
+
+static void *run_thread_byebye(void *data)
+{
+	int ret;
+
+	ret = kdbus_msg_recv_poll(conn_a, 3000, NULL, NULL);
+	if (ret == 0) {
+		kdbus_printf("Thread received message, invoking BYEBYE ...\n");
+		kdbus_msg_recv(conn_a, NULL, NULL);
+		if (data == BYEBYE_ME)
+			ioctl(conn_b->fd, KDBUS_CMD_BYEBYE, 0);
+		else if (data == BYEBYE_THEM)
+			ioctl(conn_a->fd, KDBUS_CMD_BYEBYE, 0);
+	}
+
+	pthread_exit(NULL);
+	return NULL;
+}
+
+int kdbus_test_sync_byebye(struct kdbus_test_env *env)
+{
+	pthread_t thread;
+	int ret;
+
+	/*
+	 * This sends a synchronous message to a thread, which waits until it
+	 * received the message and then invokes BYEBYE on the *ORIGINAL*
+	 * connection. That is, on the same connection that synchronously waits
+	 * for an reply.
+	 * This should properly wake the connection up and cause ECONNRESET as
+	 * the connection is disconnected now.
+	 *
+	 * The second time, we do the same but invoke BYEBYE on the *TARGET*
+	 * connection. This should also wake up the synchronous sender as the
+	 * reply cannot be sent by a disconnected target.
+	 */
+
+	conn_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_a && conn_b);
+
+	pthread_create(&thread, NULL, run_thread_byebye, BYEBYE_ME);
+
+	ret = kdbus_msg_send(conn_b, NULL, cookie,
+			     KDBUS_MSG_FLAGS_EXPECT_REPLY |
+			     KDBUS_MSG_FLAGS_SYNC_REPLY,
+			     5000000000ULL, 0, conn_a->id);
+
+	ASSERT_RETURN(ret == -ECONNRESET);
+
+	pthread_join(thread, NULL);
+
+	kdbus_conn_free(conn_a);
+	kdbus_conn_free(conn_b);
+
+	conn_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_a && conn_b);
+
+	pthread_create(&thread, NULL, run_thread_byebye, BYEBYE_THEM);
+
+	ret = kdbus_msg_send(conn_b, NULL, cookie,
+			     KDBUS_MSG_FLAGS_EXPECT_REPLY |
+			     KDBUS_MSG_FLAGS_SYNC_REPLY,
+			     5000000000ULL, 0, conn_a->id);
+
+	ASSERT_RETURN(ret == -EPIPE);
+
+	pthread_join(thread, NULL);
+
+	kdbus_conn_free(conn_a);
+	kdbus_conn_free(conn_b);
+
+	return TEST_OK;
+}
diff --git a/tools/testing/selftests/kdbus/test-timeout.c b/tools/testing/selftests/kdbus/test-timeout.c
new file mode 100644
index 000000000000..f0b1a568e27c
--- /dev/null
+++ b/tools/testing/selftests/kdbus/test-timeout.c
@@ -0,0 +1,97 @@
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <errno.h>
+#include <assert.h>
+#include <poll.h>
+#include <sys/ioctl.h>
+#include <stdbool.h>
+
+#include "kdbus-test.h"
+#include "kdbus-util.h"
+#include "kdbus-enum.h"
+
+int timeout_msg_recv(struct kdbus_conn *conn, uint64_t *expected)
+{
+	struct kdbus_cmd_recv recv = {};
+	struct kdbus_msg *msg;
+	int ret;
+
+	ret = ioctl(conn->fd, KDBUS_CMD_MSG_RECV, &recv);
+	if (ret < 0) {
+		kdbus_printf("error receiving message: %d (%m)\n", ret);
+		return -errno;
+	}
+
+	msg = (struct kdbus_msg *)(conn->buf + recv.offset);
+
+	ASSERT_RETURN_VAL(msg->payload_type == KDBUS_PAYLOAD_KERNEL, -EINVAL);
+	ASSERT_RETURN_VAL(msg->src_id == KDBUS_SRC_ID_KERNEL, -EINVAL);
+	ASSERT_RETURN_VAL(msg->dst_id == conn->id, -EINVAL);
+
+	*expected &= ~(1ULL << msg->cookie_reply);
+	kdbus_printf("Got message timeout for cookie %llu\n",
+		     msg->cookie_reply);
+
+	ret = kdbus_free(conn, recv.offset);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+int kdbus_test_timeout(struct kdbus_test_env *env)
+{
+	struct kdbus_conn *conn_a, *conn_b;
+	struct pollfd fd;
+	int ret, i, n_msgs = 4;
+	uint64_t expected = 0;
+
+	conn_a = kdbus_hello(env->buspath, 0, NULL, 0);
+	conn_b = kdbus_hello(env->buspath, 0, NULL, 0);
+	ASSERT_RETURN(conn_a && conn_b);
+
+	fd.fd = conn_b->fd;
+
+	/*
+	 * send messages that expect a reply (within 100 msec),
+	 * but never answer it.
+	 */
+	for (i = 0; i < n_msgs; i++) {
+		kdbus_printf("Sending message with cookie %u ...\n", i);
+		ASSERT_RETURN(kdbus_msg_send(conn_b, NULL, i,
+			      KDBUS_MSG_FLAGS_EXPECT_REPLY,
+			      (i + 1) * 100ULL * 1000000ULL, 0,
+			      conn_a->id) == 0);
+		expected |= 1ULL << i;
+	}
+
+	for (;;) {
+		fd.events = POLLIN | POLLPRI | POLLHUP;
+		fd.revents = 0;
+
+		ret = poll(&fd, 1, (n_msgs + 1) * 100);
+		if (ret == 0)
+			kdbus_printf("--- timeout\n");
+		if (ret <= 0)
+			break;
+
+		if (fd.revents & POLLIN)
+			ASSERT_RETURN(!timeout_msg_recv(conn_b, &expected));
+
+		if (expected == 0)
+			break;
+	}
+
+	ASSERT_RETURN(expected == 0);
+
+	kdbus_conn_free(conn_a);
+	kdbus_conn_free(conn_b);
+
+	return TEST_OK;
+}
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/13] Add kdbus implementation
@ 2014-11-21  6:02   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  6:02 UTC (permalink / raw)
  To: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz

On Thu, Nov 20, 2014 at 09:02:16PM -0800, Greg Kroah-Hartman wrote:
> kdbus is a kernel-level IPC implementation that aims for resemblance to
> the the protocol layer with the existing userspace D-Bus daemon while
> enabling some features that couldn't be implemented before in userspace.
> 
> The documentation in the first patch in this series explains the
> protocol and the API details.

Ugh, and I can't seem to remember to put the patch numbers on the
series, sorry about that, my git format-patch default options are set up
for applying patches and responding to them, not generating numbered
series :(

If someone wants the numbers, I can resend them, otherwise, just stick
with the git tree...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/13] Add kdbus implementation
@ 2014-11-21  6:02   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21  6:02 UTC (permalink / raw)
  To: arnd-r2nGTMty4D4, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A

On Thu, Nov 20, 2014 at 09:02:16PM -0800, Greg Kroah-Hartman wrote:
> kdbus is a kernel-level IPC implementation that aims for resemblance to
> the the protocol layer with the existing userspace D-Bus daemon while
> enabling some features that couldn't be implemented before in userspace.
> 
> The documentation in the first patch in this series explains the
> protocol and the API details.

Ugh, and I can't seem to remember to put the patch numbers on the
series, sorry about that, my git format-patch default options are set up
for applying patches and responding to them, not generating numbered
series :(

If someone wants the numbers, I can resend them, otherwise, just stick
with the git tree...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code for buses, domains and endpoints
@ 2014-11-21  8:14     ` Harald Hoyer
  0 siblings, 0 replies; 73+ messages in thread
From: Harald Hoyer @ 2014-11-21  8:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, arnd, ebiederm, gnomes, teg, jkosina, luto,
	linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz

On 21.11.2014 06:02, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel@zonque.org>
> 
> Add the logic to handle the following entities:
> 
> Domain:
>   A domain is an unamed object containing a number of buses. A
>   domain is automatically created when an instance of kdbusfs
>   is mounted, and destroyed when it is unmounted.
>   Every domain offers its own "control" device node to create
>   buses.  Domains have no connection to each other and cannot
>   see nor talk to each other.
> 
> Bus:
>   A bus is a named object inside a domain. Clients exchange messages
>   over a bus. Multiple buses themselves have no connection to each
>   other; messages can only be exchanged on the same bus. The default
>   entry point to a bus, where clients establish the connection to, is
>   the "bus" device node /dev/kdbus/<bus name>/bus.  Common operating
>   system setups create one "system bus" per system, and one "user
>   bus" for every logged-in user. Applications or services may create
>   their own private named buses.

might need a resync with the documentation.

Bus:
  A bus is a named object inside a domain. Clients exchange messages
  over a bus. Multiple buses themselves have no connection to each other;
  messages can only be exchanged on the same bus. The default entry point to
  a bus, where clients establish the connection to, is the "bus" file
  /sys/fs/kdbus/<bus name>/bus.
  Common operating system setups create one "system bus" per system, and one
  "user bus" for every logged-in user. Applications or services may create
   their own private named buses. See section 5 for more details.



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code for buses, domains and endpoints
@ 2014-11-21  8:14     ` Harald Hoyer
  0 siblings, 0 replies; 73+ messages in thread
From: Harald Hoyer @ 2014-11-21  8:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman, arnd-r2nGTMty4D4,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A

On 21.11.2014 06:02, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
> 
> Add the logic to handle the following entities:
> 
> Domain:
>   A domain is an unamed object containing a number of buses. A
>   domain is automatically created when an instance of kdbusfs
>   is mounted, and destroyed when it is unmounted.
>   Every domain offers its own "control" device node to create
>   buses.  Domains have no connection to each other and cannot
>   see nor talk to each other.
> 
> Bus:
>   A bus is a named object inside a domain. Clients exchange messages
>   over a bus. Multiple buses themselves have no connection to each
>   other; messages can only be exchanged on the same bus. The default
>   entry point to a bus, where clients establish the connection to, is
>   the "bus" device node /dev/kdbus/<bus name>/bus.  Common operating
>   system setups create one "system bus" per system, and one "user
>   bus" for every logged-in user. Applications or services may create
>   their own private named buses.

might need a resync with the documentation.

Bus:
  A bus is a named object inside a domain. Clients exchange messages
  over a bus. Multiple buses themselves have no connection to each other;
  messages can only be exchanged on the same bus. The default entry point to
  a bus, where clients establish the connection to, is the "bus" file
  /sys/fs/kdbus/<bus name>/bus.
  Common operating system setups create one "system bus" per system, and one
  "user bus" for every logged-in user. Applications or services may create
   their own private named buses. See section 5 for more details.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-21  5:02 ` kdbus: add documentation Greg Kroah-Hartman
@ 2014-11-21  8:29   ` Harald Hoyer
  2014-11-21 17:12     ` Andy Lutomirski
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 73+ messages in thread
From: Harald Hoyer @ 2014-11-21  8:29 UTC (permalink / raw)
  To: linux-kernel

On 21.11.2014 06:02, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel@zonque.org>
> …

> +5.4 Creating buses and endpoints
> +--------------------------------
> +
> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a
> +struct kdbus_cmd_make argument.
> +
> +struct kdbus_cmd_make {
> +  __u64 size;
> +    The overall size of the struct, including its items.
> +
> +  __u64 flags;
> +    The flags for creation.
> +
> +    KDBUS_MAKE_ACCESS_GROUP
> +      Make the device file group-accessible

device?

> +
> +    KDBUS_MAKE_ACCESS_WORLD
> +      Make the device file world-accessible

device?

> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  struct kdbus_item items[0];
> +    A list of items, only used for creating custom endpoints. Has specific
> +    meanings for KDBUS_CMD_BUS_MAKE and KDBUS_CMD_ENDPOINT_MAKE (see above).
> +};
> …

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add header file
  2014-11-21  5:02 ` kdbus: add header file Greg Kroah-Hartman
@ 2014-11-21  8:34   ` Harald Hoyer
  2014-11-21  8:55     ` Daniel Mack
  0 siblings, 1 reply; 73+ messages in thread
From: Harald Hoyer @ 2014-11-21  8:34 UTC (permalink / raw)
  To: linux-kernel

On 21.11.2014 06:02, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel@zonque.org>
> …

> +
> +/**
> + * enum kdbus_make_flags - Flags for KDBUS_CMD_{BUS,EP,NS}_MAKE
> + * @KDBUS_MAKE_ACCESS_GROUP:	Make the device node group-accessible
> + * @KDBUS_MAKE_ACCESS_WORLD:	Make the device node world-accessible
> + */
> +enum kdbus_make_flags {
> +	KDBUS_MAKE_ACCESS_GROUP		= 1ULL <<  0,
> +	KDBUS_MAKE_ACCESS_WORLD		= 1ULL <<  1,
> +};
> +

"device node"

…

> +/**
> + * Ioctl API
> + * KDBUS_CMD_BUS_MAKE:		After opening the "control" device node, this
> + *				command creates a new bus with the specified
> + *				name. The bus is immediately shut down and
> + *				cleaned up when the opened "control" device node
> + *				is closed.

"device node"

> + * KDBUS_CMD_ENDPOINT_MAKE:	Creates a new named special endpoint to talk to
> + *				the bus. Such endpoints usually carry a more
> + *				restrictive policy and grant restricted access
> + *				to specific applications.
> + * KDBUS_CMD_HELLO:		By opening the bus device node a connection is
> + *				created. After a HELLO the opened connection
> + *				becomes an active peer on the bus.

"device node"

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code for buses, domains and endpoints
  2014-11-21  5:02 ` kdbus: add code for buses, domains and endpoints Greg Kroah-Hartman
  2014-11-21  8:14     ` Harald Hoyer
@ 2014-11-21  8:39   ` Harald Hoyer
  1 sibling, 0 replies; 73+ messages in thread
From: Harald Hoyer @ 2014-11-21  8:39 UTC (permalink / raw)
  To: linux-kernel

On 21.11.2014 06:02, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel@zonque.org>
> 
> +/**
> + * kdbus_bus_new() - create a kdbus_cmd_make from user-supplied data
> + * @domain:		The domain to work on
> + * @make:		Information as passed in by userspace
> + * @uid:		The uid of the device node
> + * @gid:		The gid of the device node

"device node" in several function comments.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add header file
  2014-11-21  8:34   ` Harald Hoyer
@ 2014-11-21  8:55     ` Daniel Mack
  0 siblings, 0 replies; 73+ messages in thread
From: Daniel Mack @ 2014-11-21  8:55 UTC (permalink / raw)
  To: Harald Hoyer, linux-kernel

Hi Harald,

On 11/21/2014 09:34 AM, Harald Hoyer wrote:
> On 21.11.2014 06:02, Greg Kroah-Hartman wrote:
>> From: Daniel Mack <daniel@zonque.org>
>> …
> 
>> +
>> +/**
>> + * enum kdbus_make_flags - Flags for KDBUS_CMD_{BUS,EP,NS}_MAKE
>> + * @KDBUS_MAKE_ACCESS_GROUP:	Make the device node group-accessible
>> + * @KDBUS_MAKE_ACCESS_WORLD:	Make the device node world-accessible
>> + */
>> +enum kdbus_make_flags {
>> +	KDBUS_MAKE_ACCESS_GROUP		= 1ULL <<  0,
>> +	KDBUS_MAKE_ACCESS_WORLD		= 1ULL <<  1,
>> +};
>> +
> 
> "device node"

Yes, these are indeed oversights from the old implementation. Fixed now
upstream, along with the other details in kdbus.txt and some more
elsewhere in the tree.


Thanks,
Daniel


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 15:55     ` Sasha Levin
  0 siblings, 0 replies; 73+ messages in thread
From: Sasha Levin @ 2014-11-21 15:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, arnd, ebiederm, gnomes, teg, jkosina, luto,
	linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz

On 11/21/2014 12:02 AM, Greg Kroah-Hartman wrote:
> +static struct dentry *fs_dir_iop_lookup(struct inode *dir,
> +					struct dentry *dentry,
> +					unsigned int flags)
> +{
> +	struct dentry *dnew = NULL;
> +	struct kdbus_node *parent;
> +	struct kdbus_node *node;
> +	struct inode *inode;
> +
> +	parent = kdbus_node_from_dentry(dentry->d_parent);
> +	if (!kdbus_node_acquire(parent))
> +		return NULL;
> +
> +	/* returns reference to _acquired_ child node */
> +	node = kdbus_node_find_child(parent, dentry->d_name.name);
> +	if (node) {
> +		dentry->d_fsdata = node;
> +		inode = fs_inode_get(dir->i_sb, node);
> +		if (IS_ERR(inode))
> +			dnew = ERR_CAST(inode);
> +		else
> +			dnew = d_materialise_unique(dentry, inode);

d_materialise_unique() is gone in Al's fs tree:

[mandatory]
        d_materialise_unique() is gone; d_splice_alias() does everything you
        need now.  Remember that they have opposite orders of arguments ;-/

Maybe it's worth basing your git tree on top of Al's rather than a random
-rc, since it's now a filesystem?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 15:55     ` Sasha Levin
  0 siblings, 0 replies; 73+ messages in thread
From: Sasha Levin @ 2014-11-21 15:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, arnd-r2nGTMty4D4,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A

On 11/21/2014 12:02 AM, Greg Kroah-Hartman wrote:
> +static struct dentry *fs_dir_iop_lookup(struct inode *dir,
> +					struct dentry *dentry,
> +					unsigned int flags)
> +{
> +	struct dentry *dnew = NULL;
> +	struct kdbus_node *parent;
> +	struct kdbus_node *node;
> +	struct inode *inode;
> +
> +	parent = kdbus_node_from_dentry(dentry->d_parent);
> +	if (!kdbus_node_acquire(parent))
> +		return NULL;
> +
> +	/* returns reference to _acquired_ child node */
> +	node = kdbus_node_find_child(parent, dentry->d_name.name);
> +	if (node) {
> +		dentry->d_fsdata = node;
> +		inode = fs_inode_get(dir->i_sb, node);
> +		if (IS_ERR(inode))
> +			dnew = ERR_CAST(inode);
> +		else
> +			dnew = d_materialise_unique(dentry, inode);

d_materialise_unique() is gone in Al's fs tree:

[mandatory]
        d_materialise_unique() is gone; d_splice_alias() does everything you
        need now.  Remember that they have opposite orders of arguments ;-/

Maybe it's worth basing your git tree on top of Al's rather than a random
-rc, since it's now a filesystem?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 16:13       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-21 16:13 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Greg Kroah-Hartman, Arnd Bergmann, ebiederm, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Andy Lutomirski, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Fri, Nov 21, 2014 at 4:55 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
> On 11/21/2014 12:02 AM, Greg Kroah-Hartman wrote:
>> +static struct dentry *fs_dir_iop_lookup(struct inode *dir,
>> +                                     struct dentry *dentry,
>> +                                     unsigned int flags)
>> +{
>> +     struct dentry *dnew = NULL;
>> +     struct kdbus_node *parent;
>> +     struct kdbus_node *node;
>> +     struct inode *inode;
>> +
>> +     parent = kdbus_node_from_dentry(dentry->d_parent);
>> +     if (!kdbus_node_acquire(parent))
>> +             return NULL;
>> +
>> +     /* returns reference to _acquired_ child node */
>> +     node = kdbus_node_find_child(parent, dentry->d_name.name);
>> +     if (node) {
>> +             dentry->d_fsdata = node;
>> +             inode = fs_inode_get(dir->i_sb, node);
>> +             if (IS_ERR(inode))
>> +                     dnew = ERR_CAST(inode);
>> +             else
>> +                     dnew = d_materialise_unique(dentry, inode);
>
> d_materialise_unique() is gone in Al's fs tree:
>
> [mandatory]
>         d_materialise_unique() is gone; d_splice_alias() does everything you
>         need now.  Remember that they have opposite orders of arguments ;-/

That was actually pushed after we prepared v2, so I haven't seen it
yet. I now rebased on top of vfs.git#for-next, with
d_materialise_unique() -> d_splice_alias(). Thanks for the hint!

> Maybe it's worth basing your git tree on top of Al's rather than a random
> -rc, since it's now a filesystem?

Sure, sounds good.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 16:13       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-21 16:13 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Greg Kroah-Hartman, Arnd Bergmann,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Andy Lutomirski, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Fri, Nov 21, 2014 at 4:55 PM, Sasha Levin <sasha.levin-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> On 11/21/2014 12:02 AM, Greg Kroah-Hartman wrote:
>> +static struct dentry *fs_dir_iop_lookup(struct inode *dir,
>> +                                     struct dentry *dentry,
>> +                                     unsigned int flags)
>> +{
>> +     struct dentry *dnew = NULL;
>> +     struct kdbus_node *parent;
>> +     struct kdbus_node *node;
>> +     struct inode *inode;
>> +
>> +     parent = kdbus_node_from_dentry(dentry->d_parent);
>> +     if (!kdbus_node_acquire(parent))
>> +             return NULL;
>> +
>> +     /* returns reference to _acquired_ child node */
>> +     node = kdbus_node_find_child(parent, dentry->d_name.name);
>> +     if (node) {
>> +             dentry->d_fsdata = node;
>> +             inode = fs_inode_get(dir->i_sb, node);
>> +             if (IS_ERR(inode))
>> +                     dnew = ERR_CAST(inode);
>> +             else
>> +                     dnew = d_materialise_unique(dentry, inode);
>
> d_materialise_unique() is gone in Al's fs tree:
>
> [mandatory]
>         d_materialise_unique() is gone; d_splice_alias() does everything you
>         need now.  Remember that they have opposite orders of arguments ;-/

That was actually pushed after we prepared v2, so I haven't seen it
yet. I now rebased on top of vfs.git#for-next, with
d_materialise_unique() -> d_splice_alias(). Thanks for the hint!

> Maybe it's worth basing your git tree on top of Al's rather than a random
> -rc, since it's now a filesystem?

Sure, sounds good.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
  2014-11-21  5:02 ` kdbus: add node and filesystem implementation Greg Kroah-Hartman
  2014-11-21 15:55     ` Sasha Levin
@ 2014-11-21 16:35   ` Andy Lutomirski
  2014-11-21 16:41     ` Andy Lutomirski
  2014-11-21 16:53       ` David Herrmann
  1 sibling, 2 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-21 16:35 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Eric W. Biederman, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Linux API, linux-kernel, Daniel Mack,
	David Herrmann, Djalal Harouni

On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> From: Daniel Mack <daniel@zonque.org>
>
> kdbusfs is a filesystem that will expose a fresh kdbus domain context
> each time it is mounted. Per mount point, there will be a 'control'
> node, which can be used to create buses. fs.c contains the
> implementation of that pseudo-fs. Exported inodes of 'file' type have
> their i_fop set to either kdbus_handle_control_ops or
> kdbus_handle_ep_ops, depending on their type. The actual dispatching
> of file operations is done from handle.c
>
> node.c is an implementation of a kdbus object that has an id and
> children, organized in an R/B tree. The tree is used by the filesystem
> code for lookup and iterator functions, and to deactivate children
> once the parent is deactivated. Every inode exported by kdbusfs is
> backed by a kdbus_node, hence it is embedded in struct kdbus_ep,
> struct kdbus_bus and struct kdbus_domain.
>
> Signed-off-by: Daniel Mack <daniel@zonque.org>
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---


> +
> +static struct file_system_type fs_type = {
> +       .name           = KBUILD_MODNAME "fs",
> +       .owner          = THIS_MODULE,
> +       .mount          = fs_super_mount,
> +       .kill_sb        = fs_super_kill,
> +};

Does this want something like:

.fs_flags = FS_USERNS_MOUNT

This design may have the annoying property that, if a namespace-based
sandbox wants to use kdbus itself, it will need to proxy anything from
the parent that it wants to use.

Is there a good reason why individual *busses* don't show up in the
filesystem?  If they did, maybe they could be bind-mounted or
otherwise arranged to cross namespace boundaries.

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
  2014-11-21 16:35   ` Andy Lutomirski
@ 2014-11-21 16:41     ` Andy Lutomirski
  2014-11-21 16:53       ` David Herrmann
  1 sibling, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-21 16:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Eric W. Biederman, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Linux API, linux-kernel, Daniel Mack,
	David Herrmann, Djalal Harouni

On Fri, Nov 21, 2014 at 8:35 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
>> From: Daniel Mack <daniel@zonque.org>
>>
>> kdbusfs is a filesystem that will expose a fresh kdbus domain context
>> each time it is mounted. Per mount point, there will be a 'control'
>> node, which can be used to create buses. fs.c contains the
>> implementation of that pseudo-fs. Exported inodes of 'file' type have
>> their i_fop set to either kdbus_handle_control_ops or
>> kdbus_handle_ep_ops, depending on their type. The actual dispatching
>> of file operations is done from handle.c
>>
>> node.c is an implementation of a kdbus object that has an id and
>> children, organized in an R/B tree. The tree is used by the filesystem
>> code for lookup and iterator functions, and to deactivate children
>> once the parent is deactivated. Every inode exported by kdbusfs is
>> backed by a kdbus_node, hence it is embedded in struct kdbus_ep,
>> struct kdbus_bus and struct kdbus_domain.
>>
>> Signed-off-by: Daniel Mack <daniel@zonque.org>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>
>
>> +
>> +static struct file_system_type fs_type = {
>> +       .name           = KBUILD_MODNAME "fs",
>> +       .owner          = THIS_MODULE,
>> +       .mount          = fs_super_mount,
>> +       .kill_sb        = fs_super_kill,
>> +};
>
> Does this want something like:
>
> .fs_flags = FS_USERNS_MOUNT
>
> This design may have the annoying property that, if a namespace-based
> sandbox wants to use kdbus itself, it will need to proxy anything from
> the parent that it wants to use.
>
> Is there a good reason why individual *busses* don't show up in the
> filesystem?  If they did, maybe they could be bind-mounted or
> otherwise arranged to cross namespace boundaries.
>

Whoops.  Brainfart there -- busses do show up in the filesystem.  Does
bind-mounting them work?

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 16:53       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-21 16:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Fri, Nov 21, 2014 at 5:35 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
>> From: Daniel Mack <daniel@zonque.org>
>>
>> kdbusfs is a filesystem that will expose a fresh kdbus domain context
>> each time it is mounted. Per mount point, there will be a 'control'
>> node, which can be used to create buses. fs.c contains the
>> implementation of that pseudo-fs. Exported inodes of 'file' type have
>> their i_fop set to either kdbus_handle_control_ops or
>> kdbus_handle_ep_ops, depending on their type. The actual dispatching
>> of file operations is done from handle.c
>>
>> node.c is an implementation of a kdbus object that has an id and
>> children, organized in an R/B tree. The tree is used by the filesystem
>> code for lookup and iterator functions, and to deactivate children
>> once the parent is deactivated. Every inode exported by kdbusfs is
>> backed by a kdbus_node, hence it is embedded in struct kdbus_ep,
>> struct kdbus_bus and struct kdbus_domain.
>>
>> Signed-off-by: Daniel Mack <daniel@zonque.org>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>
>
>> +
>> +static struct file_system_type fs_type = {
>> +       .name           = KBUILD_MODNAME "fs",
>> +       .owner          = THIS_MODULE,
>> +       .mount          = fs_super_mount,
>> +       .kill_sb        = fs_super_kill,
>> +};
>
> Does this want something like:
>
> .fs_flags = FS_USERNS_MOUNT

Yes, we should add that. Nice catch!

> This design may have the annoying property that, if a namespace-based
> sandbox wants to use kdbus itself, it will need to proxy anything from
> the parent that it wants to use.
>
> Is there a good reason why individual *busses* don't show up in the
> filesystem?  If they did, maybe they could be bind-mounted or
> otherwise arranged to cross namespace boundaries.

Buses show up as directories in kdbusfs. You can bind-mount them
anywhere you want and you will get access to the endpoints in there.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 16:53       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-21 16:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Daniel Mack, Djalal Harouni

Hi

On Fri, Nov 21, 2014 at 5:35 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
> <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> wrote:
>> From: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
>>
>> kdbusfs is a filesystem that will expose a fresh kdbus domain context
>> each time it is mounted. Per mount point, there will be a 'control'
>> node, which can be used to create buses. fs.c contains the
>> implementation of that pseudo-fs. Exported inodes of 'file' type have
>> their i_fop set to either kdbus_handle_control_ops or
>> kdbus_handle_ep_ops, depending on their type. The actual dispatching
>> of file operations is done from handle.c
>>
>> node.c is an implementation of a kdbus object that has an id and
>> children, organized in an R/B tree. The tree is used by the filesystem
>> code for lookup and iterator functions, and to deactivate children
>> once the parent is deactivated. Every inode exported by kdbusfs is
>> backed by a kdbus_node, hence it is embedded in struct kdbus_ep,
>> struct kdbus_bus and struct kdbus_domain.
>>
>> Signed-off-by: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
>> Signed-off-by: David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Signed-off-by: Djalal Harouni <tixxdz-Umm1ozX2/EEdnm+yROfE0A@public.gmane.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
>> ---
>
>
>> +
>> +static struct file_system_type fs_type = {
>> +       .name           = KBUILD_MODNAME "fs",
>> +       .owner          = THIS_MODULE,
>> +       .mount          = fs_super_mount,
>> +       .kill_sb        = fs_super_kill,
>> +};
>
> Does this want something like:
>
> .fs_flags = FS_USERNS_MOUNT

Yes, we should add that. Nice catch!

> This design may have the annoying property that, if a namespace-based
> sandbox wants to use kdbus itself, it will need to proxy anything from
> the parent that it wants to use.
>
> Is there a good reason why individual *busses* don't show up in the
> filesystem?  If they did, maybe they could be bind-mounted or
> otherwise arranged to cross namespace boundaries.

Buses show up as directories in kdbusfs. You can bind-mount them
anywhere you want and you will get access to the endpoints in there.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 16:56         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21 16:56 UTC (permalink / raw)
  To: David Herrmann
  Cc: Sasha Levin, Arnd Bergmann, ebiederm, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Andy Lutomirski, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

On Fri, Nov 21, 2014 at 05:13:26PM +0100, David Herrmann wrote:
> Hi
> 
> On Fri, Nov 21, 2014 at 4:55 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
> > On 11/21/2014 12:02 AM, Greg Kroah-Hartman wrote:
> >> +static struct dentry *fs_dir_iop_lookup(struct inode *dir,
> >> +                                     struct dentry *dentry,
> >> +                                     unsigned int flags)
> >> +{
> >> +     struct dentry *dnew = NULL;
> >> +     struct kdbus_node *parent;
> >> +     struct kdbus_node *node;
> >> +     struct inode *inode;
> >> +
> >> +     parent = kdbus_node_from_dentry(dentry->d_parent);
> >> +     if (!kdbus_node_acquire(parent))
> >> +             return NULL;
> >> +
> >> +     /* returns reference to _acquired_ child node */
> >> +     node = kdbus_node_find_child(parent, dentry->d_name.name);
> >> +     if (node) {
> >> +             dentry->d_fsdata = node;
> >> +             inode = fs_inode_get(dir->i_sb, node);
> >> +             if (IS_ERR(inode))
> >> +                     dnew = ERR_CAST(inode);
> >> +             else
> >> +                     dnew = d_materialise_unique(dentry, inode);
> >
> > d_materialise_unique() is gone in Al's fs tree:
> >
> > [mandatory]
> >         d_materialise_unique() is gone; d_splice_alias() does everything you
> >         need now.  Remember that they have opposite orders of arguments ;-/
> 
> That was actually pushed after we prepared v2, so I haven't seen it
> yet. I now rebased on top of vfs.git#for-next, with
> d_materialise_unique() -> d_splice_alias(). Thanks for the hint!
> 
> > Maybe it's worth basing your git tree on top of Al's rather than a random
> > -rc, since it's now a filesystem?
> 
> Sure, sounds good.

No, I'll keep it as is, we can handle the merge issues later when it
hits Linus's tree, this makes it easier for me and others to test it
out properly.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 16:56         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21 16:56 UTC (permalink / raw)
  To: David Herrmann
  Cc: Sasha Levin, Arnd Bergmann, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

On Fri, Nov 21, 2014 at 05:13:26PM +0100, David Herrmann wrote:
> Hi
> 
> On Fri, Nov 21, 2014 at 4:55 PM, Sasha Levin <sasha.levin-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> > On 11/21/2014 12:02 AM, Greg Kroah-Hartman wrote:
> >> +static struct dentry *fs_dir_iop_lookup(struct inode *dir,
> >> +                                     struct dentry *dentry,
> >> +                                     unsigned int flags)
> >> +{
> >> +     struct dentry *dnew = NULL;
> >> +     struct kdbus_node *parent;
> >> +     struct kdbus_node *node;
> >> +     struct inode *inode;
> >> +
> >> +     parent = kdbus_node_from_dentry(dentry->d_parent);
> >> +     if (!kdbus_node_acquire(parent))
> >> +             return NULL;
> >> +
> >> +     /* returns reference to _acquired_ child node */
> >> +     node = kdbus_node_find_child(parent, dentry->d_name.name);
> >> +     if (node) {
> >> +             dentry->d_fsdata = node;
> >> +             inode = fs_inode_get(dir->i_sb, node);
> >> +             if (IS_ERR(inode))
> >> +                     dnew = ERR_CAST(inode);
> >> +             else
> >> +                     dnew = d_materialise_unique(dentry, inode);
> >
> > d_materialise_unique() is gone in Al's fs tree:
> >
> > [mandatory]
> >         d_materialise_unique() is gone; d_splice_alias() does everything you
> >         need now.  Remember that they have opposite orders of arguments ;-/
> 
> That was actually pushed after we prepared v2, so I haven't seen it
> yet. I now rebased on top of vfs.git#for-next, with
> d_materialise_unique() -> d_splice_alias(). Thanks for the hint!
> 
> > Maybe it's worth basing your git tree on top of Al's rather than a random
> > -rc, since it's now a filesystem?
> 
> Sure, sounds good.

No, I'll keep it as is, we can handle the merge issues later when it
hits Linus's tree, this makes it easier for me and others to test it
out properly.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 17:03           ` Sasha Levin
  0 siblings, 0 replies; 73+ messages in thread
From: Sasha Levin @ 2014-11-21 17:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Herrmann
  Cc: Arnd Bergmann, ebiederm, One Thousand Gnomes, Tom Gundersen,
	Jiri Kosina, Andy Lutomirski, Linux API, linux-kernel,
	Daniel Mack, Djalal Harouni

On 11/21/2014 11:56 AM, Greg Kroah-Hartman wrote:
>>> Maybe it's worth basing your git tree on top of Al's rather than a random
>>> > > -rc, since it's now a filesystem?
>> > 
>> > Sure, sounds good.
> No, I'll keep it as is, we can handle the merge issues later when it
> hits Linus's tree, this makes it easier for me and others to test it
> out properly.

It should be hitting -next, not Linus's tree. This is why we have an
integration tree, no?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 17:03           ` Sasha Levin
  0 siblings, 0 replies; 73+ messages in thread
From: Sasha Levin @ 2014-11-21 17:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman, David Herrmann
  Cc: Arnd Bergmann, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

On 11/21/2014 11:56 AM, Greg Kroah-Hartman wrote:
>>> Maybe it's worth basing your git tree on top of Al's rather than a random
>>> > > -rc, since it's now a filesystem?
>> > 
>> > Sure, sounds good.
> No, I'll keep it as is, we can handle the merge issues later when it
> hits Linus's tree, this makes it easier for me and others to test it
> out properly.

It should be hitting -next, not Linus's tree. This is why we have an
integration tree, no?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-21 17:12     ` Andy Lutomirski
  0 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-21 17:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Eric W. Biederman, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Linux API, linux-kernel, Daniel Mack,
	David Herrmann, Djalal Harouni

On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> From: Daniel Mack <daniel@zonque.org>
>
> kdbus is a system for low-latency, low-overhead, easy to use
> interprocess communication (IPC).
>
> The interface to all functions in this driver is implemented through
> ioctls on files exposed through the mount point of a kdbusfs.  This
> patch adds detailed documentation about the kernel level API design.
>
> Signed-off-by: Daniel Mack <daniel@zonque.org>
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---

> +  Pool:
> +    Each connection allocates a piece of shmem-backed memory that is used
> +    to receive messages and answers to ioctl command from the kernel. It is
> +    never used to send anything to the kernel. In order to access that memory,
> +    userspace must mmap() it into its task.
> +    See section 12 for more details.

At the risk of opening a can of worms, wouldn't this be much more
useful if you could share a pool between multiple connections?


> +
> +
> +4. Items
> +===============================================================================
> +
> +To flexibly augment transport structures used by kdbus, data blobs of type
> +struct kdbus_item are used. An item has a fixed-sized header that only stores
> +the type of the item and the overall size. The total size is variable and is
> +in some cases defined by the item type, in other cases, they can be of
> +arbitrary length (for instance, a string).
> +
> +In the external kernel API, items are used for many ioctls to transport
> +optional information from userspace to kernelspace. They are also used for
> +information stored in a connection's pool, such as messages, name lists or
> +requested connection information.
> +
> +In all such occasions where items are used as part of the kdbus kernel API,
> +they are embedded in structs that have an overall size of their own, so there
> +can be many of them.
> +
> +The kernel expects all items to be aligned to 8-byte boundaries.
> +
> +A simple iterator in userspace would iterate over the items until the items
> +have reached the embedding structure's overall size. An example implementation
> +of such an iterator can be found in tools/testing/selftests/kdbus/kdbus-util.h.

It looks like many (all?) item consumers ignore unknown items.  This
seems like a compatbility problem.

Would it be better to have a bit in each item that toggles between
"ignore me if you don't recognize me" and "error out if you don't
recognize me"?

> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a
> +struct kdbus_cmd_make argument.
> +
> +struct kdbus_cmd_make {
> +  __u64 size;
> +    The overall size of the struct, including its items.
> +
> +  __u64 flags;
> +    The flags for creation.
> +
> +    KDBUS_MAKE_ACCESS_GROUP
> +      Make the device file group-accessible
> +
> +    KDBUS_MAKE_ACCESS_WORLD
> +      Make the device file world-accessible

This thing is a file.  What's wrong with using a normal POSIX mode?
(And what to the read, write, and exec modes do?)

> +
> +
> +6.2 Creating connections
> +------------------------
> +
> +A connection to a bus is created by opening an endpoint file of a bus and
> +becoming an active client with the KDBUS_CMD_HELLO ioctl. Every connected client
> +connection has a unique identifier on the bus and can address messages to every
> +other connection on the same bus by using the peer's connection id as the
> +destination.
> +
> +The KDBUS_CMD_HELLO ioctl takes the following struct as argument.
> +
> +struct kdbus_cmd_hello {
> +  __u64 size;
> +    The overall size of the struct, including all attached items.
> +
> +  __u64 conn_flags;
> +    Flags to apply to this connection:
> +
> +    KDBUS_HELLO_ACCEPT_FD
> +      When this flag is set, the connection can be sent file descriptors
> +      as message payload. If it's not set, any attempt of doing so will
> +      result in -ECOMM on the sender's side.
> +
> +    KDBUS_HELLO_ACTIVATOR
> +      Make this connection an activator (see below). With this bit set,
> +      an item of type KDBUS_ITEM_NAME has to be attached which describes
> +      the well-known name this connection should be an activator for.
> +
> +    KDBUS_HELLO_POLICY_HOLDER
> +      Make this connection a policy holder (see below). With this bit set,
> +      an item of type KDBUS_ITEM_NAME has to be attached which describes
> +      the well-known name this connection should hold a policy for.
> +
> +    KDBUS_HELLO_MONITOR
> +      Make this connection an eaves-dropping connection that receives all
> +      unicast messages sent on the bus. To also receive broadcast messages,
> +      the connection has to upload appropriate matches as well.
> +      This flag is only valid for privileged bus connections.
> +
> +  __u64 attach_flags_send;
> +      Set the bits for metadata this connection permits to be sent to the
> +      receiving peer. Only metadata items that are both allowed to be sent by
> +      the sender and that are requested by the receiver will effectively be
> +      attached to the message eventually. Note, however, that the bus may
> +      optionally enforce some of those bits to be set. If the match fails,
> +      -ECONNREFUSED will be returned. In either case, this field will be set
> +      to the mask of metadata items that are enforced by the bus. The
> +      KDBUS_FLAGS_KERNEL bit will as well be set.
> +
> +  __u64 attach_flags_recv;
> +      Request the attachment of metadata for each message received by this
> +      connection. The metadata actually attached may actually augment the list
> +      of requested items. See section 13 for more details.
> +
> +  __u64 bus_flags;
> +      Upon successful completion of the ioctl, this member will contain the
> +      flags of the bus it connected to.
> +
> +  __u64 id;
> +      Upon successful completion of the ioctl, this member will contain the
> +      id of the new connection.
> +
> +  __u64 pool_size;
> +      The size of the communication pool, in bytes. The pool can be accessed
> +      by calling mmap() on the file descriptor that was used to issue the
> +      KDBUS_CMD_HELLO ioctl.
> +
> +  struct kdbus_bloom_parameter bloom;
> +      Bloom filter parameter (see below).
> +
> +  __u8 id128[16];
> +      Upon successful completion of the ioctl, this member will contain the
> +      128 bit wide UUID of the connected bus.
> +
> +  struct kdbus_item items[0];
> +      Variable list of items to add optional additional information. The
> +      following items are currently expected/valid:
> +
> +      KDBUS_ITEM_CONN_DESCRIPTION
> +        Contains a string to describes this connection's name, so it can be
> +        identified later.
> +
> +      KDBUS_ITEM_NAME
> +      KDBUS_ITEM_POLICY_ACCESS
> +        For activators and policy holders only, combinations of these two
> +        items describe policy access entries (see section about policy).
> +
> +      KDBUS_ITEM_CREDS
> +      KDBUS_ITEM_SECLABEL
> +        Privileged bus users may submit these types in order to create
> +        connections with faked credentials. The only real use case for this
> +        is a proxy service which acts on behalf of some other tasks. For a
> +        connection that runs in that mode, the message's metadata items will
> +        be limited to what's specified here. See section 13 for more
> +        information.

This is still confusing.  There are multiple places in which metadata
is attached.  Which does this apply to?  And why are only creds and
seclabel listed?


> +
> +6.4 Retrieving information on a connection
> +------------------------------------------
> +
> +The KDBUS_CMD_CONN_INFO ioctl can be used to retrieve credentials and
> +properties of the initial creator of a connection. This ioctl uses the
> +following struct:
> +
> +struct kdbus_cmd_info {
> +  __u64 size;
> +    The overall size of the struct, including the name with its 0-byte string
> +    terminator.
> +
> +  __u64 flags;
> +    Specify which metadata items should be attached to the answer.
> +    See section 13 for more details.
> +
> +    After the ioctl returns, this field will contain the current metadata
> +    attach flags of the connection.
> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  __u64 id;
> +    The connection's numerical ID to retrieve information for. If set to
> +    non-zero value, the 'name' field is ignored.
> +
> +  __u64 offset;
> +    When the ioctl returns, this value will yield the offset of the connection
> +    information inside the caller's pool.
> +
> +  struct kdbus_item items[0];
> +    The optional item list, containing the well-known name to look up as
> +    a KDBUS_ITEM_OWNED_NAME. Only required if the 'id' field is set to 0.
> +    All other items are currently ignored.
> +};
> +
> +After the ioctl returns, the following struct will be stored in the caller's
> +pool at 'offset'.
> +
> +struct kdbus_info {
> +  __u64 size;
> +    The overall size of the struct, including all its items.
> +
> +  __u64 id;
> +    The connection's unique ID.
> +
> +  __u64 flags;
> +    The connection's flags as specified when it was created.
> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  struct kdbus_item items[0];
> +    Depending on the 'flags' field in struct kdbus_cmd_info, items of
> +    types KDBUS_ITEM_OWNED_NAME and KDBUS_ITEM_CONN_DESCRIPTION are followed
> +    here.
> +};
> +
> +Once the caller is finished with parsing the return buffer, it needs to call
> +KDBUS_CMD_FREE for the offset.
> +
> +
> +6.5 Getting information about a connection's bus creator
> +--------------------------------------------------------
> +
> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
> +the bus the connection is attached to. The metadata returned by this call is
> +collected during the creation of the bus and is never altered afterwards, so
> +it provides pristine information on the task that created the bus, at the
> +moment when it did so.

What's this for?  I understand the need for the creator of busses to
be authenticated, but doing it like this mean that anyone who will
*fail* authentication can DoS the authentic creator.

> +
> +7.3 Passing of Payload Data
> +---------------------------
> +
> +When connecting to the bus, receivers request a memory pool of a given size,
> +large enough to carry all backlog of data enqueued for the connection. The
> +pool is internally backed by a shared memory file which can be mmap()ed by
> +the receiver.
> +
> +KDBUS_MSG_PAYLOAD_VEC:
> +  Messages are directly copied by the sending process into the receiver's pool,
> +  that way two peers can exchange data by effectively doing a single-copy from
> +  one process to another, the kernel will not buffer the data anywhere else.
> +
> +KDBUS_MSG_PAYLOAD_MEMFD:
> +  Messages can reference memfd files which contain the data.
> +  memfd files are tmpfs-backed files that allow sealing of the content of the
> +  file, which prevents all writable access to the file content.
> +  Only sealed memfd files are accepted as payload data, which enforces
> +  reliable passing of data; the receiver can assume that neither the sender nor
> +  anyone else can alter the content after the message is sent.

This should specify *which* seals are checked.

> +
> +Apart from the sender filling-in the content into memfd files, the data will
> +be passed as zero-copy from one process to another, read-only, shared between
> +the peers.
> +
> +
> +7.4 Receiving messages
> +----------------------
> +
> +Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The
> +endpoint file of the bus supports poll() to wake up the receiving process when
> +new messages are queued up to be received.
> +
> +With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used.
> +
> +struct kdbus_cmd_recv {
> +  __u64 flags;
> +    Flags to control the receive command.
> +
> +    KDBUS_RECV_PEEK
> +      Just return the location of the next message. Do not install file
> +      descriptors or anything else. This is usually used to determine the
> +      sender of the next queued message.
> +
> +    KDBUS_RECV_DROP
> +      Drop the next message without doing anything else with it, and free the
> +      pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE.
> +
> +    KDBUS_RECV_USE_PRIORITY
> +      Use the priority field (see below).
> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  __s64 priority;
> +      With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in
> +      the queue with at least the given priority. If no such message is waiting
> +      in the queue, -ENOMSG is returned.
> +
> +  __u64 offset;
> +      Upon return of the ioctl, this field contains the offset in the
> +      receiver's memory pool.
> +};
> +
> +Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the
> +offset field contains the location of the new message inside the receiver's
> +pool. The message is stored as struct kdbus_msg at this offset, and can be
> +interpreted with the semantics described above.

I'm confused here.  Is sent data written to the pool when send is
called or when recv is called?

If the former, what prevents DoS, especially DoS due to sending too many fds?

If the latter, where is the data buffered in the mean time?

> +
> +Also, if the connection allowed for file descriptor to be passed
> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
> +returns. The receiving task is obliged to close all of them appropriately.

This makes it sound like fds are installed at receive time.  What
prevents resource exhaustion due to having excessive numbers of fds in
transit (that are presumably not accounted to anyone)?

> +
> +7.5 Canceling messages synchronously waiting for replies
> +--------------------------------------------------------
> +
> +When a connection sends a message with KDBUS_MSG_FLAGS_SYNC_REPLY and
> +blocks while waiting for the reply, the KDBUS_CMD_MSG_CANCEL ioctl can be
> +used on the same file descriptor to cancel the message, based on its cookie.
> +If there are multiple messages with the same cookie that are all synchronously
> +waiting for a reply, all of them will be canceled. Obviously, this is only
> +possible in multi-threaded applications.

What does "cancel the message" mean?  Does it just mean that the wait
for the reply is cancelled?

> +11. Policy
> +===============================================================================
> +
> +A policy databases restrict the possibilities of connections to own, see and
> +talk to well-known names. It can be associated with a bus (through a policy
> +holder connection) or a custom endpoint.

ISTM metadata items on bus names should be replaced with policy that
applies to the domain as a whole and governs bus creation.

> +A set of policy rules is described by a name and multiple access rules, defined
> +by the following struct.
> +
> +struct kdbus_policy_access {
> +  __u64 type;  /* USER, GROUP, WORLD */
> +    One of the following.
> +
> +    KDBUS_POLICY_ACCESS_USER
> +      Grant access to a user with the uid stored in the 'id' field.
> +
> +    KDBUS_POLICY_ACCESS_GROUP
> +      Grant access to a user with the gid stored in the 'id' field.
> +
> +    KDBUS_POLICY_ACCESS_WORLD
> +      Grant access to everyone. The 'id' field is ignored.
> +
> +  __u64 access;        /* OWN, TALK, SEE */
> +    The access to grant.
> +
> +    KDBUS_POLICY_SEE
> +      Allow the name to be seen.
> +
> +    KDBUS_POLICY_TALK
> +      Allow the name to be talked to.
> +
> +    KDBUS_POLICY_OWN
> +      Allow the name to be owned.
> +
> +  __u64 id;
> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
> +};


What happens if there are multiple matches?


> +
> +11.4 TALK access and multiple well-known names per connection
> +-------------------------------------------------------------
> +
> +Note that TALK access is checked against all names of a connection.
> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
> +after all, we allow messages to be directed to either the name or a well-known
> +name, and policy is applied to the connection, not the name. In other words,
> +the effective TALK policy for a connection is the most permissive of all names
> +the connection owns.

This does seem illogical.  Does the recipient at least know which
well-known name was addressed?

> +11.5 Implicit policies
> +----------------------
> +
> +Depending on the type of the endpoint, a set of implicit rules might be
> +enforced. On default endpoints, the following set is enforced:
> +

How do these rules interact with installed policy?

> +  * Privileged connections always override any installed policy. Those
> +    connections could easily install their own policies, so there is no
> +    reason to enforce installed policies.
> +  * Connections can always talk to connections of the same user. This
> +    includes broadcast messages.

Why?  If anyone ever strengthens the concept of identity to include
things other than users (hmm -- there are already groups), this could
be very limiting.

> +  * Connections that own names might send broadcast messages to other
> +    connections that belong to a different user, but only if that
> +    destination connection does not own any name.
> +

This is weird.  It is also differently illogical than the "illogical"
thing above.

How about restricting access per name and making sure that the
receivers check what name was addressed before taking any action?

> +12. Pool
> +===============================================================================
> +
> +A pool for data received from the kernel is installed for every connection of
> +the bus, and is sized according to kdbus_cmd_hello.pool_size. It is accessed
> +when one of the following ioctls is issued:
> +
> +  * KDBUS_CMD_MSG_RECV, to receive a message
> +  * KDBUS_CMD_NAME_LIST, to dump the name registry
> +  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection
> +
> +Internally, the pool is organized in slices, stored in an rb-tree. The offsets
> +returned by either one of the aforementioned ioctls describe offsets inside the
> +pool. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE
> +has to be called on the offset.

Why are you documenting that the slices are stored in an rb-tree?
That's just an implementation details, right?

> +
> +To access the memory, the caller is expected to mmap() it to its task, like
> +this:
> +
> +  /*
> +   * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the
> +   * value that was previously passed in the .pool_size field of struct
> +   * kdbus_cmd_hello.
> +   */
> +
> +  buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0);
> +

Will mapping with PROT_WRITE fail?  What about MAP_SHARED?

And why are you suggesting MAP_PRIVATE?  That's just strange.

> +
> +13. Metadata
> +===============================================================================
> +
> +When a message is delivered to a receiver connection, it is augmented by
> +metadata items in accordance to the destination's current attach flags. The
> +information stored in those metadata items refer to the sender task at the
> +time of sending the message, so even if any detail of the sender task has
> +already changed upon message reception (or if the sender task does not exist
> +anymore), the information is still preserved and won't be modfied until the
> +message is freed.
> +
> +Note that there are two exceptions to the above rules:
> +
> +  a) Kernel generated messages don't have a source connection, so they won't be
> +     augmented.
> +
> +  b) If a connection was created with faked credentials (see section 6.2),
> +     the only attached metadata items are the ones provided by the connection
> +     itself. Other bits in the destination's attach_flags_recv won't have any
> +     effect in such cases.
> +
> +Also, there are two things to be considered by userspace programs regarding
> +those metadata items:
> +
> +  a) Userspace must cope with the fact that it might get more metadata than
> +     they requested. That happens, for example, when a broadcast message is
> +     sent and receivers have different attach flags. Items that haven't been
> +     requested should hence be silently ignored.
> +
> +  b) Userspace might not always get all requested metadata items that it
> +     requested. That is because some of those items are only added if a
> +     corresponding kernel feature has been enabled. Also, the two exceptions
> +     described above will as well lead to less items be attached than
> +     requested.
> +
> +
> +13.1 Known item types
> +---------------------
> +
> +The following attach flags are currently supported.
> +
> +  KDBUS_ATTACH_TIMESTAMP
> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
> +    monotonic and the realtime timestamp, taken when the message was
> +    processed on the kernel side.
> +
> +  KDBUS_ATTACH_CREDS
> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
> +

As mentioned last time, please remove or justify starttime.

> +  KDBUS_ATTACH_AUXGROUPS
> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
> +    number of auxiliary groups the sending task was a member of.
> +
> +  KDBUS_ATTACH_NAMES
> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
> +    connection currently owns. The name and flags are stored in kdbus_item.name
> +    for each of them.
> +

That's interesting.  What's it for?

> +  KDBUS_ATTACH_TID_COMM
> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
> +
> +  KDBUS_ATTACH_PID_COMM
> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
> +
> +  KDBUS_ATTACH_EXE
> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
> +    executable of the sending task, stored in kdbus_item.str.
> +
> +  KDBUS_ATTACH_CMDLINE
> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
> +    arguments of the sending task, as an array of strings, stored in
> +    kdbus_item.str.

Please remove these four items.  They are genuinely useless.  Anything
that uses them for anything is either buggy or should have asked the
sender to put the value in the payload (and immediately wondered why
it was doing that).

> +
> +  KDBUS_ATTACH_CGROUP
> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
> +
> +  KDBUS_ATTACH_CAPS
> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
> +    and derive the number of sets and rows from the item size and the reported
> +    number of valid capability bits.
> +

Please remove this, too, or justify its use.

> +  KDBUS_ATTACH_SECLABEL
> +    Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux
> +    security label of the sending task. Access via kdbus_item->str.
> +

This one, too, and please justify why code that uses it will work in
containers an on non-selinux systems.

> +  KDBUS_ATTACH_AUDIT
> +    Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label
> +    of the sending taskj. Access via kdbus_item->str.
> +

I will NAK the hell out of this until, at the very least, someone
documents what this means, how to parse it, what its stability rules
are, who is allowed to see the value it contains, why that value will
never evolve to become *more* security sensitive than it is now, etc.

Audit gets to do crazy sh*t because it's restricted to privileged
receivers.  This isn't restricted like that, so it doesn't deserve the
same dispensation.  (And, honestly, I'm not sure that audit really
deserves its free pass on making sense.)

> +  KDBUS_ATTACH_CONN_DESCRIPTION
> +    Attaches an item of type KDBUS_ITEM_CONN_DESCRIPTION that contains the
> +    sender connection's current name in kdbus_item.str.
> +

Which name?  Can't there be several?

> +
> +13.1 Metadata and namespaces
> +----------------------------
> +
> +Metadata such as PIDs, UIDs or GIDs are automatically translated to the
> +namespaces of the domain that is used to send a message over. The namespaces
> +of a domain are pinned at creation time, which is when the filesystem has been
> +mounted.
> +
> +Metadata items that cannot be translated are dropped.

What if the receiver said that the item was mandatory?


Thanks,
Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-21 17:12     ` Andy Lutomirski
  0 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-21 17:12 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arnd Bergmann, Eric W. Biederman, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Daniel Mack, David Herrmann,
	Djalal Harouni

On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
<gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org> wrote:
> From: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
>
> kdbus is a system for low-latency, low-overhead, easy to use
> interprocess communication (IPC).
>
> The interface to all functions in this driver is implemented through
> ioctls on files exposed through the mount point of a kdbusfs.  This
> patch adds detailed documentation about the kernel level API design.
>
> Signed-off-by: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
> Signed-off-by: David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Djalal Harouni <tixxdz-Umm1ozX2/EEdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
> ---

> +  Pool:
> +    Each connection allocates a piece of shmem-backed memory that is used
> +    to receive messages and answers to ioctl command from the kernel. It is
> +    never used to send anything to the kernel. In order to access that memory,
> +    userspace must mmap() it into its task.
> +    See section 12 for more details.

At the risk of opening a can of worms, wouldn't this be much more
useful if you could share a pool between multiple connections?


> +
> +
> +4. Items
> +===============================================================================
> +
> +To flexibly augment transport structures used by kdbus, data blobs of type
> +struct kdbus_item are used. An item has a fixed-sized header that only stores
> +the type of the item and the overall size. The total size is variable and is
> +in some cases defined by the item type, in other cases, they can be of
> +arbitrary length (for instance, a string).
> +
> +In the external kernel API, items are used for many ioctls to transport
> +optional information from userspace to kernelspace. They are also used for
> +information stored in a connection's pool, such as messages, name lists or
> +requested connection information.
> +
> +In all such occasions where items are used as part of the kdbus kernel API,
> +they are embedded in structs that have an overall size of their own, so there
> +can be many of them.
> +
> +The kernel expects all items to be aligned to 8-byte boundaries.
> +
> +A simple iterator in userspace would iterate over the items until the items
> +have reached the embedding structure's overall size. An example implementation
> +of such an iterator can be found in tools/testing/selftests/kdbus/kdbus-util.h.

It looks like many (all?) item consumers ignore unknown items.  This
seems like a compatbility problem.

Would it be better to have a bit in each item that toggles between
"ignore me if you don't recognize me" and "error out if you don't
recognize me"?

> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a
> +struct kdbus_cmd_make argument.
> +
> +struct kdbus_cmd_make {
> +  __u64 size;
> +    The overall size of the struct, including its items.
> +
> +  __u64 flags;
> +    The flags for creation.
> +
> +    KDBUS_MAKE_ACCESS_GROUP
> +      Make the device file group-accessible
> +
> +    KDBUS_MAKE_ACCESS_WORLD
> +      Make the device file world-accessible

This thing is a file.  What's wrong with using a normal POSIX mode?
(And what to the read, write, and exec modes do?)

> +
> +
> +6.2 Creating connections
> +------------------------
> +
> +A connection to a bus is created by opening an endpoint file of a bus and
> +becoming an active client with the KDBUS_CMD_HELLO ioctl. Every connected client
> +connection has a unique identifier on the bus and can address messages to every
> +other connection on the same bus by using the peer's connection id as the
> +destination.
> +
> +The KDBUS_CMD_HELLO ioctl takes the following struct as argument.
> +
> +struct kdbus_cmd_hello {
> +  __u64 size;
> +    The overall size of the struct, including all attached items.
> +
> +  __u64 conn_flags;
> +    Flags to apply to this connection:
> +
> +    KDBUS_HELLO_ACCEPT_FD
> +      When this flag is set, the connection can be sent file descriptors
> +      as message payload. If it's not set, any attempt of doing so will
> +      result in -ECOMM on the sender's side.
> +
> +    KDBUS_HELLO_ACTIVATOR
> +      Make this connection an activator (see below). With this bit set,
> +      an item of type KDBUS_ITEM_NAME has to be attached which describes
> +      the well-known name this connection should be an activator for.
> +
> +    KDBUS_HELLO_POLICY_HOLDER
> +      Make this connection a policy holder (see below). With this bit set,
> +      an item of type KDBUS_ITEM_NAME has to be attached which describes
> +      the well-known name this connection should hold a policy for.
> +
> +    KDBUS_HELLO_MONITOR
> +      Make this connection an eaves-dropping connection that receives all
> +      unicast messages sent on the bus. To also receive broadcast messages,
> +      the connection has to upload appropriate matches as well.
> +      This flag is only valid for privileged bus connections.
> +
> +  __u64 attach_flags_send;
> +      Set the bits for metadata this connection permits to be sent to the
> +      receiving peer. Only metadata items that are both allowed to be sent by
> +      the sender and that are requested by the receiver will effectively be
> +      attached to the message eventually. Note, however, that the bus may
> +      optionally enforce some of those bits to be set. If the match fails,
> +      -ECONNREFUSED will be returned. In either case, this field will be set
> +      to the mask of metadata items that are enforced by the bus. The
> +      KDBUS_FLAGS_KERNEL bit will as well be set.
> +
> +  __u64 attach_flags_recv;
> +      Request the attachment of metadata for each message received by this
> +      connection. The metadata actually attached may actually augment the list
> +      of requested items. See section 13 for more details.
> +
> +  __u64 bus_flags;
> +      Upon successful completion of the ioctl, this member will contain the
> +      flags of the bus it connected to.
> +
> +  __u64 id;
> +      Upon successful completion of the ioctl, this member will contain the
> +      id of the new connection.
> +
> +  __u64 pool_size;
> +      The size of the communication pool, in bytes. The pool can be accessed
> +      by calling mmap() on the file descriptor that was used to issue the
> +      KDBUS_CMD_HELLO ioctl.
> +
> +  struct kdbus_bloom_parameter bloom;
> +      Bloom filter parameter (see below).
> +
> +  __u8 id128[16];
> +      Upon successful completion of the ioctl, this member will contain the
> +      128 bit wide UUID of the connected bus.
> +
> +  struct kdbus_item items[0];
> +      Variable list of items to add optional additional information. The
> +      following items are currently expected/valid:
> +
> +      KDBUS_ITEM_CONN_DESCRIPTION
> +        Contains a string to describes this connection's name, so it can be
> +        identified later.
> +
> +      KDBUS_ITEM_NAME
> +      KDBUS_ITEM_POLICY_ACCESS
> +        For activators and policy holders only, combinations of these two
> +        items describe policy access entries (see section about policy).
> +
> +      KDBUS_ITEM_CREDS
> +      KDBUS_ITEM_SECLABEL
> +        Privileged bus users may submit these types in order to create
> +        connections with faked credentials. The only real use case for this
> +        is a proxy service which acts on behalf of some other tasks. For a
> +        connection that runs in that mode, the message's metadata items will
> +        be limited to what's specified here. See section 13 for more
> +        information.

This is still confusing.  There are multiple places in which metadata
is attached.  Which does this apply to?  And why are only creds and
seclabel listed?


> +
> +6.4 Retrieving information on a connection
> +------------------------------------------
> +
> +The KDBUS_CMD_CONN_INFO ioctl can be used to retrieve credentials and
> +properties of the initial creator of a connection. This ioctl uses the
> +following struct:
> +
> +struct kdbus_cmd_info {
> +  __u64 size;
> +    The overall size of the struct, including the name with its 0-byte string
> +    terminator.
> +
> +  __u64 flags;
> +    Specify which metadata items should be attached to the answer.
> +    See section 13 for more details.
> +
> +    After the ioctl returns, this field will contain the current metadata
> +    attach flags of the connection.
> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  __u64 id;
> +    The connection's numerical ID to retrieve information for. If set to
> +    non-zero value, the 'name' field is ignored.
> +
> +  __u64 offset;
> +    When the ioctl returns, this value will yield the offset of the connection
> +    information inside the caller's pool.
> +
> +  struct kdbus_item items[0];
> +    The optional item list, containing the well-known name to look up as
> +    a KDBUS_ITEM_OWNED_NAME. Only required if the 'id' field is set to 0.
> +    All other items are currently ignored.
> +};
> +
> +After the ioctl returns, the following struct will be stored in the caller's
> +pool at 'offset'.
> +
> +struct kdbus_info {
> +  __u64 size;
> +    The overall size of the struct, including all its items.
> +
> +  __u64 id;
> +    The connection's unique ID.
> +
> +  __u64 flags;
> +    The connection's flags as specified when it was created.
> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  struct kdbus_item items[0];
> +    Depending on the 'flags' field in struct kdbus_cmd_info, items of
> +    types KDBUS_ITEM_OWNED_NAME and KDBUS_ITEM_CONN_DESCRIPTION are followed
> +    here.
> +};
> +
> +Once the caller is finished with parsing the return buffer, it needs to call
> +KDBUS_CMD_FREE for the offset.
> +
> +
> +6.5 Getting information about a connection's bus creator
> +--------------------------------------------------------
> +
> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
> +the bus the connection is attached to. The metadata returned by this call is
> +collected during the creation of the bus and is never altered afterwards, so
> +it provides pristine information on the task that created the bus, at the
> +moment when it did so.

What's this for?  I understand the need for the creator of busses to
be authenticated, but doing it like this mean that anyone who will
*fail* authentication can DoS the authentic creator.

> +
> +7.3 Passing of Payload Data
> +---------------------------
> +
> +When connecting to the bus, receivers request a memory pool of a given size,
> +large enough to carry all backlog of data enqueued for the connection. The
> +pool is internally backed by a shared memory file which can be mmap()ed by
> +the receiver.
> +
> +KDBUS_MSG_PAYLOAD_VEC:
> +  Messages are directly copied by the sending process into the receiver's pool,
> +  that way two peers can exchange data by effectively doing a single-copy from
> +  one process to another, the kernel will not buffer the data anywhere else.
> +
> +KDBUS_MSG_PAYLOAD_MEMFD:
> +  Messages can reference memfd files which contain the data.
> +  memfd files are tmpfs-backed files that allow sealing of the content of the
> +  file, which prevents all writable access to the file content.
> +  Only sealed memfd files are accepted as payload data, which enforces
> +  reliable passing of data; the receiver can assume that neither the sender nor
> +  anyone else can alter the content after the message is sent.

This should specify *which* seals are checked.

> +
> +Apart from the sender filling-in the content into memfd files, the data will
> +be passed as zero-copy from one process to another, read-only, shared between
> +the peers.
> +
> +
> +7.4 Receiving messages
> +----------------------
> +
> +Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The
> +endpoint file of the bus supports poll() to wake up the receiving process when
> +new messages are queued up to be received.
> +
> +With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used.
> +
> +struct kdbus_cmd_recv {
> +  __u64 flags;
> +    Flags to control the receive command.
> +
> +    KDBUS_RECV_PEEK
> +      Just return the location of the next message. Do not install file
> +      descriptors or anything else. This is usually used to determine the
> +      sender of the next queued message.
> +
> +    KDBUS_RECV_DROP
> +      Drop the next message without doing anything else with it, and free the
> +      pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE.
> +
> +    KDBUS_RECV_USE_PRIORITY
> +      Use the priority field (see below).
> +
> +  __u64 kernel_flags;
> +    Valid flags for this command, returned by the kernel upon each call.
> +
> +  __s64 priority;
> +      With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in
> +      the queue with at least the given priority. If no such message is waiting
> +      in the queue, -ENOMSG is returned.
> +
> +  __u64 offset;
> +      Upon return of the ioctl, this field contains the offset in the
> +      receiver's memory pool.
> +};
> +
> +Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the
> +offset field contains the location of the new message inside the receiver's
> +pool. The message is stored as struct kdbus_msg at this offset, and can be
> +interpreted with the semantics described above.

I'm confused here.  Is sent data written to the pool when send is
called or when recv is called?

If the former, what prevents DoS, especially DoS due to sending too many fds?

If the latter, where is the data buffered in the mean time?

> +
> +Also, if the connection allowed for file descriptor to be passed
> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
> +returns. The receiving task is obliged to close all of them appropriately.

This makes it sound like fds are installed at receive time.  What
prevents resource exhaustion due to having excessive numbers of fds in
transit (that are presumably not accounted to anyone)?

> +
> +7.5 Canceling messages synchronously waiting for replies
> +--------------------------------------------------------
> +
> +When a connection sends a message with KDBUS_MSG_FLAGS_SYNC_REPLY and
> +blocks while waiting for the reply, the KDBUS_CMD_MSG_CANCEL ioctl can be
> +used on the same file descriptor to cancel the message, based on its cookie.
> +If there are multiple messages with the same cookie that are all synchronously
> +waiting for a reply, all of them will be canceled. Obviously, this is only
> +possible in multi-threaded applications.

What does "cancel the message" mean?  Does it just mean that the wait
for the reply is cancelled?

> +11. Policy
> +===============================================================================
> +
> +A policy databases restrict the possibilities of connections to own, see and
> +talk to well-known names. It can be associated with a bus (through a policy
> +holder connection) or a custom endpoint.

ISTM metadata items on bus names should be replaced with policy that
applies to the domain as a whole and governs bus creation.

> +A set of policy rules is described by a name and multiple access rules, defined
> +by the following struct.
> +
> +struct kdbus_policy_access {
> +  __u64 type;  /* USER, GROUP, WORLD */
> +    One of the following.
> +
> +    KDBUS_POLICY_ACCESS_USER
> +      Grant access to a user with the uid stored in the 'id' field.
> +
> +    KDBUS_POLICY_ACCESS_GROUP
> +      Grant access to a user with the gid stored in the 'id' field.
> +
> +    KDBUS_POLICY_ACCESS_WORLD
> +      Grant access to everyone. The 'id' field is ignored.
> +
> +  __u64 access;        /* OWN, TALK, SEE */
> +    The access to grant.
> +
> +    KDBUS_POLICY_SEE
> +      Allow the name to be seen.
> +
> +    KDBUS_POLICY_TALK
> +      Allow the name to be talked to.
> +
> +    KDBUS_POLICY_OWN
> +      Allow the name to be owned.
> +
> +  __u64 id;
> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
> +};


What happens if there are multiple matches?


> +
> +11.4 TALK access and multiple well-known names per connection
> +-------------------------------------------------------------
> +
> +Note that TALK access is checked against all names of a connection.
> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
> +after all, we allow messages to be directed to either the name or a well-known
> +name, and policy is applied to the connection, not the name. In other words,
> +the effective TALK policy for a connection is the most permissive of all names
> +the connection owns.

This does seem illogical.  Does the recipient at least know which
well-known name was addressed?

> +11.5 Implicit policies
> +----------------------
> +
> +Depending on the type of the endpoint, a set of implicit rules might be
> +enforced. On default endpoints, the following set is enforced:
> +

How do these rules interact with installed policy?

> +  * Privileged connections always override any installed policy. Those
> +    connections could easily install their own policies, so there is no
> +    reason to enforce installed policies.
> +  * Connections can always talk to connections of the same user. This
> +    includes broadcast messages.

Why?  If anyone ever strengthens the concept of identity to include
things other than users (hmm -- there are already groups), this could
be very limiting.

> +  * Connections that own names might send broadcast messages to other
> +    connections that belong to a different user, but only if that
> +    destination connection does not own any name.
> +

This is weird.  It is also differently illogical than the "illogical"
thing above.

How about restricting access per name and making sure that the
receivers check what name was addressed before taking any action?

> +12. Pool
> +===============================================================================
> +
> +A pool for data received from the kernel is installed for every connection of
> +the bus, and is sized according to kdbus_cmd_hello.pool_size. It is accessed
> +when one of the following ioctls is issued:
> +
> +  * KDBUS_CMD_MSG_RECV, to receive a message
> +  * KDBUS_CMD_NAME_LIST, to dump the name registry
> +  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection
> +
> +Internally, the pool is organized in slices, stored in an rb-tree. The offsets
> +returned by either one of the aforementioned ioctls describe offsets inside the
> +pool. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE
> +has to be called on the offset.

Why are you documenting that the slices are stored in an rb-tree?
That's just an implementation details, right?

> +
> +To access the memory, the caller is expected to mmap() it to its task, like
> +this:
> +
> +  /*
> +   * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the
> +   * value that was previously passed in the .pool_size field of struct
> +   * kdbus_cmd_hello.
> +   */
> +
> +  buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0);
> +

Will mapping with PROT_WRITE fail?  What about MAP_SHARED?

And why are you suggesting MAP_PRIVATE?  That's just strange.

> +
> +13. Metadata
> +===============================================================================
> +
> +When a message is delivered to a receiver connection, it is augmented by
> +metadata items in accordance to the destination's current attach flags. The
> +information stored in those metadata items refer to the sender task at the
> +time of sending the message, so even if any detail of the sender task has
> +already changed upon message reception (or if the sender task does not exist
> +anymore), the information is still preserved and won't be modfied until the
> +message is freed.
> +
> +Note that there are two exceptions to the above rules:
> +
> +  a) Kernel generated messages don't have a source connection, so they won't be
> +     augmented.
> +
> +  b) If a connection was created with faked credentials (see section 6.2),
> +     the only attached metadata items are the ones provided by the connection
> +     itself. Other bits in the destination's attach_flags_recv won't have any
> +     effect in such cases.
> +
> +Also, there are two things to be considered by userspace programs regarding
> +those metadata items:
> +
> +  a) Userspace must cope with the fact that it might get more metadata than
> +     they requested. That happens, for example, when a broadcast message is
> +     sent and receivers have different attach flags. Items that haven't been
> +     requested should hence be silently ignored.
> +
> +  b) Userspace might not always get all requested metadata items that it
> +     requested. That is because some of those items are only added if a
> +     corresponding kernel feature has been enabled. Also, the two exceptions
> +     described above will as well lead to less items be attached than
> +     requested.
> +
> +
> +13.1 Known item types
> +---------------------
> +
> +The following attach flags are currently supported.
> +
> +  KDBUS_ATTACH_TIMESTAMP
> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
> +    monotonic and the realtime timestamp, taken when the message was
> +    processed on the kernel side.
> +
> +  KDBUS_ATTACH_CREDS
> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
> +

As mentioned last time, please remove or justify starttime.

> +  KDBUS_ATTACH_AUXGROUPS
> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
> +    number of auxiliary groups the sending task was a member of.
> +
> +  KDBUS_ATTACH_NAMES
> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
> +    connection currently owns. The name and flags are stored in kdbus_item.name
> +    for each of them.
> +

That's interesting.  What's it for?

> +  KDBUS_ATTACH_TID_COMM
> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
> +
> +  KDBUS_ATTACH_PID_COMM
> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
> +
> +  KDBUS_ATTACH_EXE
> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
> +    executable of the sending task, stored in kdbus_item.str.
> +
> +  KDBUS_ATTACH_CMDLINE
> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
> +    arguments of the sending task, as an array of strings, stored in
> +    kdbus_item.str.

Please remove these four items.  They are genuinely useless.  Anything
that uses them for anything is either buggy or should have asked the
sender to put the value in the payload (and immediately wondered why
it was doing that).

> +
> +  KDBUS_ATTACH_CGROUP
> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
> +
> +  KDBUS_ATTACH_CAPS
> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
> +    and derive the number of sets and rows from the item size and the reported
> +    number of valid capability bits.
> +

Please remove this, too, or justify its use.

> +  KDBUS_ATTACH_SECLABEL
> +    Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux
> +    security label of the sending task. Access via kdbus_item->str.
> +

This one, too, and please justify why code that uses it will work in
containers an on non-selinux systems.

> +  KDBUS_ATTACH_AUDIT
> +    Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label
> +    of the sending taskj. Access via kdbus_item->str.
> +

I will NAK the hell out of this until, at the very least, someone
documents what this means, how to parse it, what its stability rules
are, who is allowed to see the value it contains, why that value will
never evolve to become *more* security sensitive than it is now, etc.

Audit gets to do crazy sh*t because it's restricted to privileged
receivers.  This isn't restricted like that, so it doesn't deserve the
same dispensation.  (And, honestly, I'm not sure that audit really
deserves its free pass on making sense.)

> +  KDBUS_ATTACH_CONN_DESCRIPTION
> +    Attaches an item of type KDBUS_ITEM_CONN_DESCRIPTION that contains the
> +    sender connection's current name in kdbus_item.str.
> +

Which name?  Can't there be several?

> +
> +13.1 Metadata and namespaces
> +----------------------------
> +
> +Metadata such as PIDs, UIDs or GIDs are automatically translated to the
> +namespaces of the domain that is used to send a message over. The namespaces
> +of a domain are pinned at creation time, which is when the filesystem has been
> +mounted.
> +
> +Metadata items that cannot be translated are dropped.

What if the receiver said that the item was mandatory?


Thanks,
Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 17:55             ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21 17:55 UTC (permalink / raw)
  To: Sasha Levin
  Cc: David Herrmann, Arnd Bergmann, ebiederm, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Andy Lutomirski, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

On Fri, Nov 21, 2014 at 12:03:51PM -0500, Sasha Levin wrote:
> On 11/21/2014 11:56 AM, Greg Kroah-Hartman wrote:
> >>> Maybe it's worth basing your git tree on top of Al's rather than a random
> >>> > > -rc, since it's now a filesystem?
> >> > 
> >> > Sure, sounds good.
> > No, I'll keep it as is, we can handle the merge issues later when it
> > hits Linus's tree, this makes it easier for me and others to test it
> > out properly.
> 
> It should be hitting -next, not Linus's tree. This is why we have an
> integration tree, no?

Yes, it will hit -next, and the merge issue can be resolved there.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add node and filesystem implementation
@ 2014-11-21 17:55             ` Greg Kroah-Hartman
  0 siblings, 0 replies; 73+ messages in thread
From: Greg Kroah-Hartman @ 2014-11-21 17:55 UTC (permalink / raw)
  To: Sasha Levin
  Cc: David Herrmann, Arnd Bergmann, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

On Fri, Nov 21, 2014 at 12:03:51PM -0500, Sasha Levin wrote:
> On 11/21/2014 11:56 AM, Greg Kroah-Hartman wrote:
> >>> Maybe it's worth basing your git tree on top of Al's rather than a random
> >>> > > -rc, since it's now a filesystem?
> >> > 
> >> > Sure, sounds good.
> > No, I'll keep it as is, we can handle the merge issues later when it
> > hits Linus's tree, this makes it easier for me and others to test it
> > out properly.
> 
> It should be hitting -next, not Linus's tree. This is why we have an
> integration tree, no?

Yes, it will hit -next, and the merge issue can be resolved there.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code to gather metadata
@ 2014-11-21 19:50     ` Andy Lutomirski
  0 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-21 19:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, arnd, ebiederm, gnomes, teg, jkosina, luto,
	linux-api, linux-kernel
  Cc: daniel, dh.herrmann, tixxdz

On 11/20/2014 09:02 PM, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel@zonque.org>
> 
> A connection chooses which metadata it wants to have attached to each
> message it receives with kdbus_cmd_hello.attach_flags. The metadata
> will be attached as items to the messages. All metadata refers to
> information about the sending task at sending time, unless otherwise
> stated. Also, the metadata is copied, not referenced, so even if the
> sending task doesn't exist anymore at the time the message is received,
> the information is still preserved.

Namespace comments below.

> +
> +static int kdbus_meta_append_cred(struct kdbus_meta *meta,
> +				  const struct kdbus_domain *domain)
> +{
> +	struct kdbus_creds creds = {
> +		.uid = from_kuid_munged(domain->user_namespace, current_uid()),
> +		.gid = from_kgid_munged(domain->user_namespace, current_gid()),
> +		.pid = task_pid_nr_ns(current, domain->pid_namespace),
> +		.tid = task_tgid_nr_ns(current, domain->pid_namespace),

This is better than before -- at least it gets translation right part of
the way.  But it's still wrong if the receiver's namespace doesn't match
the domain.

Also, please move pid and tgid into their own item.  They suck for
reasons that have been beaten to death.  Let's make it possible to
deprecate them separately in the future.

> +		.starttime = current->start_time,

I'm repeating myself here, but... why?

> +static int kdbus_meta_append_auxgroups(struct kdbus_meta *meta,
> +				       const struct kdbus_domain *domain)
> +{
> +	struct group_info *info;
> +	struct kdbus_item *item;
> +	int i, ret = 0;
> +	u64 *gid;
> +
> +	info = get_current_groups();
> +	item = kdbus_meta_append_item(meta, KDBUS_ITEM_AUXGROUPS,
> +				      info->ngroups * sizeof(*gid));
> +	if (IS_ERR(item)) {
> +		ret = PTR_ERR(item);
> +		goto exit_put_groups;
> +	}
> +
> +	gid = (u64 *) item->data;
> +
> +	for (i = 0; i < info->ngroups; i++)
> +		gid[i] = from_kgid_munged(domain->user_namespace,
> +					  GROUP_AT(info, i));

Ditto.

> +static int kdbus_meta_append_exe(struct kdbus_meta *meta)

NAK.

> +{
> +	struct mm_struct *mm = get_task_mm(current);
> +	struct path *exe_path = NULL;
> +	char *pathname;
> +	int ret = 0;
> +	size_t len;
> +	char *tmp;
> +
> +	if (!mm)
> +		return -EFAULT;
> +
> +	down_read(&mm->mmap_sem);
> +	if (mm->exe_file) {
> +		path_get(&mm->exe_file->f_path);
> +		exe_path = &mm->exe_file->f_path;
> +	}
> +	up_read(&mm->mmap_sem);
> +
> +	if (!exe_path)
> +		goto exit_mmput;
> +
> +	tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
> +	if (!tmp) {
> +		ret = -ENOMEM;
> +		goto exit_path_put;
> +	}
> +
> +	pathname = d_path(exe_path, tmp, PAGE_SIZE);

My NAK notwithstanding, the namespacing here is completely bogus.

> +static int kdbus_meta_append_cmdline(struct kdbus_meta *meta)

NAK

> +static int kdbus_meta_append_caps(struct kdbus_meta *meta)
> +{
> +	struct caps {
> +		u32 last_cap;
> +		struct {
> +			u32 caps[_KERNEL_CAPABILITY_U32S];
> +		} set[4];
> +	} caps;
> +	unsigned int i;
> +	const struct cred *cred = current_cred();
> +
> +	caps.last_cap = CAP_LAST_CAP;
> +
> +	for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
> +		caps.set[0].caps[i] = cred->cap_inheritable.cap[i];
> +		caps.set[1].caps[i] = cred->cap_permitted.cap[i];
> +		caps.set[2].caps[i] = cred->cap_effective.cap[i];
> +		caps.set[3].caps[i] = cred->cap_bset.cap[i];
> +	}

Please leave this in so that I can root every single kdbus-using system.
 It'll be lots of fun.

Snark aside, the correct fix is IMO to delete this function entirely.
Even if you could find a way to implement it safely (which will be
distinctly nontrivial), it seems like a bad idea to begin with.

> +#ifdef CONFIG_CGROUPS
> +static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
> +{
> +	char *buf, *path;
> +	int ret;
> +
> +	buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	path = task_cgroup_path(current, buf, PAGE_SIZE);

This may have strange interactions with cgroupns.  It's fixable, though,
but only once you implement translation at receive time, and I think
you'll have to do that to get any of this to work right.

> +
> +	if (path)
> +		ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, path);
> +	else
> +		ret = -ENAMETOOLONG;
> +
> +	free_page((unsigned long) buf);
> +
> +	return ret;
> +}
> +#endif
> +
> +#ifdef CONFIG_AUDITSYSCALL
> +static int kdbus_meta_append_audit(struct kdbus_meta *meta,
> +				   const struct kdbus_domain *domain)
> +{
> +	struct kdbus_audit audit;
> +
> +	audit.loginuid = from_kuid_munged(domain->user_namespace,
> +					  audit_get_loginuid(current));
> +	audit.sessionid = audit_get_sessionid(current);
> +
> +	return kdbus_meta_append_data(meta, KDBUS_ITEM_AUDIT,
> +				      &audit, sizeof(audit));

So *that's* what audit means.  Please document this and consider
renaming it to something like AUDIT_LOGINUID_AND_SESSIONID.

> +#ifdef CONFIG_SECURITY
> +static int kdbus_meta_append_seclabel(struct kdbus_meta *meta)
> +{
> +	u32 len, sid;
> +	char *label;
> +	int ret;
> +
> +	security_task_getsecid(current, &sid);
> +	ret = security_secid_to_secctx(sid, &label, &len);
> +	if (ret == -EOPNOTSUPP)
> +		return 0;
> +	if (ret < 0)
> +		return ret;
> +
> +	if (label && len > 0)
> +		ret = kdbus_meta_append_data(meta, KDBUS_ITEM_SECLABEL,
> +					     label, len);

This thing needs a clear, valid use case.  I think that the use case
should document how non-enforcing mode is supposed to work, too.

Also, there should be a justification for why the LSM hooks by
themselves aren't good enough to remove the need for this.

> +
> +/**
> + * kdbus_meta_size() - calculate the size of an excerpt of a metadata db
> + * @meta:	The database object containing the metadata

What is a "database object containing the metadata"?

Anyway, this is unreviewable because not only is the context in which
this function called not specified in this patch, but the function has
no callers in this patch.  However...

> + * @conn_dst:	The connection that is about to receive the data

...this suggests that this is called in the recipient's context, so...

> + * @mask:	Pointer to KDBUS_ATTACH_* bitmask to calculate the size for.
> + *		Callers *must* use the same mask for calls to
> + *		kdbus_meta_write().
> + *
> + * Return: the size in bytes the masked data will consume. Data that should
> + * not received by @conn_dst will be filtered out.
> + */
> +size_t kdbus_meta_size(const struct kdbus_meta *meta,
> +		       const struct kdbus_conn *conn_dst,
> +		       u64 *mask)
> +{
> +	struct kdbus_domain *domain = conn_dst->ep->bus->domain;
> +	const struct kdbus_item *item;
> +	size_t size = 0;
> +
> +	/*
> +	 * We currently don't have a way to translate capability flags between
> +	 * user namespaces, so let's drop these items in such cases.
> +	 */
> +	if (domain->user_namespace != current_user_ns())
> +		*mask &= ~KDBUS_ATTACH_CAPS;

...this is still completely wrong, and I'll be able to root kdbus systems :)

> +
> +	/*
> +	 * If the domain was created with hide_pid enabled, drop all items
> +	 * except for such not revealing anything about the task.
> +	 */
> +	if (domain->pid_namespace->hide_pid)
> +		*mask &= KDBUS_ATTACH_TIMESTAMP | KDBUS_ATTACH_NAMES |
> +			 KDBUS_ATTACH_CONN_DESCRIPTION;

Huh?  This looks wrong.

I realize that some of the systemd people seem to think that hide_pid is
unusable right now, but it really does seem to be usable.  Doing this,
however, will indeed make it unusable.

What are you trying to do here?  I think that the correct fix is to
remove support for all of the questionable metadata items and get rid of
this check.

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code to gather metadata
@ 2014-11-21 19:50     ` Andy Lutomirski
  0 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-21 19:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, arnd-r2nGTMty4D4,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A

On 11/20/2014 09:02 PM, Greg Kroah-Hartman wrote:
> From: Daniel Mack <daniel-cYrQPVfZoowdnm+yROfE0A@public.gmane.org>
> 
> A connection chooses which metadata it wants to have attached to each
> message it receives with kdbus_cmd_hello.attach_flags. The metadata
> will be attached as items to the messages. All metadata refers to
> information about the sending task at sending time, unless otherwise
> stated. Also, the metadata is copied, not referenced, so even if the
> sending task doesn't exist anymore at the time the message is received,
> the information is still preserved.

Namespace comments below.

> +
> +static int kdbus_meta_append_cred(struct kdbus_meta *meta,
> +				  const struct kdbus_domain *domain)
> +{
> +	struct kdbus_creds creds = {
> +		.uid = from_kuid_munged(domain->user_namespace, current_uid()),
> +		.gid = from_kgid_munged(domain->user_namespace, current_gid()),
> +		.pid = task_pid_nr_ns(current, domain->pid_namespace),
> +		.tid = task_tgid_nr_ns(current, domain->pid_namespace),

This is better than before -- at least it gets translation right part of
the way.  But it's still wrong if the receiver's namespace doesn't match
the domain.

Also, please move pid and tgid into their own item.  They suck for
reasons that have been beaten to death.  Let's make it possible to
deprecate them separately in the future.

> +		.starttime = current->start_time,

I'm repeating myself here, but... why?

> +static int kdbus_meta_append_auxgroups(struct kdbus_meta *meta,
> +				       const struct kdbus_domain *domain)
> +{
> +	struct group_info *info;
> +	struct kdbus_item *item;
> +	int i, ret = 0;
> +	u64 *gid;
> +
> +	info = get_current_groups();
> +	item = kdbus_meta_append_item(meta, KDBUS_ITEM_AUXGROUPS,
> +				      info->ngroups * sizeof(*gid));
> +	if (IS_ERR(item)) {
> +		ret = PTR_ERR(item);
> +		goto exit_put_groups;
> +	}
> +
> +	gid = (u64 *) item->data;
> +
> +	for (i = 0; i < info->ngroups; i++)
> +		gid[i] = from_kgid_munged(domain->user_namespace,
> +					  GROUP_AT(info, i));

Ditto.

> +static int kdbus_meta_append_exe(struct kdbus_meta *meta)

NAK.

> +{
> +	struct mm_struct *mm = get_task_mm(current);
> +	struct path *exe_path = NULL;
> +	char *pathname;
> +	int ret = 0;
> +	size_t len;
> +	char *tmp;
> +
> +	if (!mm)
> +		return -EFAULT;
> +
> +	down_read(&mm->mmap_sem);
> +	if (mm->exe_file) {
> +		path_get(&mm->exe_file->f_path);
> +		exe_path = &mm->exe_file->f_path;
> +	}
> +	up_read(&mm->mmap_sem);
> +
> +	if (!exe_path)
> +		goto exit_mmput;
> +
> +	tmp = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
> +	if (!tmp) {
> +		ret = -ENOMEM;
> +		goto exit_path_put;
> +	}
> +
> +	pathname = d_path(exe_path, tmp, PAGE_SIZE);

My NAK notwithstanding, the namespacing here is completely bogus.

> +static int kdbus_meta_append_cmdline(struct kdbus_meta *meta)

NAK

> +static int kdbus_meta_append_caps(struct kdbus_meta *meta)
> +{
> +	struct caps {
> +		u32 last_cap;
> +		struct {
> +			u32 caps[_KERNEL_CAPABILITY_U32S];
> +		} set[4];
> +	} caps;
> +	unsigned int i;
> +	const struct cred *cred = current_cred();
> +
> +	caps.last_cap = CAP_LAST_CAP;
> +
> +	for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
> +		caps.set[0].caps[i] = cred->cap_inheritable.cap[i];
> +		caps.set[1].caps[i] = cred->cap_permitted.cap[i];
> +		caps.set[2].caps[i] = cred->cap_effective.cap[i];
> +		caps.set[3].caps[i] = cred->cap_bset.cap[i];
> +	}

Please leave this in so that I can root every single kdbus-using system.
 It'll be lots of fun.

Snark aside, the correct fix is IMO to delete this function entirely.
Even if you could find a way to implement it safely (which will be
distinctly nontrivial), it seems like a bad idea to begin with.

> +#ifdef CONFIG_CGROUPS
> +static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
> +{
> +	char *buf, *path;
> +	int ret;
> +
> +	buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	path = task_cgroup_path(current, buf, PAGE_SIZE);

This may have strange interactions with cgroupns.  It's fixable, though,
but only once you implement translation at receive time, and I think
you'll have to do that to get any of this to work right.

> +
> +	if (path)
> +		ret = kdbus_meta_append_str(meta, KDBUS_ITEM_CGROUP, path);
> +	else
> +		ret = -ENAMETOOLONG;
> +
> +	free_page((unsigned long) buf);
> +
> +	return ret;
> +}
> +#endif
> +
> +#ifdef CONFIG_AUDITSYSCALL
> +static int kdbus_meta_append_audit(struct kdbus_meta *meta,
> +				   const struct kdbus_domain *domain)
> +{
> +	struct kdbus_audit audit;
> +
> +	audit.loginuid = from_kuid_munged(domain->user_namespace,
> +					  audit_get_loginuid(current));
> +	audit.sessionid = audit_get_sessionid(current);
> +
> +	return kdbus_meta_append_data(meta, KDBUS_ITEM_AUDIT,
> +				      &audit, sizeof(audit));

So *that's* what audit means.  Please document this and consider
renaming it to something like AUDIT_LOGINUID_AND_SESSIONID.

> +#ifdef CONFIG_SECURITY
> +static int kdbus_meta_append_seclabel(struct kdbus_meta *meta)
> +{
> +	u32 len, sid;
> +	char *label;
> +	int ret;
> +
> +	security_task_getsecid(current, &sid);
> +	ret = security_secid_to_secctx(sid, &label, &len);
> +	if (ret == -EOPNOTSUPP)
> +		return 0;
> +	if (ret < 0)
> +		return ret;
> +
> +	if (label && len > 0)
> +		ret = kdbus_meta_append_data(meta, KDBUS_ITEM_SECLABEL,
> +					     label, len);

This thing needs a clear, valid use case.  I think that the use case
should document how non-enforcing mode is supposed to work, too.

Also, there should be a justification for why the LSM hooks by
themselves aren't good enough to remove the need for this.

> +
> +/**
> + * kdbus_meta_size() - calculate the size of an excerpt of a metadata db
> + * @meta:	The database object containing the metadata

What is a "database object containing the metadata"?

Anyway, this is unreviewable because not only is the context in which
this function called not specified in this patch, but the function has
no callers in this patch.  However...

> + * @conn_dst:	The connection that is about to receive the data

...this suggests that this is called in the recipient's context, so...

> + * @mask:	Pointer to KDBUS_ATTACH_* bitmask to calculate the size for.
> + *		Callers *must* use the same mask for calls to
> + *		kdbus_meta_write().
> + *
> + * Return: the size in bytes the masked data will consume. Data that should
> + * not received by @conn_dst will be filtered out.
> + */
> +size_t kdbus_meta_size(const struct kdbus_meta *meta,
> +		       const struct kdbus_conn *conn_dst,
> +		       u64 *mask)
> +{
> +	struct kdbus_domain *domain = conn_dst->ep->bus->domain;
> +	const struct kdbus_item *item;
> +	size_t size = 0;
> +
> +	/*
> +	 * We currently don't have a way to translate capability flags between
> +	 * user namespaces, so let's drop these items in such cases.
> +	 */
> +	if (domain->user_namespace != current_user_ns())
> +		*mask &= ~KDBUS_ATTACH_CAPS;

...this is still completely wrong, and I'll be able to root kdbus systems :)

> +
> +	/*
> +	 * If the domain was created with hide_pid enabled, drop all items
> +	 * except for such not revealing anything about the task.
> +	 */
> +	if (domain->pid_namespace->hide_pid)
> +		*mask &= KDBUS_ATTACH_TIMESTAMP | KDBUS_ATTACH_NAMES |
> +			 KDBUS_ATTACH_CONN_DESCRIPTION;

Huh?  This looks wrong.

I realize that some of the systemd people seem to think that hide_pid is
unusable right now, but it really does seem to be usable.  Doing this,
however, will indeed make it unusable.

What are you trying to do here?  I think that the correct fix is to
remove support for all of the questionable metadata items and get rid of
this check.

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-21 17:12     ` Andy Lutomirski
  (?)
@ 2014-11-24 20:16     ` David Herrmann
  2014-11-24 20:57       ` Andy Lutomirski
  -1 siblings, 1 reply; 73+ messages in thread
From: David Herrmann @ 2014-11-24 20:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

Hi Andy!

On Fri, Nov 21, 2014 at 6:12 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
>> From: Daniel Mack <daniel@zonque.org>
>>
>> kdbus is a system for low-latency, low-overhead, easy to use
>> interprocess communication (IPC).
>>
>> The interface to all functions in this driver is implemented through
>> ioctls on files exposed through the mount point of a kdbusfs.  This
>> patch adds detailed documentation about the kernel level API design.
>>
>> Signed-off-by: Daniel Mack <daniel@zonque.org>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> ---
>
>> +  Pool:
>> +    Each connection allocates a piece of shmem-backed memory that is used
>> +    to receive messages and answers to ioctl command from the kernel. It is
>> +    never used to send anything to the kernel. In order to access that memory,
>> +    userspace must mmap() it into its task.
>> +    See section 12 for more details.
>
> At the risk of opening a can of worms, wouldn't this be much more
> useful if you could share a pool between multiple connections?

Within a process it could theoretically be possible to share the same
memory pool between multiple connections made by the process. However,
note that normally a process only has a single connection to the bus
open (possibly two, if it opens a connection to both the system and
the user bus). Now, sharing the receiver buffer could certainly be
considered an optimization, but it would have no effect on
"usefulness", though, as just allocating space from a single shared
per-process receiver won't give you any new possibilities...

We have thought about this, but decided to delay it for now. Shared
pools can easily be added as an extension later on.

[snip]
>> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a
>> +struct kdbus_cmd_make argument.
>> +
>> +struct kdbus_cmd_make {
>> +  __u64 size;
>> +    The overall size of the struct, including its items.
>> +
>> +  __u64 flags;
>> +    The flags for creation.
>> +
>> +    KDBUS_MAKE_ACCESS_GROUP
>> +      Make the device file group-accessible
>> +
>> +    KDBUS_MAKE_ACCESS_WORLD
>> +      Make the device file world-accessible
>
> This thing is a file.  What's wrong with using a normal POSIX mode?
> (And what to the read, write, and exec modes do?)

Domains and buses are directories, endpoints are files. Domains also
create control-files implicitly.

For kdbus clients there is just access or no access, but not
distinction between read, write and execute access. Due to that we
just break this down to per-group and world access bits, since doing
more is pointless, and we shouldn't allow shoehorning more stuff into
the access mode.

[snip]
>> +      KDBUS_ITEM_CREDS
>> +      KDBUS_ITEM_SECLABEL
>> +        Privileged bus users may submit these types in order to create
>> +        connections with faked credentials. The only real use case for this
>> +        is a proxy service which acts on behalf of some other tasks. For a
>> +        connection that runs in that mode, the message's metadata items will
>> +        be limited to what's specified here. See section 13 for more
>> +        information.
>
> This is still confusing.  There are multiple places in which metadata
> is attached.  Which does this apply to?  And why are only creds and
> seclabel listed?

Yes, and there are multiple places where metadata is *gathered*. This
ioctl creates connections, so only the items that are actually
*gathered* by that ioctl are documented here. These items are not part
of any messages, but are used as identification of the connection
owner (and in this particular case, to allow privileged proxies to
overwrite the items so they can properly proxy a legacy-dbus peer
connection).

[snip]
>> +6.5 Getting information about a connection's bus creator
>> +--------------------------------------------------------
>> +
>> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
>> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
>> +the bus the connection is attached to. The metadata returned by this call is
>> +collected during the creation of the bus and is never altered afterwards, so
>> +it provides pristine information on the task that created the bus, at the
>> +moment when it did so.
>
> What's this for?  I understand the need for the creator of busses to
> be authenticated, but doing it like this mean that anyone who will
> *fail* authentication can DoS the authentic creator.

This returns information on a bus owner, to determine whether a
connection is connected to a system, user or session bus. Note that
the bus-creator itself is not a valid peer on the bus, so you cannot
send messages to them. Which kind of DoS do you have in mind?

>> +
>> +7.3 Passing of Payload Data
>> +---------------------------
>> +
>> +When connecting to the bus, receivers request a memory pool of a given size,
>> +large enough to carry all backlog of data enqueued for the connection. The
>> +pool is internally backed by a shared memory file which can be mmap()ed by
>> +the receiver.
>> +
>> +KDBUS_MSG_PAYLOAD_VEC:
>> +  Messages are directly copied by the sending process into the receiver's pool,
>> +  that way two peers can exchange data by effectively doing a single-copy from
>> +  one process to another, the kernel will not buffer the data anywhere else.
>> +
>> +KDBUS_MSG_PAYLOAD_MEMFD:
>> +  Messages can reference memfd files which contain the data.
>> +  memfd files are tmpfs-backed files that allow sealing of the content of the
>> +  file, which prevents all writable access to the file content.
>> +  Only sealed memfd files are accepted as payload data, which enforces
>> +  reliable passing of data; the receiver can assume that neither the sender nor
>> +  anyone else can alter the content after the message is sent.
>
> This should specify *which* seals are checked.

True. Will be added to the documentation. For the record, it's the
full set of seals right now, so
F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_SEAL.

>> +
>> +Apart from the sender filling-in the content into memfd files, the data will
>> +be passed as zero-copy from one process to another, read-only, shared between
>> +the peers.
>> +
>> +
>> +7.4 Receiving messages
>> +----------------------
>> +
>> +Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The
>> +endpoint file of the bus supports poll() to wake up the receiving process when
>> +new messages are queued up to be received.
>> +
>> +With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used.
>> +
>> +struct kdbus_cmd_recv {
>> +  __u64 flags;
>> +    Flags to control the receive command.
>> +
>> +    KDBUS_RECV_PEEK
>> +      Just return the location of the next message. Do not install file
>> +      descriptors or anything else. This is usually used to determine the
>> +      sender of the next queued message.
>> +
>> +    KDBUS_RECV_DROP
>> +      Drop the next message without doing anything else with it, and free the
>> +      pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE.
>> +
>> +    KDBUS_RECV_USE_PRIORITY
>> +      Use the priority field (see below).
>> +
>> +  __u64 kernel_flags;
>> +    Valid flags for this command, returned by the kernel upon each call.
>> +
>> +  __s64 priority;
>> +      With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in
>> +      the queue with at least the given priority. If no such message is waiting
>> +      in the queue, -ENOMSG is returned.
>> +
>> +  __u64 offset;
>> +      Upon return of the ioctl, this field contains the offset in the
>> +      receiver's memory pool.
>> +};
>> +
>> +Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the
>> +offset field contains the location of the new message inside the receiver's
>> +pool. The message is stored as struct kdbus_msg at this offset, and can be
>> +interpreted with the semantics described above.
>
> I'm confused here.  Is sent data written to the pool when send is
> called or when recv is called?

When send is called.

> If the former, what prevents DoS, especially DoS due to sending too many fds?

FDs are installed into the task only when the receiver issues
CMD_RECV, so a task won't explode unless it issues that ioctl. Plus,
receiving FDs is already a per-connection opt-in. Furthermore, you can
CMD_PEEK messages which allows you to look at the full message
_without_ installing FDs. If you don't want the FDs, you can CMD_DROP
the message.

> If the latter, where is the data buffered in the mean time?
>
>> +
>> +Also, if the connection allowed for file descriptor to be passed
>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>> +returns. The receiving task is obliged to close all of them appropriately.
>
> This makes it sound like fds are installed at receive time.  What
> prevents resource exhaustion due to having excessive numbers of fds in
> transit (that are presumably not accounted to anyone)?

We have a per-user message accounting for undelivered messages, as
well as a maximum number of pending messages per connection on the
receiving end. These limits are accounted on a "user<->user" basis, so
the limit of a user A will not affect two other users (B and C)
talking.

>> +
>> +7.5 Canceling messages synchronously waiting for replies
>> +--------------------------------------------------------
>> +
>> +When a connection sends a message with KDBUS_MSG_FLAGS_SYNC_REPLY and
>> +blocks while waiting for the reply, the KDBUS_CMD_MSG_CANCEL ioctl can be
>> +used on the same file descriptor to cancel the message, based on its cookie.
>> +If there are multiple messages with the same cookie that are all synchronously
>> +waiting for a reply, all of them will be canceled. Obviously, this is only
>> +possible in multi-threaded applications.
>
> What does "cancel the message" mean?  Does it just mean that the wait
> for the reply is cancelled?

Yes, exactly. That will be made clearer in the docs. Thanks.

>> +11. Policy
>> +===============================================================================
>> +
>> +A policy databases restrict the possibilities of connections to own, see and
>> +talk to well-known names. It can be associated with a bus (through a policy
>> +holder connection) or a custom endpoint.
>
> ISTM metadata items on bus names should be replaced with policy that
> applies to the domain as a whole and governs bus creation.

No, well-known names are bound to buses, so a bus is really the right
place to hold policy about which process is allowed to claim them.
Every user is allowed to create a bus of its own, there's no policy
for that, and there shouldn't be.

It has nothing to do with metadata items.

>> +A set of policy rules is described by a name and multiple access rules, defined
>> +by the following struct.
>> +
>> +struct kdbus_policy_access {
>> +  __u64 type;  /* USER, GROUP, WORLD */
>> +    One of the following.
>> +
>> +    KDBUS_POLICY_ACCESS_USER
>> +      Grant access to a user with the uid stored in the 'id' field.
>> +
>> +    KDBUS_POLICY_ACCESS_GROUP
>> +      Grant access to a user with the gid stored in the 'id' field.
>> +
>> +    KDBUS_POLICY_ACCESS_WORLD
>> +      Grant access to everyone. The 'id' field is ignored.
>> +
>> +  __u64 access;        /* OWN, TALK, SEE */
>> +    The access to grant.
>> +
>> +    KDBUS_POLICY_SEE
>> +      Allow the name to be seen.
>> +
>> +    KDBUS_POLICY_TALK
>> +      Allow the name to be talked to.
>> +
>> +    KDBUS_POLICY_OWN
>> +      Allow the name to be owned.
>> +
>> +  __u64 id;
>> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
>> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
>> +};
>
>
> What happens if there are multiple matches?

We only have _granting_ policy entries. We search through the
policy-db until we find an entry that grants access. We do _not_ stop
on the first item that matches.

>
>> +
>> +11.4 TALK access and multiple well-known names per connection
>> +-------------------------------------------------------------
>> +
>> +Note that TALK access is checked against all names of a connection.
>> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
>> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
>> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
>> +after all, we allow messages to be directed to either the name or a well-known
>> +name, and policy is applied to the connection, not the name. In other words,
>> +the effective TALK policy for a connection is the most permissive of all names
>> +the connection owns.
>
> This does seem illogical.  Does the recipient at least know which
> well-known name was addressed?

If the sender addressed it to a well-known name, yes. If the sender
addressed the message to a unique-ID, there will be no such name, of
course. Still, the policy applies to such transactions either way
(standard D-Bus behavior).

Note however that dbus1 will not pass along the destination well-known
name, hence most userspace libraries will ignore this information too,
even if they run on kdbus which might pass this information around.
The right way for services that carry multiple service names to
discern which actual service is being talked to is having separate
object paths for the different functionality to hide between the
services.

>> +11.5 Implicit policies
>> +----------------------
>> +
>> +Depending on the type of the endpoint, a set of implicit rules might be
>> +enforced. On default endpoints, the following set is enforced:
>> +
>
> How do these rules interact with installed policy?

As said before, all policy entries _grant_ access. We look through all
entries until we find one that grants access.

>> +  * Privileged connections always override any installed policy. Those
>> +    connections could easily install their own policies, so there is no
>> +    reason to enforce installed policies.
>> +  * Connections can always talk to connections of the same user. This
>> +    includes broadcast messages.
>
> Why?

All limits on buses are enforced on a user<->user basis. We don't want
to provide policies that are more fine-grained than our accounting.

> If anyone ever strengthens the concept of identity to include
> things other than users (hmm -- there are already groups), this could
> be very limiting.

If user-based accounting is not suitable, you can create custom
endpoints. Future extensions to that are always welcome. So far, the
default user-based accounting was enough. And I think it's suitable as
default.

>> +  * Connections that own names might send broadcast messages to other
>> +    connections that belong to a different user, but only if that
>> +    destination connection does not own any name.
>> +
>
> This is weird.  It is also differently illogical than the "illogical"
> thing above.

Actually it follows the same model described above. If two connections
are running under the same user then broadcasts are allowed, but if
they are running under different users *and* if the destination owns a
well-known name, then broadcasts are subject to TALK policy checks
since that destination may own a restricted well-known name that is
not interested in broadcasts. So this implicit policy is just
fast-path for the common case where the target is subscribed to a
broadcast and does not own any name.

>> +12. Pool
>> +===============================================================================
>> +
>> +A pool for data received from the kernel is installed for every connection of
>> +the bus, and is sized according to kdbus_cmd_hello.pool_size. It is accessed
>> +when one of the following ioctls is issued:
>> +
>> +  * KDBUS_CMD_MSG_RECV, to receive a message
>> +  * KDBUS_CMD_NAME_LIST, to dump the name registry
>> +  * KDBUS_CMD_CONN_INFO, to retrieve information on a connection
>> +
>> +Internally, the pool is organized in slices, stored in an rb-tree. The offsets
>> +returned by either one of the aforementioned ioctls describe offsets inside the
>> +pool. In order to make the slice available for subsequent calls, KDBUS_CMD_FREE
>> +has to be called on the offset.
>
> Why are you documenting that the slices are stored in an rb-tree?
> That's just an implementation details, right?

Dropped, thanks.

>> +
>> +To access the memory, the caller is expected to mmap() it to its task, like
>> +this:
>> +
>> +  /*
>> +   * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the
>> +   * value that was previously passed in the .pool_size field of struct
>> +   * kdbus_cmd_hello.
>> +   */
>> +
>> +  buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0);
>> +
>
> Will mapping with PROT_WRITE fail?  What about MAP_SHARED?

PROT_WRITE will fail and VM_MAYWRITE is cleared.

> And why are you suggesting MAP_PRIVATE?  That's just strange.

This was a leftover from pre-3.17. memfds require you to use
MAP_PRIVATE if SEAL_WRITE is set (which is a linux-specific behavior,
where MAP_SHARED is always accounted as writable mapping as
mprotect(VM_WRITE) can be called any time). I fixed this up.

>> +
>> +13. Metadata
>> +===============================================================================
[snip]
>> +13.1 Known item types
>> +---------------------
>> +
>> +The following attach flags are currently supported.
>> +
>> +  KDBUS_ATTACH_TIMESTAMP
>> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
>> +    monotonic and the realtime timestamp, taken when the message was
>> +    processed on the kernel side.
>> +
>> +  KDBUS_ATTACH_CREDS
>> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
>> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
>> +
>
> As mentioned last time, please remove or justify starttime.

starttime allows detecting PID overflows. Exposing the process
starttime is useful to detect when a PID is getting reused.
Unfortunately, we don't have 64bit pids, so we need the pid+time
combination to avoid ambiguity.

>> +  KDBUS_ATTACH_AUXGROUPS
>> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
>> +    number of auxiliary groups the sending task was a member of.
>> +
>> +  KDBUS_ATTACH_NAMES
>> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
>> +    connection currently owns. The name and flags are stored in kdbus_item.name
>> +    for each of them.
>> +
>
> That's interesting.  What's it for?

It a valuable piece of information for receivers to know which bus
names a sender has claimed. For instance, we need this information for
the D-Bus proxy service, because we have to apply D-Bus1 policy in
that case, and we need to get a list of owned names in a race-free
manner to check the policy against.

>> +  KDBUS_ATTACH_TID_COMM
>> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
>> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
>> +
>> +  KDBUS_ATTACH_PID_COMM
>> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
>> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
>> +
>> +  KDBUS_ATTACH_EXE
>> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
>> +    executable of the sending task, stored in kdbus_item.str.
>> +
>> +  KDBUS_ATTACH_CMDLINE
>> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
>> +    arguments of the sending task, as an array of strings, stored in
>> +    kdbus_item.str.
>
> Please remove these four items.  They are genuinely useless.  Anything
> that uses them for anything is either buggy or should have asked the
> sender to put the value in the payload (and immediately wondered why
> it was doing that).

We use them for logging, debugging and monitoring. With our wireshark
extension it's pretty useful to know the comm-name of a process when
monitoring a bus. As we explained last time, this is not about
security. We're aware that a process can modify them. We use them only
as additional meta-data for logging and debugging.

If we put those items into the payload, we have to transmit this data
even if the destination process is not interested in this.
Furthermore, each caller has to run multiple syscalls on each message
to retrieve those values.

We use these items heavily for filtering and debugging, regardless of
the payload protocol that is transmitted on the bus.

To give another specific use-case here: dbus supports bus activation,
where a message sent to a non-running service causes it to be spawned
implicitly without losing the message. Now, with such a scheme it is
incredibly useful to be able to log which client caused a service to
be triggered, hence we want to know the cmdline/exe/comm of the
client. Not knowing this is a major pita when trying to trace the boot
process and figuring out why a specific service got activated.

Also note that since v2 of the patch there's actually a per-sender
mask for meta-data like this, hence a peer which doesn't want to pass
its exec/cmdline/comm along can do that. Of course, this will
seriously hamper debuggability and transparency...

>> +
>> +  KDBUS_ATTACH_CGROUP
>> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
>> +
>> +  KDBUS_ATTACH_CAPS
>> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
>> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
>> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
>> +    and derive the number of sets and rows from the item size and the reported
>> +    number of valid capability bits.
>> +
>
> Please remove this, too, or justify its use.

cgroup information tells us which service is doing a bus request. This
is useful for a variety of things. For example, the bus activation
logging item above benefits from it. In general, if some message shall
be logged about any client it is useful to know its service name.

Capabilities are useful to authenticate specific method calls. For
example, when a client asks systemd to reboot, without this concept we
can only check for UID==0 to decide whether to allow this. Breaking
this down to capabilities in a race-free way has the benefit of
allowing systemd to bind this to CAP_SYS_BOOT instead. There is no
reason to deny a process with CAP_SYS_BOOT to reboot via bus-APIs, as
they could just enforce it via syscalls, anyway.

We think it's a useful and reliable authentication method. Why should
we remove it?

Anyway, these items are  just optional. The sender can refuse the
reveal them, and the item is only transmitted if the receiver opted in
for it, too. So there's no need to drop any item type from the
protocol.

>> +  KDBUS_ATTACH_SECLABEL
>> +    Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux
>> +    security label of the sending task. Access via kdbus_item->str.
>> +
>
> This one, too, and please justify why code that uses it will work in
> containers an on non-selinux systems.

This maps to SCM_PEERSEC on AF_UNIX sockets. This is actually heavily
used already on dbus1. For example, systemd actually uses this
information about a bus peer to check whether certain operations are
allowed, on top of the normal policy. To give an explicit example:
when starting a service systemd gets the client's peer label from
AF_UNIX via SCM_PEERSEC, reads the security label of the service file
in question, and then checks the selinux database (via one of
libselinux APIs) whether to allow this change.

Note that this is really nothing we came up with, it's code from the
SELinux folks, it's simple enough, and has been exposed and used that
way since ages in dbus1. libselinux offers all the right APIs to make
use of this, and kdbus really needs to provide the same functionality
as dbus1 in this regard here.

Let met stress that checking the selinux database here is alway *on
top* of the normal UID/caps based security checks services do. This is
exactly how selinux enforces checks on files on top of UID/caps
checks, or on process or anything else that is selinux-managed.

>> +  KDBUS_ATTACH_AUDIT
>> +    Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label
>> +    of the sending taskj. Access via kdbus_item->str.
>> +
>
> I will NAK the hell out of this until, at the very least, someone
> documents what this means, how to parse it, what its stability rules
> are, who is allowed to see the value it contains, why that value will
> never evolve to become *more* security sensitive than it is now, etc.
>
> Audit gets to do crazy sh*t because it's restricted to privileged
> receivers.  This isn't restricted like that, so it doesn't deserve the
> same dispensation.  (And, honestly, I'm not sure that audit really
> deserves its free pass on making sense.)

This was based on a misunderstanding, so I will ignore it here. Lets
discuss this on the metadata-patch, in case it's still unclear.

>> +  KDBUS_ATTACH_CONN_DESCRIPTION
>> +    Attaches an item of type KDBUS_ITEM_CONN_DESCRIPTION that contains the
>> +    sender connection's current name in kdbus_item.str.
>> +
>
> Which name?  Can't there be several?

No. There is only one connection description string, copied verbatim
from what the connection supplied during HELLO.

Note that this is really just about debugging, since in some cases
processes might end up with multiple kdbus fds, and it is useful in
tools like "busctl" to know which one is which.

>> +
>> +13.1 Metadata and namespaces
>> +----------------------------
>> +
>> +Metadata such as PIDs, UIDs or GIDs are automatically translated to the
>> +namespaces of the domain that is used to send a message over. The namespaces
>> +of a domain are pinned at creation time, which is when the filesystem has been
>> +mounted.
>> +
>> +Metadata items that cannot be translated are dropped.
>
> What if the receiver said that the item was mandatory?

It is still dropped. It's the responsibility of the receiver to reject
messages that lack required metadata.

Thanks for the review, Andy. Documentation fixes coming up!
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-24 20:16     ` David Herrmann
@ 2014-11-24 20:57       ` Andy Lutomirski
  2014-11-26 11:55           ` David Herrmann
  0 siblings, 1 reply; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-24 20:57 UTC (permalink / raw)
  To: David Herrmann
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann <dh.herrmann@gmail.com> wrote:
> Hi Andy!
>
> On Fri, Nov 21, 2014 at 6:12 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Thu, Nov 20, 2014 at 9:02 PM, Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>>> From: Daniel Mack <daniel@zonque.org>
>>>
>>> kdbus is a system for low-latency, low-overhead, easy to use
>>> interprocess communication (IPC).
>>>
>>> The interface to all functions in this driver is implemented through
>>> ioctls on files exposed through the mount point of a kdbusfs.  This
>>> patch adds detailed documentation about the kernel level API design.
>>>
>>> Signed-off-by: Daniel Mack <daniel@zonque.org>
>>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>>> Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
>>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> ---
>>
>>> +  Pool:
>>> +    Each connection allocates a piece of shmem-backed memory that is used
>>> +    to receive messages and answers to ioctl command from the kernel. It is
>>> +    never used to send anything to the kernel. In order to access that memory,
>>> +    userspace must mmap() it into its task.
>>> +    See section 12 for more details.
>>
>> At the risk of opening a can of worms, wouldn't this be much more
>> useful if you could share a pool between multiple connections?
>
> Within a process it could theoretically be possible to share the same
> memory pool between multiple connections made by the process. However,
> note that normally a process only has a single connection to the bus
> open (possibly two, if it opens a connection to both the system and
> the user bus). Now, sharing the receiver buffer could certainly be
> considered an optimization, but it would have no effect on
> "usefulness", though, as just allocating space from a single shared
> per-process receiver won't give you any new possibilities...
>
> We have thought about this, but decided to delay it for now. Shared
> pools can easily be added as an extension later on.
>
> [snip]
>>> +KDBUS_CMD_BUS_MAKE, and KDBUS_CMD_ENDPOINT_MAKE take a
>>> +struct kdbus_cmd_make argument.
>>> +
>>> +struct kdbus_cmd_make {
>>> +  __u64 size;
>>> +    The overall size of the struct, including its items.
>>> +
>>> +  __u64 flags;
>>> +    The flags for creation.
>>> +
>>> +    KDBUS_MAKE_ACCESS_GROUP
>>> +      Make the device file group-accessible
>>> +
>>> +    KDBUS_MAKE_ACCESS_WORLD
>>> +      Make the device file world-accessible
>>
>> This thing is a file.  What's wrong with using a normal POSIX mode?
>> (And what to the read, write, and exec modes do?)
>
> Domains and buses are directories, endpoints are files. Domains also
> create control-files implicitly.
>
> For kdbus clients there is just access or no access, but not
> distinction between read, write and execute access. Due to that we
> just break this down to per-group and world access bits, since doing
> more is pointless, and we shouldn't allow shoehorning more stuff into
> the access mode.
>
> [snip]
>>> +      KDBUS_ITEM_CREDS
>>> +      KDBUS_ITEM_SECLABEL
>>> +        Privileged bus users may submit these types in order to create
>>> +        connections with faked credentials. The only real use case for this
>>> +        is a proxy service which acts on behalf of some other tasks. For a
>>> +        connection that runs in that mode, the message's metadata items will
>>> +        be limited to what's specified here. See section 13 for more
>>> +        information.
>>
>> This is still confusing.  There are multiple places in which metadata
>> is attached.  Which does this apply to?  And why are only creds and
>> seclabel listed?
>
> Yes, and there are multiple places where metadata is *gathered*. This
> ioctl creates connections, so only the items that are actually
> *gathered* by that ioctl are documented here. These items are not part
> of any messages, but are used as identification of the connection
> owner (and in this particular case, to allow privileged proxies to
> overwrite the items so they can properly proxy a legacy-dbus peer
> connection).

But don't proxies need to override the per-message metadata, too?
This is why I'm confused (and my confusion about what's happening goes
down into the code, too).  IMO it would be great if all the variables
were named things like message_metadata, conn_metadata, bus_metadata,
etc.

>
> [snip]
>>> +6.5 Getting information about a connection's bus creator
>>> +--------------------------------------------------------
>>> +
>>> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
>>> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
>>> +the bus the connection is attached to. The metadata returned by this call is
>>> +collected during the creation of the bus and is never altered afterwards, so
>>> +it provides pristine information on the task that created the bus, at the
>>> +moment when it did so.
>>
>> What's this for?  I understand the need for the creator of busses to
>> be authenticated, but doing it like this mean that anyone who will
>> *fail* authentication can DoS the authentic creator.
>
> This returns information on a bus owner, to determine whether a
> connection is connected to a system, user or session bus. Note that
> the bus-creator itself is not a valid peer on the bus, so you cannot
> send messages to them. Which kind of DoS do you have in mind?

I assume that the logic is something like:

connect to bus
request bus metadata
if (bus metadata matches expectations) {
  great, trust the bus!
} else {
  oh crap!
}

If I'm understanding it right, then user code only really has two
outcomes: the good case and the "oh crap!" case.  The problem is that
"oh crap!" isn't a clean failure -- if it happens, then the
application has just been DoSed, because in that case, one of two
things happened:

1. Some policy mismatch means that the legitimate bus owner did create
the bus, but the user application is confused.  This will result in
difficult-to-diagnose failures.

2. A malicious or confused program created the bus.  This is a DoS --
even the legitimate bus creator can't actually create the bus now.

So I think that the policy should be applied at the time that the bus
name is claimed, not at the time that someone else tries to use the
bus.  IOW, the way that you verify you're talking to the system bus
should be by checking that the bus is called "system", not by checking
that UID 0 created the bus.

>
>>> +
>>> +7.3 Passing of Payload Data
>>> +---------------------------
>>> +
>>> +When connecting to the bus, receivers request a memory pool of a given size,
>>> +large enough to carry all backlog of data enqueued for the connection. The
>>> +pool is internally backed by a shared memory file which can be mmap()ed by
>>> +the receiver.
>>> +
>>> +KDBUS_MSG_PAYLOAD_VEC:
>>> +  Messages are directly copied by the sending process into the receiver's pool,
>>> +  that way two peers can exchange data by effectively doing a single-copy from
>>> +  one process to another, the kernel will not buffer the data anywhere else.
>>> +
>>> +KDBUS_MSG_PAYLOAD_MEMFD:
>>> +  Messages can reference memfd files which contain the data.
>>> +  memfd files are tmpfs-backed files that allow sealing of the content of the
>>> +  file, which prevents all writable access to the file content.
>>> +  Only sealed memfd files are accepted as payload data, which enforces
>>> +  reliable passing of data; the receiver can assume that neither the sender nor
>>> +  anyone else can alter the content after the message is sent.
>>
>> This should specify *which* seals are checked.
>
> True. Will be added to the documentation. For the record, it's the
> full set of seals right now, so
> F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_SEAL.

Makes sense.

>
>>> +
>>> +Apart from the sender filling-in the content into memfd files, the data will
>>> +be passed as zero-copy from one process to another, read-only, shared between
>>> +the peers.
>>> +
>>> +
>>> +7.4 Receiving messages
>>> +----------------------
>>> +
>>> +Messages are received by the client with the KDBUS_CMD_MSG_RECV ioctl. The
>>> +endpoint file of the bus supports poll() to wake up the receiving process when
>>> +new messages are queued up to be received.
>>> +
>>> +With the KDBUS_CMD_MSG_RECV ioctl, a struct kdbus_cmd_recv is used.
>>> +
>>> +struct kdbus_cmd_recv {
>>> +  __u64 flags;
>>> +    Flags to control the receive command.
>>> +
>>> +    KDBUS_RECV_PEEK
>>> +      Just return the location of the next message. Do not install file
>>> +      descriptors or anything else. This is usually used to determine the
>>> +      sender of the next queued message.
>>> +
>>> +    KDBUS_RECV_DROP
>>> +      Drop the next message without doing anything else with it, and free the
>>> +      pool slice. This a short-cut for KDBUS_RECV_PEEK and KDBUS_CMD_FREE.
>>> +
>>> +    KDBUS_RECV_USE_PRIORITY
>>> +      Use the priority field (see below).
>>> +
>>> +  __u64 kernel_flags;
>>> +    Valid flags for this command, returned by the kernel upon each call.
>>> +
>>> +  __s64 priority;
>>> +      With KDBUS_RECV_USE_PRIORITY set in flags, receive the next message in
>>> +      the queue with at least the given priority. If no such message is waiting
>>> +      in the queue, -ENOMSG is returned.
>>> +
>>> +  __u64 offset;
>>> +      Upon return of the ioctl, this field contains the offset in the
>>> +      receiver's memory pool.
>>> +};
>>> +
>>> +Unless KDBUS_RECV_DROP was passed, and given that the ioctl succeeded, the
>>> +offset field contains the location of the new message inside the receiver's
>>> +pool. The message is stored as struct kdbus_msg at this offset, and can be
>>> +interpreted with the semantics described above.
>>
>> I'm confused here.  Is sent data written to the pool when send is
>> called or when recv is called?
>
> When send is called.
>
>> If the former, what prevents DoS, especially DoS due to sending too many fds?
>
> FDs are installed into the task only when the receiver issues
> CMD_RECV, so a task won't explode unless it issues that ioctl. Plus,
> receiving FDs is already a per-connection opt-in. Furthermore, you can
> CMD_PEEK messages which allows you to look at the full message
> _without_ installing FDs. If you don't want the FDs, you can CMD_DROP
> the message.
>
>> If the latter, where is the data buffered in the mean time?
>>
>>> +
>>> +Also, if the connection allowed for file descriptor to be passed
>>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>>> +returns. The receiving task is obliged to close all of them appropriately.
>>
>> This makes it sound like fds are installed at receive time.  What
>> prevents resource exhaustion due to having excessive numbers of fds in
>> transit (that are presumably not accounted to anyone)?
>
> We have a per-user message accounting for undelivered messages, as
> well as a maximum number of pending messages per connection on the
> receiving end. These limits are accounted on a "user<->user" basis, so
> the limit of a user A will not affect two other users (B and C)
> talking.

But you can shove tons of fds in a message, and you can have lots of
messages, and some of the fds can be fds of unix sockets that have fds
queued up in them, and one of those fds could be the fd to the kdbus
connection that sent the fd...

This is not advice as to what to do about it, but I think that it will
be a problem at some point.


>>> +11. Policy
>>> +===============================================================================
>>> +
>>> +A policy databases restrict the possibilities of connections to own, see and
>>> +talk to well-known names. It can be associated with a bus (through a policy
>>> +holder connection) or a custom endpoint.
>>
>> ISTM metadata items on bus names should be replaced with policy that
>> applies to the domain as a whole and governs bus creation.
>
> No, well-known names are bound to buses, so a bus is really the right
> place to hold policy about which process is allowed to claim them.
> Every user is allowed to create a bus of its own, there's no policy
> for that, and there shouldn't be.
>
> It has nothing to do with metadata items.

But it does -- the creator of the bus binds metadata to that bus at
creation time.

I think that a better solution would be to have a global policy that
says, for example, "to create the bus called 'system', the creator
must have selinux label xyz" or "to create a user bus called
uid-1000-privileged-ui-bus the creator must have some cgroup" or
whatever.

Although maybe a better solution would leave this in the kernel but
allow any cgroup to create a bus with a same that indicates the
creating cgroup.  Then I could have my desktop shell create a
"/cgroup/path/to/desktop" for per-user privileged things.

>
>>> +A set of policy rules is described by a name and multiple access rules, defined
>>> +by the following struct.
>>> +
>>> +struct kdbus_policy_access {
>>> +  __u64 type;  /* USER, GROUP, WORLD */
>>> +    One of the following.
>>> +
>>> +    KDBUS_POLICY_ACCESS_USER
>>> +      Grant access to a user with the uid stored in the 'id' field.
>>> +
>>> +    KDBUS_POLICY_ACCESS_GROUP
>>> +      Grant access to a user with the gid stored in the 'id' field.
>>> +
>>> +    KDBUS_POLICY_ACCESS_WORLD
>>> +      Grant access to everyone. The 'id' field is ignored.
>>> +
>>> +  __u64 access;        /* OWN, TALK, SEE */
>>> +    The access to grant.
>>> +
>>> +    KDBUS_POLICY_SEE
>>> +      Allow the name to be seen.
>>> +
>>> +    KDBUS_POLICY_TALK
>>> +      Allow the name to be talked to.
>>> +
>>> +    KDBUS_POLICY_OWN
>>> +      Allow the name to be owned.
>>> +
>>> +  __u64 id;
>>> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
>>> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
>>> +};
>>
>>
>> What happens if there are multiple matches?
>
> We only have _granting_ policy entries. We search through the
> policy-db until we find an entry that grants access. We do _not_ stop
> on the first item that matches.

Yay!  Can you document that more clearly?

>
>>
>>> +
>>> +11.4 TALK access and multiple well-known names per connection
>>> +-------------------------------------------------------------
>>> +
>>> +Note that TALK access is checked against all names of a connection.
>>> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
>>> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
>>> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
>>> +after all, we allow messages to be directed to either the name or a well-known
>>> +name, and policy is applied to the connection, not the name. In other words,
>>> +the effective TALK policy for a connection is the most permissive of all names
>>> +the connection owns.
>>
>> This does seem illogical.  Does the recipient at least know which
>> well-known name was addressed?
>
> If the sender addressed it to a well-known name, yes. If the sender
> addressed the message to a unique-ID, there will be no such name, of
> course. Still, the policy applies to such transactions either way
> (standard D-Bus behavior).
>
> Note however that dbus1 will not pass along the destination well-known
> name, hence most userspace libraries will ignore this information too,
> even if they run on kdbus which might pass this information around.
> The right way for services that carry multiple service names to
> discern which actual service is being talked to is having separate
> object paths for the different functionality to hide between the
> services.

It seems unfortunate to keep around this really weird behavior for the
benefit of legacy applications.  Could there perhaps be a flag that a
connection can set to indicate that it understands per-destination
access control and therefore wants stricter policy enforcement?

>
>>> +11.5 Implicit policies
>>> +----------------------
>>> +
>>> +Depending on the type of the endpoint, a set of implicit rules might be
>>> +enforced. On default endpoints, the following set is enforced:
>>> +
>>
>> How do these rules interact with installed policy?
>
> As said before, all policy entries _grant_ access. We look through all
> entries until we find one that grants access.
>
>>> +  * Privileged connections always override any installed policy. Those
>>> +    connections could easily install their own policies, so there is no
>>> +    reason to enforce installed policies.
>>> +  * Connections can always talk to connections of the same user. This
>>> +    includes broadcast messages.
>>
>> Why?
>
> All limits on buses are enforced on a user<->user basis. We don't want
> to provide policies that are more fine-grained than our accounting.

This seems completely at odds with all the fine-grained metadata
stuff.  Also, anything that relies on this may get very confused when
the LSM hooks go in, because I'm reasonably sure that the intent is
for them to *not* follow this principle.

>
>> If anyone ever strengthens the concept of identity to include
>> things other than users (hmm -- there are already groups), this could
>> be very limiting.
>
> If user-based accounting is not suitable, you can create custom
> endpoints. Future extensions to that are always welcome. So far, the
> default user-based accounting was enough. And I think it's suitable as
> default.
>
>>> +  * Connections that own names might send broadcast messages to other
>>> +    connections that belong to a different user, but only if that
>>> +    destination connection does not own any name.
>>> +

(Also, what does "might" mean here?)

>>
>> This is weird.  It is also differently illogical than the "illogical"
>> thing above.
>
> Actually it follows the same model described above. If two connections
> are running under the same user then broadcasts are allowed, but if
> they are running under different users *and* if the destination owns a
> well-known name, then broadcasts are subject to TALK policy checks
> since that destination may own a restricted well-known name that is
> not interested in broadcasts. So this implicit policy is just
> fast-path for the common case where the target is subscribed to a
> broadcast and does not own any name.

Huh?

Say I have two users, "Sender" and "Receiver", each with a single
connection.  If Receiver owns no well-known names, then Sender can
send to it.  If Receiver owns one well-known name, then Sender needs
to pass a TALK check on that name.  If Reciever owns two well-known
names, then Sender only needs to pass a TALK check on one of them.

Am I understanding this right?  If I am, then I think this is in the
category of baroque and inconsistent security rules which everyone
will screw up and therefore introduce security vulnerabilities.

Can you really not enforce the much simpler rule that, to send to a
name, you must have permission to send to *that* name?  If legacy
dbus1 receivers register two names and don't validate everything
correctly, then only the legacy receivers have problems.

>>> +
>>> +To access the memory, the caller is expected to mmap() it to its task, like
>>> +this:
>>> +
>>> +  /*
>>> +   * POOL_SIZE has to be a multiple of PAGE_SIZE, and it must match the
>>> +   * value that was previously passed in the .pool_size field of struct
>>> +   * kdbus_cmd_hello.
>>> +   */
>>> +
>>> +  buf = mmap(NULL, POOL_SIZE, PROT_READ, MAP_PRIVATE, conn_fd, 0);
>>> +
>>
>> Will mapping with PROT_WRITE fail?  What about MAP_SHARED?
>
> PROT_WRITE will fail and VM_MAYWRITE is cleared.
>
>> And why are you suggesting MAP_PRIVATE?  That's just strange.
>
> This was a leftover from pre-3.17. memfds require you to use
> MAP_PRIVATE if SEAL_WRITE is set (which is a linux-specific behavior,
> where MAP_SHARED is always accounted as writable mapping as
> mprotect(VM_WRITE) can be called any time). I fixed this up.

Thanks.  (I assume that memfd has nothing to do with this directly,
since these are pools, not memfds.)

>
>>> +
>>> +13. Metadata
>>> +===============================================================================
> [snip]
>>> +13.1 Known item types
>>> +---------------------
>>> +
>>> +The following attach flags are currently supported.
>>> +
>>> +  KDBUS_ATTACH_TIMESTAMP
>>> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
>>> +    monotonic and the realtime timestamp, taken when the message was
>>> +    processed on the kernel side.
>>> +
>>> +  KDBUS_ATTACH_CREDS
>>> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
>>> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
>>> +
>>
>> As mentioned last time, please remove or justify starttime.
>
> starttime allows detecting PID overflows. Exposing the process
> starttime is useful to detect when a PID is getting reused.
> Unfortunately, we don't have 64bit pids, so we need the pid+time
> combination to avoid ambiguity.

NAK, I think.

I agree that PID overflow is a real issue and should be addressed
somehow.  But please address it for real instead of adding Yet Another
Hack (tm).  In the mean time, leave that hack out, please.

I would *love* to see PIDs have extra high bits at the end, done in a
way that supports CRIU and that guarantees no reuse unless something
privileged intentionally mis-programs it.  But starttime isn't that
mechanism.

>
>>> +  KDBUS_ATTACH_AUXGROUPS
>>> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
>>> +    number of auxiliary groups the sending task was a member of.
>>> +
>>> +  KDBUS_ATTACH_NAMES
>>> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
>>> +    connection currently owns. The name and flags are stored in kdbus_item.name
>>> +    for each of them.
>>> +
>>
>> That's interesting.  What's it for?
>
> It a valuable piece of information for receivers to know which bus
> names a sender has claimed. For instance, we need this information for
> the D-Bus proxy service, because we have to apply D-Bus1 policy in
> that case, and we need to get a list of owned names in a race-free
> manner to check the policy against.

But if you change the rule to the sensible one where you need
permission to TALK to the name that you're talking to, this goes away,
right?

>
>>> +  KDBUS_ATTACH_TID_COMM
>>> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
>>> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
>>> +
>>> +  KDBUS_ATTACH_PID_COMM
>>> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
>>> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
>>> +
>>> +  KDBUS_ATTACH_EXE
>>> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
>>> +    executable of the sending task, stored in kdbus_item.str.
>>> +
>>> +  KDBUS_ATTACH_CMDLINE
>>> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
>>> +    arguments of the sending task, as an array of strings, stored in
>>> +    kdbus_item.str.
>>
>> Please remove these four items.  They are genuinely useless.  Anything
>> that uses them for anything is either buggy or should have asked the
>> sender to put the value in the payload (and immediately wondered why
>> it was doing that).
>
> We use them for logging, debugging and monitoring. With our wireshark
> extension it's pretty useful to know the comm-name of a process when
> monitoring a bus. As we explained last time, this is not about
> security. We're aware that a process can modify them. We use them only
> as additional meta-data for logging and debugging.

Use the PID.  Really.  Your wireshark extention can look this crap up
in /proc and, if it fails due to a race, big frickin' deal.

>
> If we put those items into the payload, we have to transmit this data
> even if the destination process is not interested in this.
> Furthermore, each caller has to run multiple syscalls on each message
> to retrieve those values.
>
> We use these items heavily for filtering and debugging, regardless of
> the payload protocol that is transmitted on the bus.
>
> To give another specific use-case here: dbus supports bus activation,
> where a message sent to a non-running service causes it to be spawned
> implicitly without losing the message. Now, with such a scheme it is
> incredibly useful to be able to log which client caused a service to
> be triggered, hence we want to know the cmdline/exe/comm of the
> client. Not knowing this is a major pita when trying to trace the boot
> process and figuring out why a specific service got activated.

Again, use the PID for tracing, please.

At the very least, make it impossible to specify these fields in the
"must be received" set and rename them to something like
KDBUS_INSTALL_UNRELIABLE_CMDLINE, etc, because they're unreliable

Finally, this stuff should only be readable by privileged users.  And
using the PID accomplishes that.

>
> Also note that since v2 of the patch there's actually a per-sender
> mask for meta-data like this, hence a peer which doesn't want to pass
> its exec/cmdline/comm along can do that. Of course, this will
> seriously hamper debuggability and transparency...

Transparency is a terrible thing here.

How many users put passwords into things on the command line?  Yes,
it's a bad idea (for reasons that are entirely stupid), but now those
passwords get *logged*.

If this is in the kernel, and someone complains that sensitive data is
showing up on ten different logs on their system, they'll *correctly*
blame the kernel.  If you at least use the PID and restrict it to the
logging code, then at least the bug report will go to the logging
daemon, which will be *correctly* accused of doing something daft, and
it can be fixed.

>
>>> +
>>> +  KDBUS_ATTACH_CGROUP
>>> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
>>> +
>>> +  KDBUS_ATTACH_CAPS
>>> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
>>> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
>>> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
>>> +    and derive the number of sets and rows from the item size and the reported
>>> +    number of valid capability bits.
>>> +
>>
>> Please remove this, too, or justify its use.
>
> cgroup information tells us which service is doing a bus request. This
> is useful for a variety of things. For example, the bus activation
> logging item above benefits from it. In general, if some message shall
> be logged about any client it is useful to know its service name.
>
> Capabilities are useful to authenticate specific method calls. For
> example, when a client asks systemd to reboot, without this concept we
> can only check for UID==0 to decide whether to allow this. Breaking
> this down to capabilities in a race-free way has the benefit of
> allowing systemd to bind this to CAP_SYS_BOOT instead. There is no
> reason to deny a process with CAP_SYS_BOOT to reboot via bus-APIs, as
> they could just enforce it via syscalls, anyway.

With all due respect, BS.

I admit that there is probably no reason to deny systemd-based reboot
to a CAP_SYS_BOOT-capable process, but there is absolutely no reason
to give processes that are supposed to reboot using systemd the
CAP_SYS_BOOT capability.

In any event, I suspect you'll have a hard time justifying this for
anything other than CAP_SYS_BOOT.  Just because CAP_SYS_ADMIN users
can probably do whatever they want doesn't mean that systemd should
make that a built-in policy.

Also, wtf is the bounding set and such for?  At the very least this
should only be the effective set.

>
> We think it's a useful and reliable authentication method. Why should
> we remove it?

Because the implementation is buggy and therefore it's insecure?
Remember that caps are namespaced in an interesting way.

>
> Anyway, these items are  just optional. The sender can refuse the
> reveal them, and the item is only transmitted if the receiver opted in
> for it, too. So there's no need to drop any item type from the
> protocol.

No.

Because if receivers opt in to most of these, *they're doing it
wrong*, and the kernel shouldn't be in the business of helping them.

>
>>> +  KDBUS_ATTACH_SECLABEL
>>> +    Attaches an item of type KDBUS_ITEM_SECLABEL, which contains the SELinux
>>> +    security label of the sending task. Access via kdbus_item->str.
>>> +
>>
>> This one, too, and please justify why code that uses it will work in
>> containers an on non-selinux systems.
>
> This maps to SCM_PEERSEC on AF_UNIX sockets. This is actually heavily
> used already on dbus1. For example, systemd actually uses this
> information about a bus peer to check whether certain operations are
> allowed, on top of the normal policy. To give an explicit example:
> when starting a service systemd gets the client's peer label from
> AF_UNIX via SCM_PEERSEC, reads the security label of the service file
> in question, and then checks the selinux database (via one of
> libselinux APIs) whether to allow this change.
>
> Note that this is really nothing we came up with, it's code from the
> SELinux folks, it's simple enough, and has been exposed and used that
> way since ages in dbus1. libselinux offers all the right APIs to make
> use of this, and kdbus really needs to provide the same functionality
> as dbus1 in this regard here.
>
> Let met stress that checking the selinux database here is alway *on
> top* of the normal UID/caps based security checks services do. This is
> exactly how selinux enforces checks on files on top of UID/caps
> checks, or on process or anything else that is selinux-managed.

But I thought that the LSM hooks were going to replace all of that.

Can you get a selinux person to confirm that this is actually necessary?

>
>>> +  KDBUS_ATTACH_AUDIT
>>> +    Attaches an item of type KDBUS_ITEM_AUDIT, which contains the audio label
>>> +    of the sending taskj. Access via kdbus_item->str.
>>> +
>>
>> I will NAK the hell out of this until, at the very least, someone
>> documents what this means, how to parse it, what its stability rules
>> are, who is allowed to see the value it contains, why that value will
>> never evolve to become *more* security sensitive than it is now, etc.
>>
>> Audit gets to do crazy sh*t because it's restricted to privileged
>> receivers.  This isn't restricted like that, so it doesn't deserve the
>> same dispensation.  (And, honestly, I'm not sure that audit really
>> deserves its free pass on making sense.)
>
> This was based on a misunderstanding, so I will ignore it here. Lets
> discuss this on the metadata-patch, in case it's still unclear.

Yeah, I think you're right about this.

If I'm understanding this right, then I think it just needs a
documentation change and possibly a renaming of the metadata item
name.

>
>>> +  KDBUS_ATTACH_CONN_DESCRIPTION
>>> +    Attaches an item of type KDBUS_ITEM_CONN_DESCRIPTION that contains the
>>> +    sender connection's current name in kdbus_item.str.
>>> +
>>
>> Which name?  Can't there be several?
>
> No. There is only one connection description string, copied verbatim
> from what the connection supplied during HELLO.
>
> Note that this is really just about debugging, since in some cases
> processes might end up with multiple kdbus fds, and it is useful in
> tools like "busctl" to know which one is which.

Fair enough.

>
>>> +
>>> +13.1 Metadata and namespaces
>>> +----------------------------
>>> +
>>> +Metadata such as PIDs, UIDs or GIDs are automatically translated to the
>>> +namespaces of the domain that is used to send a message over. The namespaces
>>> +of a domain are pinned at creation time, which is when the filesystem has been
>>> +mounted.
>>> +
>>> +Metadata items that cannot be translated are dropped.
>>
>> What if the receiver said that the item was mandatory?
>
> It is still dropped. It's the responsibility of the receiver to reject
> messages that lack required metadata.

Thanks,
Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-26 11:55           ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-26 11:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann <dh.herrmann@gmail.com> wrote:
>> [snip]
>>>> +6.5 Getting information about a connection's bus creator
>>>> +--------------------------------------------------------
>>>> +
>>>> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
>>>> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
>>>> +the bus the connection is attached to. The metadata returned by this call is
>>>> +collected during the creation of the bus and is never altered afterwards, so
>>>> +it provides pristine information on the task that created the bus, at the
>>>> +moment when it did so.
>>>
>>> What's this for?  I understand the need for the creator of busses to
>>> be authenticated, but doing it like this mean that anyone who will
>>> *fail* authentication can DoS the authentic creator.
>>
>> This returns information on a bus owner, to determine whether a
>> connection is connected to a system, user or session bus. Note that
>> the bus-creator itself is not a valid peer on the bus, so you cannot
>> send messages to them. Which kind of DoS do you have in mind?
>
> I assume that the logic is something like:
>
> connect to bus
> request bus metadata
> if (bus metadata matches expectations) {
>   great, trust the bus!
> } else {
>   oh crap!
> }

Uh, no, this is really not the logic that should be assumed. It's more
for code where you want to simply pass a bus fd, and the code knows
nothing about it. Now, the code can derive some information from the
bus fd, like for example who owns it. Then, depending on some of the
creds returned it can determine whether to read configuration file set
A or B and so on. This is particularly useful for all kinds of
unprivileged bus services that end up running on any kind of bus and
need to be able to figure out what they are actually operating on.

> If I'm understanding it right, then user code only really has two
> outcomes: the good case and the "oh crap!" case.  The problem is that
> "oh crap!" isn't a clean failure -- if it happens, then the
> application has just been DoSed, because in that case, one of two
> things happened:
>
> 1. Some policy mismatch means that the legitimate bus owner did create
> the bus, but the user application is confused.  This will result in
> difficult-to-diagnose failures.
>
> 2. A malicious or confused program created the bus.  This is a DoS --
> even the legitimate bus creator can't actually create the bus now.
>
> So I think that the policy should be applied at the time that the bus
> name is claimed, not at the time that someone else tries to use the
> bus.  IOW, the way that you verify you're talking to the system bus
> should be by checking that the bus is called "system", not by checking
> that UID 0 created the bus.
>

[snip]

>>>> +
>>>> +Also, if the connection allowed for file descriptor to be passed
>>>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>>>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>>>> +returns. The receiving task is obliged to close all of them appropriately.
>>>
>>> This makes it sound like fds are installed at receive time.  What
>>> prevents resource exhaustion due to having excessive numbers of fds in
>>> transit (that are presumably not accounted to anyone)?
>>
>> We have a per-user message accounting for undelivered messages, as
>> well as a maximum number of pending messages per connection on the
>> receiving end. These limits are accounted on a "user<->user" basis, so
>> the limit of a user A will not affect two other users (B and C)
>> talking.
>
> But you can shove tons of fds in a message, and you can have lots of
> messages, and some of the fds can be fds of unix sockets that have fds
> queued up in them, and one of those fds could be the fd to the kdbus
> connection that sent the fd...

You cannot send kdbus-fds or unix-fds over kdbus, right now. We have
people working on the AF_UNIX gc to make it more generic and include
external types. Until then, we simply prevent recursive fd passing.

> This is not advice as to what to do about it, but I think that it will
> be a problem at some point.
>
>
>>>> +11. Policy
>>>> +===============================================================================
>>>> +
>>>> +A policy databases restrict the possibilities of connections to own, see and
>>>> +talk to well-known names. It can be associated with a bus (through a policy
>>>> +holder connection) or a custom endpoint.
>>>
>>> ISTM metadata items on bus names should be replaced with policy that
>>> applies to the domain as a whole and governs bus creation.
>>
>> No, well-known names are bound to buses, so a bus is really the right
>> place to hold policy about which process is allowed to claim them.
>> Every user is allowed to create a bus of its own, there's no policy
>> for that, and there shouldn't be.
>>
>> It has nothing to do with metadata items.
>
> But it does -- the creator of the bus binds metadata to that bus at
> creation time.
>
> I think that a better solution would be to have a global policy that
> says, for example, "to create the bus called 'system', the creator
> must have selinux label xyz" or "to create a user bus called
> uid-1000-privileged-ui-bus the creator must have some cgroup" or
> whatever.
>
> Although maybe a better solution would leave this in the kernel but
> allow any cgroup to create a bus with a same that indicates the
> creating cgroup.  Then I could have my desktop shell create a
> "/cgroup/path/to/desktop" for per-user privileged things.

We enforce the UID as first entity of the bus name. Again, this is our
default policy because we rely on user-based access control. If we
want more fine-grained access-control, we can introduce other policies
at any time. For instance, we could enforce "cg-<cgroup>-<busname>"
later on, where the kernel requires the caller to prefix the bus with
"cg-<cgroup>-", where <cgroup> is the cgroup-path encoded in some way.

We provide one policy as default, and we have a use-case for it.
Further policies are always welcome as extensions later on. I don't
see why we should provide all those right from the beginning without
any users right now.

>>
>>>> +A set of policy rules is described by a name and multiple access rules, defined
>>>> +by the following struct.
>>>> +
>>>> +struct kdbus_policy_access {
>>>> +  __u64 type;  /* USER, GROUP, WORLD */
>>>> +    One of the following.
>>>> +
>>>> +    KDBUS_POLICY_ACCESS_USER
>>>> +      Grant access to a user with the uid stored in the 'id' field.
>>>> +
>>>> +    KDBUS_POLICY_ACCESS_GROUP
>>>> +      Grant access to a user with the gid stored in the 'id' field.
>>>> +
>>>> +    KDBUS_POLICY_ACCESS_WORLD
>>>> +      Grant access to everyone. The 'id' field is ignored.
>>>> +
>>>> +  __u64 access;        /* OWN, TALK, SEE */
>>>> +    The access to grant.
>>>> +
>>>> +    KDBUS_POLICY_SEE
>>>> +      Allow the name to be seen.
>>>> +
>>>> +    KDBUS_POLICY_TALK
>>>> +      Allow the name to be talked to.
>>>> +
>>>> +    KDBUS_POLICY_OWN
>>>> +      Allow the name to be owned.
>>>> +
>>>> +  __u64 id;
>>>> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
>>>> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
>>>> +};
>>>
>>>
>>> What happens if there are multiple matches?
>>
>> We only have _granting_ policy entries. We search through the
>> policy-db until we find an entry that grants access. We do _not_ stop
>> on the first item that matches.
>
> Yay!  Can you document that more clearly?

Sure!

>>
>>>
>>>> +
>>>> +11.4 TALK access and multiple well-known names per connection
>>>> +-------------------------------------------------------------
>>>> +
>>>> +Note that TALK access is checked against all names of a connection.
>>>> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
>>>> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
>>>> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
>>>> +after all, we allow messages to be directed to either the name or a well-known
>>>> +name, and policy is applied to the connection, not the name. In other words,
>>>> +the effective TALK policy for a connection is the most permissive of all names
>>>> +the connection owns.
>>>
>>> This does seem illogical.  Does the recipient at least know which
>>> well-known name was addressed?
>>
>> If the sender addressed it to a well-known name, yes. If the sender
>> addressed the message to a unique-ID, there will be no such name, of
>> course. Still, the policy applies to such transactions either way
>> (standard D-Bus behavior).
>>
>> Note however that dbus1 will not pass along the destination well-known
>> name, hence most userspace libraries will ignore this information too,
>> even if they run on kdbus which might pass this information around.
>> The right way for services that carry multiple service names to
>> discern which actual service is being talked to is having separate
>> object paths for the different functionality to hide between the
>> services.
>
> It seems unfortunate to keep around this really weird behavior for the
> benefit of legacy applications.  Could there perhaps be a flag that a
> connection can set to indicate that it understands per-destination
> access control and therefore wants stricter policy enforcement?

We actually think that it's a good idea not to use the destination
name for doing different things, to make things more transparent. For
example, we have tools that can explore the bus, introspect things
(like d-bus), and they should show the same objects regardless by
which name you access a service. It's more transparent then, and
really just reduces names to some labels that make addressing things
easier, but that are actually completely unnecessary for actual method
invocations.

This behaviour is also relied on by a number of current bindings. For
example GLib's implementation usually caches the unique name of a
peer, and uses that for talking to remote objects (rather than always
using the well-known name), in order to get an error back when a name
becomes unavailable (maybe because a service died) or is moved to a
different peer. If daemons would always take the destination name into
account this kind of logic could never work.

It does take some time to get used to the fact that names are
exclusively used for message routing and policies, not as
target-entity of actual method calls. But the current dbus1 behaviour
makes a ton of sense, and is really something we want to keep. It
improves how clients can do life-cycle tracking of remote objects.

Note that D-Bus modeled unique-names and well-known-names after IP
addresses and DNS names. It's a very similar model, and, like DNS
names, well-known names have no effect on the routing of messages.

>>
>>>> +11.5 Implicit policies
>>>> +----------------------
>>>> +
>>>> +Depending on the type of the endpoint, a set of implicit rules might be
>>>> +enforced. On default endpoints, the following set is enforced:
>>>> +
>>>
>>> How do these rules interact with installed policy?
>>
>> As said before, all policy entries _grant_ access. We look through all
>> entries until we find one that grants access.
>>
>>>> +  * Privileged connections always override any installed policy. Those
>>>> +    connections could easily install their own policies, so there is no
>>>> +    reason to enforce installed policies.
>>>> +  * Connections can always talk to connections of the same user. This
>>>> +    includes broadcast messages.
>>>
>>> Why?
>>
>> All limits on buses are enforced on a user<->user basis. We don't want
>> to provide policies that are more fine-grained than our accounting.
>
> This seems completely at odds with all the fine-grained metadata
> stuff.  Also, anything that relies on this may get very confused when
> the LSM hooks go in, because I'm reasonably sure that the intent is
> for them to *not* follow this principle.

User-based accounting has always been the default, right? We are open
to extend the API to support any other accounting scheme (LSM,
cgroup-based, ...). But like bus-name-policies, I think it's fine to
keep this as future extension. If you think the current design
precludes LSM-based accounting, lemme know and we can fix it. But we
have talked to LSM people before (and there have been patches on
LKML), and they seemed fine with it.

>>
>>> If anyone ever strengthens the concept of identity to include
>>> things other than users (hmm -- there are already groups), this could
>>> be very limiting.
>>
>> If user-based accounting is not suitable, you can create custom
>> endpoints. Future extensions to that are always welcome. So far, the
>> default user-based accounting was enough. And I think it's suitable as
>> default.
>>
>>>> +  * Connections that own names might send broadcast messages to other
>>>> +    connections that belong to a different user, but only if that
>>>> +    destination connection does not own any name.
>>>> +
>
> (Also, what does "might" mean here?)
>
>>>
>>> This is weird.  It is also differently illogical than the "illogical"
>>> thing above.
>>
>> Actually it follows the same model described above. If two connections
>> are running under the same user then broadcasts are allowed, but if
>> they are running under different users *and* if the destination owns a
>> well-known name, then broadcasts are subject to TALK policy checks
>> since that destination may own a restricted well-known name that is
>> not interested in broadcasts. So this implicit policy is just
>> fast-path for the common case where the target is subscribed to a
>> broadcast and does not own any name.
>
> Huh?
>
> Say I have two users, "Sender" and "Receiver", each with a single
> connection.  If Receiver owns no well-known names, then Sender can
> send to it.  If Receiver owns one well-known name, then Sender needs
> to pass a TALK check on that name.  If Reciever owns two well-known
> names, then Sender only needs to pass a TALK check on one of them.
>
> Am I understanding this right?  If I am, then I think this is in the
> category of baroque and inconsistent security rules which everyone
> will screw up and therefore introduce security vulnerabilities.
>
> Can you really not enforce the much simpler rule that, to send to a
> name, you must have permission to send to *that* name?  If legacy
> dbus1 receivers register two names and don't validate everything
> correctly, then only the legacy receivers have problems.

Sorry, I got confused here. That implicit policy is now dropped.

>>
>>>> +
>>>> +13. Metadata
>>>> +===============================================================================
>> [snip]
>>>> +13.1 Known item types
>>>> +---------------------
>>>> +
>>>> +The following attach flags are currently supported.
>>>> +
>>>> +  KDBUS_ATTACH_TIMESTAMP
>>>> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
>>>> +    monotonic and the realtime timestamp, taken when the message was
>>>> +    processed on the kernel side.
>>>> +
>>>> +  KDBUS_ATTACH_CREDS
>>>> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
>>>> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
>>>> +
>>>
>>> As mentioned last time, please remove or justify starttime.
>>
>> starttime allows detecting PID overflows. Exposing the process
>> starttime is useful to detect when a PID is getting reused.
>> Unfortunately, we don't have 64bit pids, so we need the pid+time
>> combination to avoid ambiguity.
>
> NAK, I think.
>
> I agree that PID overflow is a real issue and should be addressed
> somehow.  But please address it for real instead of adding Yet Another
> Hack (tm).  In the mean time, leave that hack out, please.
>
> I would *love* to see PIDs have extra high bits at the end, done in a
> way that supports CRIU and that guarantees no reuse unless something
> privileged intentionally mis-programs it.  But starttime isn't that
> mechanism.

The starttime logic sufficiently fixes the issue. If one great day in
the future somebody invents some completely new concept for making
this problem go away, we can look into that, but even then the field
is still valuable for informational purposes. I mean, the kernel
tracks this and exposes this in /proc for a reason...

In the meantime, we don't have any other way of solving this problem,
so we'll leave this in.

>>
>>>> +  KDBUS_ATTACH_AUXGROUPS
>>>> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
>>>> +    number of auxiliary groups the sending task was a member of.
>>>> +
>>>> +  KDBUS_ATTACH_NAMES
>>>> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
>>>> +    connection currently owns. The name and flags are stored in kdbus_item.name
>>>> +    for each of them.
>>>> +
>>>
>>> That's interesting.  What's it for?
>>
>> It a valuable piece of information for receivers to know which bus
>> names a sender has claimed. For instance, we need this information for
>> the D-Bus proxy service, because we have to apply D-Bus1 policy in
>> that case, and we need to get a list of owned names in a race-free
>> manner to check the policy against.
>
> But if you change the rule to the sensible one where you need
> permission to TALK to the name that you're talking to, this goes away,
> right?

This does not work if a message is directed at a unique-name, as
explained above (or, think broadcasts).

>>
>>>> +  KDBUS_ATTACH_TID_COMM
>>>> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
>>>> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
>>>> +
>>>> +  KDBUS_ATTACH_PID_COMM
>>>> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
>>>> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
>>>> +
>>>> +  KDBUS_ATTACH_EXE
>>>> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
>>>> +    executable of the sending task, stored in kdbus_item.str.
>>>> +
>>>> +  KDBUS_ATTACH_CMDLINE
>>>> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
>>>> +    arguments of the sending task, as an array of strings, stored in
>>>> +    kdbus_item.str.
>>>
>>> Please remove these four items.  They are genuinely useless.  Anything
>>> that uses them for anything is either buggy or should have asked the
>>> sender to put the value in the payload (and immediately wondered why
>>> it was doing that).
>>
>> We use them for logging, debugging and monitoring. With our wireshark
>> extension it's pretty useful to know the comm-name of a process when
>> monitoring a bus. As we explained last time, this is not about
>> security. We're aware that a process can modify them. We use them only
>> as additional meta-data for logging and debugging.
>
> Use the PID.  Really.  Your wireshark extention can look this crap up
> in /proc and, if it fails due to a race, big frickin' deal.

I see no reason for leaving it up to the client to do this extra work
if it can as well be attached by the kernel, really.

We use the PID on dbus1 systems for cases like this. But it's actually
too racy to be useful. For example, in systemd we ship a tiny binary
that is used as cgroup agent, which just pushes a message about the
fact it just got called for a cgroup onto the bus and then exits.
Since it only runs for a very very short time, any peer which then
tries to read the metadata off it is pretty likely to fail. And there
are quite a number of processes like that, that just do one thing and
die, especially in the early boot process. For none of them we
currently can generate useful log metadata, because all we have is a
PID that we have barely any useful information about.
This is a real race we get lots of bug-reports for. With kdbus we want
to fix this, by optionally attaching this useful data to the message,
so that the receiver can get the information when it wants to.

>>
>> If we put those items into the payload, we have to transmit this data
>> even if the destination process is not interested in this.
>> Furthermore, each caller has to run multiple syscalls on each message
>> to retrieve those values.
>>
>> We use these items heavily for filtering and debugging, regardless of
>> the payload protocol that is transmitted on the bus.
>>
>> To give another specific use-case here: dbus supports bus activation,
>> where a message sent to a non-running service causes it to be spawned
>> implicitly without losing the message. Now, with such a scheme it is
>> incredibly useful to be able to log which client caused a service to
>> be triggered, hence we want to know the cmdline/exe/comm of the
>> client. Not knowing this is a major pita when trying to trace the boot
>> process and figuring out why a specific service got activated.
>
> Again, use the PID for tracing, please.

No. It's racy. Processes can die too quickly. And this is not a race
that would be about security, but it's a real-life race that is just
awful to run into when you try to trace what's going on on your
system.

> At the very least, make it impossible to specify these fields in the
> "must be received" set and rename them to something like
> KDBUS_INSTALL_UNRELIABLE_CMDLINE, etc, because they're unreliable
>
> Finally, this stuff should only be readable by privileged users.  And
> using the PID accomplishes that.
>
>>
>> Also note that since v2 of the patch there's actually a per-sender
>> mask for meta-data like this, hence a peer which doesn't want to pass
>> its exec/cmdline/comm along can do that. Of course, this will
>> seriously hamper debuggability and transparency...
>
> Transparency is a terrible thing here.
>
> How many users put passwords into things on the command line?  Yes,
> it's a bad idea (for reasons that are entirely stupid), but now those
> passwords get *logged*.
>
> If this is in the kernel, and someone complains that sensitive data is
> showing up on ten different logs on their system, they'll *correctly*
> blame the kernel.  If you at least use the PID and restrict it to the
> logging code, then at least the bug report will go to the logging
> daemon, which will be *correctly* accused of doing something daft, and
> it can be fixed.

Hmm? Not following here. This information is visible via /proc too. If
you hide it from /proc via the hide_pid logic, then it is also gone
from the kdbus meta-data. Also, note again that clients that don't
want this information to be passed to services can declare that now
with their sender creds mask, introduced with v2.

>>
>>>> +
>>>> +  KDBUS_ATTACH_CGROUP
>>>> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
>>>> +
>>>> +  KDBUS_ATTACH_CAPS
>>>> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
>>>> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
>>>> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
>>>> +    and derive the number of sets and rows from the item size and the reported
>>>> +    number of valid capability bits.
>>>> +
>>>
>>> Please remove this, too, or justify its use.
>>
>> cgroup information tells us which service is doing a bus request. This
>> is useful for a variety of things. For example, the bus activation
>> logging item above benefits from it. In general, if some message shall
>> be logged about any client it is useful to know its service name.
>>
>> Capabilities are useful to authenticate specific method calls. For
>> example, when a client asks systemd to reboot, without this concept we
>> can only check for UID==0 to decide whether to allow this. Breaking
>> this down to capabilities in a race-free way has the benefit of
>> allowing systemd to bind this to CAP_SYS_BOOT instead. There is no
>> reason to deny a process with CAP_SYS_BOOT to reboot via bus-APIs, as
>> they could just enforce it via syscalls, anyway.
>
> With all due respect, BS.
>
> I admit that there is probably no reason to deny systemd-based reboot
> to a CAP_SYS_BOOT-capable process, but there is absolutely no reason
> to give processes that are supposed to reboot using systemd the
> CAP_SYS_BOOT capability.

No, and this is not how this works. Note that for unpriviliged clients
systemd checks PolicyKit in order to identify whether to allow certain
priviliged operations. However, PolicyKit requires bus chatte, is slow
and quite complex, hence the code shortcuts things if it knows that
the client is priviliged anyway, and doesn't even bother with
PolicyKit at all. This is where the caps come into play, since this
shortcutting really should *NOT* be done for every single client, but
only for those with the rights to do the operation anyway.

Hence, this is really not about overloading capabilities with new
meanings. Instead it is about shortcutting policykit for priviliged
clients. And this shortcutting should be as restricted as possible.

> In any event, I suspect you'll have a hard time justifying this for
> anything other than CAP_SYS_BOOT.  Just because CAP_SYS_ADMIN users
> can probably do whatever they want doesn't mean that systemd should
> make that a built-in policy.

Well, the other option for these APIs is to use the euid, which is
hardly any better.

systemd-timedated shortcuts policykit if the client has CAP_SYS_TIME
and tries to change the system clock. Similar, if a process asks
logind to kill a session, we bypass pk if the client has CAP_KILL.

> Also, wtf is the bounding set and such for?  At the very least this
> should only be the effective set.

Yes, the code that makes use of this for shortcutting pk uses the
effective set only. The other ones we allow sending across for
enhancing logging of security operations.

In general: all creds we collect at the very least are incredibly
useful for generating log records in a race-free fashion. As pointed
out above the "race-free" bit alone solves real-world issues that are
highly annoying if we don't have it.

>>
>> We think it's a useful and reliable authentication method. Why should
>> we remove it?
>
> Because the implementation is buggy and therefore it's insecure?
> Remember that caps are namespaced in an interesting way.

Yes, we are well aware of the fact that we currently have no good way
to translate a full set of capabilities from one user-ns to another.
Hence, the only sane thing to do in such situations is to drop the
entire item, which is what we do. Once we have a reliable way of
translating things, we can add that to our code. Note that a set
capability flag will only gain you more access level, so if caps are
missing from a message, a user might only have *less* privileges, not
more.

>>
>> Anyway, these items are  just optional. The sender can refuse the
>> reveal them, and the item is only transmitted if the receiver opted in
>> for it, too. So there's no need to drop any item type from the
>> protocol.
>
> No.
>
> Because if receivers opt in to most of these, *they're doing it
> wrong*, and the kernel shouldn't be in the business of helping them.

No, they are not doing it "wrong". The services would do things wrong
if they'd make security decisions on bits that cannot be acquired in a
race-free way. And services do things in a dirty way if they'd
generated logging data in a race-ful way (like you suggest), by
reading things from /proc.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-26 11:55           ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-26 11:55 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Daniel Mack, Djalal Harouni

Hi

On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann <dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> [snip]
>>>> +6.5 Getting information about a connection's bus creator
>>>> +--------------------------------------------------------
>>>> +
>>>> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
>>>> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
>>>> +the bus the connection is attached to. The metadata returned by this call is
>>>> +collected during the creation of the bus and is never altered afterwards, so
>>>> +it provides pristine information on the task that created the bus, at the
>>>> +moment when it did so.
>>>
>>> What's this for?  I understand the need for the creator of busses to
>>> be authenticated, but doing it like this mean that anyone who will
>>> *fail* authentication can DoS the authentic creator.
>>
>> This returns information on a bus owner, to determine whether a
>> connection is connected to a system, user or session bus. Note that
>> the bus-creator itself is not a valid peer on the bus, so you cannot
>> send messages to them. Which kind of DoS do you have in mind?
>
> I assume that the logic is something like:
>
> connect to bus
> request bus metadata
> if (bus metadata matches expectations) {
>   great, trust the bus!
> } else {
>   oh crap!
> }

Uh, no, this is really not the logic that should be assumed. It's more
for code where you want to simply pass a bus fd, and the code knows
nothing about it. Now, the code can derive some information from the
bus fd, like for example who owns it. Then, depending on some of the
creds returned it can determine whether to read configuration file set
A or B and so on. This is particularly useful for all kinds of
unprivileged bus services that end up running on any kind of bus and
need to be able to figure out what they are actually operating on.

> If I'm understanding it right, then user code only really has two
> outcomes: the good case and the "oh crap!" case.  The problem is that
> "oh crap!" isn't a clean failure -- if it happens, then the
> application has just been DoSed, because in that case, one of two
> things happened:
>
> 1. Some policy mismatch means that the legitimate bus owner did create
> the bus, but the user application is confused.  This will result in
> difficult-to-diagnose failures.
>
> 2. A malicious or confused program created the bus.  This is a DoS --
> even the legitimate bus creator can't actually create the bus now.
>
> So I think that the policy should be applied at the time that the bus
> name is claimed, not at the time that someone else tries to use the
> bus.  IOW, the way that you verify you're talking to the system bus
> should be by checking that the bus is called "system", not by checking
> that UID 0 created the bus.
>

[snip]

>>>> +
>>>> +Also, if the connection allowed for file descriptor to be passed
>>>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>>>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>>>> +returns. The receiving task is obliged to close all of them appropriately.
>>>
>>> This makes it sound like fds are installed at receive time.  What
>>> prevents resource exhaustion due to having excessive numbers of fds in
>>> transit (that are presumably not accounted to anyone)?
>>
>> We have a per-user message accounting for undelivered messages, as
>> well as a maximum number of pending messages per connection on the
>> receiving end. These limits are accounted on a "user<->user" basis, so
>> the limit of a user A will not affect two other users (B and C)
>> talking.
>
> But you can shove tons of fds in a message, and you can have lots of
> messages, and some of the fds can be fds of unix sockets that have fds
> queued up in them, and one of those fds could be the fd to the kdbus
> connection that sent the fd...

You cannot send kdbus-fds or unix-fds over kdbus, right now. We have
people working on the AF_UNIX gc to make it more generic and include
external types. Until then, we simply prevent recursive fd passing.

> This is not advice as to what to do about it, but I think that it will
> be a problem at some point.
>
>
>>>> +11. Policy
>>>> +===============================================================================
>>>> +
>>>> +A policy databases restrict the possibilities of connections to own, see and
>>>> +talk to well-known names. It can be associated with a bus (through a policy
>>>> +holder connection) or a custom endpoint.
>>>
>>> ISTM metadata items on bus names should be replaced with policy that
>>> applies to the domain as a whole and governs bus creation.
>>
>> No, well-known names are bound to buses, so a bus is really the right
>> place to hold policy about which process is allowed to claim them.
>> Every user is allowed to create a bus of its own, there's no policy
>> for that, and there shouldn't be.
>>
>> It has nothing to do with metadata items.
>
> But it does -- the creator of the bus binds metadata to that bus at
> creation time.
>
> I think that a better solution would be to have a global policy that
> says, for example, "to create the bus called 'system', the creator
> must have selinux label xyz" or "to create a user bus called
> uid-1000-privileged-ui-bus the creator must have some cgroup" or
> whatever.
>
> Although maybe a better solution would leave this in the kernel but
> allow any cgroup to create a bus with a same that indicates the
> creating cgroup.  Then I could have my desktop shell create a
> "/cgroup/path/to/desktop" for per-user privileged things.

We enforce the UID as first entity of the bus name. Again, this is our
default policy because we rely on user-based access control. If we
want more fine-grained access-control, we can introduce other policies
at any time. For instance, we could enforce "cg-<cgroup>-<busname>"
later on, where the kernel requires the caller to prefix the bus with
"cg-<cgroup>-", where <cgroup> is the cgroup-path encoded in some way.

We provide one policy as default, and we have a use-case for it.
Further policies are always welcome as extensions later on. I don't
see why we should provide all those right from the beginning without
any users right now.

>>
>>>> +A set of policy rules is described by a name and multiple access rules, defined
>>>> +by the following struct.
>>>> +
>>>> +struct kdbus_policy_access {
>>>> +  __u64 type;  /* USER, GROUP, WORLD */
>>>> +    One of the following.
>>>> +
>>>> +    KDBUS_POLICY_ACCESS_USER
>>>> +      Grant access to a user with the uid stored in the 'id' field.
>>>> +
>>>> +    KDBUS_POLICY_ACCESS_GROUP
>>>> +      Grant access to a user with the gid stored in the 'id' field.
>>>> +
>>>> +    KDBUS_POLICY_ACCESS_WORLD
>>>> +      Grant access to everyone. The 'id' field is ignored.
>>>> +
>>>> +  __u64 access;        /* OWN, TALK, SEE */
>>>> +    The access to grant.
>>>> +
>>>> +    KDBUS_POLICY_SEE
>>>> +      Allow the name to be seen.
>>>> +
>>>> +    KDBUS_POLICY_TALK
>>>> +      Allow the name to be talked to.
>>>> +
>>>> +    KDBUS_POLICY_OWN
>>>> +      Allow the name to be owned.
>>>> +
>>>> +  __u64 id;
>>>> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
>>>> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
>>>> +};
>>>
>>>
>>> What happens if there are multiple matches?
>>
>> We only have _granting_ policy entries. We search through the
>> policy-db until we find an entry that grants access. We do _not_ stop
>> on the first item that matches.
>
> Yay!  Can you document that more clearly?

Sure!

>>
>>>
>>>> +
>>>> +11.4 TALK access and multiple well-known names per connection
>>>> +-------------------------------------------------------------
>>>> +
>>>> +Note that TALK access is checked against all names of a connection.
>>>> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
>>>> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
>>>> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
>>>> +after all, we allow messages to be directed to either the name or a well-known
>>>> +name, and policy is applied to the connection, not the name. In other words,
>>>> +the effective TALK policy for a connection is the most permissive of all names
>>>> +the connection owns.
>>>
>>> This does seem illogical.  Does the recipient at least know which
>>> well-known name was addressed?
>>
>> If the sender addressed it to a well-known name, yes. If the sender
>> addressed the message to a unique-ID, there will be no such name, of
>> course. Still, the policy applies to such transactions either way
>> (standard D-Bus behavior).
>>
>> Note however that dbus1 will not pass along the destination well-known
>> name, hence most userspace libraries will ignore this information too,
>> even if they run on kdbus which might pass this information around.
>> The right way for services that carry multiple service names to
>> discern which actual service is being talked to is having separate
>> object paths for the different functionality to hide between the
>> services.
>
> It seems unfortunate to keep around this really weird behavior for the
> benefit of legacy applications.  Could there perhaps be a flag that a
> connection can set to indicate that it understands per-destination
> access control and therefore wants stricter policy enforcement?

We actually think that it's a good idea not to use the destination
name for doing different things, to make things more transparent. For
example, we have tools that can explore the bus, introspect things
(like d-bus), and they should show the same objects regardless by
which name you access a service. It's more transparent then, and
really just reduces names to some labels that make addressing things
easier, but that are actually completely unnecessary for actual method
invocations.

This behaviour is also relied on by a number of current bindings. For
example GLib's implementation usually caches the unique name of a
peer, and uses that for talking to remote objects (rather than always
using the well-known name), in order to get an error back when a name
becomes unavailable (maybe because a service died) or is moved to a
different peer. If daemons would always take the destination name into
account this kind of logic could never work.

It does take some time to get used to the fact that names are
exclusively used for message routing and policies, not as
target-entity of actual method calls. But the current dbus1 behaviour
makes a ton of sense, and is really something we want to keep. It
improves how clients can do life-cycle tracking of remote objects.

Note that D-Bus modeled unique-names and well-known-names after IP
addresses and DNS names. It's a very similar model, and, like DNS
names, well-known names have no effect on the routing of messages.

>>
>>>> +11.5 Implicit policies
>>>> +----------------------
>>>> +
>>>> +Depending on the type of the endpoint, a set of implicit rules might be
>>>> +enforced. On default endpoints, the following set is enforced:
>>>> +
>>>
>>> How do these rules interact with installed policy?
>>
>> As said before, all policy entries _grant_ access. We look through all
>> entries until we find one that grants access.
>>
>>>> +  * Privileged connections always override any installed policy. Those
>>>> +    connections could easily install their own policies, so there is no
>>>> +    reason to enforce installed policies.
>>>> +  * Connections can always talk to connections of the same user. This
>>>> +    includes broadcast messages.
>>>
>>> Why?
>>
>> All limits on buses are enforced on a user<->user basis. We don't want
>> to provide policies that are more fine-grained than our accounting.
>
> This seems completely at odds with all the fine-grained metadata
> stuff.  Also, anything that relies on this may get very confused when
> the LSM hooks go in, because I'm reasonably sure that the intent is
> for them to *not* follow this principle.

User-based accounting has always been the default, right? We are open
to extend the API to support any other accounting scheme (LSM,
cgroup-based, ...). But like bus-name-policies, I think it's fine to
keep this as future extension. If you think the current design
precludes LSM-based accounting, lemme know and we can fix it. But we
have talked to LSM people before (and there have been patches on
LKML), and they seemed fine with it.

>>
>>> If anyone ever strengthens the concept of identity to include
>>> things other than users (hmm -- there are already groups), this could
>>> be very limiting.
>>
>> If user-based accounting is not suitable, you can create custom
>> endpoints. Future extensions to that are always welcome. So far, the
>> default user-based accounting was enough. And I think it's suitable as
>> default.
>>
>>>> +  * Connections that own names might send broadcast messages to other
>>>> +    connections that belong to a different user, but only if that
>>>> +    destination connection does not own any name.
>>>> +
>
> (Also, what does "might" mean here?)
>
>>>
>>> This is weird.  It is also differently illogical than the "illogical"
>>> thing above.
>>
>> Actually it follows the same model described above. If two connections
>> are running under the same user then broadcasts are allowed, but if
>> they are running under different users *and* if the destination owns a
>> well-known name, then broadcasts are subject to TALK policy checks
>> since that destination may own a restricted well-known name that is
>> not interested in broadcasts. So this implicit policy is just
>> fast-path for the common case where the target is subscribed to a
>> broadcast and does not own any name.
>
> Huh?
>
> Say I have two users, "Sender" and "Receiver", each with a single
> connection.  If Receiver owns no well-known names, then Sender can
> send to it.  If Receiver owns one well-known name, then Sender needs
> to pass a TALK check on that name.  If Reciever owns two well-known
> names, then Sender only needs to pass a TALK check on one of them.
>
> Am I understanding this right?  If I am, then I think this is in the
> category of baroque and inconsistent security rules which everyone
> will screw up and therefore introduce security vulnerabilities.
>
> Can you really not enforce the much simpler rule that, to send to a
> name, you must have permission to send to *that* name?  If legacy
> dbus1 receivers register two names and don't validate everything
> correctly, then only the legacy receivers have problems.

Sorry, I got confused here. That implicit policy is now dropped.

>>
>>>> +
>>>> +13. Metadata
>>>> +===============================================================================
>> [snip]
>>>> +13.1 Known item types
>>>> +---------------------
>>>> +
>>>> +The following attach flags are currently supported.
>>>> +
>>>> +  KDBUS_ATTACH_TIMESTAMP
>>>> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
>>>> +    monotonic and the realtime timestamp, taken when the message was
>>>> +    processed on the kernel side.
>>>> +
>>>> +  KDBUS_ATTACH_CREDS
>>>> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
>>>> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
>>>> +
>>>
>>> As mentioned last time, please remove or justify starttime.
>>
>> starttime allows detecting PID overflows. Exposing the process
>> starttime is useful to detect when a PID is getting reused.
>> Unfortunately, we don't have 64bit pids, so we need the pid+time
>> combination to avoid ambiguity.
>
> NAK, I think.
>
> I agree that PID overflow is a real issue and should be addressed
> somehow.  But please address it for real instead of adding Yet Another
> Hack (tm).  In the mean time, leave that hack out, please.
>
> I would *love* to see PIDs have extra high bits at the end, done in a
> way that supports CRIU and that guarantees no reuse unless something
> privileged intentionally mis-programs it.  But starttime isn't that
> mechanism.

The starttime logic sufficiently fixes the issue. If one great day in
the future somebody invents some completely new concept for making
this problem go away, we can look into that, but even then the field
is still valuable for informational purposes. I mean, the kernel
tracks this and exposes this in /proc for a reason...

In the meantime, we don't have any other way of solving this problem,
so we'll leave this in.

>>
>>>> +  KDBUS_ATTACH_AUXGROUPS
>>>> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
>>>> +    number of auxiliary groups the sending task was a member of.
>>>> +
>>>> +  KDBUS_ATTACH_NAMES
>>>> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
>>>> +    connection currently owns. The name and flags are stored in kdbus_item.name
>>>> +    for each of them.
>>>> +
>>>
>>> That's interesting.  What's it for?
>>
>> It a valuable piece of information for receivers to know which bus
>> names a sender has claimed. For instance, we need this information for
>> the D-Bus proxy service, because we have to apply D-Bus1 policy in
>> that case, and we need to get a list of owned names in a race-free
>> manner to check the policy against.
>
> But if you change the rule to the sensible one where you need
> permission to TALK to the name that you're talking to, this goes away,
> right?

This does not work if a message is directed at a unique-name, as
explained above (or, think broadcasts).

>>
>>>> +  KDBUS_ATTACH_TID_COMM
>>>> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
>>>> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
>>>> +
>>>> +  KDBUS_ATTACH_PID_COMM
>>>> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
>>>> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
>>>> +
>>>> +  KDBUS_ATTACH_EXE
>>>> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
>>>> +    executable of the sending task, stored in kdbus_item.str.
>>>> +
>>>> +  KDBUS_ATTACH_CMDLINE
>>>> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
>>>> +    arguments of the sending task, as an array of strings, stored in
>>>> +    kdbus_item.str.
>>>
>>> Please remove these four items.  They are genuinely useless.  Anything
>>> that uses them for anything is either buggy or should have asked the
>>> sender to put the value in the payload (and immediately wondered why
>>> it was doing that).
>>
>> We use them for logging, debugging and monitoring. With our wireshark
>> extension it's pretty useful to know the comm-name of a process when
>> monitoring a bus. As we explained last time, this is not about
>> security. We're aware that a process can modify them. We use them only
>> as additional meta-data for logging and debugging.
>
> Use the PID.  Really.  Your wireshark extention can look this crap up
> in /proc and, if it fails due to a race, big frickin' deal.

I see no reason for leaving it up to the client to do this extra work
if it can as well be attached by the kernel, really.

We use the PID on dbus1 systems for cases like this. But it's actually
too racy to be useful. For example, in systemd we ship a tiny binary
that is used as cgroup agent, which just pushes a message about the
fact it just got called for a cgroup onto the bus and then exits.
Since it only runs for a very very short time, any peer which then
tries to read the metadata off it is pretty likely to fail. And there
are quite a number of processes like that, that just do one thing and
die, especially in the early boot process. For none of them we
currently can generate useful log metadata, because all we have is a
PID that we have barely any useful information about.
This is a real race we get lots of bug-reports for. With kdbus we want
to fix this, by optionally attaching this useful data to the message,
so that the receiver can get the information when it wants to.

>>
>> If we put those items into the payload, we have to transmit this data
>> even if the destination process is not interested in this.
>> Furthermore, each caller has to run multiple syscalls on each message
>> to retrieve those values.
>>
>> We use these items heavily for filtering and debugging, regardless of
>> the payload protocol that is transmitted on the bus.
>>
>> To give another specific use-case here: dbus supports bus activation,
>> where a message sent to a non-running service causes it to be spawned
>> implicitly without losing the message. Now, with such a scheme it is
>> incredibly useful to be able to log which client caused a service to
>> be triggered, hence we want to know the cmdline/exe/comm of the
>> client. Not knowing this is a major pita when trying to trace the boot
>> process and figuring out why a specific service got activated.
>
> Again, use the PID for tracing, please.

No. It's racy. Processes can die too quickly. And this is not a race
that would be about security, but it's a real-life race that is just
awful to run into when you try to trace what's going on on your
system.

> At the very least, make it impossible to specify these fields in the
> "must be received" set and rename them to something like
> KDBUS_INSTALL_UNRELIABLE_CMDLINE, etc, because they're unreliable
>
> Finally, this stuff should only be readable by privileged users.  And
> using the PID accomplishes that.
>
>>
>> Also note that since v2 of the patch there's actually a per-sender
>> mask for meta-data like this, hence a peer which doesn't want to pass
>> its exec/cmdline/comm along can do that. Of course, this will
>> seriously hamper debuggability and transparency...
>
> Transparency is a terrible thing here.
>
> How many users put passwords into things on the command line?  Yes,
> it's a bad idea (for reasons that are entirely stupid), but now those
> passwords get *logged*.
>
> If this is in the kernel, and someone complains that sensitive data is
> showing up on ten different logs on their system, they'll *correctly*
> blame the kernel.  If you at least use the PID and restrict it to the
> logging code, then at least the bug report will go to the logging
> daemon, which will be *correctly* accused of doing something daft, and
> it can be fixed.

Hmm? Not following here. This information is visible via /proc too. If
you hide it from /proc via the hide_pid logic, then it is also gone
from the kdbus meta-data. Also, note again that clients that don't
want this information to be passed to services can declare that now
with their sender creds mask, introduced with v2.

>>
>>>> +
>>>> +  KDBUS_ATTACH_CGROUP
>>>> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
>>>> +
>>>> +  KDBUS_ATTACH_CAPS
>>>> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
>>>> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
>>>> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
>>>> +    and derive the number of sets and rows from the item size and the reported
>>>> +    number of valid capability bits.
>>>> +
>>>
>>> Please remove this, too, or justify its use.
>>
>> cgroup information tells us which service is doing a bus request. This
>> is useful for a variety of things. For example, the bus activation
>> logging item above benefits from it. In general, if some message shall
>> be logged about any client it is useful to know its service name.
>>
>> Capabilities are useful to authenticate specific method calls. For
>> example, when a client asks systemd to reboot, without this concept we
>> can only check for UID==0 to decide whether to allow this. Breaking
>> this down to capabilities in a race-free way has the benefit of
>> allowing systemd to bind this to CAP_SYS_BOOT instead. There is no
>> reason to deny a process with CAP_SYS_BOOT to reboot via bus-APIs, as
>> they could just enforce it via syscalls, anyway.
>
> With all due respect, BS.
>
> I admit that there is probably no reason to deny systemd-based reboot
> to a CAP_SYS_BOOT-capable process, but there is absolutely no reason
> to give processes that are supposed to reboot using systemd the
> CAP_SYS_BOOT capability.

No, and this is not how this works. Note that for unpriviliged clients
systemd checks PolicyKit in order to identify whether to allow certain
priviliged operations. However, PolicyKit requires bus chatte, is slow
and quite complex, hence the code shortcuts things if it knows that
the client is priviliged anyway, and doesn't even bother with
PolicyKit at all. This is where the caps come into play, since this
shortcutting really should *NOT* be done for every single client, but
only for those with the rights to do the operation anyway.

Hence, this is really not about overloading capabilities with new
meanings. Instead it is about shortcutting policykit for priviliged
clients. And this shortcutting should be as restricted as possible.

> In any event, I suspect you'll have a hard time justifying this for
> anything other than CAP_SYS_BOOT.  Just because CAP_SYS_ADMIN users
> can probably do whatever they want doesn't mean that systemd should
> make that a built-in policy.

Well, the other option for these APIs is to use the euid, which is
hardly any better.

systemd-timedated shortcuts policykit if the client has CAP_SYS_TIME
and tries to change the system clock. Similar, if a process asks
logind to kill a session, we bypass pk if the client has CAP_KILL.

> Also, wtf is the bounding set and such for?  At the very least this
> should only be the effective set.

Yes, the code that makes use of this for shortcutting pk uses the
effective set only. The other ones we allow sending across for
enhancing logging of security operations.

In general: all creds we collect at the very least are incredibly
useful for generating log records in a race-free fashion. As pointed
out above the "race-free" bit alone solves real-world issues that are
highly annoying if we don't have it.

>>
>> We think it's a useful and reliable authentication method. Why should
>> we remove it?
>
> Because the implementation is buggy and therefore it's insecure?
> Remember that caps are namespaced in an interesting way.

Yes, we are well aware of the fact that we currently have no good way
to translate a full set of capabilities from one user-ns to another.
Hence, the only sane thing to do in such situations is to drop the
entire item, which is what we do. Once we have a reliable way of
translating things, we can add that to our code. Note that a set
capability flag will only gain you more access level, so if caps are
missing from a message, a user might only have *less* privileges, not
more.

>>
>> Anyway, these items are  just optional. The sender can refuse the
>> reveal them, and the item is only transmitted if the receiver opted in
>> for it, too. So there's no need to drop any item type from the
>> protocol.
>
> No.
>
> Because if receivers opt in to most of these, *they're doing it
> wrong*, and the kernel shouldn't be in the business of helping them.

No, they are not doing it "wrong". The services would do things wrong
if they'd make security decisions on bits that cannot be acquired in a
race-free way. And services do things in a dirty way if they'd
generated logging data in a race-ful way (like you suggest), by
reading things from /proc.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-26 11:55           ` David Herrmann
  (?)
@ 2014-11-26 15:30           ` Andy Lutomirski
  2014-11-26 15:39               ` Andy Lutomirski
  -1 siblings, 1 reply; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-26 15:30 UTC (permalink / raw)
  To: David Herrmann
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

On Wed, Nov 26, 2014 at 3:55 AM, David Herrmann <dh.herrmann@gmail.com> wrote:
> Hi
>
> On Mon, Nov 24, 2014 at 9:57 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Mon, Nov 24, 2014 at 12:16 PM, David Herrmann <dh.herrmann@gmail.com> wrote:
>>> [snip]
>>>>> +6.5 Getting information about a connection's bus creator
>>>>> +--------------------------------------------------------
>>>>> +
>>>>> +The KDBUS_CMD_BUS_CREATOR_INFO ioctl takes the same struct as
>>>>> +KDBUS_CMD_CONN_INFO but is used to retrieve information about the creator of
>>>>> +the bus the connection is attached to. The metadata returned by this call is
>>>>> +collected during the creation of the bus and is never altered afterwards, so
>>>>> +it provides pristine information on the task that created the bus, at the
>>>>> +moment when it did so.
>>>>
>>>> What's this for?  I understand the need for the creator of busses to
>>>> be authenticated, but doing it like this mean that anyone who will
>>>> *fail* authentication can DoS the authentic creator.
>>>
>>> This returns information on a bus owner, to determine whether a
>>> connection is connected to a system, user or session bus. Note that
>>> the bus-creator itself is not a valid peer on the bus, so you cannot
>>> send messages to them. Which kind of DoS do you have in mind?
>>
>> I assume that the logic is something like:
>>
>> connect to bus
>> request bus metadata
>> if (bus metadata matches expectations) {
>>   great, trust the bus!
>> } else {
>>   oh crap!
>> }
>
> Uh, no, this is really not the logic that should be assumed. It's more
> for code where you want to simply pass a bus fd, and the code knows
> nothing about it. Now, the code can derive some information from the
> bus fd, like for example who owns it. Then, depending on some of the
> creds returned it can determine whether to read configuration file set
> A or B and so on. This is particularly useful for all kinds of
> unprivileged bus services that end up running on any kind of bus and
> need to be able to figure out what they are actually operating on.

The logic you've described is more or less the same thing that I
described with a process transition in.  It's:

connect to bus
send kdbus fd to another process, which does the rest:
request bus metadata
if (bus metadata matches expectations) {
  great, trust the bus!
} else {
  oh crap! (or malfunction or whatever)
}

ISTM you should have an API to get the *name* of the bus and check that.

Except that, if the service you pass that fd to is privileged, then
you're completely screwed, because none of this checks that the
*domain* is correct.

[snip]

>
>>>>> +
>>>>> +Also, if the connection allowed for file descriptor to be passed
>>>>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>>>>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>>>>> +returns. The receiving task is obliged to close all of them appropriately.
>>>>
>>>> This makes it sound like fds are installed at receive time.  What
>>>> prevents resource exhaustion due to having excessive numbers of fds in
>>>> transit (that are presumably not accounted to anyone)?
>>>
>>> We have a per-user message accounting for undelivered messages, as
>>> well as a maximum number of pending messages per connection on the
>>> receiving end. These limits are accounted on a "user<->user" basis, so
>>> the limit of a user A will not affect two other users (B and C)
>>> talking.
>>
>> But you can shove tons of fds in a message, and you can have lots of
>> messages, and some of the fds can be fds of unix sockets that have fds
>> queued up in them, and one of those fds could be the fd to the kdbus
>> connection that sent the fd...
>
> You cannot send kdbus-fds or unix-fds over kdbus, right now. We have
> people working on the AF_UNIX gc to make it more generic and include
> external types. Until then, we simply prevent recursive fd passing.

OK, fair enough.

>
>> This is not advice as to what to do about it, but I think that it will
>> be a problem at some point.
>>
>>
>>>>> +11. Policy
>>>>> +===============================================================================
>>>>> +
>>>>> +A policy databases restrict the possibilities of connections to own, see and
>>>>> +talk to well-known names. It can be associated with a bus (through a policy
>>>>> +holder connection) or a custom endpoint.
>>>>
>>>> ISTM metadata items on bus names should be replaced with policy that
>>>> applies to the domain as a whole and governs bus creation.
>>>
>>> No, well-known names are bound to buses, so a bus is really the right
>>> place to hold policy about which process is allowed to claim them.
>>> Every user is allowed to create a bus of its own, there's no policy
>>> for that, and there shouldn't be.
>>>
>>> It has nothing to do with metadata items.
>>
>> But it does -- the creator of the bus binds metadata to that bus at
>> creation time.
>>
>> I think that a better solution would be to have a global policy that
>> says, for example, "to create the bus called 'system', the creator
>> must have selinux label xyz" or "to create a user bus called
>> uid-1000-privileged-ui-bus the creator must have some cgroup" or
>> whatever.
>>
>> Although maybe a better solution would leave this in the kernel but
>> allow any cgroup to create a bus with a same that indicates the
>> creating cgroup.  Then I could have my desktop shell create a
>> "/cgroup/path/to/desktop" for per-user privileged things.
>
> We enforce the UID as first entity of the bus name. Again, this is our
> default policy because we rely on user-based access control. If we
> want more fine-grained access-control, we can introduce other policies
> at any time. For instance, we could enforce "cg-<cgroup>-<busname>"
> later on, where the kernel requires the caller to prefix the bus with
> "cg-<cgroup>-", where <cgroup> is the cgroup-path encoded in some way.
>
> We provide one policy as default, and we have a use-case for it.
> Further policies are always welcome as extensions later on. I don't
> see why we should provide all those right from the beginning without
> any users right now.

OK

>
>>>
>>>>> +A set of policy rules is described by a name and multiple access rules, defined
>>>>> +by the following struct.
>>>>> +
>>>>> +struct kdbus_policy_access {
>>>>> +  __u64 type;  /* USER, GROUP, WORLD */
>>>>> +    One of the following.
>>>>> +
>>>>> +    KDBUS_POLICY_ACCESS_USER
>>>>> +      Grant access to a user with the uid stored in the 'id' field.
>>>>> +
>>>>> +    KDBUS_POLICY_ACCESS_GROUP
>>>>> +      Grant access to a user with the gid stored in the 'id' field.
>>>>> +
>>>>> +    KDBUS_POLICY_ACCESS_WORLD
>>>>> +      Grant access to everyone. The 'id' field is ignored.
>>>>> +
>>>>> +  __u64 access;        /* OWN, TALK, SEE */
>>>>> +    The access to grant.
>>>>> +
>>>>> +    KDBUS_POLICY_SEE
>>>>> +      Allow the name to be seen.
>>>>> +
>>>>> +    KDBUS_POLICY_TALK
>>>>> +      Allow the name to be talked to.
>>>>> +
>>>>> +    KDBUS_POLICY_OWN
>>>>> +      Allow the name to be owned.
>>>>> +
>>>>> +  __u64 id;
>>>>> +    For KDBUS_POLICY_ACCESS_USER, stores the uid.
>>>>> +    For KDBUS_POLICY_ACCESS_GROUP, stores the gid.
>>>>> +};
>>>>
>>>>
>>>> What happens if there are multiple matches?
>>>
>>> We only have _granting_ policy entries. We search through the
>>> policy-db until we find an entry that grants access. We do _not_ stop
>>> on the first item that matches.
>>
>> Yay!  Can you document that more clearly?
>
> Sure!
>
>>>
>>>>
>>>>> +
>>>>> +11.4 TALK access and multiple well-known names per connection
>>>>> +-------------------------------------------------------------
>>>>> +
>>>>> +Note that TALK access is checked against all names of a connection.
>>>>> +For example, if a connection owns both 'org.foo.bar' and 'org.blah.baz', and
>>>>> +the policy database allows 'org.blah.baz' to be talked to by WORLD, then this
>>>>> +permission is also granted to 'org.foo.bar'. That might sound illogical, but
>>>>> +after all, we allow messages to be directed to either the name or a well-known
>>>>> +name, and policy is applied to the connection, not the name. In other words,
>>>>> +the effective TALK policy for a connection is the most permissive of all names
>>>>> +the connection owns.
>>>>
>>>> This does seem illogical.  Does the recipient at least know which
>>>> well-known name was addressed?
>>>
>>> If the sender addressed it to a well-known name, yes. If the sender
>>> addressed the message to a unique-ID, there will be no such name, of
>>> course. Still, the policy applies to such transactions either way
>>> (standard D-Bus behavior).
>>>
>>> Note however that dbus1 will not pass along the destination well-known
>>> name, hence most userspace libraries will ignore this information too,
>>> even if they run on kdbus which might pass this information around.
>>> The right way for services that carry multiple service names to
>>> discern which actual service is being talked to is having separate
>>> object paths for the different functionality to hide between the
>>> services.
>>
>> It seems unfortunate to keep around this really weird behavior for the
>> benefit of legacy applications.  Could there perhaps be a flag that a
>> connection can set to indicate that it understands per-destination
>> access control and therefore wants stricter policy enforcement?
>
> We actually think that it's a good idea not to use the destination
> name for doing different things, to make things more transparent. For
> example, we have tools that can explore the bus, introspect things
> (like d-bus), and they should show the same objects regardless by
> which name you access a service. It's more transparent then, and
> really just reduces names to some labels that make addressing things
> easier, but that are actually completely unnecessary for actual method
> invocations.
>
> This behaviour is also relied on by a number of current bindings. For
> example GLib's implementation usually caches the unique name of a
> peer, and uses that for talking to remote objects (rather than always
> using the well-known name), in order to get an error back when a name
> becomes unavailable (maybe because a service died) or is moved to a
> different peer. If daemons would always take the destination name into
> account this kind of logic could never work.
>
> It does take some time to get used to the fact that names are
> exclusively used for message routing and policies, not as
> target-entity of actual method calls. But the current dbus1 behaviour
> makes a ton of sense, and is really something we want to keep. It
> improves how clients can do life-cycle tracking of remote objects.
>
> Note that D-Bus modeled unique-names and well-known-names after IP
> addresses and DNS names. It's a very similar model, and, like DNS
> names, well-known names have no effect on the routing of messages.

DNS names also have no effect on your ability to send to an IP.

In any event, I think you can keep doing everything that you're
currently doing if you check policy on the actual sent-to name.  Users
will just apply the *same* policy to all of their names (or just use
one name), and everything will work.  And, for bonus points, the
security rules will make sense.

>
>>>
>>>>> +11.5 Implicit policies
>>>>> +----------------------
>>>>> +
>>>>> +Depending on the type of the endpoint, a set of implicit rules might be
>>>>> +enforced. On default endpoints, the following set is enforced:
>>>>> +
>>>>
>>>> How do these rules interact with installed policy?
>>>
>>> As said before, all policy entries _grant_ access. We look through all
>>> entries until we find one that grants access.
>>>
>>>>> +  * Privileged connections always override any installed policy. Those
>>>>> +    connections could easily install their own policies, so there is no
>>>>> +    reason to enforce installed policies.
>>>>> +  * Connections can always talk to connections of the same user. This
>>>>> +    includes broadcast messages.
>>>>
>>>> Why?
>>>
>>> All limits on buses are enforced on a user<->user basis. We don't want
>>> to provide policies that are more fine-grained than our accounting.
>>
>> This seems completely at odds with all the fine-grained metadata
>> stuff.  Also, anything that relies on this may get very confused when
>> the LSM hooks go in, because I'm reasonably sure that the intent is
>> for them to *not* follow this principle.
>
> User-based accounting has always been the default, right? We are open
> to extend the API to support any other accounting scheme (LSM,
> cgroup-based, ...). But like bus-name-policies, I think it's fine to
> keep this as future extension. If you think the current design
> precludes LSM-based accounting, lemme know and we can fix it. But we
> have talked to LSM people before (and there have been patches on
> LKML), and they seemed fine with it.

OK, as long as restricted endpoints don't work like this.

>
>>>
>>>> If anyone ever strengthens the concept of identity to include
>>>> things other than users (hmm -- there are already groups), this could
>>>> be very limiting.
>>>
>>> If user-based accounting is not suitable, you can create custom
>>> endpoints. Future extensions to that are always welcome. So far, the
>>> default user-based accounting was enough. And I think it's suitable as
>>> default.
>>>
>>>>> +  * Connections that own names might send broadcast messages to other
>>>>> +    connections that belong to a different user, but only if that
>>>>> +    destination connection does not own any name.
>>>>> +
>>
>> (Also, what does "might" mean here?)
>>
>>>>
>>>> This is weird.  It is also differently illogical than the "illogical"
>>>> thing above.
>>>
>>> Actually it follows the same model described above. If two connections
>>> are running under the same user then broadcasts are allowed, but if
>>> they are running under different users *and* if the destination owns a
>>> well-known name, then broadcasts are subject to TALK policy checks
>>> since that destination may own a restricted well-known name that is
>>> not interested in broadcasts. So this implicit policy is just
>>> fast-path for the common case where the target is subscribed to a
>>> broadcast and does not own any name.
>>
>> Huh?
>>
>> Say I have two users, "Sender" and "Receiver", each with a single
>> connection.  If Receiver owns no well-known names, then Sender can
>> send to it.  If Receiver owns one well-known name, then Sender needs
>> to pass a TALK check on that name.  If Reciever owns two well-known
>> names, then Sender only needs to pass a TALK check on one of them.
>>
>> Am I understanding this right?  If I am, then I think this is in the
>> category of baroque and inconsistent security rules which everyone
>> will screw up and therefore introduce security vulnerabilities.
>>
>> Can you really not enforce the much simpler rule that, to send to a
>> name, you must have permission to send to *that* name?  If legacy
>> dbus1 receivers register two names and don't validate everything
>> correctly, then only the legacy receivers have problems.
>
> Sorry, I got confused here. That implicit policy is now dropped.
>

Sounds good.

>>>
>>>>> +
>>>>> +13. Metadata
>>>>> +===============================================================================
>>> [snip]
>>>>> +13.1 Known item types
>>>>> +---------------------
>>>>> +
>>>>> +The following attach flags are currently supported.
>>>>> +
>>>>> +  KDBUS_ATTACH_TIMESTAMP
>>>>> +    Attaches an item of type KDBUS_ITEM_TIMESTAMP which contains both the
>>>>> +    monotonic and the realtime timestamp, taken when the message was
>>>>> +    processed on the kernel side.
>>>>> +
>>>>> +  KDBUS_ATTACH_CREDS
>>>>> +    Attaches an item of type KDBUS_ITEM_CREDS, containing credentials as
>>>>> +    described in kdbus_creds: the uid, gid, pid, tid and starttime of the task.
>>>>> +
>>>>
>>>> As mentioned last time, please remove or justify starttime.
>>>
>>> starttime allows detecting PID overflows. Exposing the process
>>> starttime is useful to detect when a PID is getting reused.
>>> Unfortunately, we don't have 64bit pids, so we need the pid+time
>>> combination to avoid ambiguity.
>>
>> NAK, I think.
>>
>> I agree that PID overflow is a real issue and should be addressed
>> somehow.  But please address it for real instead of adding Yet Another
>> Hack (tm).  In the mean time, leave that hack out, please.
>>
>> I would *love* to see PIDs have extra high bits at the end, done in a
>> way that supports CRIU and that guarantees no reuse unless something
>> privileged intentionally mis-programs it.  But starttime isn't that
>> mechanism.
>
> The starttime logic sufficiently fixes the issue. If one great day in
> the future somebody invents some completely new concept for making
> this problem go away, we can look into that, but even then the field
> is still valuable for informational purposes. I mean, the kernel
> tracks this and exposes this in /proc for a reason...
>
> In the meantime, we don't have any other way of solving this problem,
> so we'll leave this in.
>

But this set of patches doesn't "leave this in".  This set of patches
*adds it*.  Designing a poor new API is not excused by the fact that
no one has developed the better one yet.

>>>
>>>>> +  KDBUS_ATTACH_AUXGROUPS
>>>>> +    Attaches an item of type KDBUS_ITEM_AUXGROUPS, containing a dynamic
>>>>> +    number of auxiliary groups the sending task was a member of.
>>>>> +
>>>>> +  KDBUS_ATTACH_NAMES
>>>>> +    Attaches items of type KDBUS_ITEM_OWNED_NAME, one for each name the sending
>>>>> +    connection currently owns. The name and flags are stored in kdbus_item.name
>>>>> +    for each of them.
>>>>> +
>>>>
>>>> That's interesting.  What's it for?
>>>
>>> It a valuable piece of information for receivers to know which bus
>>> names a sender has claimed. For instance, we need this information for
>>> the D-Bus proxy service, because we have to apply D-Bus1 policy in
>>> that case, and we need to get a list of owned names in a race-free
>>> manner to check the policy against.
>>
>> But if you change the rule to the sensible one where you need
>> permission to TALK to the name that you're talking to, this goes away,
>> right?
>
> This does not work if a message is directed at a unique-name, as
> explained above (or, think broadcasts).

Is the issue that, if I send a proxied request to unique name foo, and
unique name foo is help by someone who's claimed /a/b/c, then the
proxy service needs to check for permission to talk to /a/b/c?

To me, this still feels like this is only an issue because of this
weird concept of how names and policies interact.  I really think that
you should put some extra effort into making the policy matches
self-contained in the sense of only matching on addressing information
that's actually in the message.

For example, HELLO could indicate one of "anyone can talk to me" or
"no one can talk to me without using a well-known name", then the
problem is solved.  Protocols that switch to using the unique name
(e.g. glib) can send *both* the unique and well-known name and the
access control will still work.

>
>>>
>>>>> +  KDBUS_ATTACH_TID_COMM
>>>>> +    Attaches an items of type KDBUS_ITEM_TID_COMM, transporting the sending
>>>>> +    task's 'comm', for the tid.  The string is stored in kdbus_item.str.
>>>>> +
>>>>> +  KDBUS_ATTACH_PID_COMM
>>>>> +    Attaches an items of type KDBUS_ITEM_PID_COMM, transporting the sending
>>>>> +    task's 'comm', for the pid.  The string is stored in kdbus_item.str.
>>>>> +
>>>>> +  KDBUS_ATTACH_EXE
>>>>> +    Attaches an item of type KDBUS_ITEM_EXE, containing the path to the
>>>>> +    executable of the sending task, stored in kdbus_item.str.
>>>>> +
>>>>> +  KDBUS_ATTACH_CMDLINE
>>>>> +    Attaches an item of type KDBUS_ITEM_CMDLINE, containing the command line
>>>>> +    arguments of the sending task, as an array of strings, stored in
>>>>> +    kdbus_item.str.
>>>>
>>>> Please remove these four items.  They are genuinely useless.  Anything
>>>> that uses them for anything is either buggy or should have asked the
>>>> sender to put the value in the payload (and immediately wondered why
>>>> it was doing that).
>>>
>>> We use them for logging, debugging and monitoring. With our wireshark
>>> extension it's pretty useful to know the comm-name of a process when
>>> monitoring a bus. As we explained last time, this is not about
>>> security. We're aware that a process can modify them. We use them only
>>> as additional meta-data for logging and debugging.
>>
>> Use the PID.  Really.  Your wireshark extention can look this crap up
>> in /proc and, if it fails due to a race, big frickin' deal.
>
> I see no reason for leaving it up to the client to do this extra work
> if it can as well be attached by the kernel, really.
>
> We use the PID on dbus1 systems for cases like this. But it's actually
> too racy to be useful. For example, in systemd we ship a tiny binary
> that is used as cgroup agent, which just pushes a message about the
> fact it just got called for a cgroup onto the bus and then exits.
> Since it only runs for a very very short time, any peer which then
> tries to read the metadata off it is pretty likely to fail. And there
> are quite a number of processes like that, that just do one thing and
> die, especially in the early boot process. For none of them we
> currently can generate useful log metadata, because all we have is a
> PID that we have barely any useful information about.
> This is a real race we get lots of bug-reports for. With kdbus we want
> to fix this, by optionally attaching this useful data to the message,
> so that the receiver can get the information when it wants to.
>

I really don't think that this kind of mostly indiscriminate
broadcasting of almost-invariably unnecessary and potentially
dangerous information is justified by the limited logging and
debugging benefit.  I think you should find a better design that
solves your problem.

[snip]

>> Transparency is a terrible thing here.
>>
>> How many users put passwords into things on the command line?  Yes,
>> it's a bad idea (for reasons that are entirely stupid), but now those
>> passwords get *logged*.
>>
>> If this is in the kernel, and someone complains that sensitive data is
>> showing up on ten different logs on their system, they'll *correctly*
>> blame the kernel.  If you at least use the PID and restrict it to the
>> logging code, then at least the bug report will go to the logging
>> daemon, which will be *correctly* accused of doing something daft, and
>> it can be fixed.
>
> Hmm? Not following here. This information is visible via /proc too. If
> you hide it from /proc via the hide_pid logic, then it is also gone
> from the kdbus meta-data. Also, note again that clients that don't
> want this information to be passed to services can declare that now
> with their sender creds mask, introduced with v2.
>

We're talking about users, not senders, and users aren't going to
expect that they're sending their command line arond the bus.  And
there could also receivers who are sandboxed and can't access /proc.

My point is that this information is probably okay to expose to
privileged logging daemons, and *maybe* to journald, but it's not okay
to pass anywhere else.

>>>
>>>>> +
>>>>> +  KDBUS_ATTACH_CGROUP
>>>>> +    Attaches an item of type KDBUS_ITEM_CGROUP with the task's cgroup path.
>>>>> +
>>>>> +  KDBUS_ATTACH_CAPS
>>>>> +    Attaches an item of type KDBUS_ITEM_CAPS, carrying sets of capabilities
>>>>> +    that should be accessed via kdbus_item.caps.caps. Also, userspace should
>>>>> +    be written in a way that it takes kdbus_item.caps.last_cap into account,
>>>>> +    and derive the number of sets and rows from the item size and the reported
>>>>> +    number of valid capability bits.
>>>>> +
>>>>
>>>> Please remove this, too, or justify its use.
>>>
>>> cgroup information tells us which service is doing a bus request. This
>>> is useful for a variety of things. For example, the bus activation
>>> logging item above benefits from it. In general, if some message shall
>>> be logged about any client it is useful to know its service name.
>>>
>>> Capabilities are useful to authenticate specific method calls. For
>>> example, when a client asks systemd to reboot, without this concept we
>>> can only check for UID==0 to decide whether to allow this. Breaking
>>> this down to capabilities in a race-free way has the benefit of
>>> allowing systemd to bind this to CAP_SYS_BOOT instead. There is no
>>> reason to deny a process with CAP_SYS_BOOT to reboot via bus-APIs, as
>>> they could just enforce it via syscalls, anyway.
>>
>> With all due respect, BS.
>>
>> I admit that there is probably no reason to deny systemd-based reboot
>> to a CAP_SYS_BOOT-capable process, but there is absolutely no reason
>> to give processes that are supposed to reboot using systemd the
>> CAP_SYS_BOOT capability.
>
> No, and this is not how this works. Note that for unpriviliged clients
> systemd checks PolicyKit in order to identify whether to allow certain
> priviliged operations. However, PolicyKit requires bus chatte, is slow
> and quite complex, hence the code shortcuts things if it knows that
> the client is priviliged anyway, and doesn't even bother with
> PolicyKit at all. This is where the caps come into play, since this
> shortcutting really should *NOT* be done for every single client, but
> only for those with the rights to do the operation anyway.
>
> Hence, this is really not about overloading capabilities with new
> meanings. Instead it is about shortcutting policykit for priviliged
> clients. And this shortcutting should be as restricted as possible.

So this mechanism is there to speed up reboot and similar things?

>
>> In any event, I suspect you'll have a hard time justifying this for
>> anything other than CAP_SYS_BOOT.  Just because CAP_SYS_ADMIN users
>> can probably do whatever they want doesn't mean that systemd should
>> make that a built-in policy.
>
> Well, the other option for these APIs is to use the euid, which is
> hardly any better.

It absolutely is better.  You can, in software, change the rules, and
euid works intelligently in namespaces.  Aside from being a bad idea,
caps are just *broken* without knowledge of the userns hierarchy *and*
userns ownership information.

ns_capable is the in-kernel interface, but the return value of
ns_capable can't be calculated from just the cap mask.

Also, have you ever tried to usefully assign caps to euid != 0 users?
It's essentially entirely broken, especially if you try to use execve.
Don't force the use of the crappy fscaps interface to allow non-root
to reboot cleanly without invoking policykit -- you won't be doing
anyone any favors.  Just set your userspace policy the way you want
it.

I've tried to fix this, I've written patches, I've designed
alternative and IMO far superior cap transition rules.  Guess what?
They're not in the kernel because no one wants to touch caps.  But
that doesn't mean that the existing cap system is anything other than
a giant pile of cr*p that happens to sort of work.  Don't make user
code have to deal with the mess more than it already has to.

>
> systemd-timedated shortcuts policykit if the client has CAP_SYS_TIME
> and tries to change the system clock. Similar, if a process asks
> logind to kill a session, we bypass pk if the client has CAP_KILL.
>
>> Also, wtf is the bounding set and such for?  At the very least this
>> should only be the effective set.
>
> Yes, the code that makes use of this for shortcutting pk uses the
> effective set only. The other ones we allow sending across for
> enhancing logging of security operations.
>
> In general: all creds we collect at the very least are incredibly
> useful for generating log records in a race-free fashion. As pointed
> out above the "race-free" bit alone solves real-world issues that are
> highly annoying if we don't have it.

I find it highly implausible that kdbus logging users really benefit
from knowing the *caps* of other participants in any context in which
their euid isn't enough.

>
>>>
>>> We think it's a useful and reliable authentication method. Why should
>>> we remove it?
>>
>> Because the implementation is buggy and therefore it's insecure?
>> Remember that caps are namespaced in an interesting way.
>
> Yes, we are well aware of the fact that we currently have no good way
> to translate a full set of capabilities from one user-ns to another.
> Hence, the only sane thing to do in such situations is to drop the
> entire item, which is what we do. Once we have a reliable way of
> translating things, we can add that to our code. Note that a set
> capability flag will only gain you more access level, so if caps are
> missing from a message, a user might only have *less* privileges, not
> more.

Or you could remove it in the absence of a real legitimate use case.

>
>>>
>>> Anyway, these items are  just optional. The sender can refuse the
>>> reveal them, and the item is only transmitted if the receiver opted in
>>> for it, too. So there's no need to drop any item type from the
>>> protocol.
>>
>> No.
>>
>> Because if receivers opt in to most of these, *they're doing it
>> wrong*, and the kernel shouldn't be in the business of helping them.
>
> No, they are not doing it "wrong". The services would do things wrong
> if they'd make security decisions on bits that cannot be acquired in a
> race-free way. And services do things in a dirty way if they'd
> generated logging data in a race-ful way (like you suggest), by
> reading things from /proc.

Then find a clean way that's gated on having the right /proc access,
which is not guaranteed to exist on all of your eventual users'
systems, and, if that access doesn't exist because the admin or
sandbox designer has sensibly revoked it, then kdbus shouldn't
override them.

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-26 15:39               ` Andy Lutomirski
  0 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-26 15:39 UTC (permalink / raw)
  To: David Herrmann
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, Djalal Harouni

On Wed, Nov 26, 2014 at 7:30 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> Then find a clean way that's gated on having the right /proc access,
> which is not guaranteed to exist on all of your eventual users'
> systems, and, if that access doesn't exist because the admin or
> sandbox designer has sensibly revoked it, then kdbus shouldn't
> override them.

One idea: add a sysctl that defaults to off that enables these
metadata items, and keep it disabled on production systems.  Then you
get your debugging and everyone else gets unsurprising behavior.

--Andy

>
> --Andy



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-26 15:39               ` Andy Lutomirski
  0 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-11-26 15:39 UTC (permalink / raw)
  To: David Herrmann
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Daniel Mack, Djalal Harouni

On Wed, Nov 26, 2014 at 7:30 AM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> Then find a clean way that's gated on having the right /proc access,
> which is not guaranteed to exist on all of your eventual users'
> systems, and, if that access doesn't exist because the admin or
> sandbox designer has sensibly revoked it, then kdbus shouldn't
> override them.

One idea: add a sysctl that defaults to off that enables these
metadata items, and keep it disabled on production systems.  Then you
get your debugging and everyone else gets unsurprising behavior.

--Andy

>
> --Andy



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-21  5:02 ` kdbus: add documentation Greg Kroah-Hartman
  2014-11-21  8:29   ` Harald Hoyer
  2014-11-21 17:12     ` Andy Lutomirski
@ 2014-11-30  8:56   ` Florian Weimer
  2014-11-30 17:17       ` David Herrmann
  2014-11-30  9:02     ` Florian Weimer
  3 siblings, 1 reply; 73+ messages in thread
From: Florian Weimer @ 2014-11-30  8:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api,
	linux-kernel, daniel, dh.herrmann, tixxdz

* Greg Kroah-Hartman:

> +The focus of this document is an overview of the low-level, native kernel D-Bus
> +transport called kdbus. Kdbus exposes its functionality via files in a
> +filesystem called 'kdbusfs'. All communication between processes takes place
> +via ioctls on files exposed through the mount point of a kdbusfs. The default
> +mount point of kdbusfs is /sys/fs/kdbus.

Does this mean the bus does not enforce the correctness of the D-Bus
introspection metadata?  That's really unfortunate.  Classic D-Bus
does not do this, either, and combined with the variety of approaches
used to implement D-Bus endpoints, it makes it really difficult to
figure out what D-Bus services, exactly, a process provides.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30  9:02     ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30  9:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api,
	linux-kernel, daniel, dh.herrmann, tixxdz

* Greg Kroah-Hartman:

> +7.4 Receiving messages

> +Also, if the connection allowed for file descriptor to be passed
> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
> +returns. The receiving task is obliged to close all of them appropriately.

What happens if this is not possible because the file descriptor limit
of the processes would be exceeded?  EMFILE, and the message will not
be received?

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30  9:02     ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30  9:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: arnd-r2nGTMty4D4, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A

* Greg Kroah-Hartman:

> +7.4 Receiving messages

> +Also, if the connection allowed for file descriptor to be passed
> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
> +returns. The receiving task is obliged to close all of them appropriately.

What happens if this is not possible because the file descriptor limit
of the processes would be exceeded?  EMFILE, and the message will not
be received?

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-21 17:12     ` Andy Lutomirski
@ 2014-11-30  9:08       ` Florian Weimer
  -1 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30  9:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, David Herrmann, Djalal Harouni

* Andy Lutomirski:

> At the risk of opening a can of worms, wouldn't this be much more
> useful if you could share a pool between multiple connections?

They would also be useful to reduce context switches when receiving
data from all kinds of descriptors.  At present, when polling, you
receive notification, and then you have to call into the kernel,
again, to actually fetch the data and associated information.  The
kernel could also queue the data for one specific recipient,
addressing the same issue that SO_REUSEPORT tries to solve (on poll
notification, the kernel does not know which recipient will eventually
retrieve the data, so it has to notify and wake up all of them).

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30  9:08       ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30  9:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, Daniel Mack, David Herrmann, Djalal Harouni

* Andy Lutomirski:

> At the risk of opening a can of worms, wouldn't this be much more
> useful if you could share a pool between multiple connections?

They would also be useful to reduce context switches when receiving
data from all kinds of descriptors.  At present, when polling, you
receive notification, and then you have to call into the kernel,
again, to actually fetch the data and associated information.  The
kernel could also queue the data for one specific recipient,
addressing the same issue that SO_REUSEPORT tries to solve (on poll
notification, the kernel does not know which recipient will eventually
retrieve the data, so it has to notify and wake up all of them).

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2014-11-30  9:08       ` Florian Weimer
  (?)
@ 2014-11-30 17:12       ` David Herrmann
  2014-11-30 17:22           ` Florian Weimer
  -1 siblings, 1 reply; 73+ messages in thread
From: David Herrmann @ 2014-11-30 17:12 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andy Lutomirski, Greg Kroah-Hartman, Arnd Bergmann,
	Eric W. Biederman, One Thousand Gnomes, Tom Gundersen,
	Jiri Kosina, Linux API, linux-kernel, Daniel Mack,
	Djalal Harouni

Hi

On Sun, Nov 30, 2014 at 10:08 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Andy Lutomirski:
>
>> At the risk of opening a can of worms, wouldn't this be much more
>> useful if you could share a pool between multiple connections?
>
> They would also be useful to reduce context switches when receiving
> data from all kinds of descriptors.  At present, when polling, you
> receive notification, and then you have to call into the kernel,
> again, to actually fetch the data and associated information.

poll(2) and friends cannot return data for changed descriptors. I
think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this
turns out to be a bottleneck, we can provide bulk-operations in the
future. Anyway, I don't see how a _shared_ pool would change any of
this?

> kernel could also queue the data for one specific recipient,
> addressing the same issue that SO_REUSEPORT tries to solve (on poll
> notification, the kernel does not know which recipient will eventually
> retrieve the data, so it has to notify and wake up all of them).

We already queue data only for the addressed recipients. We *do* know
all recipients of a message at poll-notification time. We only wake up
recipients that actually got a message queued.

Not sure how this is related to SO_REUSEPORT. Can you elaborate on
your optimizations?

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:15       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-30 17:15 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Greg Kroah-Hartman:
>
>> +7.4 Receiving messages
>
>> +Also, if the connection allowed for file descriptor to be passed
>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>> +returns. The receiving task is obliged to close all of them appropriately.
>
> What happens if this is not possible because the file descriptor limit
> of the processes would be exceeded?  EMFILE, and the message will not
> be received?

The message is returned without installing the FDs. This is signaled
by EMFILE, but a valid pool offset.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:15       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-30 17:15 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw-d32yF4oPJVt0XxTmqZlbVQ@public.gmane.org> wrote:
> * Greg Kroah-Hartman:
>
>> +7.4 Receiving messages
>
>> +Also, if the connection allowed for file descriptor to be passed
>> +(KDBUS_HELLO_ACCEPT_FD), and if the message contained any, they will be
>> +installed into the receiving process after the KDBUS_CMD_MSG_RECV ioctl
>> +returns. The receiving task is obliged to close all of them appropriately.
>
> What happens if this is not possible because the file descriptor limit
> of the processes would be exceeded?  EMFILE, and the message will not
> be received?

The message is returned without installing the FDs. This is signaled
by EMFILE, but a valid pool offset.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:17       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-30 17:17 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Sun, Nov 30, 2014 at 9:56 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Greg Kroah-Hartman:
>
>> +The focus of this document is an overview of the low-level, native kernel D-Bus
>> +transport called kdbus. Kdbus exposes its functionality via files in a
>> +filesystem called 'kdbusfs'. All communication between processes takes place
>> +via ioctls on files exposed through the mount point of a kdbusfs. The default
>> +mount point of kdbusfs is /sys/fs/kdbus.
>
> Does this mean the bus does not enforce the correctness of the D-Bus
> introspection metadata?  That's really unfortunate.  Classic D-Bus
> does not do this, either, and combined with the variety of approaches
> used to implement D-Bus endpoints, it makes it really difficult to
> figure out what D-Bus services, exactly, a process provides.

kdbus operates on the transport-level only. We never touch or look at
transferred data. As such, DBus introspection data as defined by
org.freedesktop.DBus.Introspectable is not verified by the transport
layer.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:17       ` David Herrmann
  0 siblings, 0 replies; 73+ messages in thread
From: David Herrmann @ 2014-11-30 17:17 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

Hi

On Sun, Nov 30, 2014 at 9:56 AM, Florian Weimer <fw-d32yF4oPJVt0XxTmqZlbVQ@public.gmane.org> wrote:
> * Greg Kroah-Hartman:
>
>> +The focus of this document is an overview of the low-level, native kernel D-Bus
>> +transport called kdbus. Kdbus exposes its functionality via files in a
>> +filesystem called 'kdbusfs'. All communication between processes takes place
>> +via ioctls on files exposed through the mount point of a kdbusfs. The default
>> +mount point of kdbusfs is /sys/fs/kdbus.
>
> Does this mean the bus does not enforce the correctness of the D-Bus
> introspection metadata?  That's really unfortunate.  Classic D-Bus
> does not do this, either, and combined with the variety of approaches
> used to implement D-Bus endpoints, it makes it really difficult to
> figure out what D-Bus services, exactly, a process provides.

kdbus operates on the transport-level only. We never touch or look at
transferred data. As such, DBus introspection data as defined by
org.freedesktop.DBus.Introspectable is not verified by the transport
layer.

Thanks
David

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:22           ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30 17:22 UTC (permalink / raw)
  To: David Herrmann
  Cc: Andy Lutomirski, Greg Kroah-Hartman, Arnd Bergmann,
	Eric W. Biederman, One Thousand Gnomes, Tom Gundersen,
	Jiri Kosina, Linux API, linux-kernel, Daniel Mack,
	Djalal Harouni

* David Herrmann:

> poll(2) and friends cannot return data for changed descriptors. I
> think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this
> turns out to be a bottleneck, we can provide bulk-operations in the
> future. Anyway, I don't see how a _shared_ pool would change any of
> this?

I responded to Andy's messages because it seemed to be about
generalizing the pool functionality.

>> kernel could also queue the data for one specific recipient,
>> addressing the same issue that SO_REUSEPORT tries to solve (on poll
>> notification, the kernel does not know which recipient will eventually
>> retrieve the data, so it has to notify and wake up all of them).
>
> We already queue data only for the addressed recipients. We *do* know
> all recipients of a message at poll-notification time. We only wake up
> recipients that actually got a message queued.

Exactly, but poll on, say, UDP sockets, does not work this way.  What
I'm trying to say is that this functionality is interesting for much
more than kdbus.

> Not sure how this is related to SO_REUSEPORT. Can you elaborate on
> your optimizations?

Without something like SO_REUSEPORT, it is a bad idea to have multiple
threads polling the same socket.  The semantics are such that the
kernel has to wake *all* the waiting threads, and one of them will
eventually pick up the datagram with a separate system call.  But the
kernel does not know which thread this will be.

With SO_REUSEPORT and a separately created socket for each polling
thread, the kernel will only signal one poll operation because it
assumes that any of the waiting threads will process the datagram, so
it's sufficient just to notify one of them.

kdbus behaves like the latter, but also saves the need to separately
obtain the datagram and related data from the kernel.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:22           ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30 17:22 UTC (permalink / raw)
  To: David Herrmann
  Cc: Andy Lutomirski, Greg Kroah-Hartman, Arnd Bergmann,
	Eric W. Biederman, One Thousand Gnomes, Tom Gundersen,
	Jiri Kosina, Linux API, linux-kernel@vger.kernel.org,
	Daniel Mack, Djalal Harouni

* David Herrmann:

> poll(2) and friends cannot return data for changed descriptors. I
> think a single trap for each KDBUS_CMD_MSG_RECV is acceptable. If this
> turns out to be a bottleneck, we can provide bulk-operations in the
> future. Anyway, I don't see how a _shared_ pool would change any of
> this?

I responded to Andy's messages because it seemed to be about
generalizing the pool functionality.

>> kernel could also queue the data for one specific recipient,
>> addressing the same issue that SO_REUSEPORT tries to solve (on poll
>> notification, the kernel does not know which recipient will eventually
>> retrieve the data, so it has to notify and wake up all of them).
>
> We already queue data only for the addressed recipients. We *do* know
> all recipients of a message at poll-notification time. We only wake up
> recipients that actually got a message queued.

Exactly, but poll on, say, UDP sockets, does not work this way.  What
I'm trying to say is that this functionality is interesting for much
more than kdbus.

> Not sure how this is related to SO_REUSEPORT. Can you elaborate on
> your optimizations?

Without something like SO_REUSEPORT, it is a bad idea to have multiple
threads polling the same socket.  The semantics are such that the
kernel has to wake *all* the waiting threads, and one of them will
eventually pick up the datagram with a separate system call.  But the
kernel does not know which thread this will be.

With SO_REUSEPORT and a separately created socket for each polling
thread, the kernel will only signal one poll operation because it
assumes that any of the waiting threads will process the datagram, so
it's sufficient just to notify one of them.

kdbus behaves like the latter, but also saves the need to separately
obtain the datagram and related data from the kernel.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:23         ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30 17:23 UTC (permalink / raw)
  To: David Herrmann
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

* David Herrmann:

> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
>> * Greg Kroah-Hartman:
>>
>>> +7.4 Receiving messages

>> What happens if this is not possible because the file descriptor limit
>> of the processes would be exceeded?  EMFILE, and the message will not
>> be received?
>
> The message is returned without installing the FDs. This is signaled
> by EMFILE, but a valid pool offset.

Oh.  This is really surprising, so it needs documentation.  But it's
probably better than the alternative (return EMFILE and leave the
message stuck, so that you receive it immediately again—this behavior
makes non-blocking accept rather difficult to use correctly).

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2014-11-30 17:23         ` Florian Weimer
  0 siblings, 0 replies; 73+ messages in thread
From: Florian Weimer @ 2014-11-30 17:23 UTC (permalink / raw)
  To: David Herrmann
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Daniel Mack, Djalal Harouni

* David Herrmann:

> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw-d32yF4oPJVt0XxTmqZlbVQ@public.gmane.org> wrote:
>> * Greg Kroah-Hartman:
>>
>>> +7.4 Receiving messages

>> What happens if this is not possible because the file descriptor limit
>> of the processes would be exceeded?  EMFILE, and the message will not
>> be received?
>
> The message is returned without installing the FDs. This is signaled
> by EMFILE, but a valid pool offset.

Oh.  This is really surprising, so it needs documentation.  But it's
probably better than the alternative (return EMFILE and leave the
message stuck, so that you receive it immediately again—this behavior
makes non-blocking accept rather difficult to use correctly).

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code to gather metadata
@ 2014-12-01 13:50       ` Daniel Mack
  0 siblings, 0 replies; 73+ messages in thread
From: Daniel Mack @ 2014-12-01 13:50 UTC (permalink / raw)
  To: Andy Lutomirski, Greg Kroah-Hartman, arnd, ebiederm, gnomes, teg,
	jkosina, linux-api, linux-kernel
  Cc: dh.herrmann, tixxdz

Hi Andy,

Sorry for the late response.

On 11/21/2014 08:50 PM, Andy Lutomirski wrote:
>> +static int kdbus_meta_append_cred(struct kdbus_meta *meta,
>> +				  const struct kdbus_domain *domain)
>> +{
>> +	struct kdbus_creds creds = {
>> +		.uid = from_kuid_munged(domain->user_namespace, current_uid()),
>> +		.gid = from_kgid_munged(domain->user_namespace, current_gid()),
>> +		.pid = task_pid_nr_ns(current, domain->pid_namespace),
>> +		.tid = task_tgid_nr_ns(current, domain->pid_namespace),
> 
> This is better than before -- at least it gets translation right part of
> the way.  But it's still wrong if the receiver's namespace doesn't match
> the domain.

Alright. The code now translates the items into each receiver's
namespaces individually, so we can also take possible chroot()
environments into account to do path translations.

> Also, please move pid and tgid into their own item.  They suck for
> reasons that have been beaten to death.  Let's make it possible to
> deprecate them separately in the future.

Ok, done now. As mentioned in the highpid thread, we can easily add
another item once we have a better source of information.

>> +static int kdbus_meta_append_caps(struct kdbus_meta *meta)
>> +{
>> +	struct caps {
>> +		u32 last_cap;
>> +		struct {
>> +			u32 caps[_KERNEL_CAPABILITY_U32S];
>> +		} set[4];
>> +	} caps;
>> +	unsigned int i;
>> +	const struct cred *cred = current_cred();
>> +
>> +	caps.last_cap = CAP_LAST_CAP;
>> +
>> +	for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
>> +		caps.set[0].caps[i] = cred->cap_inheritable.cap[i];
>> +		caps.set[1].caps[i] = cred->cap_permitted.cap[i];
>> +		caps.set[2].caps[i] = cred->cap_effective.cap[i];
>> +		caps.set[3].caps[i] = cred->cap_bset.cap[i];
>> +	}
> 
> Please leave this in so that I can root every single kdbus-using system.
>  It'll be lots of fun.
> 
> Snark aside, the correct fix is IMO to delete this function entirely.
> Even if you could find a way to implement it safely (which will be
> distinctly nontrivial), it seems like a bad idea to begin with.

Yes, we currently have no way of translating caps from one user
namespace to another, which means we cannot allow such an item to cross
user namespaces. We can add this functionality later once we have the
neccessary APIs.

>> +#ifdef CONFIG_CGROUPS
>> +static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
>> +{
>> +	char *buf, *path;
>> +	int ret;
>> +
>> +	buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
>> +	if (!buf)
>> +		return -ENOMEM;
>> +
>> +	path = task_cgroup_path(current, buf, PAGE_SIZE);
> 
> This may have strange interactions with cgroupns.  It's fixable, though,
> but only once you implement translation at receive time, and I think
> you'll have to do that to get any of this to work right.

Yes, agreed. We'll add translation to this item once cgroup namespaces
are in place. Until then, we'll expose the same information as other
parts of the Linux API already do.

>> +#ifdef CONFIG_AUDITSYSCALL
>> +static int kdbus_meta_append_audit(struct kdbus_meta *meta,
>> +				   const struct kdbus_domain *domain)
>> +{
>> +	struct kdbus_audit audit;
>> +
>> +	audit.loginuid = from_kuid_munged(domain->user_namespace,
>> +					  audit_get_loginuid(current));
>> +	audit.sessionid = audit_get_sessionid(current);
>> +
>> +	return kdbus_meta_append_data(meta, KDBUS_ITEM_AUDIT,
>> +				      &audit, sizeof(audit));
> 
> So *that's* what audit means.  Please document this and consider
> renaming it to something like AUDIT_LOGINUID_AND_SESSIONID.

The documentation was indeed bogus and is fixed now, along with man
other details. Thanks for spotting this!

>> +#ifdef CONFIG_SECURITY
>> +static int kdbus_meta_append_seclabel(struct kdbus_meta *meta)
>> +{
>> +	u32 len, sid;
>> +	char *label;
>> +	int ret;
>> +
>> +	security_task_getsecid(current, &sid);
>> +	ret = security_secid_to_secctx(sid, &label, &len);
>> +	if (ret == -EOPNOTSUPP)
>> +		return 0;
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	if (label && len > 0)
>> +		ret = kdbus_meta_append_data(meta, KDBUS_ITEM_SECLABEL,
>> +					     label, len);
> 
> This thing needs a clear, valid use case.  I think that the use case
> should document how non-enforcing mode is supposed to work, too.

kdbus just passes the seclabel along, if people want to use this for
security checks, then they need to call into libselinux, and should do
that by taking the enforcing mode into consideration. This is already
done in a lot of software that way.

> Also, there should be a justification for why the LSM hooks by
> themselves aren't good enough to remove the need for this.

SELinux and other MACs might want to do additional per-service security
checks. For example, a service manager might want to check the security
label of a service file against the security label of the client process
using the SELinux database before allow access. For this, we need to be
able to pass the client's security label race-free to libselinux so that
it can make its decision.

I've added the above to the documentation now.



Thanks for your review again - much appreciated!

Daniel




^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code to gather metadata
@ 2014-12-01 13:50       ` Daniel Mack
  0 siblings, 0 replies; 73+ messages in thread
From: Daniel Mack @ 2014-12-01 13:50 UTC (permalink / raw)
  To: Andy Lutomirski, Greg Kroah-Hartman, arnd-r2nGTMty4D4,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w, tixxdz-Umm1ozX2/EEdnm+yROfE0A

Hi Andy,

Sorry for the late response.

On 11/21/2014 08:50 PM, Andy Lutomirski wrote:
>> +static int kdbus_meta_append_cred(struct kdbus_meta *meta,
>> +				  const struct kdbus_domain *domain)
>> +{
>> +	struct kdbus_creds creds = {
>> +		.uid = from_kuid_munged(domain->user_namespace, current_uid()),
>> +		.gid = from_kgid_munged(domain->user_namespace, current_gid()),
>> +		.pid = task_pid_nr_ns(current, domain->pid_namespace),
>> +		.tid = task_tgid_nr_ns(current, domain->pid_namespace),
> 
> This is better than before -- at least it gets translation right part of
> the way.  But it's still wrong if the receiver's namespace doesn't match
> the domain.

Alright. The code now translates the items into each receiver's
namespaces individually, so we can also take possible chroot()
environments into account to do path translations.

> Also, please move pid and tgid into their own item.  They suck for
> reasons that have been beaten to death.  Let's make it possible to
> deprecate them separately in the future.

Ok, done now. As mentioned in the highpid thread, we can easily add
another item once we have a better source of information.

>> +static int kdbus_meta_append_caps(struct kdbus_meta *meta)
>> +{
>> +	struct caps {
>> +		u32 last_cap;
>> +		struct {
>> +			u32 caps[_KERNEL_CAPABILITY_U32S];
>> +		} set[4];
>> +	} caps;
>> +	unsigned int i;
>> +	const struct cred *cred = current_cred();
>> +
>> +	caps.last_cap = CAP_LAST_CAP;
>> +
>> +	for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
>> +		caps.set[0].caps[i] = cred->cap_inheritable.cap[i];
>> +		caps.set[1].caps[i] = cred->cap_permitted.cap[i];
>> +		caps.set[2].caps[i] = cred->cap_effective.cap[i];
>> +		caps.set[3].caps[i] = cred->cap_bset.cap[i];
>> +	}
> 
> Please leave this in so that I can root every single kdbus-using system.
>  It'll be lots of fun.
> 
> Snark aside, the correct fix is IMO to delete this function entirely.
> Even if you could find a way to implement it safely (which will be
> distinctly nontrivial), it seems like a bad idea to begin with.

Yes, we currently have no way of translating caps from one user
namespace to another, which means we cannot allow such an item to cross
user namespaces. We can add this functionality later once we have the
neccessary APIs.

>> +#ifdef CONFIG_CGROUPS
>> +static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
>> +{
>> +	char *buf, *path;
>> +	int ret;
>> +
>> +	buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
>> +	if (!buf)
>> +		return -ENOMEM;
>> +
>> +	path = task_cgroup_path(current, buf, PAGE_SIZE);
> 
> This may have strange interactions with cgroupns.  It's fixable, though,
> but only once you implement translation at receive time, and I think
> you'll have to do that to get any of this to work right.

Yes, agreed. We'll add translation to this item once cgroup namespaces
are in place. Until then, we'll expose the same information as other
parts of the Linux API already do.

>> +#ifdef CONFIG_AUDITSYSCALL
>> +static int kdbus_meta_append_audit(struct kdbus_meta *meta,
>> +				   const struct kdbus_domain *domain)
>> +{
>> +	struct kdbus_audit audit;
>> +
>> +	audit.loginuid = from_kuid_munged(domain->user_namespace,
>> +					  audit_get_loginuid(current));
>> +	audit.sessionid = audit_get_sessionid(current);
>> +
>> +	return kdbus_meta_append_data(meta, KDBUS_ITEM_AUDIT,
>> +				      &audit, sizeof(audit));
> 
> So *that's* what audit means.  Please document this and consider
> renaming it to something like AUDIT_LOGINUID_AND_SESSIONID.

The documentation was indeed bogus and is fixed now, along with man
other details. Thanks for spotting this!

>> +#ifdef CONFIG_SECURITY
>> +static int kdbus_meta_append_seclabel(struct kdbus_meta *meta)
>> +{
>> +	u32 len, sid;
>> +	char *label;
>> +	int ret;
>> +
>> +	security_task_getsecid(current, &sid);
>> +	ret = security_secid_to_secctx(sid, &label, &len);
>> +	if (ret == -EOPNOTSUPP)
>> +		return 0;
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	if (label && len > 0)
>> +		ret = kdbus_meta_append_data(meta, KDBUS_ITEM_SECLABEL,
>> +					     label, len);
> 
> This thing needs a clear, valid use case.  I think that the use case
> should document how non-enforcing mode is supposed to work, too.

kdbus just passes the seclabel along, if people want to use this for
security checks, then they need to call into libselinux, and should do
that by taking the enforcing mode into consideration. This is already
done in a lot of software that way.

> Also, there should be a justification for why the LSM hooks by
> themselves aren't good enough to remove the need for this.

SELinux and other MACs might want to do additional per-service security
checks. For example, a service manager might want to check the security
label of a service file against the security label of the client process
using the SELinux database before allow access. For this, we need to be
able to pass the client's security label race-free to libselinux so that
it can make its decision.

I've added the above to the documentation now.



Thanks for your review again - much appreciated!

Daniel

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add code to gather metadata
  2014-12-01 13:50       ` Daniel Mack
  (?)
@ 2014-12-01 14:46       ` Andy Lutomirski
  -1 siblings, 0 replies; 73+ messages in thread
From: Andy Lutomirski @ 2014-12-01 14:46 UTC (permalink / raw)
  To: Daniel Mack
  Cc: Greg Kroah-Hartman, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Linux API,
	linux-kernel, David Herrmann, Djalal Harouni

On Mon, Dec 1, 2014 at 5:50 AM, Daniel Mack <daniel@zonque.org> wrote:
> Hi Andy,
>
> Sorry for the late response.
>
> On 11/21/2014 08:50 PM, Andy Lutomirski wrote:
>>> +static int kdbus_meta_append_cred(struct kdbus_meta *meta,
>>> +                              const struct kdbus_domain *domain)
>>> +{
>>> +    struct kdbus_creds creds = {
>>> +            .uid = from_kuid_munged(domain->user_namespace, current_uid()),
>>> +            .gid = from_kgid_munged(domain->user_namespace, current_gid()),
>>> +            .pid = task_pid_nr_ns(current, domain->pid_namespace),
>>> +            .tid = task_tgid_nr_ns(current, domain->pid_namespace),
>>
>> This is better than before -- at least it gets translation right part of
>> the way.  But it's still wrong if the receiver's namespace doesn't match
>> the domain.
>
> Alright. The code now translates the items into each receiver's
> namespaces individually, so we can also take possible chroot()
> environments into account to do path translations.

Thanks!

I suspect you'll get a lot of "(unreachable)/foo/bar".

>
>> Also, please move pid and tgid into their own item.  They suck for
>> reasons that have been beaten to death.  Let's make it possible to
>> deprecate them separately in the future.
>
> Ok, done now. As mentioned in the highpid thread, we can easily add
> another item once we have a better source of information.

I'll see if I can get highpid in soon enough that starttime can be
dropped entirely.  Once it shows up, there will be an API that can
probably never be removed that isn't namespaced correctly and can't be
checkpointed.

>
>>> +static int kdbus_meta_append_caps(struct kdbus_meta *meta)
>>> +{
>>> +    struct caps {
>>> +            u32 last_cap;
>>> +            struct {
>>> +                    u32 caps[_KERNEL_CAPABILITY_U32S];
>>> +            } set[4];
>>> +    } caps;
>>> +    unsigned int i;
>>> +    const struct cred *cred = current_cred();
>>> +
>>> +    caps.last_cap = CAP_LAST_CAP;
>>> +
>>> +    for (i = 0; i < _KERNEL_CAPABILITY_U32S; i++) {
>>> +            caps.set[0].caps[i] = cred->cap_inheritable.cap[i];
>>> +            caps.set[1].caps[i] = cred->cap_permitted.cap[i];
>>> +            caps.set[2].caps[i] = cred->cap_effective.cap[i];
>>> +            caps.set[3].caps[i] = cred->cap_bset.cap[i];
>>> +    }
>>
>> Please leave this in so that I can root every single kdbus-using system.
>>  It'll be lots of fun.
>>
>> Snark aside, the correct fix is IMO to delete this function entirely.
>> Even if you could find a way to implement it safely (which will be
>> distinctly nontrivial), it seems like a bad idea to begin with.
>
> Yes, we currently have no way of translating caps from one user
> namespace to another, which means we cannot allow such an item to cross
> user namespaces. We can add this functionality later once we have the
> neccessary APIs.

I still don't expect this metadata item to ever make sense to use.
The best argument I've heard is that it would enable CAP_SYS_BOOT to
ask systemd for a reboot.  I think this is bogus:

Realistically, this probably just means that users of /sbin/reboot
might continue to work as expected.  Except:

1. CAP_SYS_BOOT was never sufficient for /sbin/reboot to reboot
cleanly -- clean reboots historically went through /dev/initctl.

2. This only makes sense anyway if /sbin/reboot gets updated to speak
dbus.  So you have a new reboot binary that you want to work as long
as it ends up with CAP_SYS_BOOT.  But caps are so broken that this
won't work intelligently no matter what kdbus does.  A user can have
CAP_SYS_BOOT or not, but whether or not /sbin/reboot inherits that bit
is almost entirely a function of that user's euid and has very little
to do with whether that user has CAP_SYS_BOOT.  So you might as well
just check euid == 0 in the first place.

And setting fscap bits on /sbin/reboot so that a new reboot binary can
prove to systemd that it has CAP_SYS_BOOT makes no sense.  Either put
the policy in systemd in the first place, or make /sbin/reboot setuid
root.

And passing the permitted, inheritable, and bounding sets makes even
less sense.  Yes, you could pass the entirety of /proc/PID/stat and
/proc/PID/status, but that's silly, and passing caps around via kdbus
for diagnostics seems ridiculous.

If you put it in now, it's here to stay even if it proves to be a bad
idea, whereas if you don't put it in at first, you can always add it
if a good reason comes up.

>
>>> +#ifdef CONFIG_CGROUPS
>>> +static int kdbus_meta_append_cgroup(struct kdbus_meta *meta)
>>> +{
>>> +    char *buf, *path;
>>> +    int ret;
>>> +
>>> +    buf = (char *)__get_free_page(GFP_TEMPORARY | __GFP_ZERO);
>>> +    if (!buf)
>>> +            return -ENOMEM;
>>> +
>>> +    path = task_cgroup_path(current, buf, PAGE_SIZE);
>>
>> This may have strange interactions with cgroupns.  It's fixable, though,
>> but only once you implement translation at receive time, and I think
>> you'll have to do that to get any of this to work right.
>
> Yes, agreed. We'll add translation to this item once cgroup namespaces
> are in place. Until then, we'll expose the same information as other
> parts of the Linux API already do.

I don't expect this to become a problem.

>
>>> +#ifdef CONFIG_AUDITSYSCALL
>>> +static int kdbus_meta_append_audit(struct kdbus_meta *meta,
>>> +                               const struct kdbus_domain *domain)
>>> +{
>>> +    struct kdbus_audit audit;
>>> +
>>> +    audit.loginuid = from_kuid_munged(domain->user_namespace,
>>> +                                      audit_get_loginuid(current));
>>> +    audit.sessionid = audit_get_sessionid(current);
>>> +
>>> +    return kdbus_meta_append_data(meta, KDBUS_ITEM_AUDIT,
>>> +                                  &audit, sizeof(audit));
>>
>> So *that's* what audit means.  Please document this and consider
>> renaming it to something like AUDIT_LOGINUID_AND_SESSIONID.
>
> The documentation was indeed bogus and is fixed now, along with man
> other details. Thanks for spotting this!

:)


>
>>> +#ifdef CONFIG_SECURITY
>>> +static int kdbus_meta_append_seclabel(struct kdbus_meta *meta)
>>> +{
>>> +    u32 len, sid;
>>> +    char *label;
>>> +    int ret;
>>> +
>>> +    security_task_getsecid(current, &sid);
>>> +    ret = security_secid_to_secctx(sid, &label, &len);
>>> +    if (ret == -EOPNOTSUPP)
>>> +            return 0;
>>> +    if (ret < 0)
>>> +            return ret;
>>> +
>>> +    if (label && len > 0)
>>> +            ret = kdbus_meta_append_data(meta, KDBUS_ITEM_SECLABEL,
>>> +                                         label, len);
>>
>> This thing needs a clear, valid use case.  I think that the use case
>> should document how non-enforcing mode is supposed to work, too.
>
> kdbus just passes the seclabel along, if people want to use this for
> security checks, then they need to call into libselinux, and should do
> that by taking the enforcing mode into consideration. This is already
> done in a lot of software that way.
>
>> Also, there should be a justification for why the LSM hooks by
>> themselves aren't good enough to remove the need for this.
>
> SELinux and other MACs might want to do additional per-service security
> checks. For example, a service manager might want to check the security
> label of a service file against the security label of the client process
> using the SELinux database before allow access. For this, we need to be
> able to pass the client's security label race-free to libselinux so that
> it can make its decision.

Yeah, but what happens on systems that use a different LSM?  And isn't
the point of the new LSM kdbus hooks to do this automatically without
userspace's help?

Admittedly, I don't see much of a problem with this feature existing,
as long as the non-selinux people are okay with it.

>
> I've added the above to the documentation now.
>
>
>
> Thanks for your review again - much appreciated!

And thanks for addressing most of the issues.  The code is starting to
look much better to me.

--Andy

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2015-01-20  8:09           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 73+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-20  8:09 UTC (permalink / raw)
  To: Florian Weimer, David Herrmann, Greg Kroah-Hartman, Daniel Mack
  Cc: mtk.manpages, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Djalal Harouni

Daniel,  David,

On 11/30/2014 06:23 PM, Florian Weimer wrote:
> * David Herrmann:
> 
>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>> * Greg Kroah-Hartman:
>>>
>>>> +7.4 Receiving messages
> 
>>> What happens if this is not possible because the file descriptor limit
>>> of the processes would be exceeded?  EMFILE, and the message will not
>>> be received?
>>
>> The message is returned without installing the FDs. This is signaled
>> by EMFILE, but a valid pool offset.
> 
> Oh.  This is really surprising, so it needs documentation.  But it's
> probably better than the alternative (return EMFILE and leave the
> message stuck, so that you receive it immediately again—this behavior
> makes non-blocking accept rather difficult to use correctly).

So, was this point in the end explicitly documented? I not
obvious that it is documented in the revised kdbus.txt that
Greg K-H sent out 4 days ago.

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2015-01-20  8:09           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 73+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-20  8:09 UTC (permalink / raw)
  To: Florian Weimer, David Herrmann, Greg Kroah-Hartman, Daniel Mack
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Arnd Bergmann,
	Eric W. Biederman, One Thousand Gnomes, Tom Gundersen,
	Jiri Kosina, Andy Lutomirski, Linux API, linux-kernel,
	Djalal Harouni

Daniel,  David,

On 11/30/2014 06:23 PM, Florian Weimer wrote:
> * David Herrmann:
> 
>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw-d32yF4oPJVt0XxTmqZlbVQ@public.gmane.org> wrote:
>>> * Greg Kroah-Hartman:
>>>
>>>> +7.4 Receiving messages
> 
>>> What happens if this is not possible because the file descriptor limit
>>> of the processes would be exceeded?  EMFILE, and the message will not
>>> be received?
>>
>> The message is returned without installing the FDs. This is signaled
>> by EMFILE, but a valid pool offset.
> 
> Oh.  This is really surprising, so it needs documentation.  But it's
> probably better than the alternative (return EMFILE and leave the
> message stuck, so that you receive it immediately again—this behavior
> makes non-blocking accept rather difficult to use correctly).

So, was this point in the end explicitly documented? I not
obvious that it is documented in the revised kdbus.txt that
Greg K-H sent out 4 days ago.

Thanks,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2015-01-20  8:25             ` Daniel Mack
  0 siblings, 0 replies; 73+ messages in thread
From: Daniel Mack @ 2015-01-20  8:25 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages),
	Florian Weimer, David Herrmann, Greg Kroah-Hartman
  Cc: Arnd Bergmann, Eric W. Biederman, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Andy Lutomirski, Linux API,
	linux-kernel, Djalal Harouni

Hi Michael,

On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote:
> On 11/30/2014 06:23 PM, Florian Weimer wrote:
>> * David Herrmann:
>>
>>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>> * Greg Kroah-Hartman:
>>>>
>>>>> +7.4 Receiving messages
>>
>>>> What happens if this is not possible because the file descriptor limit
>>>> of the processes would be exceeded?  EMFILE, and the message will not
>>>> be received?
>>>
>>> The message is returned without installing the FDs. This is signaled
>>> by EMFILE, but a valid pool offset.
>>
>> Oh.  This is really surprising, so it needs documentation.  But it's
>> probably better than the alternative (return EMFILE and leave the
>> message stuck, so that you receive it immediately again—this behavior
>> makes non-blocking accept rather difficult to use correctly).
> 
> So, was this point in the end explicitly documented? I not
> obvious that it is documented in the revised kdbus.txt that
> Greg K-H sent out 4 days ago.

No, we've revisited this point and changed the kernel behavior again in
v3. We're no longer returning -EMFILE in this case, but rather set
KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl
struct called 'return_flags'. We believe that's a nicer way of signaling
specific errors. The message will carry -1 for all FDs that failed to
get installed, so the user can actually see which one is missing.

That's also documented in kdbus.txt, but we missed putting it into the
Changelog - sorry for that.


Hope this helps,
Daniel

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
@ 2015-01-20  8:25             ` Daniel Mack
  0 siblings, 0 replies; 73+ messages in thread
From: Daniel Mack @ 2015-01-20  8:25 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages),
	Florian Weimer, David Herrmann, Greg Kroah-Hartman
  Cc: Arnd Bergmann, Eric W. Biederman, One Thousand Gnomes,
	Tom Gundersen, Jiri Kosina, Andy Lutomirski, Linux API,
	linux-kernel, Djalal Harouni

Hi Michael,

On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote:
> On 11/30/2014 06:23 PM, Florian Weimer wrote:
>> * David Herrmann:
>>
>>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw-d32yF4oPJVt0XxTmqZlbVQ@public.gmane.org> wrote:
>>>> * Greg Kroah-Hartman:
>>>>
>>>>> +7.4 Receiving messages
>>
>>>> What happens if this is not possible because the file descriptor limit
>>>> of the processes would be exceeded?  EMFILE, and the message will not
>>>> be received?
>>>
>>> The message is returned without installing the FDs. This is signaled
>>> by EMFILE, but a valid pool offset.
>>
>> Oh.  This is really surprising, so it needs documentation.  But it's
>> probably better than the alternative (return EMFILE and leave the
>> message stuck, so that you receive it immediately again—this behavior
>> makes non-blocking accept rather difficult to use correctly).
> 
> So, was this point in the end explicitly documented? I not
> obvious that it is documented in the revised kdbus.txt that
> Greg K-H sent out 4 days ago.

No, we've revisited this point and changed the kernel behavior again in
v3. We're no longer returning -EMFILE in this case, but rather set
KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl
struct called 'return_flags'. We believe that's a nicer way of signaling
specific errors. The message will carry -1 for all FDs that failed to
get installed, so the user can actually see which one is missing.

That's also documented in kdbus.txt, but we missed putting it into the
Changelog - sorry for that.


Hope this helps,
Daniel

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: kdbus: add documentation
  2015-01-20  8:25             ` Daniel Mack
  (?)
@ 2015-01-20 12:54             ` Michael Kerrisk (man-pages)
  -1 siblings, 0 replies; 73+ messages in thread
From: Michael Kerrisk (man-pages) @ 2015-01-20 12:54 UTC (permalink / raw)
  To: Daniel Mack, Florian Weimer, David Herrmann, Greg Kroah-Hartman
  Cc: mtk.manpages, Arnd Bergmann, Eric W. Biederman,
	One Thousand Gnomes, Tom Gundersen, Jiri Kosina, Andy Lutomirski,
	Linux API, linux-kernel, Djalal Harouni

On 01/20/2015 09:25 AM, Daniel Mack wrote:
> Hi Michael,
> 
> On 01/20/2015 09:09 AM, Michael Kerrisk (man-pages) wrote:
>> On 11/30/2014 06:23 PM, Florian Weimer wrote:
>>> * David Herrmann:
>>>
>>>> On Sun, Nov 30, 2014 at 10:02 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>>> * Greg Kroah-Hartman:
>>>>>
>>>>>> +7.4 Receiving messages
>>>
>>>>> What happens if this is not possible because the file descriptor limit
>>>>> of the processes would be exceeded?  EMFILE, and the message will not
>>>>> be received?
>>>>
>>>> The message is returned without installing the FDs. This is signaled
>>>> by EMFILE, but a valid pool offset.
>>>
>>> Oh.  This is really surprising, so it needs documentation.  But it's
>>> probably better than the alternative (return EMFILE and leave the
>>> message stuck, so that you receive it immediately again—this behavior
>>> makes non-blocking accept rather difficult to use correctly).
>>
>> So, was this point in the end explicitly documented? I not
>> obvious that it is documented in the revised kdbus.txt that
>> Greg K-H sent out 4 days ago.
> 
> No, we've revisited this point and changed the kernel behavior again in
> v3. We're no longer returning -EMFILE in this case, but rather set
> KDBUS_RECV_RETURN_INCOMPLETE_FDS in a new field in the receive ioctl
> struct called 'return_flags'. We believe that's a nicer way of signaling
> specific errors. The message will carry -1 for all FDs that failed to
> get installed, so the user can actually see which one is missing.
> 
> That's also documented in kdbus.txt, but we missed putting it into the
> Changelog - sorry for that.

Thanks for the info, Daniel.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/13] Add kdbus implementation
@ 2014-12-05  8:49 ` Hillf Danton
  0 siblings, 0 replies; 73+ messages in thread
From: Hillf Danton @ 2014-12-05  8:49 UTC (permalink / raw)
  To: 'Greg Kroah-Hartman'
  Cc: arnd, ebiederm, gnomes, teg, jkosina, luto, linux-api,
	linux-kernel, daniel, dh.herrmann, tixxdz, Hillf Danton,
	赵东(辅周)

Hey all
> 
> kdbus is a kernel-level IPC implementation that aims for resemblance to
> the the protocol layer with the existing userspace D-Bus daemon while
> enabling some features that couldn't be implemented before in userspace.
> 
[...]
> 
> This can also be found in a git tree, the kdbus branch of char-misc.git at:
>         https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/
> 
In the environment:
	Android Kitkat
	Linux-3.4.67
	CPU MTK MT6582 ARMv7 Processor rev 3 (v7l)

root@cwet_td_a800:/ # kdbus-test 
Testing bus make functions (bus-make) .................................. OK
Testing the HELLO command (hello) ...................................... OK
Testing the BYEBYE command (byebye) .................................... OK
Testing a chat pattern (chat) .......................................... OK
Testing a simple dameon (daemon) ....................................... OK
Testing file descriptor passing (fd-passing) ........................... OK
Testing custom endpoint (endpoint) ..................................... OK
Testing monitor functionality (monitor) ................................ OK
Testing basic name registry functions (name-basics) .................... OK
Testing name registry conflict details (name-conflict) ................. OK
Testing queuing of names (name-queue) .................................. OK
Testing basic message handling (message-basic) ......................... OK
Testing handling of messages with priority (message-prio) .............. OK
Testing message quotas are enforced (message-quota) .................... OK
Testing timeout (timeout) .............................................. OK
Testing synchronous replies vs. BYEBYE (sync-byebye) ................... OK
Testing synchronous replies (sync-reply) ............................... OK
Testing freeing of memory (message-free) ............................... OK
Testing retrieving connection information (connection-info) ............ OK
Testing updating connection information (connection-update) ............ OK
Testing verifying pools are never writable (writable-pool) ............. OK
Testing policy (policy) ................................................ OK
Testing unprivileged bus access (policy-priv) .......................... OK
Testing policy in user namespaces (policy-ns) .......................... OK
Testing metadata in user namespaces (metadata-ns) ...................... OK
Testing adding of matches by id (match-id-add) ......................... OK
Testing removing of matches by id (match-id-remove) .................... OK
Testing replace of matches with the same cookie (match-replace) ........ OK
Testing adding of matches by name (match-name-add) ..................... OK
Testing removing of matches by name (match-name-remove) ................ OK
Testing matching for name changes (match-name-change) .................. OK
Testing matching with bloom filters (match-bloom) ...................... OK
Testing activator connections (activator) .............................. OK
Testing benchmark (benchmark) .......................................... OK
Testing race multiple byebyes (race-byebye) ............................ OK
Testing race byebye vs match removal (race-byebye-match) ............... OK

SUMMARY: 36 tests passed, 0 skipped, 0 failed


And we like to test newer versions, if any, of kdbus with our phone.

Thanks
Hillf



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: [PATCH v2 00/13] Add kdbus implementation
@ 2014-12-05  8:49 ` Hillf Danton
  0 siblings, 0 replies; 73+ messages in thread
From: Hillf Danton @ 2014-12-05  8:49 UTC (permalink / raw)
  To: 'Greg Kroah-Hartman'
  Cc: arnd-r2nGTMty4D4, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	gnomes-qBU/x9rampVanCEyBjwyrvXRex20P6io, teg-B22kvLQNl6c,
	jkosina-AlSwsSmVLrQ, luto-kltTT9wpgjJwATOyAt5JVQ,
	linux-api-u79uwXL29TY76Z2rM5mHXA, linux-kernel,
	daniel-cYrQPVfZoowdnm+yROfE0A,
	dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w,
	tixxdz-Umm1ozX2/EEdnm+yROfE0A, Hillf Danton,
	赵东(辅周)

Hey all
> 
> kdbus is a kernel-level IPC implementation that aims for resemblance to
> the the protocol layer with the existing userspace D-Bus daemon while
> enabling some features that couldn't be implemented before in userspace.
> 
[...]
> 
> This can also be found in a git tree, the kdbus branch of char-misc.git at:
>         https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/
> 
In the environment:
	Android Kitkat
	Linux-3.4.67
	CPU MTK MT6582 ARMv7 Processor rev 3 (v7l)

root@cwet_td_a800:/ # kdbus-test 
Testing bus make functions (bus-make) .................................. OK
Testing the HELLO command (hello) ...................................... OK
Testing the BYEBYE command (byebye) .................................... OK
Testing a chat pattern (chat) .......................................... OK
Testing a simple dameon (daemon) ....................................... OK
Testing file descriptor passing (fd-passing) ........................... OK
Testing custom endpoint (endpoint) ..................................... OK
Testing monitor functionality (monitor) ................................ OK
Testing basic name registry functions (name-basics) .................... OK
Testing name registry conflict details (name-conflict) ................. OK
Testing queuing of names (name-queue) .................................. OK
Testing basic message handling (message-basic) ......................... OK
Testing handling of messages with priority (message-prio) .............. OK
Testing message quotas are enforced (message-quota) .................... OK
Testing timeout (timeout) .............................................. OK
Testing synchronous replies vs. BYEBYE (sync-byebye) ................... OK
Testing synchronous replies (sync-reply) ............................... OK
Testing freeing of memory (message-free) ............................... OK
Testing retrieving connection information (connection-info) ............ OK
Testing updating connection information (connection-update) ............ OK
Testing verifying pools are never writable (writable-pool) ............. OK
Testing policy (policy) ................................................ OK
Testing unprivileged bus access (policy-priv) .......................... OK
Testing policy in user namespaces (policy-ns) .......................... OK
Testing metadata in user namespaces (metadata-ns) ...................... OK
Testing adding of matches by id (match-id-add) ......................... OK
Testing removing of matches by id (match-id-remove) .................... OK
Testing replace of matches with the same cookie (match-replace) ........ OK
Testing adding of matches by name (match-name-add) ..................... OK
Testing removing of matches by name (match-name-remove) ................ OK
Testing matching for name changes (match-name-change) .................. OK
Testing matching with bloom filters (match-bloom) ...................... OK
Testing activator connections (activator) .............................. OK
Testing benchmark (benchmark) .......................................... OK
Testing race multiple byebyes (race-byebye) ............................ OK
Testing race byebye vs match removal (race-byebye-match) ............... OK

SUMMARY: 36 tests passed, 0 skipped, 0 failed


And we like to test newer versions, if any, of kdbus with our phone.

Thanks
Hillf

^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2015-01-20 12:54 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-21  5:02 [PATCH v2 00/13] Add kdbus implementation Greg Kroah-Hartman
2014-11-21  5:02 ` Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add documentation Greg Kroah-Hartman
2014-11-21  8:29   ` Harald Hoyer
2014-11-21 17:12   ` Andy Lutomirski
2014-11-21 17:12     ` Andy Lutomirski
2014-11-24 20:16     ` David Herrmann
2014-11-24 20:57       ` Andy Lutomirski
2014-11-26 11:55         ` David Herrmann
2014-11-26 11:55           ` David Herrmann
2014-11-26 15:30           ` Andy Lutomirski
2014-11-26 15:39             ` Andy Lutomirski
2014-11-26 15:39               ` Andy Lutomirski
2014-11-30  9:08     ` Florian Weimer
2014-11-30  9:08       ` Florian Weimer
2014-11-30 17:12       ` David Herrmann
2014-11-30 17:22         ` Florian Weimer
2014-11-30 17:22           ` Florian Weimer
2014-11-30  8:56   ` Florian Weimer
2014-11-30 17:17     ` David Herrmann
2014-11-30 17:17       ` David Herrmann
2014-11-30  9:02   ` Florian Weimer
2014-11-30  9:02     ` Florian Weimer
2014-11-30 17:15     ` David Herrmann
2014-11-30 17:15       ` David Herrmann
2014-11-30 17:23       ` Florian Weimer
2014-11-30 17:23         ` Florian Weimer
2015-01-20  8:09         ` Michael Kerrisk (man-pages)
2015-01-20  8:09           ` Michael Kerrisk (man-pages)
2015-01-20  8:25           ` Daniel Mack
2015-01-20  8:25             ` Daniel Mack
2015-01-20 12:54             ` Michael Kerrisk (man-pages)
2014-11-21  5:02 ` kdbus: add header file Greg Kroah-Hartman
2014-11-21  8:34   ` Harald Hoyer
2014-11-21  8:55     ` Daniel Mack
2014-11-21  5:02 ` kdbus: add driver skeleton, ioctl entry points and utility functions Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add connection pool implementation Greg Kroah-Hartman
2014-11-21  5:02   ` Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add connection, queue handling and message validation code Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add node and filesystem implementation Greg Kroah-Hartman
2014-11-21 15:55   ` Sasha Levin
2014-11-21 15:55     ` Sasha Levin
2014-11-21 16:13     ` David Herrmann
2014-11-21 16:13       ` David Herrmann
2014-11-21 16:56       ` Greg Kroah-Hartman
2014-11-21 16:56         ` Greg Kroah-Hartman
2014-11-21 17:03         ` Sasha Levin
2014-11-21 17:03           ` Sasha Levin
2014-11-21 17:55           ` Greg Kroah-Hartman
2014-11-21 17:55             ` Greg Kroah-Hartman
2014-11-21 16:35   ` Andy Lutomirski
2014-11-21 16:41     ` Andy Lutomirski
2014-11-21 16:53     ` David Herrmann
2014-11-21 16:53       ` David Herrmann
2014-11-21  5:02 ` kdbus: add code to gather metadata Greg Kroah-Hartman
2014-11-21 19:50   ` Andy Lutomirski
2014-11-21 19:50     ` Andy Lutomirski
2014-12-01 13:50     ` Daniel Mack
2014-12-01 13:50       ` Daniel Mack
2014-12-01 14:46       ` Andy Lutomirski
2014-11-21  5:02 ` kdbus: add code for notifications and matches Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add code for buses, domains and endpoints Greg Kroah-Hartman
2014-11-21  8:14   ` Harald Hoyer
2014-11-21  8:14     ` Harald Hoyer
2014-11-21  8:39   ` Harald Hoyer
2014-11-21  5:02 ` kdbus: add name registry implementation Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add policy database implementation Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add Makefile, Kconfig and MAINTAINERS entry Greg Kroah-Hartman
2014-11-21  5:02 ` kdbus: add selftests Greg Kroah-Hartman
2014-11-21  6:02 ` [PATCH v2 00/13] Add kdbus implementation Greg Kroah-Hartman
2014-11-21  6:02   ` Greg Kroah-Hartman
2014-12-05  8:49 Hillf Danton
2014-12-05  8:49 ` Hillf Danton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.