All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 0/8] Landlock: IOCTL support
@ 2024-02-09 17:06 Günther Noack
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
                   ` (7 more replies)
  0 siblings, 8 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Hello!

These patches add simple ioctl(2) support to Landlock.

Objective
~~~~~~~~~

Make ioctl(2) requests restrictable with Landlock,
in a way that is useful for real-world applications.

Proposed approach
~~~~~~~~~~~~~~~~~

Introduce the LANDLOCK_ACCESS_FS_IOCTL right, which restricts the use
of ioctl(2) on file descriptors.

We attach IOCTL access rights to opened file descriptors, as we
already do for LANDLOCK_ACCESS_FS_TRUNCATE.

If LANDLOCK_ACCESS_FS_IOCTL is handled (restricted in the ruleset),
the LANDLOCK_ACCESS_FS_IOCTL access right governs the use of all IOCTL
commands.

We make an exception for the common and known-harmless IOCTL commands
FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC.  These IOCTL commands are
always permitted.  Their functionality is already available through
fcntl(2).

If additionally(!) the access rights LANDLOCK_ACCESS_FS_READ_FILE,
LANDLOCK_ACCESS_FS_WRITE_FILE or LANDLOCK_ACCESS_FS_READ_DIR are
handled, these access rights also unlock some IOCTL commands which are
considered safe for use with files opened in these ways.

As soon as these access rights are handled, the affected IOCTL
commands can not be permitted through LANDLOCK_ACCESS_FS_IOCTL any
more, but only be permitted through the respective more specific
access rights.  A full list of these access rights is listed below in
this cover letter and in the documentation.

I believe that this approach works for the majority of use cases, and
offers a good trade-off between Landlock API and implementation
complexity and flexibility when the feature is used.

Current limitations
~~~~~~~~~~~~~~~~~~~

With this patch set, ioctl(2) requests can *not* be filtered based on
file type, device number (dev_t) or on the ioctl(2) request number.

On the initial RFC patch set [1], we have reached consensus to start
with this simpler coarse-grained approach, and build additional IOCTL
restriction capabilities on top in subsequent steps.

[1] https://lore.kernel.org/linux-security-module/d4f1395c-d2d4-1860-3a02-2a0c023dd761@digikod.net/

Notable implications of this approach
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Existing inherited file descriptors stay unaffected
  when a program enables Landlock.

  This means in particular that in common scenarios,
  the terminal's IOCTLs (ioctl_tty(2)) continue to work.

* ioctl(2) continues to be available for file descriptors acquired
  through means other than open(2).  Example: Network sockets,
  memfd_create(2), file descriptors that are already open before the
  Landlock ruleset is enabled.

Examples
~~~~~~~~

Starting a sandboxed shell from $HOME with samples/landlock/sandboxer:

  LL_FS_RO=/ LL_FS_RW=. ./sandboxer /bin/bash

The LANDLOCK_ACCESS_FS_IOCTL right is part of the "read-write" rights
here, so we expect that newly opened files outside of $HOME don't work
with most IOCTL commands.

  * "stty" works: It probes terminal properties

  * "stty </dev/tty" fails: /dev/tty can be reopened, but the IOCTL is
    denied.

  * "eject" fails: ioctls to use CD-ROM drive are denied.

  * "ls /dev" works: It uses ioctl to get the terminal size for
    columnar layout

  * The text editors "vim" and "mg" work.  (GNU Emacs fails because it
    attempts to reopen /dev/tty.)

IOCTL groups
~~~~~~~~~~~~

To decide which IOCTL commands should be blanket-permitted we went
through the list of IOCTL commands mentioned in fs/ioctl.c and looked
at them individually to understand what they are about.  The following
list is for reference.

We should always allow the following IOCTL commands, which are also
available through fcntl(2) with the F_SETFD and F_SETFL commands:

 * FIOCLEX, FIONCLEX - these work on the file descriptor and
   manipulate the close-on-exec flag
 * FIONBIO, FIOASYNC - these work on the struct file and enable
   nonblocking-IO and async flags

The following commands are guarded and enabled by either of
LANDLOCK_ACCESS_FS_WRITE_FILE, LANDLOCK_ACCESS_FS_READ_FILE or
LANDLOCK_ACCESS_FS_READ_DIR, once one of them is handled
(otherwise by LANDLOCK_ACCESS_FS_IOCTL):

 * FIOQSIZE - get the size of the opened file
 * FIONREAD - get the number of bytes available for reading (the
   implementation is defined per file type)
 * FIGETBSZ - get file system blocksize

The following commands are guarded and enabled by either of
LANDLOCK_ACCESS_FS_WRITE_FILE or LANDLOCK_ACCESS_FS_READ_FILE,
once one of them is handled (otherwise by LANDLOCK_ACCESS_FS_IOCTL):

 * FS_IOC_FIEMAP - get information about file extent mapping
   (c.f. https://www.kernel.org/doc/Documentation/filesystems/fiemap.txt)
 * FIBMAP - get the file system block numbers underlying a file
 * FIDEDUPERANGE, FICLONE, FICLONERANGE - manipulating shared physical storage
   between multiple files.  These only work on some COW file systems, by design.
 * FS_IOC_RESVSP, FS_IOC_RESVSP64, FS_IOC_UNRESVSP, FS_IOC_UNRESVSP64,
   FS_IOC_ZERO_RANGE: Backwards compatibility with legacy XFS
   preallocation syscalls which predate fallocate(2).

The following commands are also mentioned in fs/ioctl.c, but are not
handled specially and are managed by LANDLOCK_ACCESS_FS_IOCTL together
with all other remaining IOCTL commands:

 * FIFREEZE, FITHAW - work on superblock(!) to freeze/thaw the file
   system. Requires CAP_SYS_ADMIN.
 * Accessing file attributes:
   * FS_IOC_GETFLAGS, FS_IOC_SETFLAGS - manipulate inode flags (ioctl_iflags(2))
   * FS_IOC_FSGETXATTR, FS_IOC_FSSETXATTR - more attributes

Related Work
~~~~~~~~~~~~

OpenBSD's pledge(2) [2] restricts ioctl(2) independent of the file
descriptor which is used.  The implementers maintain multiple
allow-lists of predefined ioctl(2) operations required for different
application domains such as "audio", "bpf", "tty" and "inet".

OpenBSD does not guarantee ABI backwards compatibility to the same
extent as Linux does, so it's easier for them to update these lists in
later versions.  It might not be a feasible approach for Linux though.

[2] https://man.openbsd.org/OpenBSD-7.3/pledge.2

Open questions
~~~~~~~~~~~~~~

 * Can the FIONREAD IOCTL command number be overloaded?
 
   We allow the use of FIONREAD quite liberally, but for non-regular files, this
   IOCTL command can also be implemented in the VFS layer.  It is *technically*
   possible that file implementations overload the FIONREAD IOCTL number for
   other purposes which we don't want to permit.

   With what certainty can we assume that FIONREAD implementations are actually
   implementing that FIONREAD functionality?  If it were to happen anyway, would
   that be considered a kernel bug that has to be fixed?

 * I still need to write a test for the COW file system "reflink" IOCTLs, but it
   felt like the changes in V9 of the patch set were already large enough to
   send them out.

Changes
~~~~~~~

V9:
 * in “landlock: Add IOCTL access right”:
   * Change IOCTL group names and grouping as discussed with Mickaël.
     This makes the grouping coarser, and we occasionally rely on the
     underlying implementation to perform the appropriate read/write
     checks.
     * Group IOCTL_RW (one of READ_FILE, WRITE_FILE or READ_DIR):
       FIONREAD, FIOQSIZE, FIGETBSZ
     * Group IOCTL_RWF (one of READ_FILE or WRITE_FILE):
       FS_IOC_FIEMAP, FIBMAP, FIDEDUPERANGE, FICLONE, FICLONERANGE,
       FS_IOC_RESVSP, FS_IOC_RESVSP64, FS_IOC_UNRESVSP, FS_IOC_UNRESVSP64,
       FS_IOC_ZERO_RANGE
   * Excempt pipe file descriptors from IOCTL restrictions,
     even for named pipes which are opened from the file system.
     This is to be consistent with anonymous pipes created with pipe(2).
     As discussed in https://lore.kernel.org/r/ZP7lxmXklksadvz+@google.com
   * Document rationale for the IOCTL grouping in the code
   * Use __attribute_const__
   * Rename required_ioctl_access() to get_required_ioctl_access()
 * Selftests
   * Simplify IOCTL test fixtures as a result of simpler grouping.
   * Test that IOCTLs are permitted on named pipe FDs.
   * Test that IOCTLs are permitted on named Unix Domain Socket FDs.
   * Work around compilation issue with old GCC / glibc.
     https://sourceware.org/glibc/wiki/Synchronizing_Headers
     Thanks to Huyadi <hu.yadi@h3c.com>, who pointed this out in
     https://lore.kernel.org/all/f25be6663bcc4608adf630509f045a76@h3c.com/
     and Mickaël, who fixed it through #include reordering.
 * Documentation changes
   * Reword "IOCTL commands" section a bit
   * s/permit/allow/
   * s/access right/right/, if preceded by LANDLOCK_ACCESS_FS_*
   * s/IOCTL/FS_IOCTL/ in ASCII table
   * Update IOCTL grouping documentation in header file
 * Removed a few of the earlier commits in this patch set,
   which have already been merged.

V8:
 * Documentation changes
   * userspace-api/landlock.rst:
     * Add an extra paragraph about how the IOCTL right combines
       when used with other access rights.
     * Explain better the circumstances under which passing of
       file descriptors between different Landlock domains can happen
   * limits.h: Add comment to explain public vs internal FS access rights
   * Add a paragraph in the commit to explain better why the IOCTL
     right works as it does

V7:
 * in “landlock: Add IOCTL access right”:
   * Make IOCTL_GROUPS a #define so that static_assert works even on
     old compilers (bug reported by Intel about PowerPC GCC9 config)
   * Adapt indentation of IOCTL_GROUPS definition
   * Add missing dots in kernel-doc comments.
 * in “landlock: Remove remaining "inline" modifiers in .c files”:
   * explain reasoning in commit message

V6:
 * Implementation:
   * Check that only publicly visible access rights can be used when adding a
     rule (rather than the synthetic ones).  Thanks Mickaël for spotting that!
   * Move all functionality related to IOCTL groups and synthetic access rights
     into the same place at the top of fs.c
   * Move kernel doc to the .c file in one instance
   * Smaller code style issues (upcase IOCTL, vardecl at block start)
   * Remove inline modifier from functions in .c files
 * Tests:
   * use SKIP
   * Rename 'fd' to dir_fd and file_fd where appropriate
   * Remove duplicate "ioctl" mentions from test names
   * Rename "permitted" to "allowed", in ioctl and ftruncate tests
   * Do not add rules if access is 0, in test helper

V5:
 * Implementation:
   * move IOCTL group expansion logic into fs.c (implementation suggested by
     mic)
   * rename IOCTL_CMD_G* constants to LANDLOCK_ACCESS_FS_IOCTL_GROUP*
   * fs.c: create ioctl_groups constant
   * add "const" to some variables
 * Formatting and docstring fixes (including wrong kernel-doc format)
 * samples/landlock: fix ABI version and fallback attribute (mic)
 * Documentation
   * move header documentation changes into the implementation commit
   * spell out how FIFREEZE, FITHAW and attribute-manipulation ioctls from
     fs/ioctl.c are handled
   * change ABI 4 to ABI 5 in some missing places
   
V4:
 * use "synthetic" IOCTL access rights, as previously discussed
 * testing changes
   * use a large fixture-based test, for more exhaustive coverage,
     and replace some of the earlier tests with it
 * rebased on mic-next

V3:
 * always permit the IOCTL commands FIOCLEX, FIONCLEX, FIONBIO, FIOASYNC and
   FIONREAD, independent of LANDLOCK_ACCESS_FS_IOCTL
 * increment ABI version in the same commit where the feature is introduced
 * testing changes
   * use FIOQSIZE instead of TTY IOCTL commands
     (FIOQSIZE works with regular files, directories and memfds)
   * run the memfd test with both Landlock enabled and disabled
   * add a test for the always-permitted IOCTL commands

V2:
 * rebased on mic-next
 * added documentation
 * exercise ioctl(2) in the memfd test
 * test: Use layout0 for the test

---

V1: https://lore.kernel.org/linux-security-module/20230502171755.9788-1-gnoack3000@gmail.com/
V2: https://lore.kernel.org/linux-security-module/20230623144329.136541-1-gnoack@google.com/
V3: https://lore.kernel.org/linux-security-module/20230814172816.3907299-1-gnoack@google.com/
V4: https://lore.kernel.org/linux-security-module/20231103155717.78042-1-gnoack@google.com/
V5: https://lore.kernel.org/linux-security-module/20231117154920.1706371-1-gnoack@google.com/
V6: https://lore.kernel.org/linux-security-module/20231124173026.3257122-1-gnoack@google.com/
V7: https://lore.kernel.org/linux-security-module/20231201143042.3276833-1-gnoack@google.com/
V8: https://lore.kernel.org/linux-security-module/20231208155121.1943775-1-gnoack@google.com/

Günther Noack (8):
  landlock: Add IOCTL access right
  selftests/landlock: Test IOCTL support
  selftests/landlock: Test IOCTL with memfds
  selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH)
  selftests/landlock: Test IOCTLs on named pipes
  selftests/landlock: Check IOCTL restrictions for named UNIX domain
    sockets
  samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL
  landlock: Document IOCTL support

 Documentation/userspace-api/landlock.rst     | 121 +++-
 include/uapi/linux/landlock.h                |  55 +-
 samples/landlock/sandboxer.c                 |  13 +-
 security/landlock/fs.c                       | 227 +++++++-
 security/landlock/fs.h                       |   3 +
 security/landlock/limits.h                   |  11 +-
 security/landlock/ruleset.h                  |   2 +-
 security/landlock/syscalls.c                 |  19 +-
 tools/testing/selftests/landlock/base_test.c |   2 +-
 tools/testing/selftests/landlock/fs_test.c   | 559 ++++++++++++++++++-
 10 files changed, 963 insertions(+), 49 deletions(-)


base-commit: 5b921b7dbe3e0df48a1d947b3813ac9ae18858c1
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-10 11:06   ` Günther Noack
                     ` (3 more replies)
  2024-02-09 17:06 ` [PATCH v9 2/8] selftests/landlock: Test IOCTL support Günther Noack
                   ` (6 subsequent siblings)
  7 siblings, 4 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
and increments the Landlock ABI version to 5.

Like the truncate right, these rights are associated with a file
descriptor at the time of open(2), and get respected even when the
file descriptor is used outside of the thread which it was originally
opened in.

A newly enabled Landlock policy therefore does not apply to file
descriptors which are already open.

If the LANDLOCK_ACCESS_FS_IOCTL right is handled, only a small number
of safe IOCTL commands will be permitted on newly opened files.  The
permitted IOCTLs can be configured through the ruleset in limited ways
now.  (See documentation for details.)

Specifically, when LANDLOCK_ACCESS_FS_IOCTL is handled, granting this
right on a file or directory will *not* permit to do all IOCTL
commands, but only influence the IOCTL commands which are not already
handled through other access rights.  The intent is to keep the groups
of IOCTL commands more fine-grained.

Noteworthy scenarios which require special attention:

TTY devices are often passed into a process from the parent process,
and so a newly enabled Landlock policy does not retroactively apply to
them automatically.  In the past, TTY devices have often supported
IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were
letting callers control the TTY input buffer (and simulate
keypresses).  This should be restricted to CAP_SYS_ADMIN programs on
modern kernels though.

Some legitimate file system features, like setting up fscrypt, are
exposed as IOCTL commands on regular files and directories -- users of
Landlock are advised to double check that the sandboxed process does
not need to invoke these IOCTLs.

Known limitations:

The LANDLOCK_ACCESS_FS_IOCTL access right is a coarse-grained control
over IOCTL commands.  Future work will enable a more fine-grained
access control for IOCTLs.

In the meantime, Landlock users may use path-based restrictions in
combination with their knowledge about the file system layout to
control what IOCTLs can be done.  Mounting file systems with the nodev
option can help to distinguish regular files and devices, and give
guarantees about the affected files, which Landlock alone can not give
yet.

Signed-off-by: Günther Noack <gnoack@google.com>
---
 include/uapi/linux/landlock.h                |  55 ++++-
 security/landlock/fs.c                       | 227 ++++++++++++++++++-
 security/landlock/fs.h                       |   3 +
 security/landlock/limits.h                   |  11 +-
 security/landlock/ruleset.h                  |   2 +-
 security/landlock/syscalls.c                 |  19 +-
 tools/testing/selftests/landlock/base_test.c |   2 +-
 tools/testing/selftests/landlock/fs_test.c   |   5 +-
 8 files changed, 302 insertions(+), 22 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index 25c8d7677539..16d7d72804f8 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -128,7 +128,7 @@ struct landlock_net_port_attr {
  * files and directories.  Files or directories opened before the sandboxing
  * are not subject to these restrictions.
  *
- * A file can only receive these access rights:
+ * The following access rights apply only to files:
  *
  * - %LANDLOCK_ACCESS_FS_EXECUTE: Execute a file.
  * - %LANDLOCK_ACCESS_FS_WRITE_FILE: Open a file with write access. Note that
@@ -138,12 +138,13 @@ struct landlock_net_port_attr {
  * - %LANDLOCK_ACCESS_FS_READ_FILE: Open a file with read access.
  * - %LANDLOCK_ACCESS_FS_TRUNCATE: Truncate a file with :manpage:`truncate(2)`,
  *   :manpage:`ftruncate(2)`, :manpage:`creat(2)`, or :manpage:`open(2)` with
- *   ``O_TRUNC``. Whether an opened file can be truncated with
- *   :manpage:`ftruncate(2)` is determined during :manpage:`open(2)`, in the
- *   same way as read and write permissions are checked during
- *   :manpage:`open(2)` using %LANDLOCK_ACCESS_FS_READ_FILE and
- *   %LANDLOCK_ACCESS_FS_WRITE_FILE. This access right is available since the
- *   third version of the Landlock ABI.
+ *   ``O_TRUNC``.  This access right is available since the third version of the
+ *   Landlock ABI.
+ *
+ * Whether an opened file can be truncated with :manpage:`ftruncate(2)` or used
+ * with `ioctl(2)` is determined during :manpage:`open(2)`, in the same way as
+ * read and write permissions are checked during :manpage:`open(2)` using
+ * %LANDLOCK_ACCESS_FS_READ_FILE and %LANDLOCK_ACCESS_FS_WRITE_FILE.
  *
  * A directory can receive access rights related to files or directories.  The
  * following access right is applied to the directory itself, and the
@@ -198,13 +199,50 @@ struct landlock_net_port_attr {
  *   If multiple requirements are not met, the ``EACCES`` error code takes
  *   precedence over ``EXDEV``.
  *
+ * The following access right applies both to files and directories:
+ *
+ * - %LANDLOCK_ACCESS_FS_IOCTL: Invoke :manpage:`ioctl(2)` commands on an opened
+ *   file or directory.
+ *
+ *   This access right applies to all :manpage:`ioctl(2)` commands, except of
+ *   ``FIOCLEX``, ``FIONCLEX``, ``FIONBIO`` and ``FIOASYNC``.  These commands
+ *   continue to be invokable independent of the %LANDLOCK_ACCESS_FS_IOCTL
+ *   access right.
+ *
+ *   When certain other access rights are handled in the ruleset, in addition to
+ *   %LANDLOCK_ACCESS_FS_IOCTL, granting these access rights will unlock access
+ *   to additional groups of IOCTL commands, on the affected files:
+ *
+ *   * %LANDLOCK_ACCESS_FS_READ_FILE and %LANDLOCK_ACCESS_FS_WRITE_FILE unlock
+ *     access to ``FIOQSIZE``, ``FIONREAD``, ``FIGETBSZ``, ``FS_IOC_FIEMAP``,
+ *     ``FIBMAP``, ``FIDEDUPERANGE``, ``FICLONE``, ``FICLONERANGE``,
+ *     ``FS_IOC_RESVSP``, ``FS_IOC_RESVSP64``, ``FS_IOC_UNRESVSP``,
+ *     ``FS_IOC_UNRESVSP64``, ``FS_IOC_ZERO_RANGE``.
+ *
+ *   * %LANDLOCK_ACCESS_FS_READ_DIR unlocks access to ``FIOQSIZE``,
+ *     ``FIONREAD``, ``FIGETBSZ``.
+ *
+ *   When these access rights are handled in the ruleset, the availability of
+ *   the affected IOCTL commands is not governed by %LANDLOCK_ACCESS_FS_IOCTL
+ *   any more, but by the respective access right.
+ *
+ *   All other IOCTL commands are not handled specially, and are governed by
+ *   %LANDLOCK_ACCESS_FS_IOCTL.  This includes %FS_IOC_GETFLAGS and
+ *   %FS_IOC_SETFLAGS for manipulating inode flags (:manpage:`ioctl_iflags(2)`),
+ *   %FS_IOC_FSFETXATTR and %FS_IOC_FSSETXATTR for manipulating extended
+ *   attributes, as well as %FIFREEZE and %FITHAW for freezing and thawing file
+ *   systems.
+ *
+ *   This access right is available since the fifth version of the Landlock
+ *   ABI.
+ *
  * .. warning::
  *
  *   It is currently not possible to restrict some file-related actions
  *   accessible through these syscall families: :manpage:`chdir(2)`,
  *   :manpage:`stat(2)`, :manpage:`flock(2)`, :manpage:`chmod(2)`,
  *   :manpage:`chown(2)`, :manpage:`setxattr(2)`, :manpage:`utime(2)`,
- *   :manpage:`ioctl(2)`, :manpage:`fcntl(2)`, :manpage:`access(2)`.
+ *   :manpage:`fcntl(2)`, :manpage:`access(2)`.
  *   Future Landlock evolutions will enable to restrict them.
  */
 /* clang-format off */
@@ -223,6 +261,7 @@ struct landlock_net_port_attr {
 #define LANDLOCK_ACCESS_FS_MAKE_SYM			(1ULL << 12)
 #define LANDLOCK_ACCESS_FS_REFER			(1ULL << 13)
 #define LANDLOCK_ACCESS_FS_TRUNCATE			(1ULL << 14)
+#define LANDLOCK_ACCESS_FS_IOCTL			(1ULL << 15)
 /* clang-format on */
 
 /**
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 73997e63734f..84efea3f7c0f 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -7,6 +7,7 @@
  * Copyright © 2021-2022 Microsoft Corporation
  */
 
+#include <asm/ioctls.h>
 #include <kunit/test.h>
 #include <linux/atomic.h>
 #include <linux/bitops.h>
@@ -14,6 +15,7 @@
 #include <linux/compiler_types.h>
 #include <linux/dcache.h>
 #include <linux/err.h>
+#include <linux/falloc.h>
 #include <linux/fs.h>
 #include <linux/init.h>
 #include <linux/kernel.h>
@@ -29,6 +31,7 @@
 #include <linux/types.h>
 #include <linux/wait_bit.h>
 #include <linux/workqueue.h>
+#include <uapi/linux/fiemap.h>
 #include <uapi/linux/landlock.h>
 
 #include "common.h"
@@ -84,6 +87,186 @@ static const struct landlock_object_underops landlock_fs_underops = {
 	.release = release_inode
 };
 
+/* IOCTL helpers */
+
+/*
+ * These are synthetic access rights, which are only used within the kernel, but
+ * not exposed to callers in userspace.  The mapping between these access rights
+ * and IOCTL commands is defined in the get_required_ioctl_access() helper function.
+ */
+#define LANDLOCK_ACCESS_FS_IOCTL_RW (LANDLOCK_LAST_PUBLIC_ACCESS_FS << 1)
+#define LANDLOCK_ACCESS_FS_IOCTL_RW_FILE (LANDLOCK_LAST_PUBLIC_ACCESS_FS << 2)
+
+/* ioctl_groups - all synthetic access rights for IOCTL command groups */
+/* clang-format off */
+#define IOCTL_GROUPS (				\
+	LANDLOCK_ACCESS_FS_IOCTL_RW |		\
+	LANDLOCK_ACCESS_FS_IOCTL_RW_FILE)
+/* clang-format on */
+
+static_assert((IOCTL_GROUPS & LANDLOCK_MASK_ACCESS_FS) == IOCTL_GROUPS);
+
+/**
+ * get_required_ioctl_access(): Determine required IOCTL access rights.
+ *
+ * @cmd: The IOCTL command that is supposed to be run.
+ *
+ * Any new IOCTL commands that are implemented in fs/ioctl.c's do_vfs_ioctl()
+ * should be considered for inclusion here.
+ *
+ * Returns: The access rights that must be granted on an opened file in order to
+ * use the given @cmd.
+ */
+static __attribute_const__ access_mask_t
+get_required_ioctl_access(const unsigned int cmd)
+{
+	switch (cmd) {
+	case FIOCLEX:
+	case FIONCLEX:
+	case FIONBIO:
+	case FIOASYNC:
+		/*
+		 * FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC manipulate the FD's
+		 * close-on-exec and the file's buffered-IO and async flags.
+		 * These operations are also available through fcntl(2), and are
+		 * unconditionally permitted in Landlock.
+		 */
+		return 0;
+	case FIONREAD:
+	case FIOQSIZE:
+	case FIGETBSZ:
+		/*
+		 * FIONREAD returns the number of bytes available for reading.
+		 * FIONREAD returns the number of immediately readable bytes for
+		 * a file.
+		 *
+		 * FIOQSIZE queries the size of a file or directory.
+		 *
+		 * FIGETBSZ queries the file system's block size for a file or
+		 * directory.
+		 *
+		 * These IOCTL commands are permitted for files which are opened
+		 * with LANDLOCK_ACCESS_FS_READ_DIR,
+		 * LANDLOCK_ACCESS_FS_READ_FILE, or
+		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
+		 */
+		return LANDLOCK_ACCESS_FS_IOCTL_RW;
+	case FS_IOC_FIEMAP:
+	case FIBMAP:
+		/*
+		 * FS_IOC_FIEMAP and FIBMAP query information about the
+		 * allocation of blocks within a file.  They are permitted for
+		 * files which are opened with LANDLOCK_ACCESS_FS_READ_FILE or
+		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
+		 */
+		fallthrough;
+	case FIDEDUPERANGE:
+	case FICLONE:
+	case FICLONERANGE:
+		/*
+		 * FIDEDUPERANGE, FICLONE and FICLONERANGE make files share
+		 * their underlying storage ("reflink") between source and
+		 * destination FDs, on file systems which support that.
+		 *
+		 * The underlying implementations are already checking whether
+		 * the involved files are opened with the appropriate read/write
+		 * modes.  We rely on this being implemented correctly.
+		 *
+		 * These IOCTLs are permitted for files which are opened with
+		 * LANDLOCK_ACCESS_FS_READ_FILE or
+		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
+		 */
+		fallthrough;
+	case FS_IOC_RESVSP:
+	case FS_IOC_RESVSP64:
+	case FS_IOC_UNRESVSP:
+	case FS_IOC_UNRESVSP64:
+	case FS_IOC_ZERO_RANGE:
+		/*
+		 * These IOCTLs reserve space, or create holes like
+		 * fallocate(2).  We rely on the implementations checking the
+		 * files' read/write modes.
+		 *
+		 * These IOCTLs are permitted for files which are opened with
+		 * LANDLOCK_ACCESS_FS_READ_FILE or
+		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
+		 */
+		return LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
+	default:
+		/*
+		 * Other commands are guarded by the catch-all access right.
+		 */
+		return LANDLOCK_ACCESS_FS_IOCTL;
+	}
+}
+
+/**
+ * expand_ioctl() - Return the dst flags from either the src flag or the
+ * %LANDLOCK_ACCESS_FS_IOCTL flag, depending on whether the
+ * %LANDLOCK_ACCESS_FS_IOCTL and src access rights are handled or not.
+ *
+ * @handled: Handled access rights.
+ * @access: The access mask to copy values from.
+ * @src: A single access right to copy from in @access.
+ * @dst: One or more access rights to copy to.
+ *
+ * Returns: @dst, or 0.
+ */
+static __attribute_const__ access_mask_t
+expand_ioctl(const access_mask_t handled, const access_mask_t access,
+	     const access_mask_t src, const access_mask_t dst)
+{
+	access_mask_t copy_from;
+
+	if (!(handled & LANDLOCK_ACCESS_FS_IOCTL))
+		return 0;
+
+	copy_from = (handled & src) ? src : LANDLOCK_ACCESS_FS_IOCTL;
+	if (access & copy_from)
+		return dst;
+
+	return 0;
+}
+
+/**
+ * landlock_expand_access_fs() - Returns @access with the synthetic IOCTL group
+ * flags enabled if necessary.
+ *
+ * @handled: Handled FS access rights.
+ * @access: FS access rights to expand.
+ *
+ * Returns: @access expanded by the necessary flags for the synthetic IOCTL
+ * access rights.
+ */
+static __attribute_const__ access_mask_t landlock_expand_access_fs(
+	const access_mask_t handled, const access_mask_t access)
+{
+	return access |
+	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_WRITE_FILE,
+			    LANDLOCK_ACCESS_FS_IOCTL_RW |
+				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
+	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_FILE,
+			    LANDLOCK_ACCESS_FS_IOCTL_RW |
+				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
+	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_DIR,
+			    LANDLOCK_ACCESS_FS_IOCTL_RW);
+}
+
+/**
+ * landlock_expand_handled_access_fs() - add synthetic IOCTL access rights to an
+ * access mask of handled accesses.
+ *
+ * @handled: The handled accesses of a ruleset that is being created.
+ *
+ * Returns: @handled, with the bits for the synthetic IOCTL access rights set,
+ * if %LANDLOCK_ACCESS_FS_IOCTL is handled.
+ */
+__attribute_const__ access_mask_t
+landlock_expand_handled_access_fs(const access_mask_t handled)
+{
+	return landlock_expand_access_fs(handled, handled);
+}
+
 /* Ruleset management */
 
 static struct landlock_object *get_inode_object(struct inode *const inode)
@@ -148,7 +331,8 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
 	LANDLOCK_ACCESS_FS_EXECUTE | \
 	LANDLOCK_ACCESS_FS_WRITE_FILE | \
 	LANDLOCK_ACCESS_FS_READ_FILE | \
-	LANDLOCK_ACCESS_FS_TRUNCATE)
+	LANDLOCK_ACCESS_FS_TRUNCATE | \
+	LANDLOCK_ACCESS_FS_IOCTL)
 /* clang-format on */
 
 /*
@@ -158,6 +342,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
 			    const struct path *const path,
 			    access_mask_t access_rights)
 {
+	access_mask_t handled;
 	int err;
 	struct landlock_id id = {
 		.type = LANDLOCK_KEY_INODE,
@@ -170,9 +355,11 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
 	if (WARN_ON_ONCE(ruleset->num_layers != 1))
 		return -EINVAL;
 
+	handled = landlock_get_fs_access_mask(ruleset, 0);
+	/* Expands the synthetic IOCTL groups. */
+	access_rights |= landlock_expand_access_fs(handled, access_rights);
 	/* Transforms relative access rights to absolute ones. */
-	access_rights |= LANDLOCK_MASK_ACCESS_FS &
-			 ~landlock_get_fs_access_mask(ruleset, 0);
+	access_rights |= LANDLOCK_MASK_ACCESS_FS & ~handled;
 	id.key.object = get_inode_object(d_backing_inode(path->dentry));
 	if (IS_ERR(id.key.object))
 		return PTR_ERR(id.key.object);
@@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
 {
 	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
 	access_mask_t open_access_request, full_access_request, allowed_access;
-	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
+	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
+					      LANDLOCK_ACCESS_FS_IOCTL |
+					      IOCTL_GROUPS;
 	const struct landlock_ruleset *const dom = get_current_fs_domain();
 
 	if (!dom)
@@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
 		}
 	}
 
+	/*
+	 * Named pipes should be treated just like anonymous pipes.
+	 * Therefore, we permit all IOCTLs on them.
+	 */
+	if (S_ISFIFO(file_inode(file)->i_mode)) {
+		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
+				  LANDLOCK_ACCESS_FS_IOCTL_RW |
+				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
+	}
+
 	/*
 	 * For operations on already opened files (i.e. ftruncate()), it is the
 	 * access rights at the time of open() which decide whether the
@@ -1406,6 +1605,25 @@ static int hook_file_truncate(struct file *const file)
 	return -EACCES;
 }
 
+static int hook_file_ioctl(struct file *file, unsigned int cmd,
+			   unsigned long arg)
+{
+	const access_mask_t required_access = get_required_ioctl_access(cmd);
+	const access_mask_t allowed_access =
+		landlock_file(file)->allowed_access;
+
+	/*
+	 * It is the access rights at the time of opening the file which
+	 * determine whether IOCTL can be used on the opened file later.
+	 *
+	 * The access right is attached to the opened file in hook_file_open().
+	 */
+	if ((allowed_access & required_access) == required_access)
+		return 0;
+
+	return -EACCES;
+}
+
 static struct security_hook_list landlock_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(inode_free_security, hook_inode_free_security),
 
@@ -1428,6 +1646,7 @@ static struct security_hook_list landlock_hooks[] __ro_after_init = {
 	LSM_HOOK_INIT(file_alloc_security, hook_file_alloc_security),
 	LSM_HOOK_INIT(file_open, hook_file_open),
 	LSM_HOOK_INIT(file_truncate, hook_file_truncate),
+	LSM_HOOK_INIT(file_ioctl, hook_file_ioctl),
 };
 
 __init void landlock_add_fs_hooks(void)
diff --git a/security/landlock/fs.h b/security/landlock/fs.h
index 488e4813680a..086576b8386b 100644
--- a/security/landlock/fs.h
+++ b/security/landlock/fs.h
@@ -92,4 +92,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
 			    const struct path *const path,
 			    access_mask_t access_hierarchy);
 
+__attribute_const__ access_mask_t
+landlock_expand_handled_access_fs(const access_mask_t handled);
+
 #endif /* _SECURITY_LANDLOCK_FS_H */
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index 93c9c6f91556..ecbdc8bbf906 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -18,7 +18,16 @@
 #define LANDLOCK_MAX_NUM_LAYERS		16
 #define LANDLOCK_MAX_NUM_RULES		U32_MAX
 
-#define LANDLOCK_LAST_ACCESS_FS		LANDLOCK_ACCESS_FS_TRUNCATE
+/*
+ * For file system access rights, Landlock distinguishes between the publicly
+ * visible access rights (1 to LANDLOCK_LAST_PUBLIC_ACCESS_FS) and the private
+ * ones which are not exposed to userspace (LANDLOCK_LAST_PUBLIC_ACCESS_FS + 1
+ * to LANDLOCK_LAST_ACCESS_FS).  The private access rights are defined in fs.c.
+ */
+#define LANDLOCK_LAST_PUBLIC_ACCESS_FS	LANDLOCK_ACCESS_FS_IOCTL
+#define LANDLOCK_MASK_PUBLIC_ACCESS_FS	((LANDLOCK_LAST_PUBLIC_ACCESS_FS << 1) - 1)
+
+#define LANDLOCK_LAST_ACCESS_FS		(LANDLOCK_LAST_PUBLIC_ACCESS_FS << 2)
 #define LANDLOCK_MASK_ACCESS_FS		((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
 #define LANDLOCK_NUM_ACCESS_FS		__const_hweight64(LANDLOCK_MASK_ACCESS_FS)
 #define LANDLOCK_SHIFT_ACCESS_FS	0
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index c7f1526784fd..5a28ea8e1c3d 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -30,7 +30,7 @@
 	LANDLOCK_ACCESS_FS_REFER)
 /* clang-format on */
 
-typedef u16 access_mask_t;
+typedef u32 access_mask_t;
 /* Makes sure all filesystem access rights can be stored. */
 static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
 /* Makes sure all network access rights can be stored. */
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 898358f57fa0..f0bc50003b46 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -137,7 +137,7 @@ static const struct file_operations ruleset_fops = {
 	.write = fop_dummy_write,
 };
 
-#define LANDLOCK_ABI_VERSION 4
+#define LANDLOCK_ABI_VERSION 5
 
 /**
  * sys_landlock_create_ruleset - Create a new ruleset
@@ -192,8 +192,8 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
 		return err;
 
 	/* Checks content (and 32-bits cast). */
-	if ((ruleset_attr.handled_access_fs | LANDLOCK_MASK_ACCESS_FS) !=
-	    LANDLOCK_MASK_ACCESS_FS)
+	if ((ruleset_attr.handled_access_fs | LANDLOCK_MASK_PUBLIC_ACCESS_FS) !=
+	    LANDLOCK_MASK_PUBLIC_ACCESS_FS)
 		return -EINVAL;
 
 	/* Checks network content (and 32-bits cast). */
@@ -201,6 +201,10 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
 	    LANDLOCK_MASK_ACCESS_NET)
 		return -EINVAL;
 
+	/* Expands synthetic IOCTL groups. */
+	ruleset_attr.handled_access_fs = landlock_expand_handled_access_fs(
+		ruleset_attr.handled_access_fs);
+
 	/* Checks arguments and transforms to kernel struct. */
 	ruleset = landlock_create_ruleset(ruleset_attr.handled_access_fs,
 					  ruleset_attr.handled_access_net);
@@ -309,8 +313,13 @@ static int add_rule_path_beneath(struct landlock_ruleset *const ruleset,
 	if (!path_beneath_attr.allowed_access)
 		return -ENOMSG;
 
-	/* Checks that allowed_access matches the @ruleset constraints. */
-	mask = landlock_get_raw_fs_access_mask(ruleset, 0);
+	/*
+	 * Checks that allowed_access matches the @ruleset constraints and only
+	 * consists of publicly visible access rights (as opposed to synthetic
+	 * ones).
+	 */
+	mask = landlock_get_raw_fs_access_mask(ruleset, 0) &
+	       LANDLOCK_MASK_PUBLIC_ACCESS_FS;
 	if ((path_beneath_attr.allowed_access | mask) != mask)
 		return -EINVAL;
 
diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
index 646f778dfb1e..d292b419ccba 100644
--- a/tools/testing/selftests/landlock/base_test.c
+++ b/tools/testing/selftests/landlock/base_test.c
@@ -75,7 +75,7 @@ TEST(abi_version)
 	const struct landlock_ruleset_attr ruleset_attr = {
 		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
 	};
-	ASSERT_EQ(4, landlock_create_ruleset(NULL, 0,
+	ASSERT_EQ(5, landlock_create_ruleset(NULL, 0,
 					     LANDLOCK_CREATE_RULESET_VERSION));
 
 	ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 2d6d9b43d958..3203f4a5bc85 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -527,9 +527,10 @@ TEST_F_FORK(layout1, inval)
 	LANDLOCK_ACCESS_FS_EXECUTE | \
 	LANDLOCK_ACCESS_FS_WRITE_FILE | \
 	LANDLOCK_ACCESS_FS_READ_FILE | \
-	LANDLOCK_ACCESS_FS_TRUNCATE)
+	LANDLOCK_ACCESS_FS_TRUNCATE | \
+	LANDLOCK_ACCESS_FS_IOCTL)
 
-#define ACCESS_LAST LANDLOCK_ACCESS_FS_TRUNCATE
+#define ACCESS_LAST LANDLOCK_ACCESS_FS_IOCTL
 
 #define ACCESS_ALL ( \
 	ACCESS_FILE | \
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 2/8] selftests/landlock: Test IOCTL support
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-09 17:06 ` [PATCH v9 3/8] selftests/landlock: Test IOCTL with memfds Günther Noack
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Exercises Landlock's IOCTL feature in different combinations of
handling and permitting the rights LANDLOCK_ACCESS_FS_IOCTL,
LANDLOCK_ACCESS_FS_READ_FILE, LANDLOCK_ACCESS_FS_WRITE_FILE and
LANDLOCK_ACCESS_FS_READ_DIR, and in different combinations of using
files and directories.

Signed-off-by: Günther Noack <gnoack@google.com>
---
 tools/testing/selftests/landlock/fs_test.c | 379 ++++++++++++++++++++-
 1 file changed, 376 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 3203f4a5bc85..6ff1026c26c2 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -23,6 +23,12 @@
 #include <sys/vfs.h>
 #include <unistd.h>
 
+/*
+ * Intentionally included last to work around header conflict.
+ * See https://sourceware.org/glibc/wiki/Synchronizing_Headers.
+ */
+#include <linux/fs.h>
+
 #include "common.h"
 
 #ifndef renameat2
@@ -735,6 +741,9 @@ static int create_ruleset(struct __test_metadata *const _metadata,
 	}
 
 	for (i = 0; rules[i].path; i++) {
+		if (!rules[i].access)
+			continue;
+
 		add_path_beneath(_metadata, ruleset_fd, rules[i].access,
 				 rules[i].path);
 	}
@@ -3443,7 +3452,7 @@ TEST_F_FORK(layout1, truncate_unhandled)
 			      LANDLOCK_ACCESS_FS_WRITE_FILE;
 	int ruleset_fd;
 
-	/* Enable Landlock. */
+	/* Enables Landlock. */
 	ruleset_fd = create_ruleset(_metadata, handled, rules);
 
 	ASSERT_LE(0, ruleset_fd);
@@ -3526,7 +3535,7 @@ TEST_F_FORK(layout1, truncate)
 			      LANDLOCK_ACCESS_FS_TRUNCATE;
 	int ruleset_fd;
 
-	/* Enable Landlock. */
+	/* Enables Landlock. */
 	ruleset_fd = create_ruleset(_metadata, handled, rules);
 
 	ASSERT_LE(0, ruleset_fd);
@@ -3752,7 +3761,7 @@ TEST_F_FORK(ftruncate, open_and_ftruncate)
 	};
 	int fd, ruleset_fd;
 
-	/* Enable Landlock. */
+	/* Enables Landlock. */
 	ruleset_fd = create_ruleset(_metadata, variant->handled, rules);
 	ASSERT_LE(0, ruleset_fd);
 	enforce_ruleset(_metadata, ruleset_fd);
@@ -3829,6 +3838,16 @@ TEST_F_FORK(ftruncate, open_and_ftruncate_in_different_processes)
 	ASSERT_EQ(0, close(socket_fds[1]));
 }
 
+/* Invokes the FS_IOC_GETFLAGS IOCTL and returns its errno or 0. */
+static int test_fs_ioc_getflags_ioctl(int fd)
+{
+	uint32_t flags;
+
+	if (ioctl(fd, FS_IOC_GETFLAGS, &flags) < 0)
+		return errno;
+	return 0;
+}
+
 TEST(memfd_ftruncate)
 {
 	int fd;
@@ -3845,6 +3864,360 @@ TEST(memfd_ftruncate)
 	ASSERT_EQ(0, close(fd));
 }
 
+/* clang-format off */
+FIXTURE(ioctl) {};
+/* clang-format on */
+
+FIXTURE_SETUP(ioctl)
+{
+	prepare_layout(_metadata);
+	create_file(_metadata, file1_s1d1);
+}
+
+FIXTURE_TEARDOWN(ioctl)
+{
+	EXPECT_EQ(0, remove_path(file1_s1d1));
+	cleanup_layout(_metadata);
+}
+
+FIXTURE_VARIANT(ioctl)
+{
+	const __u64 handled;
+	const __u64 allowed;
+	const mode_t open_mode;
+	/*
+	 * These are the expected IOCTL results for a representative IOCTL from
+	 * each of the IOCTL groups.  We only distinguish the 0 and EACCES
+	 * results here, and treat other errors as 0.
+	 */
+	const int expected_fioqsize_result; /* RW */
+	const int expected_fibmap_result; /* RW_FILE */
+	const int expected_fionread_result; /* special */
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_i_allowed_none) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = 0,
+	.open_mode = O_RDWR,
+	/*
+	 * If LANDLOCK_ACCESS_FS_IOCTL is handled, but nothing else is
+	 * explicitly handled, almost all IOCTL commands will be governed by the
+	 * LANDLOCK_ACCESS_FS_IOCTL right.  Files can be opened, but IOCTLs are
+	 * disallowed.
+	 */
+	.expected_fioqsize_result = EACCES,
+	.expected_fibmap_result = EACCES,
+	.expected_fionread_result = EACCES,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_i_allowed_i) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_IOCTL,
+	.open_mode = O_RDWR,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, unhandled) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_EXECUTE,
+	.allowed = LANDLOCK_ACCESS_FS_EXECUTE,
+	.open_mode = O_RDWR,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwd_allowed_r) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_READ_DIR,
+	.allowed = LANDLOCK_ACCESS_FS_READ_FILE,
+	.open_mode = O_RDONLY,
+	/* If LANDLOCK_ACCESS_FS_IOCTL is not handled, all IOCTLs work. */
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwd_allowed_w) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_READ_DIR,
+	.allowed = LANDLOCK_ACCESS_FS_WRITE_FILE,
+	.open_mode = O_WRONLY,
+	/* If LANDLOCK_ACCESS_FS_IOCTL is not handled, all IOCTLs work. */
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_ri_allowed_r) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_READ_FILE,
+	.open_mode = O_RDONLY,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_wi_allowed_w) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_WRITE_FILE,
+	.open_mode = O_WRONLY,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_di_allowed_d) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_DIR | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_READ_DIR,
+	.open_mode = O_RDWR,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = EACCES,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwi_allowed_rw) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_READ_FILE | LANDLOCK_ACCESS_FS_WRITE_FILE,
+	.open_mode = O_RDWR,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwi_allowed_r) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_READ_FILE,
+	.open_mode = O_RDONLY,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwi_allowed_ri) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_READ_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.open_mode = O_RDONLY,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwi_allowed_w) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_WRITE_FILE,
+	.open_mode = O_WRONLY,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+/* clang-format off */
+FIXTURE_VARIANT_ADD(ioctl, handled_rwi_allowed_wi) {
+	/* clang-format on */
+	.handled = LANDLOCK_ACCESS_FS_READ_FILE |
+		   LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.allowed = LANDLOCK_ACCESS_FS_WRITE_FILE | LANDLOCK_ACCESS_FS_IOCTL,
+	.open_mode = O_WRONLY,
+	.expected_fioqsize_result = 0,
+	.expected_fibmap_result = 0,
+	.expected_fionread_result = 0,
+};
+
+static int test_fioqsize_ioctl(int fd)
+{
+	size_t sz;
+
+	if (ioctl(fd, FIOQSIZE, &sz) < 0)
+		return errno;
+	return 0;
+}
+
+static int test_fibmap_ioctl(int fd)
+{
+	int blk = 0;
+
+	/*
+	 * We only want to distinguish here whether Landlock already caught it,
+	 * so we treat anything but EACCESS as success.  (It commonly returns
+	 * EPERM when missing CAP_SYS_RAWIO.)
+	 */
+	if (ioctl(fd, FIBMAP, &blk) < 0 && errno == EACCES)
+		return errno;
+	return 0;
+}
+
+static int test_fionread_ioctl(int fd)
+{
+	size_t sz = 0;
+
+	if (ioctl(fd, FIONREAD, &sz) < 0 && errno == EACCES)
+		return errno;
+	return 0;
+}
+
+TEST_F_FORK(ioctl, handle_dir_access_file)
+{
+	const int flag = 0;
+	const struct rule rules[] = {
+		{
+			.path = dir_s1d1,
+			.access = variant->allowed,
+		},
+		{},
+	};
+	int file_fd, ruleset_fd;
+
+	/* Enables Landlock. */
+	ruleset_fd = create_ruleset(_metadata, variant->handled, rules);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	file_fd = open(file1_s1d1, variant->open_mode);
+	ASSERT_LE(0, file_fd);
+
+	/*
+	 * Checks that IOCTL commands in each IOCTL group return the expected
+	 * errors.
+	 */
+	EXPECT_EQ(variant->expected_fioqsize_result,
+		  test_fioqsize_ioctl(file_fd));
+	EXPECT_EQ(variant->expected_fibmap_result, test_fibmap_ioctl(file_fd));
+	EXPECT_EQ(variant->expected_fionread_result,
+		  test_fionread_ioctl(file_fd));
+
+	/* Checks that unrestrictable commands are unrestricted. */
+	EXPECT_EQ(0, ioctl(file_fd, FIOCLEX));
+	EXPECT_EQ(0, ioctl(file_fd, FIONCLEX));
+	EXPECT_EQ(0, ioctl(file_fd, FIONBIO, &flag));
+	EXPECT_EQ(0, ioctl(file_fd, FIOASYNC, &flag));
+
+	ASSERT_EQ(0, close(file_fd));
+}
+
+TEST_F_FORK(ioctl, handle_dir_access_dir)
+{
+	const char *const path = dir_s1d1;
+	const int flag = 0;
+	const struct rule rules[] = {
+		{
+			.path = path,
+			.access = variant->allowed,
+		},
+		{},
+	};
+	int dir_fd, ruleset_fd;
+
+	/* Enables Landlock. */
+	ruleset_fd = create_ruleset(_metadata, variant->handled, rules);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/*
+	 * Ignore variant->open_mode for this test, as we intend to open a
+	 * directory.  If the directory can not be opened, the variant is
+	 * infeasible to test with an opened directory.
+	 */
+	dir_fd = open(path, O_RDONLY);
+	if (dir_fd < 0)
+		return;
+
+	/*
+	 * Checks that IOCTL commands in each IOCTL group return the expected
+	 * errors.
+	 */
+	EXPECT_EQ(variant->expected_fioqsize_result,
+		  test_fioqsize_ioctl(dir_fd));
+	EXPECT_EQ(variant->expected_fibmap_result, test_fibmap_ioctl(dir_fd));
+	EXPECT_EQ(variant->expected_fionread_result,
+		  test_fionread_ioctl(dir_fd));
+
+	/* Checks that unrestrictable commands are unrestricted. */
+	EXPECT_EQ(0, ioctl(dir_fd, FIOCLEX));
+	EXPECT_EQ(0, ioctl(dir_fd, FIONCLEX));
+	EXPECT_EQ(0, ioctl(dir_fd, FIONBIO, &flag));
+	EXPECT_EQ(0, ioctl(dir_fd, FIOASYNC, &flag));
+
+	ASSERT_EQ(0, close(dir_fd));
+}
+
+TEST_F_FORK(ioctl, handle_file_access_file)
+{
+	const char *const path = file1_s1d1;
+	const int flag = 0;
+	const struct rule rules[] = {
+		{
+			.path = path,
+			.access = variant->allowed,
+		},
+		{},
+	};
+	int file_fd, ruleset_fd;
+
+	if (variant->allowed & LANDLOCK_ACCESS_FS_READ_DIR) {
+		SKIP(return, "LANDLOCK_ACCESS_FS_READ_DIR "
+			     "can not be granted on files");
+	}
+
+	/* Enables Landlock. */
+	ruleset_fd = create_ruleset(_metadata, variant->handled, rules);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	file_fd = open(path, variant->open_mode);
+	ASSERT_LE(0, file_fd);
+
+	/*
+	 * Checks that IOCTL commands in each IOCTL group return the expected
+	 * errors.
+	 */
+	EXPECT_EQ(variant->expected_fioqsize_result,
+		  test_fioqsize_ioctl(file_fd));
+	EXPECT_EQ(variant->expected_fibmap_result, test_fibmap_ioctl(file_fd));
+	EXPECT_EQ(variant->expected_fionread_result,
+		  test_fionread_ioctl(file_fd));
+
+	/* Checks that unrestrictable commands are unrestricted. */
+	EXPECT_EQ(0, ioctl(file_fd, FIOCLEX));
+	EXPECT_EQ(0, ioctl(file_fd, FIONCLEX));
+	EXPECT_EQ(0, ioctl(file_fd, FIONBIO, &flag));
+	EXPECT_EQ(0, ioctl(file_fd, FIOASYNC, &flag));
+
+	ASSERT_EQ(0, close(file_fd));
+}
+
 /* clang-format off */
 FIXTURE(layout1_bind) {};
 /* clang-format on */
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 3/8] selftests/landlock: Test IOCTL with memfds
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
  2024-02-09 17:06 ` [PATCH v9 2/8] selftests/landlock: Test IOCTL support Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-09 17:06 ` [PATCH v9 4/8] selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH) Günther Noack
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Because the LANDLOCK_ACCESS_FS_IOCTL right is associated with the
opened file during open(2), IOCTLs are supposed to work with files
which are opened by means other than open(2).

Signed-off-by: Günther Noack <gnoack@google.com>
---
 tools/testing/selftests/landlock/fs_test.c | 36 ++++++++++++++++------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 6ff1026c26c2..aa4e5524b22f 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -3848,20 +3848,38 @@ static int test_fs_ioc_getflags_ioctl(int fd)
 	return 0;
 }
 
-TEST(memfd_ftruncate)
+TEST(memfd_ftruncate_and_ioctl)
 {
-	int fd;
-
-	fd = memfd_create("name", MFD_CLOEXEC);
-	ASSERT_LE(0, fd);
+	const struct landlock_ruleset_attr attr = {
+		.handled_access_fs = ACCESS_ALL,
+	};
+	int ruleset_fd, fd, i;
 
 	/*
-	 * Checks that ftruncate is permitted on file descriptors that are
-	 * created in ways other than open(2).
+	 * We exercise the same test both with and without Landlock enabled, to
+	 * ensure that it behaves the same in both cases.
 	 */
-	EXPECT_EQ(0, test_ftruncate(fd));
+	for (i = 0; i < 2; i++) {
+		/* Creates a new memfd. */
+		fd = memfd_create("name", MFD_CLOEXEC);
+		ASSERT_LE(0, fd);
 
-	ASSERT_EQ(0, close(fd));
+		/*
+		 * Checks that operations associated with the opened file
+		 * (ftruncate, ioctl) are permitted on file descriptors that are
+		 * created in ways other than open(2).
+		 */
+		EXPECT_EQ(0, test_ftruncate(fd));
+		EXPECT_EQ(0, test_fs_ioc_getflags_ioctl(fd));
+
+		ASSERT_EQ(0, close(fd));
+
+		/* Enables Landlock. */
+		ruleset_fd = landlock_create_ruleset(&attr, sizeof(attr), 0);
+		ASSERT_LE(0, ruleset_fd);
+		enforce_ruleset(_metadata, ruleset_fd);
+		ASSERT_EQ(0, close(ruleset_fd));
+	}
 }
 
 /* clang-format off */
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 4/8] selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH)
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
                   ` (2 preceding siblings ...)
  2024-02-09 17:06 ` [PATCH v9 3/8] selftests/landlock: Test IOCTL with memfds Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-09 17:06 ` [PATCH v9 5/8] selftests/landlock: Test IOCTLs on named pipes Günther Noack
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

ioctl(2) and ftruncate(2) operations on files opened with O_PATH
should always return EBADF, independent of the
LANDLOCK_ACCESS_FS_TRUNCATE and LANDLOCK_ACCESS_FS_IOCTL access rights
in that file hierarchy.

Suggested-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Günther Noack <gnoack@google.com>
---
 tools/testing/selftests/landlock/fs_test.c | 40 ++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index aa4e5524b22f..9e9b828a898b 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -3882,6 +3882,46 @@ TEST(memfd_ftruncate_and_ioctl)
 	}
 }
 
+TEST_F_FORK(layout1, o_path_ftruncate_and_ioctl)
+{
+	const struct landlock_ruleset_attr attr = {
+		.handled_access_fs = ACCESS_ALL,
+	};
+	int ruleset_fd, fd;
+
+	/*
+	 * Checks that for files opened with O_PATH, both ioctl(2) and
+	 * ftruncate(2) yield EBADF, as it is documented in open(2) for the
+	 * O_PATH flag.
+	 */
+	fd = open(dir_s1d1, O_PATH | O_CLOEXEC);
+	ASSERT_LE(0, fd);
+
+	EXPECT_EQ(EBADF, test_ftruncate(fd));
+	EXPECT_EQ(EBADF, test_fs_ioc_getflags_ioctl(fd));
+
+	ASSERT_EQ(0, close(fd));
+
+	/* Enables Landlock. */
+	ruleset_fd = landlock_create_ruleset(&attr, sizeof(attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/*
+	 * Checks that after enabling Landlock,
+	 * - the file can still be opened with O_PATH
+	 * - both ioctl and truncate still yield EBADF (not EACCES).
+	 */
+	fd = open(dir_s1d1, O_PATH | O_CLOEXEC);
+	ASSERT_LE(0, fd);
+
+	EXPECT_EQ(EBADF, test_ftruncate(fd));
+	EXPECT_EQ(EBADF, test_fs_ioc_getflags_ioctl(fd));
+
+	ASSERT_EQ(0, close(fd));
+}
+
 /* clang-format off */
 FIXTURE(ioctl) {};
 /* clang-format on */
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 5/8] selftests/landlock: Test IOCTLs on named pipes
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
                   ` (3 preceding siblings ...)
  2024-02-09 17:06 ` [PATCH v9 4/8] selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH) Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-09 17:06 ` [PATCH v9 6/8] selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets Günther Noack
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Named pipes should behave like pipes created with pipe(2),
so we don't want to restrict IOCTLs on them.

Suggested-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Günther Noack <gnoack@google.com>
---
 tools/testing/selftests/landlock/fs_test.c | 70 +++++++++++++++++++---
 1 file changed, 61 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 9e9b828a898b..ae8b8b412828 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -3922,6 +3922,67 @@ TEST_F_FORK(layout1, o_path_ftruncate_and_ioctl)
 	ASSERT_EQ(0, close(fd));
 }
 
+static int test_fionread_ioctl(int fd)
+{
+	size_t sz = 0;
+
+	if (ioctl(fd, FIONREAD, &sz) < 0 && errno == EACCES)
+		return errno;
+	return 0;
+}
+
+/*
+ * For named pipes, the same rules should apply as for anonymous pipes.
+ *
+ * That means, if the pipe is opened, we should permit the IOCTLs which are
+ * implemented by pipefifo_fops (fs/pipe.c), even if they were otherwise
+ * forbidden by Landlock policy.
+ */
+TEST_F_FORK(layout1, named_pipe_ioctl)
+{
+	pid_t child_pid;
+	int fd, ruleset_fd;
+	const char *const path = file1_s1d1;
+	const struct landlock_ruleset_attr attr = {
+		.handled_access_fs = LANDLOCK_ACCESS_FS_IOCTL,
+	};
+
+	ASSERT_EQ(0, unlink(path));
+	ASSERT_EQ(0, mkfifo(path, 0600));
+
+	/* Enables Landlock. */
+	ruleset_fd = landlock_create_ruleset(&attr, sizeof(attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/* The child process opens the pipe for writing. */
+	child_pid = fork();
+	ASSERT_NE(-1, child_pid);
+	if (child_pid == 0) {
+		fd = open(path, O_WRONLY);
+		close(fd);
+		exit(0);
+	}
+
+	fd = open(path, O_RDONLY);
+	ASSERT_LE(0, fd);
+
+	/* FIONREAD is implemented by pipefifo_fops. */
+	EXPECT_EQ(0, test_fionread_ioctl(fd));
+
+	ASSERT_EQ(0, close(fd));
+	ASSERT_EQ(0, unlink(path));
+
+	/* Under the same conditions, FIONREAD on a regular file fails. */
+	fd = open(file2_s1d1, O_RDONLY);
+	ASSERT_LE(0, fd);
+	EXPECT_EQ(EACCES, test_fionread_ioctl(fd));
+	ASSERT_EQ(0, close(fd));
+
+	ASSERT_EQ(child_pid, waitpid(child_pid, NULL, 0));
+}
+
 /* clang-format off */
 FIXTURE(ioctl) {};
 /* clang-format on */
@@ -4134,15 +4195,6 @@ static int test_fibmap_ioctl(int fd)
 	return 0;
 }
 
-static int test_fionread_ioctl(int fd)
-{
-	size_t sz = 0;
-
-	if (ioctl(fd, FIONREAD, &sz) < 0 && errno == EACCES)
-		return errno;
-	return 0;
-}
-
 TEST_F_FORK(ioctl, handle_dir_access_file)
 {
 	const int flag = 0;
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 6/8] selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
                   ` (4 preceding siblings ...)
  2024-02-09 17:06 ` [PATCH v9 5/8] selftests/landlock: Test IOCTLs on named pipes Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-09 17:06 ` [PATCH v9 7/8] samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL Günther Noack
  2024-02-09 17:06 ` [PATCH v9 8/8] landlock: Document IOCTL support Günther Noack
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Suggested-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Günther Noack <gnoack@google.com>
---
 tools/testing/selftests/landlock/fs_test.c | 53 ++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index ae8b8b412828..59b57ff6915b 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -18,8 +18,10 @@
 #include <sys/mount.h>
 #include <sys/prctl.h>
 #include <sys/sendfile.h>
+#include <sys/socket.h>
 #include <sys/stat.h>
 #include <sys/sysmacros.h>
+#include <sys/un.h>
 #include <sys/vfs.h>
 #include <unistd.h>
 
@@ -3983,6 +3985,57 @@ TEST_F_FORK(layout1, named_pipe_ioctl)
 	ASSERT_EQ(child_pid, waitpid(child_pid, NULL, 0));
 }
 
+/* For named UNIX domain sockets, no IOCTL restrictions apply. */
+TEST_F_FORK(layout1, named_unix_domain_socket_ioctl)
+{
+	const char *const path = file1_s1d1;
+	int srv_fd, cli_fd, ruleset_fd;
+	socklen_t size;
+	struct sockaddr_un srv_un, cli_un;
+	const struct landlock_ruleset_attr attr = {
+		.handled_access_fs = LANDLOCK_ACCESS_FS_IOCTL,
+	};
+
+	/* Sets up a server */
+	srv_un.sun_family = AF_UNIX;
+	strncpy(srv_un.sun_path, path, sizeof(srv_un.sun_path));
+
+	ASSERT_EQ(0, unlink(path));
+	ASSERT_LE(0, (srv_fd = socket(AF_UNIX, SOCK_STREAM, 0)));
+
+	size = offsetof(struct sockaddr_un, sun_path) + strlen(srv_un.sun_path);
+	ASSERT_EQ(0, bind(srv_fd, (struct sockaddr *)&srv_un, size));
+	ASSERT_EQ(0, listen(srv_fd, 10 /* qlen */));
+
+	/* Enables Landlock. */
+	ruleset_fd = landlock_create_ruleset(&attr, sizeof(attr), 0);
+	ASSERT_LE(0, ruleset_fd);
+	enforce_ruleset(_metadata, ruleset_fd);
+	ASSERT_EQ(0, close(ruleset_fd));
+
+	/* Sets up a client connection to it */
+	cli_un.sun_family = AF_UNIX;
+	snprintf(cli_un.sun_path, sizeof(cli_un.sun_path), "%s%ld", path,
+		 (long)getpid());
+
+	ASSERT_LE(0, (cli_fd = socket(AF_UNIX, SOCK_STREAM, 0)));
+
+	size = offsetof(struct sockaddr_un, sun_path) + strlen(cli_un.sun_path);
+	ASSERT_EQ(0, bind(cli_fd, (struct sockaddr *)&cli_un, size));
+
+	bzero(&cli_un, sizeof(cli_un));
+	cli_un.sun_family = AF_UNIX;
+	strncpy(cli_un.sun_path, path, sizeof(cli_un.sun_path));
+	size = offsetof(struct sockaddr_un, sun_path) + strlen(cli_un.sun_path);
+
+	ASSERT_EQ(0, connect(cli_fd, (struct sockaddr *)&cli_un, size));
+
+	/* FIONREAD and other IOCTLs should not be forbidden. */
+	EXPECT_EQ(0, test_fionread_ioctl(cli_fd));
+
+	ASSERT_EQ(0, close(cli_fd));
+}
+
 /* clang-format off */
 FIXTURE(ioctl) {};
 /* clang-format on */
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 7/8] samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
                   ` (5 preceding siblings ...)
  2024-02-09 17:06 ` [PATCH v9 6/8] selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  2024-02-09 17:06 ` [PATCH v9 8/8] landlock: Document IOCTL support Günther Noack
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

Add ioctl support to the Landlock sample tool.

The ioctl right is grouped with the read-write rights in the sample
tool, as some ioctl requests provide features that mutate state.

Signed-off-by: Günther Noack <gnoack@google.com>
---
 samples/landlock/sandboxer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
index 08596c0ef070..d7323e5526be 100644
--- a/samples/landlock/sandboxer.c
+++ b/samples/landlock/sandboxer.c
@@ -81,7 +81,8 @@ static int parse_path(char *env_path, const char ***const path_list)
 	LANDLOCK_ACCESS_FS_EXECUTE | \
 	LANDLOCK_ACCESS_FS_WRITE_FILE | \
 	LANDLOCK_ACCESS_FS_READ_FILE | \
-	LANDLOCK_ACCESS_FS_TRUNCATE)
+	LANDLOCK_ACCESS_FS_TRUNCATE | \
+	LANDLOCK_ACCESS_FS_IOCTL)
 
 /* clang-format on */
 
@@ -199,11 +200,12 @@ static int populate_ruleset_net(const char *const env_var, const int ruleset_fd,
 	LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
 	LANDLOCK_ACCESS_FS_MAKE_SYM | \
 	LANDLOCK_ACCESS_FS_REFER | \
-	LANDLOCK_ACCESS_FS_TRUNCATE)
+	LANDLOCK_ACCESS_FS_TRUNCATE | \
+	LANDLOCK_ACCESS_FS_IOCTL)
 
 /* clang-format on */
 
-#define LANDLOCK_ABI_LAST 4
+#define LANDLOCK_ABI_LAST 5
 
 int main(const int argc, char *const argv[], char *const *const envp)
 {
@@ -317,6 +319,11 @@ int main(const int argc, char *const argv[], char *const *const envp)
 		ruleset_attr.handled_access_net &=
 			~(LANDLOCK_ACCESS_NET_BIND_TCP |
 			  LANDLOCK_ACCESS_NET_CONNECT_TCP);
+		__attribute__((fallthrough));
+	case 4:
+		/* Removes LANDLOCK_ACCESS_FS_IOCTL for ABI < 5 */
+		ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_IOCTL;
+
 		fprintf(stderr,
 			"Hint: You should update the running kernel "
 			"to leverage Landlock features "
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v9 8/8] landlock: Document IOCTL support
  2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
                   ` (6 preceding siblings ...)
  2024-02-09 17:06 ` [PATCH v9 7/8] samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL Günther Noack
@ 2024-02-09 17:06 ` Günther Noack
  7 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-09 17:06 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Günther Noack

In the paragraph above the fallback logic, use the shorter phrasing
from the landlock(7) man page.

Signed-off-by: Günther Noack <gnoack@google.com>
---
 Documentation/userspace-api/landlock.rst | 121 ++++++++++++++++++++---
 1 file changed, 106 insertions(+), 15 deletions(-)

diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index 2e3822677061..a6e55912139b 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -75,7 +75,8 @@ to be explicit about the denied-by-default access rights.
             LANDLOCK_ACCESS_FS_MAKE_BLOCK |
             LANDLOCK_ACCESS_FS_MAKE_SYM |
             LANDLOCK_ACCESS_FS_REFER |
-            LANDLOCK_ACCESS_FS_TRUNCATE,
+            LANDLOCK_ACCESS_FS_TRUNCATE |
+            LANDLOCK_ACCESS_FS_IOCTL,
         .handled_access_net =
             LANDLOCK_ACCESS_NET_BIND_TCP |
             LANDLOCK_ACCESS_NET_CONNECT_TCP,
@@ -84,10 +85,10 @@ to be explicit about the denied-by-default access rights.
 Because we may not know on which kernel version an application will be
 executed, it is safer to follow a best-effort security approach.  Indeed, we
 should try to protect users as much as possible whatever the kernel they are
-using.  To avoid binary enforcement (i.e. either all security features or
-none), we can leverage a dedicated Landlock command to get the current version
-of the Landlock ABI and adapt the handled accesses.  Let's check if we should
-remove access rights which are only supported in higher versions of the ABI.
+using.
+
+To be compatible with older Linux versions, we detect the available Landlock ABI
+version, and only use the available subset of access rights:
 
 .. code-block:: c
 
@@ -113,6 +114,10 @@ remove access rights which are only supported in higher versions of the ABI.
         ruleset_attr.handled_access_net &=
             ~(LANDLOCK_ACCESS_NET_BIND_TCP |
               LANDLOCK_ACCESS_NET_CONNECT_TCP);
+        __attribute__((fallthrough));
+    case 4:
+        /* Removes LANDLOCK_ACCESS_FS_IOCTL for ABI < 5 */
+        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_IOCTL;
     }
 
 This enables to create an inclusive ruleset that will contain our rules.
@@ -224,6 +229,7 @@ access rights per directory enables to change the location of such directory
 without relying on the destination directory access rights (except those that
 are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER``
 documentation).
+
 Having self-sufficient hierarchies also helps to tighten the required access
 rights to the minimal set of data.  This also helps avoid sinkhole directories,
 i.e.  directories where data can be linked to but not linked from.  However,
@@ -317,18 +323,72 @@ It should also be noted that truncating files does not require the
 system call, this can also be done through :manpage:`open(2)` with the flags
 ``O_RDONLY | O_TRUNC``.
 
-When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE``
-right is associated with the newly created file descriptor and will be used for
-subsequent truncation attempts using :manpage:`ftruncate(2)`.  The behavior is
-similar to opening a file for reading or writing, where permissions are checked
-during :manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and
+The truncate right is associated with the opened file (see below).
+
+Rights associated with file descriptors
+---------------------------------------
+
+When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE`` and
+``LANDLOCK_ACCESS_FS_IOCTL`` rights is associated with the newly created file
+descriptor and will be used for subsequent truncation and ioctl attempts using
+:manpage:`ftruncate(2)` and :manpage:`ioctl(2)`.  The behavior is similar to
+opening a file for reading or writing, where permissions are checked during
+:manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and
 :manpage:`write(2)` calls.
 
-As a consequence, it is possible to have multiple open file descriptors for the
-same file, where one grants the right to truncate the file and the other does
-not.  It is also possible to pass such file descriptors between processes,
-keeping their Landlock properties, even when these processes do not have an
-enforced Landlock ruleset.
+As a consequence, it is possible that a process has multiple open file
+descriptors referring to the same file, but Landlock enforces different things
+when operating with these file descriptors.  This can happen when a Landlock
+ruleset gets enforced and the process keeps file descriptors which were opened
+both before and after the enforcement.  It is also possible to pass such file
+descriptors between processes, keeping their Landlock properties, even when some
+of the involved processes do not have an enforced Landlock ruleset.
+
+Restricting IOCTL commands
+--------------------------
+
+When the ``LANDLOCK_ACCESS_FS_IOCTL`` right is handled, Landlock will restrict
+the invocation of IOCTL commands.  However, to *allow* these IOCTL commands
+again, some of these IOCTL commands are then granted through other, preexisting
+access rights.
+
+For example, consider a program which handles ``LANDLOCK_ACCESS_FS_IOCTL`` and
+``LANDLOCK_ACCESS_FS_READ_FILE``.  The program *allows*
+``LANDLOCK_ACCESS_FS_READ_FILE`` on a file ``foo.log``.
+
+By virtue of granting this access on the ``foo.log`` file, it is now possible to
+use common and harmless IOCTL commands which are useful when reading files, such
+as ``FIONREAD``.
+
+When both ``LANDLOCK_ACCESS_FS_IOCTL`` and other access rights are
+handled in the ruleset, these other access rights may start governing
+the use of individual IOCTL commands instead of
+``LANDLOCK_ACCESS_FS_IOCTL``.  For instance, if both
+``LANDLOCK_ACCESS_FS_IOCTL`` and ``LANDLOCK_ACCESS_FS_READ_FILE`` are
+handled, allowing ``LANDLOCK_ACCESS_FS_READ_FILE`` will make it
+possible to use ``FIONREAD`` and other IOCTL commands.
+
+The following table illustrates how IOCTL attempts for ``FIONREAD`` are
+filtered, depending on how a Landlock ruleset handles and allows the
+``LANDLOCK_ACCESS_FS_IOCTL`` and ``LANDLOCK_ACCESS_FS_READ_FILE`` rights:
+
++-------------------------+--------------+--------------+--------------+
+|                         | ``FS_IOCTL`` | ``FS_IOCTL`` | ``FS_IOCTL`` |
+|                         | not handled  | handled and  | handled and  |
+|                         |              | allowed      | not allowed  |
++-------------------------+--------------+--------------+--------------+
+| ``FS_READ_FILE``        | allow        | allow        | deny         |
+| not handled             |              |              |              |
++-------------------------+              +--------------+--------------+
+| ``FS_READ_FILE``        |              | allow                       |
+| handled and allowed     |              |                             |
++-------------------------+              +-----------------------------+
+| ``FS_READ_FILE``        |              | deny                        |
+| handled and not allowed |              |                             |
++-------------------------+--------------+-----------------------------+
+
+The full list of IOCTL commands and the access rights which affect them is
+documented below.
 
 Compatibility
 =============
@@ -457,6 +517,27 @@ Memory usage
 Kernel memory allocated to create rulesets is accounted and can be restricted
 by the Documentation/admin-guide/cgroup-v1/memory.rst.
 
+IOCTL support
+-------------
+
+The ``LANDLOCK_ACCESS_FS_IOCTL`` right restricts the use of :manpage:`ioctl(2)`,
+but it only applies to newly opened files.  This means specifically that
+pre-existing file descriptors like stdin, stdout and stderr are unaffected.
+
+Users should be aware that TTY devices have traditionally permitted to control
+other processes on the same TTY through the ``TIOCSTI`` and ``TIOCLINUX`` IOCTL
+commands.  It is therefore recommended to close inherited TTY file descriptors,
+or to reopen them from ``/proc/self/fd/*`` without the
+``LANDLOCK_ACCESS_FS_IOCTL`` right, if possible.  The :manpage:`isatty(3)`
+function checks whether a given file descriptor is a TTY.
+
+Landlock's IOCTL support is coarse-grained at the moment, but may become more
+fine-grained in the future.  Until then, users are advised to establish the
+guarantees that they need through the file hierarchy, by only allowing the
+``LANDLOCK_ACCESS_FS_IOCTL`` right on files where it is really harmless.  In
+cases where you can control the mounts, the ``nodev`` mount option can help to
+rule out that device files can be accessed.
+
 Previous limitations
 ====================
 
@@ -494,6 +575,16 @@ bind and connect actions to only a set of allowed ports thanks to the new
 ``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP``
 access rights.
 
+IOCTL (ABI < 5)
+---------------
+
+IOCTL operations could not be denied before the fifth Landlock ABI, so
+:manpage:`ioctl(2)` is always allowed when using a kernel that only supports an
+earlier ABI.
+
+Starting with the Landlock ABI version 5, it is possible to restrict the use of
+:manpage:`ioctl(2)` using the new ``LANDLOCK_ACCESS_FS_IOCTL`` access right.
+
 .. _kernel_support:
 
 Kernel support
-- 
2.43.0.687.g38aa6559b0-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
@ 2024-02-10 11:06   ` Günther Noack
  2024-02-10 11:49     ` Arnd Bergmann
  2024-02-10 11:18   ` Günther Noack
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-02-10 11:06 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-security-module, Mickaël Salaün,
	Christian Brauner, Jeff Xu, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel

Hello Arnd!

On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
> and increments the Landlock ABI version to 5.
>
> [...]
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 73997e63734f..84efea3f7c0f 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> +/**
> + * get_required_ioctl_access(): Determine required IOCTL access rights.
> + *
> + * @cmd: The IOCTL command that is supposed to be run.
> + *
> + * Any new IOCTL commands that are implemented in fs/ioctl.c's do_vfs_ioctl()
> + * should be considered for inclusion here.
> + *
> + * Returns: The access rights that must be granted on an opened file in order to
> + * use the given @cmd.
> + */
> +static __attribute_const__ access_mask_t
> +get_required_ioctl_access(const unsigned int cmd)
> +{
> +	switch (cmd) {

  [...]

> +	case FIONREAD:

Hello Arnd!  Christian Brauner suggested at FOSDEM that you would be a
good person to reach out to regarding this question -- we would
appreciate if you could have a short look! :)

Context: This patch set lets processes restrict the use of IOCTLs with
the Landlock LSM.  To make the use of this feature more practical, we
are relaxing the rules for some common and harmless IOCTL commands,
which are directly implemented in fs/ioctl.c.

The IOCTL command in question is FIONREAD: fs/ioctl.c implements
FIONREAD directly for S_ISREG files, but it does call the FIONREAD
implementation in the VFS layer for other file types.

The question we are asking ourselves is:

* Can we let processes safely use FIONREAD for all files which get
  opened for the purpose of reading, or do we run the risk of
  accidentally exposing surprising IOCTL implementations which have
  completely different purposes?

  Is it safe to assume that the VFS layer FIONREAD implementations are
  actually implementing FIONREAD semantics?

* I know there have been accidental collisions of IOCTL command
  numbers in the past -- Hypothetically, if this were to happen in one
  of the VFS implementations of FIONREAD, would that be considered a
  bug that would need to get fixed in that implementation?


> +	case FIOQSIZE:
> +	case FIGETBSZ:
> +		/*
> +		 * FIONREAD returns the number of bytes available for reading.
> +		 * FIONREAD returns the number of immediately readable bytes for
> +		 * a file.

(Oops, repetitive doc, probably mis-merged.  Fixing it in next version.)


Thanks!
—Günther

-- 
Sent using Mutt 🐕 Woof Woof

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
  2024-02-10 11:06   ` Günther Noack
@ 2024-02-10 11:18   ` Günther Noack
  2024-02-16 14:11     ` Mickaël Salaün
  2024-02-16 17:19   ` Mickaël Salaün
  2024-02-19 18:34   ` Mickaël Salaün
  3 siblings, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-02-10 11:18 UTC (permalink / raw)
  To: linux-security-module, Mickaël Salaün
  Cc: Jeff Xu, Arnd Bergmann, Christian Brauner, Jorge Lucangeli Obes,
	Allen Webb, Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel

On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 73997e63734f..84efea3f7c0f 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
>  {
>  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
>  	access_mask_t open_access_request, full_access_request, allowed_access;
> -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> +					      LANDLOCK_ACCESS_FS_IOCTL |
> +					      IOCTL_GROUPS;
>  	const struct landlock_ruleset *const dom = get_current_fs_domain();
>  
>  	if (!dom)
> @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
>  		}
>  	}
>  
> +	/*
> +	 * Named pipes should be treated just like anonymous pipes.
> +	 * Therefore, we permit all IOCTLs on them.
> +	 */
> +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> +	}
> +

Hello Mickaël, this "if" is a change I'd like to draw your attention
to -- this special case was necessary so that all IOCTLs are permitted
on named pipes. (There is also a test for it in another commit.)

Open questions here are:

 - I'm a bit on the edge whether it's worth it to have these special
   cases here.  After all, users can very easily just permit all
   IOCTLs through the ruleset if needed, and it might simplify the
   mental model that we have to explain in the documentation

 - I've put the special case into the file open hook, under the
   assumption that it would simplify the Landlock audit support to
   have the correct rights on the struct file.  The implementation
   could alternatively also be done in the ioctl hook. Let me know
   which one makes more sense to you.

BTW, named UNIX domain sockets can apparently not be opened with open() and
therefore they don't hit the LSM file_open hook.  (It is done with the BSD
socket API instead.)

Thanks!
—Günther

-- 
Sent using Mutt 🐕 Woof Woof

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-10 11:06   ` Günther Noack
@ 2024-02-10 11:49     ` Arnd Bergmann
  2024-02-12 11:09       ` Christian Brauner
  0 siblings, 1 reply; 50+ messages in thread
From: Arnd Bergmann @ 2024-02-10 11:49 UTC (permalink / raw)
  To: Günther Noack
  Cc: linux-security-module, Mickaël Salaün,
	Christian Brauner, Jeff Xu, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel

On Sat, Feb 10, 2024, at 12:06, Günther Noack wrote:
> On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
>
> The IOCTL command in question is FIONREAD: fs/ioctl.c implements
> FIONREAD directly for S_ISREG files, but it does call the FIONREAD
> implementation in the VFS layer for other file types.
>
> The question we are asking ourselves is:
>
> * Can we let processes safely use FIONREAD for all files which get
>   opened for the purpose of reading, or do we run the risk of
>   accidentally exposing surprising IOCTL implementations which have
>   completely different purposes?
>
>   Is it safe to assume that the VFS layer FIONREAD implementations are
>   actually implementing FIONREAD semantics?
>
> * I know there have been accidental collisions of IOCTL command
>   numbers in the past -- Hypothetically, if this were to happen in one
>   of the VFS implementations of FIONREAD, would that be considered a
>   bug that would need to get fixed in that implementation?

Clearly it's impossible to be sure no driver has a conflict
on this particular ioctl, but the risk for one intentionally
overriding it should be fairly low.

There are a couple of possible issues I can think of:

- the numeric value of FIONREAD is different depending
  on the architecture, with at least four different numbers
  aliasing to it. This is probably harmless but makes it
  harder to look for accidental conflicts.

- Aside from FIONREAD, it is sometimes called SIOCINQ
  (for sockets) or TIOCINQ (for tty). These still go
  through the same VFS entry point and as far as I can
  tell always have the same semantics (writing 4 bytes
  of data with the count of the remaining bytes in the
  fd).

- There are probably a couple of drivers that do something
  in their ioctl handler without actually looking at
  the command number.

If you want to be really sure you get this right, you
could add a new callback to struct file_operations
that handles this for all drivers, something like

static int ioctl_fionread(struct file *filp, int __user *arg)
{
     int n;

     if (S_ISREG(inode->i_mode))
         return put_user(i_size_read(inode) - filp->f_pos, arg);

     if (!file->f_op->fionread)
         return -ENOIOCTLCMD;

     n = file->f_op->fionread(filp);

     if (n < 0)
         return n;

     return put_user(n, arg);
}

With this, you can go through any driver implementing
FIONREAD/SIOCINQ/TIOCINQ and move the code from .ioctl
into .fionread. This probably results in cleaner code
overall, especially in drivers that have no other ioctl
commands besides this one.

Since sockets and ttys tend to have both SIOCINQ/TIOCINQ
and SIOCOUTQ/TIOCOUTQ (unlike regular files), it's
probably best to do both at the same time, or maybe
have a single callback pointer with an in/out flag.

       Arnd

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-10 11:49     ` Arnd Bergmann
@ 2024-02-12 11:09       ` Christian Brauner
  2024-02-12 22:10         ` Günther Noack
  0 siblings, 1 reply; 50+ messages in thread
From: Christian Brauner @ 2024-02-12 11:09 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Günther Noack, linux-security-module,
	Mickaël Salaün, Jeff Xu, Jorge Lucangeli Obes,
	Allen Webb, Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel

On Sat, Feb 10, 2024 at 12:49:23PM +0100, Arnd Bergmann wrote:
> On Sat, Feb 10, 2024, at 12:06, Günther Noack wrote:
> > On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> >
> > The IOCTL command in question is FIONREAD: fs/ioctl.c implements
> > FIONREAD directly for S_ISREG files, but it does call the FIONREAD
> > implementation in the VFS layer for other file types.
> >
> > The question we are asking ourselves is:
> >
> > * Can we let processes safely use FIONREAD for all files which get
> >   opened for the purpose of reading, or do we run the risk of
> >   accidentally exposing surprising IOCTL implementations which have
> >   completely different purposes?
> >
> >   Is it safe to assume that the VFS layer FIONREAD implementations are
> >   actually implementing FIONREAD semantics?

Yes, otherwise this should considered a bug.

> >
> > * I know there have been accidental collisions of IOCTL command
> >   numbers in the past -- Hypothetically, if this were to happen in one
> >   of the VFS implementations of FIONREAD, would that be considered a
> >   bug that would need to get fixed in that implementation?
> 
> Clearly it's impossible to be sure no driver has a conflict
> on this particular ioctl, but the risk for one intentionally
> overriding it should be fairly low.
> 
> There are a couple of possible issues I can think of:
> 
> - the numeric value of FIONREAD is different depending
>   on the architecture, with at least four different numbers
>   aliasing to it. This is probably harmless but makes it
>   harder to look for accidental conflicts.
> 
> - Aside from FIONREAD, it is sometimes called SIOCINQ
>   (for sockets) or TIOCINQ (for tty). These still go
>   through the same VFS entry point and as far as I can
>   tell always have the same semantics (writing 4 bytes
>   of data with the count of the remaining bytes in the
>   fd).
> 
> - There are probably a couple of drivers that do something
>   in their ioctl handler without actually looking at
>   the command number.
> 
> If you want to be really sure you get this right, you
> could add a new callback to struct file_operations
> that handles this for all drivers, something like
> 
> static int ioctl_fionread(struct file *filp, int __user *arg)
> {
>      int n;
> 
>      if (S_ISREG(inode->i_mode))
>          return put_user(i_size_read(inode) - filp->f_pos, arg);
> 
>      if (!file->f_op->fionread)
>          return -ENOIOCTLCMD;
> 
>      n = file->f_op->fionread(filp);
> 
>      if (n < 0)
>          return n;
> 
>      return put_user(n, arg);
> }
> 
> With this, you can go through any driver implementing
> FIONREAD/SIOCINQ/TIOCINQ and move the code from .ioctl
> into .fionread. This probably results in cleaner code
> overall, especially in drivers that have no other ioctl
> commands besides this one.
> 
> Since sockets and ttys tend to have both SIOCINQ/TIOCINQ
> and SIOCOUTQ/TIOCOUTQ (unlike regular files), it's

I'm not excited about adding a bunch of methods to struct
file_operations for this stuff.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-12 11:09       ` Christian Brauner
@ 2024-02-12 22:10         ` Günther Noack
  0 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-12 22:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Arnd Bergmann, linux-security-module, Mickaël Salaün,
	Jeff Xu, Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov,
	Paul Moore, Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

Thank you, Arnd and Christian, for the detailed insights!
This is very helpful!

On Mon, Feb 12, 2024 at 12:09:47PM +0100, Christian Brauner wrote:
> On Sat, Feb 10, 2024 at 12:49:23PM +0100, Arnd Bergmann wrote:
> > On Sat, Feb 10, 2024, at 12:06, Günther Noack wrote:
> > > On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> > >
> > > The IOCTL command in question is FIONREAD: fs/ioctl.c implements
> > > FIONREAD directly for S_ISREG files, but it does call the FIONREAD
> > > implementation in the VFS layer for other file types.
> > >
> > > The question we are asking ourselves is:
> > >
> > > * Can we let processes safely use FIONREAD for all files which get
> > >   opened for the purpose of reading, or do we run the risk of
> > >   accidentally exposing surprising IOCTL implementations which have
> > >   completely different purposes?
> > >
> > >   Is it safe to assume that the VFS layer FIONREAD implementations are
> > >   actually implementing FIONREAD semantics?
> 
> Yes, otherwise this should considered a bug.

Excellent, thanks :)


> > > * I know there have been accidental collisions of IOCTL command
> > >   numbers in the past -- Hypothetically, if this were to happen in one
> > >   of the VFS implementations of FIONREAD, would that be considered a
> > >   bug that would need to get fixed in that implementation?
> > 
> > Clearly it's impossible to be sure no driver has a conflict
> > on this particular ioctl, but the risk for one intentionally
> > overriding it should be fairly low.
> > 
> > There are a couple of possible issues I can think of:
> > 
> > - the numeric value of FIONREAD is different depending
> >   on the architecture, with at least four different numbers
> >   aliasing to it. This is probably harmless but makes it
> >   harder to look for accidental conflicts.
> > 
> > - Aside from FIONREAD, it is sometimes called SIOCINQ
> >   (for sockets) or TIOCINQ (for tty). These still go
> >   through the same VFS entry point and as far as I can
> >   tell always have the same semantics (writing 4 bytes
> >   of data with the count of the remaining bytes in the
> >   fd).

Thanks, good pointer!

Grepping for these three names, I found:

* ~10 FIONREAD implementations in various drivers
* ~10 TIOCINQ implementations (all in net/, mostly in net/*/af_*.c files)
* ~20 SIOCINQ implementations (all in net/, related to network protocols?)

The implementations seem mostly related to networking, which gives me
hope that special casing and denying FIONREAD in the common case might
not make a big difference after all.

(The ioctl filtering in this patch set only applies to files opened
through open(2), but not to network sockets and other files acquired
through socket(2), pipe(2), socketpair(2), fanotiy_init(2) and the
like. -- It just gets messy if such files are opened through
/proc/*/fd/*)


> > - There are probably a couple of drivers that do something
> >   in their ioctl handler without actually looking at
> >   the command number.

Thanks you for the pointers!

You are right, it is surprisingly common that ioctl handlers do work
without first looking at the command number.  I spot checked a few
ioctl handler implementations and it is easy to dig up examples.

If this is done, the pattern is often this:

   preparation_work();
   
   switch (cmd) {
   case A:   /* impl */
   case B:   /* impl */
   default:  /* set error */
   }
   
   cleanup_work();

Common types of preparation work are the acquisition and release of
locks, allocation and release of commonly used buffers, copying memory
to and from userspace, and flushing buffers.

One of the larger examples which I found was video_ioctl2() from
drivers/media/v412-core/v412-ioctl.c, which is used from multiple
video drivers.

It also seems to me that ioctl handlers doing work independent of
command number might be a bigger problem than the hypothetical command
number collision I originally asked about.  -- If we allow FIONREAD to be called on files too liberally, we are not only exposing 


> > If you want to be really sure you get this right, you
> > could add a new callback to struct file_operations
> > that handles this for all drivers, something like
> > 
> > static int ioctl_fionread(struct file *filp, int __user *arg)
> > {
> >      int n;
> > 
> >      if (S_ISREG(inode->i_mode))
> >          return put_user(i_size_read(inode) - filp->f_pos, arg);
> > 
> >      if (!file->f_op->fionread)
> >          return -ENOIOCTLCMD;
> > 
> >      n = file->f_op->fionread(filp);
> > 
> >      if (n < 0)
> >          return n;
> > 
> >      return put_user(n, arg);
> > }
> > 
> > With this, you can go through any driver implementing
> > FIONREAD/SIOCINQ/TIOCINQ and move the code from .ioctl
> > into .fionread. This probably results in cleaner code
> > overall, especially in drivers that have no other ioctl
> > commands besides this one.
> > 
> > Since sockets and ttys tend to have both SIOCINQ/TIOCINQ
> > and SIOCOUTQ/TIOCOUTQ (unlike regular files), it's
> > probably best to do both at the same time, or maybe
> > have a single callback pointer with an in/out flag.
> 
> I'm not excited about adding a bunch of methods to struct
> file_operations for this stuff.

Agreed.  As far as I understand, if we added a special .fionread
handler to struct file_operations, the pros and cons would be:

 + For files that don't implement FIONREAD, calling ioctl(fd,
   FIONREAD, ...) could not accidentally execute ioctl handler
   preparation and cleanup work.

 - It would use a bit more space in struct file_operations.

 - It might not be obvious to driver implementers that they'd need to
   hook into .fionread.  There is a slight risk that some ioctl
   implementers would still accidentally implement FIONREAD the "old"
   way, and then it would not get called.

 - It would set a weird precedent to have a special handler for a
   single IOCTL command (why is this one command special?); this can't
   be done for all IOCTL commands like that.

If I weigh this against each other, I am not convinced that it
wouldn't cause more complications later on.  But it was a good idea
that I had not considered and it was worth looking into, I think.


In summary, I need to digest this a bit longer ;-)

I think these two were the key insights for me from this discussion:

 * There are fewer implementations of FIONREAD than I was afraid we
   would find.  This gives me some hope that we can maybe special case
   it after all in Landlock, and solve the 99% of realistic scenarios
   with it (and the remaining 1% can still ask for the "all IOCTLs"
   right, if needed).

 * Some IOCTL handlers are messier than I had expected, and it seems
   less realistic that we can convince ourselves that these are safe.
   I particularly did not realize before that ioctl handlers that
   don't even implement FIONREAD might actually still do work when
   called with FIONREAD... who would have thought! o_O

I'll try to explore the direction of special casing FIONREAD for now,
and think about it some more.  Thank you for the insightful input,
I'll loop you in when I have something more concrete.

—Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-10 11:18   ` Günther Noack
@ 2024-02-16 14:11     ` Mickaël Salaün
  2024-02-16 15:51       ` Mickaël Salaün
  0 siblings, 1 reply; 50+ messages in thread
From: Mickaël Salaün @ 2024-02-16 14:11 UTC (permalink / raw)
  To: Günther Noack
  Cc: linux-security-module, Jeff Xu, Arnd Bergmann, Christian Brauner,
	Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov, Paul Moore,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

On Sat, Feb 10, 2024 at 12:18:06PM +0100, Günther Noack wrote:
> On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> > diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> > index 73997e63734f..84efea3f7c0f 100644
> > --- a/security/landlock/fs.c
> > +++ b/security/landlock/fs.c
> > @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
> >  {
> >  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> >  	access_mask_t open_access_request, full_access_request, allowed_access;
> > -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> > +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> > +					      LANDLOCK_ACCESS_FS_IOCTL |
> > +					      IOCTL_GROUPS;
> >  	const struct landlock_ruleset *const dom = get_current_fs_domain();
> >  
> >  	if (!dom)
> > @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
> >  		}
> >  	}
> >  
> > +	/*
> > +	 * Named pipes should be treated just like anonymous pipes.
> > +	 * Therefore, we permit all IOCTLs on them.
> > +	 */
> > +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> > +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> > +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> > +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;

Why not LANDLOCK_ACCESS_FS_IOCTL | IOCTL_GROUPS instead?

> > +	}
> > +
> 
> Hello Mickaël, this "if" is a change I'd like to draw your attention
> to -- this special case was necessary so that all IOCTLs are permitted
> on named pipes. (There is also a test for it in another commit.)
> 
> Open questions here are:
> 
>  - I'm a bit on the edge whether it's worth it to have these special
>    cases here.  After all, users can very easily just permit all
>    IOCTLs through the ruleset if needed, and it might simplify the
>    mental model that we have to explain in the documentation

It might simplify the kernel implementation a bit but it would make the
Landlock security policies more complex, and could encourage people to
allow all IOCTLs on a directory because this directory might contain
(dynamically created) named pipes.

I suggest to extend this check with S_ISFIFO(mode) || S_ISSOCK(mode).
A comment should explain that LANDLOCK_ACCESS_FS_* rights are not meant
to restrict IPCs.

> 
>  - I've put the special case into the file open hook, under the
>    assumption that it would simplify the Landlock audit support to
>    have the correct rights on the struct file.  The implementation
>    could alternatively also be done in the ioctl hook. Let me know
>    which one makes more sense to you.

I like your approach, thanks!  Also, in theory this approach should be
better for performance reasons, even if it should not be visible in
practice. Anyway, keeping a consistent set of access rights is
definitely useful for observability.

I'm wondering if we should do the same mode check for
LANDLOCK_ACCESS_FS_TRUNCATE too... It would not be visible to user space
anyway because the LSM hooks are called after the file mode checks for
truncate(2) and ftruncate(2). But because we need this kind of check for
IOCTL, it might be a good idea to make it common to all optional_access
values, at least to document what is really handled. Adding dedicated
truncate and ftruncate tests (before this commit) would guarantee that
the returned error codes are unchanged.

Moving this check before the is_access_to_paths_allowed() call would
enable to avoid looking for always-allowed access rights by removing
them from the full_access_request. This could help improve performance
when opening named pipe because no optional_access would be requested.

A new helper similar to get_required_file_open_access() could help.

> 
> BTW, named UNIX domain sockets can apparently not be opened with open() and
> therefore they don't hit the LSM file_open hook.  (It is done with the BSD
> socket API instead.)

What about /proc/*/fd/* ? We can test with open_proc_fd() to make sure
our assumptions are correct.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-16 14:11     ` Mickaël Salaün
@ 2024-02-16 15:51       ` Mickaël Salaün
  2024-02-18  8:34         ` Günther Noack
  0 siblings, 1 reply; 50+ messages in thread
From: Mickaël Salaün @ 2024-02-16 15:51 UTC (permalink / raw)
  To: Günther Noack
  Cc: linux-security-module, Jeff Xu, Arnd Bergmann, Christian Brauner,
	Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov, Paul Moore,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

On Fri, Feb 16, 2024 at 03:11:18PM +0100, Mickaël Salaün wrote:
> On Sat, Feb 10, 2024 at 12:18:06PM +0100, Günther Noack wrote:
> > On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> > > diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> > > index 73997e63734f..84efea3f7c0f 100644
> > > --- a/security/landlock/fs.c
> > > +++ b/security/landlock/fs.c
> > > @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
> > >  {
> > >  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> > >  	access_mask_t open_access_request, full_access_request, allowed_access;
> > > -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> > > +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> > > +					      LANDLOCK_ACCESS_FS_IOCTL |
> > > +					      IOCTL_GROUPS;
> > >  	const struct landlock_ruleset *const dom = get_current_fs_domain();
> > >  
> > >  	if (!dom)
> > > @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
> > >  		}
> > >  	}
> > >  
> > > +	/*
> > > +	 * Named pipes should be treated just like anonymous pipes.
> > > +	 * Therefore, we permit all IOCTLs on them.
> > > +	 */
> > > +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> > > +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> > > +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> > > +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> 
> Why not LANDLOCK_ACCESS_FS_IOCTL | IOCTL_GROUPS instead?
> 
> > > +	}
> > > +
> > 
> > Hello Mickaël, this "if" is a change I'd like to draw your attention
> > to -- this special case was necessary so that all IOCTLs are permitted
> > on named pipes. (There is also a test for it in another commit.)
> > 
> > Open questions here are:
> > 
> >  - I'm a bit on the edge whether it's worth it to have these special
> >    cases here.  After all, users can very easily just permit all
> >    IOCTLs through the ruleset if needed, and it might simplify the
> >    mental model that we have to explain in the documentation
> 
> It might simplify the kernel implementation a bit but it would make the
> Landlock security policies more complex, and could encourage people to
> allow all IOCTLs on a directory because this directory might contain
> (dynamically created) named pipes.
> 
> I suggest to extend this check with S_ISFIFO(mode) || S_ISSOCK(mode).
> A comment should explain that LANDLOCK_ACCESS_FS_* rights are not meant
> to restrict IPCs.
> 
> > 
> >  - I've put the special case into the file open hook, under the
> >    assumption that it would simplify the Landlock audit support to
> >    have the correct rights on the struct file.  The implementation
> >    could alternatively also be done in the ioctl hook. Let me know
> >    which one makes more sense to you.
> 
> I like your approach, thanks!  Also, in theory this approach should be
> better for performance reasons, even if it should not be visible in
> practice. Anyway, keeping a consistent set of access rights is
> definitely useful for observability.
> 
> I'm wondering if we should do the same mode check for
> LANDLOCK_ACCESS_FS_TRUNCATE too... It would not be visible to user space
> anyway because the LSM hooks are called after the file mode checks for
> truncate(2) and ftruncate(2). But because we need this kind of check for
> IOCTL, it might be a good idea to make it common to all optional_access
> values, at least to document what is really handled. Adding dedicated
> truncate and ftruncate tests (before this commit) would guarantee that
> the returned error codes are unchanged.
> 
> Moving this check before the is_access_to_paths_allowed() call would
> enable to avoid looking for always-allowed access rights by removing
> them from the full_access_request. This could help improve performance
> when opening named pipe because no optional_access would be requested.
> 
> A new helper similar to get_required_file_open_access() could help.
> 
> > 
> > BTW, named UNIX domain sockets can apparently not be opened with open() and
> > therefore they don't hit the LSM file_open hook.  (It is done with the BSD
> > socket API instead.)
> 
> What about /proc/*/fd/* ? We can test with open_proc_fd() to make sure
> our assumptions are correct.

Actually, these fifo and socket checks (and related optimizations)
should already be handled with is_nouser_or_private() called by
is_access_to_paths_allowed(). Some new dedicated tests should help
though.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
  2024-02-10 11:06   ` Günther Noack
  2024-02-10 11:18   ` Günther Noack
@ 2024-02-16 17:19   ` Mickaël Salaün
  2024-02-19 18:34   ` Mickaël Salaün
  3 siblings, 0 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-02-16 17:19 UTC (permalink / raw)
  To: Günther Noack
  Cc: linux-security-module, Jeff Xu, Arnd Bergmann,
	Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov, Paul Moore,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
> and increments the Landlock ABI version to 5.
> 
> Like the truncate right, these rights are associated with a file
> descriptor at the time of open(2), and get respected even when the
> file descriptor is used outside of the thread which it was originally
> opened in.
> 
> A newly enabled Landlock policy therefore does not apply to file
> descriptors which are already open.
> 
> If the LANDLOCK_ACCESS_FS_IOCTL right is handled, only a small number
> of safe IOCTL commands will be permitted on newly opened files.  The
> permitted IOCTLs can be configured through the ruleset in limited ways
> now.  (See documentation for details.)
> 
> Specifically, when LANDLOCK_ACCESS_FS_IOCTL is handled, granting this
> right on a file or directory will *not* permit to do all IOCTL
> commands, but only influence the IOCTL commands which are not already
> handled through other access rights.  The intent is to keep the groups
> of IOCTL commands more fine-grained.
> 
> Noteworthy scenarios which require special attention:
> 
> TTY devices are often passed into a process from the parent process,
> and so a newly enabled Landlock policy does not retroactively apply to
> them automatically.  In the past, TTY devices have often supported
> IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were
> letting callers control the TTY input buffer (and simulate
> keypresses).  This should be restricted to CAP_SYS_ADMIN programs on
> modern kernels though.
> 
> Some legitimate file system features, like setting up fscrypt, are
> exposed as IOCTL commands on regular files and directories -- users of
> Landlock are advised to double check that the sandboxed process does
> not need to invoke these IOCTLs.
> 
> Known limitations:
> 
> The LANDLOCK_ACCESS_FS_IOCTL access right is a coarse-grained control
> over IOCTL commands.  Future work will enable a more fine-grained
> access control for IOCTLs.
> 
> In the meantime, Landlock users may use path-based restrictions in
> combination with their knowledge about the file system layout to
> control what IOCTLs can be done.  Mounting file systems with the nodev
> option can help to distinguish regular files and devices, and give
> guarantees about the affected files, which Landlock alone can not give
> yet.
> 
> Signed-off-by: Günther Noack <gnoack@google.com>
> ---
>  include/uapi/linux/landlock.h                |  55 ++++-
>  security/landlock/fs.c                       | 227 ++++++++++++++++++-
>  security/landlock/fs.h                       |   3 +
>  security/landlock/limits.h                   |  11 +-
>  security/landlock/ruleset.h                  |   2 +-
>  security/landlock/syscalls.c                 |  19 +-
>  tools/testing/selftests/landlock/base_test.c |   2 +-
>  tools/testing/selftests/landlock/fs_test.c   |   5 +-
>  8 files changed, 302 insertions(+), 22 deletions(-)
> 
> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> index 25c8d7677539..16d7d72804f8 100644
> --- a/include/uapi/linux/landlock.h
> +++ b/include/uapi/linux/landlock.h
> @@ -128,7 +128,7 @@ struct landlock_net_port_attr {
>   * files and directories.  Files or directories opened before the sandboxing
>   * are not subject to these restrictions.
>   *
> - * A file can only receive these access rights:
> + * The following access rights apply only to files:
>   *
>   * - %LANDLOCK_ACCESS_FS_EXECUTE: Execute a file.
>   * - %LANDLOCK_ACCESS_FS_WRITE_FILE: Open a file with write access. Note that
> @@ -138,12 +138,13 @@ struct landlock_net_port_attr {
>   * - %LANDLOCK_ACCESS_FS_READ_FILE: Open a file with read access.
>   * - %LANDLOCK_ACCESS_FS_TRUNCATE: Truncate a file with :manpage:`truncate(2)`,
>   *   :manpage:`ftruncate(2)`, :manpage:`creat(2)`, or :manpage:`open(2)` with
> - *   ``O_TRUNC``. Whether an opened file can be truncated with
> - *   :manpage:`ftruncate(2)` is determined during :manpage:`open(2)`, in the
> - *   same way as read and write permissions are checked during
> - *   :manpage:`open(2)` using %LANDLOCK_ACCESS_FS_READ_FILE and
> - *   %LANDLOCK_ACCESS_FS_WRITE_FILE. This access right is available since the
> - *   third version of the Landlock ABI.
> + *   ``O_TRUNC``.  This access right is available since the third version of the
> + *   Landlock ABI.
> + *
> + * Whether an opened file can be truncated with :manpage:`ftruncate(2)` or used
> + * with `ioctl(2)` is determined during :manpage:`open(2)`, in the same way as
> + * read and write permissions are checked during :manpage:`open(2)` using
> + * %LANDLOCK_ACCESS_FS_READ_FILE and %LANDLOCK_ACCESS_FS_WRITE_FILE.
>   *
>   * A directory can receive access rights related to files or directories.  The
>   * following access right is applied to the directory itself, and the
> @@ -198,13 +199,50 @@ struct landlock_net_port_attr {
>   *   If multiple requirements are not met, the ``EACCES`` error code takes
>   *   precedence over ``EXDEV``.
>   *
> + * The following access right applies both to files and directories:
> + *
> + * - %LANDLOCK_ACCESS_FS_IOCTL: Invoke :manpage:`ioctl(2)` commands on an opened
> + *   file or directory.
> + *
> + *   This access right applies to all :manpage:`ioctl(2)` commands, except of
> + *   ``FIOCLEX``, ``FIONCLEX``, ``FIONBIO`` and ``FIOASYNC``.  These commands
> + *   continue to be invokable independent of the %LANDLOCK_ACCESS_FS_IOCTL
> + *   access right.
> + *
> + *   When certain other access rights are handled in the ruleset, in addition to
> + *   %LANDLOCK_ACCESS_FS_IOCTL, granting these access rights will unlock access
> + *   to additional groups of IOCTL commands, on the affected files:
> + *
> + *   * %LANDLOCK_ACCESS_FS_READ_FILE and %LANDLOCK_ACCESS_FS_WRITE_FILE unlock
> + *     access to ``FIOQSIZE``, ``FIONREAD``, ``FIGETBSZ``, ``FS_IOC_FIEMAP``,
> + *     ``FIBMAP``, ``FIDEDUPERANGE``, ``FICLONE``, ``FICLONERANGE``,
> + *     ``FS_IOC_RESVSP``, ``FS_IOC_RESVSP64``, ``FS_IOC_UNRESVSP``,
> + *     ``FS_IOC_UNRESVSP64``, ``FS_IOC_ZERO_RANGE``.
> + *
> + *   * %LANDLOCK_ACCESS_FS_READ_DIR unlocks access to ``FIOQSIZE``,
> + *     ``FIONREAD``, ``FIGETBSZ``.
> + *
> + *   When these access rights are handled in the ruleset, the availability of
> + *   the affected IOCTL commands is not governed by %LANDLOCK_ACCESS_FS_IOCTL
> + *   any more, but by the respective access right.
> + *
> + *   All other IOCTL commands are not handled specially, and are governed by
> + *   %LANDLOCK_ACCESS_FS_IOCTL.  This includes %FS_IOC_GETFLAGS and
> + *   %FS_IOC_SETFLAGS for manipulating inode flags (:manpage:`ioctl_iflags(2)`),
> + *   %FS_IOC_FSFETXATTR and %FS_IOC_FSSETXATTR for manipulating extended
> + *   attributes, as well as %FIFREEZE and %FITHAW for freezing and thawing file
> + *   systems.
> + *
> + *   This access right is available since the fifth version of the Landlock
> + *   ABI.
> + *
>   * .. warning::
>   *
>   *   It is currently not possible to restrict some file-related actions
>   *   accessible through these syscall families: :manpage:`chdir(2)`,
>   *   :manpage:`stat(2)`, :manpage:`flock(2)`, :manpage:`chmod(2)`,
>   *   :manpage:`chown(2)`, :manpage:`setxattr(2)`, :manpage:`utime(2)`,
> - *   :manpage:`ioctl(2)`, :manpage:`fcntl(2)`, :manpage:`access(2)`.
> + *   :manpage:`fcntl(2)`, :manpage:`access(2)`.
>   *   Future Landlock evolutions will enable to restrict them.
>   */
>  /* clang-format off */
> @@ -223,6 +261,7 @@ struct landlock_net_port_attr {
>  #define LANDLOCK_ACCESS_FS_MAKE_SYM			(1ULL << 12)
>  #define LANDLOCK_ACCESS_FS_REFER			(1ULL << 13)
>  #define LANDLOCK_ACCESS_FS_TRUNCATE			(1ULL << 14)
> +#define LANDLOCK_ACCESS_FS_IOCTL			(1ULL << 15)
>  /* clang-format on */
>  
>  /**
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 73997e63734f..84efea3f7c0f 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -7,6 +7,7 @@
>   * Copyright © 2021-2022 Microsoft Corporation
>   */
>  
> +#include <asm/ioctls.h>
>  #include <kunit/test.h>
>  #include <linux/atomic.h>
>  #include <linux/bitops.h>
> @@ -14,6 +15,7 @@
>  #include <linux/compiler_types.h>
>  #include <linux/dcache.h>
>  #include <linux/err.h>
> +#include <linux/falloc.h>
>  #include <linux/fs.h>
>  #include <linux/init.h>
>  #include <linux/kernel.h>
> @@ -29,6 +31,7 @@
>  #include <linux/types.h>
>  #include <linux/wait_bit.h>
>  #include <linux/workqueue.h>
> +#include <uapi/linux/fiemap.h>
>  #include <uapi/linux/landlock.h>
>  
>  #include "common.h"
> @@ -84,6 +87,186 @@ static const struct landlock_object_underops landlock_fs_underops = {
>  	.release = release_inode
>  };
>  
> +/* IOCTL helpers */
> +
> +/*
> + * These are synthetic access rights, which are only used within the kernel, but
> + * not exposed to callers in userspace.  The mapping between these access rights
> + * and IOCTL commands is defined in the get_required_ioctl_access() helper function.
> + */
> +#define LANDLOCK_ACCESS_FS_IOCTL_RW (LANDLOCK_LAST_PUBLIC_ACCESS_FS << 1)
> +#define LANDLOCK_ACCESS_FS_IOCTL_RW_FILE (LANDLOCK_LAST_PUBLIC_ACCESS_FS << 2)
> +
> +/* ioctl_groups - all synthetic access rights for IOCTL command groups */
> +/* clang-format off */
> +#define IOCTL_GROUPS (				\
> +	LANDLOCK_ACCESS_FS_IOCTL_RW |		\
> +	LANDLOCK_ACCESS_FS_IOCTL_RW_FILE)
> +/* clang-format on */
> +
> +static_assert((IOCTL_GROUPS & LANDLOCK_MASK_ACCESS_FS) == IOCTL_GROUPS);
> +
> +/**
> + * get_required_ioctl_access(): Determine required IOCTL access rights.
> + *
> + * @cmd: The IOCTL command that is supposed to be run.
> + *
> + * Any new IOCTL commands that are implemented in fs/ioctl.c's do_vfs_ioctl()
> + * should be considered for inclusion here.

It might be a good idea to add a similar comment in
fs/ioctl.c:do_vfs_ioctl(), just before the "default" case, to make sure
nobody forget to Cc us if a new command is added.

> + *
> + * Returns: The access rights that must be granted on an opened file in order to
> + * use the given @cmd.
> + */
> +static __attribute_const__ access_mask_t
> +get_required_ioctl_access(const unsigned int cmd)
> +{
> +	switch (cmd) {
> +	case FIOCLEX:
> +	case FIONCLEX:
> +	case FIONBIO:
> +	case FIOASYNC:
> +		/*
> +		 * FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC manipulate the FD's
> +		 * close-on-exec and the file's buffered-IO and async flags.
> +		 * These operations are also available through fcntl(2), and are
> +		 * unconditionally permitted in Landlock.
> +		 */
> +		return 0;
> +	case FIONREAD:
> +	case FIOQSIZE:
> +	case FIGETBSZ:
> +		/*
> +		 * FIONREAD returns the number of bytes available for reading.
> +		 * FIONREAD returns the number of immediately readable bytes for
> +		 * a file.
> +		 *
> +		 * FIOQSIZE queries the size of a file or directory.
> +		 *
> +		 * FIGETBSZ queries the file system's block size for a file or
> +		 * directory.
> +		 *
> +		 * These IOCTL commands are permitted for files which are opened
> +		 * with LANDLOCK_ACCESS_FS_READ_DIR,
> +		 * LANDLOCK_ACCESS_FS_READ_FILE, or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */
> +		return LANDLOCK_ACCESS_FS_IOCTL_RW;
> +	case FS_IOC_FIEMAP:
> +	case FIBMAP:
> +		/*
> +		 * FS_IOC_FIEMAP and FIBMAP query information about the
> +		 * allocation of blocks within a file.  They are permitted for
> +		 * files which are opened with LANDLOCK_ACCESS_FS_READ_FILE or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */
> +		fallthrough;
> +	case FIDEDUPERANGE:
> +	case FICLONE:
> +	case FICLONERANGE:
> +		/*
> +		 * FIDEDUPERANGE, FICLONE and FICLONERANGE make files share
> +		 * their underlying storage ("reflink") between source and
> +		 * destination FDs, on file systems which support that.
> +		 *
> +		 * The underlying implementations are already checking whether
> +		 * the involved files are opened with the appropriate read/write
> +		 * modes.  We rely on this being implemented correctly.
> +		 *
> +		 * These IOCTLs are permitted for files which are opened with
> +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */
> +		fallthrough;
> +	case FS_IOC_RESVSP:
> +	case FS_IOC_RESVSP64:
> +	case FS_IOC_UNRESVSP:
> +	case FS_IOC_UNRESVSP64:
> +	case FS_IOC_ZERO_RANGE:
> +		/*
> +		 * These IOCTLs reserve space, or create holes like
> +		 * fallocate(2).  We rely on the implementations checking the
> +		 * files' read/write modes.
> +		 *
> +		 * These IOCTLs are permitted for files which are opened with
> +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */
> +		return LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> +	default:
> +		/*
> +		 * Other commands are guarded by the catch-all access right.
> +		 */
> +		return LANDLOCK_ACCESS_FS_IOCTL;
> +	}

Good documentation and better grouping!

> +}
> +
> +/**
> + * expand_ioctl() - Return the dst flags from either the src flag or the
> + * %LANDLOCK_ACCESS_FS_IOCTL flag, depending on whether the
> + * %LANDLOCK_ACCESS_FS_IOCTL and src access rights are handled or not.
> + *
> + * @handled: Handled access rights.
> + * @access: The access mask to copy values from.
> + * @src: A single access right to copy from in @access.
> + * @dst: One or more access rights to copy to.
> + *
> + * Returns: @dst, or 0.
> + */
> +static __attribute_const__ access_mask_t
> +expand_ioctl(const access_mask_t handled, const access_mask_t access,
> +	     const access_mask_t src, const access_mask_t dst)
> +{
> +	access_mask_t copy_from;
> +
> +	if (!(handled & LANDLOCK_ACCESS_FS_IOCTL))
> +		return 0;
> +
> +	copy_from = (handled & src) ? src : LANDLOCK_ACCESS_FS_IOCTL;
> +	if (access & copy_from)
> +		return dst;
> +
> +	return 0;
> +}
> +
> +/**
> + * landlock_expand_access_fs() - Returns @access with the synthetic IOCTL group
> + * flags enabled if necessary.
> + *
> + * @handled: Handled FS access rights.
> + * @access: FS access rights to expand.
> + *
> + * Returns: @access expanded by the necessary flags for the synthetic IOCTL
> + * access rights.
> + */
> +static __attribute_const__ access_mask_t landlock_expand_access_fs(
> +	const access_mask_t handled, const access_mask_t access)
> +{
> +	return access |
> +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_WRITE_FILE,
> +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_FILE,
> +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_DIR,
> +			    LANDLOCK_ACCESS_FS_IOCTL_RW);
> +}
> +
> +/**
> + * landlock_expand_handled_access_fs() - add synthetic IOCTL access rights to an
> + * access mask of handled accesses.
> + *
> + * @handled: The handled accesses of a ruleset that is being created.
> + *
> + * Returns: @handled, with the bits for the synthetic IOCTL access rights set,
> + * if %LANDLOCK_ACCESS_FS_IOCTL is handled.
> + */
> +__attribute_const__ access_mask_t
> +landlock_expand_handled_access_fs(const access_mask_t handled)
> +{
> +	return landlock_expand_access_fs(handled, handled);
> +}
> +
>  /* Ruleset management */
>  
>  static struct landlock_object *get_inode_object(struct inode *const inode)
> @@ -148,7 +331,8 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
>  	LANDLOCK_ACCESS_FS_EXECUTE | \
>  	LANDLOCK_ACCESS_FS_WRITE_FILE | \
>  	LANDLOCK_ACCESS_FS_READ_FILE | \
> -	LANDLOCK_ACCESS_FS_TRUNCATE)
> +	LANDLOCK_ACCESS_FS_TRUNCATE | \
> +	LANDLOCK_ACCESS_FS_IOCTL)
>  /* clang-format on */
>  
>  /*
> @@ -158,6 +342,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>  			    const struct path *const path,
>  			    access_mask_t access_rights)
>  {
> +	access_mask_t handled;
>  	int err;
>  	struct landlock_id id = {
>  		.type = LANDLOCK_KEY_INODE,
> @@ -170,9 +355,11 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>  	if (WARN_ON_ONCE(ruleset->num_layers != 1))
>  		return -EINVAL;
>  
> +	handled = landlock_get_fs_access_mask(ruleset, 0);
> +	/* Expands the synthetic IOCTL groups. */
> +	access_rights |= landlock_expand_access_fs(handled, access_rights);
>  	/* Transforms relative access rights to absolute ones. */
> -	access_rights |= LANDLOCK_MASK_ACCESS_FS &
> -			 ~landlock_get_fs_access_mask(ruleset, 0);
> +	access_rights |= LANDLOCK_MASK_ACCESS_FS & ~handled;
>  	id.key.object = get_inode_object(d_backing_inode(path->dentry));
>  	if (IS_ERR(id.key.object))
>  		return PTR_ERR(id.key.object);
> @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
>  {
>  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
>  	access_mask_t open_access_request, full_access_request, allowed_access;
> -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> +					      LANDLOCK_ACCESS_FS_IOCTL |
> +					      IOCTL_GROUPS;
>  	const struct landlock_ruleset *const dom = get_current_fs_domain();
>  
>  	if (!dom)
> @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
>  		}
>  	}
>  
> +	/*
> +	 * Named pipes should be treated just like anonymous pipes.
> +	 * Therefore, we permit all IOCTLs on them.
> +	 */
> +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> +	}

This should not be required, cf. other thread.

> +
>  	/*
>  	 * For operations on already opened files (i.e. ftruncate()), it is the
>  	 * access rights at the time of open() which decide whether the
> @@ -1406,6 +1605,25 @@ static int hook_file_truncate(struct file *const file)
>  	return -EACCES;
>  }
>  
> +static int hook_file_ioctl(struct file *file, unsigned int cmd,
> +			   unsigned long arg)
> +{
> +	const access_mask_t required_access = get_required_ioctl_access(cmd);
> +	const access_mask_t allowed_access =
> +		landlock_file(file)->allowed_access;
> +
> +	/*
> +	 * It is the access rights at the time of opening the file which
> +	 * determine whether IOCTL can be used on the opened file later.
> +	 *
> +	 * The access right is attached to the opened file in hook_file_open().
> +	 */
> +	if ((allowed_access & required_access) == required_access)
> +		return 0;
> +
> +	return -EACCES;
> +}
> +
>  static struct security_hook_list landlock_hooks[] __ro_after_init = {
>  	LSM_HOOK_INIT(inode_free_security, hook_inode_free_security),
>  
> @@ -1428,6 +1646,7 @@ static struct security_hook_list landlock_hooks[] __ro_after_init = {
>  	LSM_HOOK_INIT(file_alloc_security, hook_file_alloc_security),
>  	LSM_HOOK_INIT(file_open, hook_file_open),
>  	LSM_HOOK_INIT(file_truncate, hook_file_truncate),
> +	LSM_HOOK_INIT(file_ioctl, hook_file_ioctl),
>  };
>  
>  __init void landlock_add_fs_hooks(void)
> diff --git a/security/landlock/fs.h b/security/landlock/fs.h
> index 488e4813680a..086576b8386b 100644
> --- a/security/landlock/fs.h
> +++ b/security/landlock/fs.h
> @@ -92,4 +92,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>  			    const struct path *const path,
>  			    access_mask_t access_hierarchy);
>  
> +__attribute_const__ access_mask_t
> +landlock_expand_handled_access_fs(const access_mask_t handled);
> +
>  #endif /* _SECURITY_LANDLOCK_FS_H */
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> index 93c9c6f91556..ecbdc8bbf906 100644
> --- a/security/landlock/limits.h
> +++ b/security/landlock/limits.h
> @@ -18,7 +18,16 @@
>  #define LANDLOCK_MAX_NUM_LAYERS		16
>  #define LANDLOCK_MAX_NUM_RULES		U32_MAX
>  
> -#define LANDLOCK_LAST_ACCESS_FS		LANDLOCK_ACCESS_FS_TRUNCATE
> +/*
> + * For file system access rights, Landlock distinguishes between the publicly
> + * visible access rights (1 to LANDLOCK_LAST_PUBLIC_ACCESS_FS) and the private
> + * ones which are not exposed to userspace (LANDLOCK_LAST_PUBLIC_ACCESS_FS + 1
> + * to LANDLOCK_LAST_ACCESS_FS).  The private access rights are defined in fs.c.
> + */
> +#define LANDLOCK_LAST_PUBLIC_ACCESS_FS	LANDLOCK_ACCESS_FS_IOCTL
> +#define LANDLOCK_MASK_PUBLIC_ACCESS_FS	((LANDLOCK_LAST_PUBLIC_ACCESS_FS << 1) - 1)
> +
> +#define LANDLOCK_LAST_ACCESS_FS		(LANDLOCK_LAST_PUBLIC_ACCESS_FS << 2)
>  #define LANDLOCK_MASK_ACCESS_FS		((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
>  #define LANDLOCK_NUM_ACCESS_FS		__const_hweight64(LANDLOCK_MASK_ACCESS_FS)
>  #define LANDLOCK_SHIFT_ACCESS_FS	0
> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> index c7f1526784fd..5a28ea8e1c3d 100644
> --- a/security/landlock/ruleset.h
> +++ b/security/landlock/ruleset.h
> @@ -30,7 +30,7 @@
>  	LANDLOCK_ACCESS_FS_REFER)
>  /* clang-format on */
>  
> -typedef u16 access_mask_t;
> +typedef u32 access_mask_t;
>  /* Makes sure all filesystem access rights can be stored. */
>  static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
>  /* Makes sure all network access rights can be stored. */
> diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
> index 898358f57fa0..f0bc50003b46 100644
> --- a/security/landlock/syscalls.c
> +++ b/security/landlock/syscalls.c
> @@ -137,7 +137,7 @@ static const struct file_operations ruleset_fops = {
>  	.write = fop_dummy_write,
>  };
>  
> -#define LANDLOCK_ABI_VERSION 4
> +#define LANDLOCK_ABI_VERSION 5
>  
>  /**
>   * sys_landlock_create_ruleset - Create a new ruleset
> @@ -192,8 +192,8 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
>  		return err;
>  
>  	/* Checks content (and 32-bits cast). */
> -	if ((ruleset_attr.handled_access_fs | LANDLOCK_MASK_ACCESS_FS) !=
> -	    LANDLOCK_MASK_ACCESS_FS)
> +	if ((ruleset_attr.handled_access_fs | LANDLOCK_MASK_PUBLIC_ACCESS_FS) !=
> +	    LANDLOCK_MASK_PUBLIC_ACCESS_FS)
>  		return -EINVAL;
>  
>  	/* Checks network content (and 32-bits cast). */
> @@ -201,6 +201,10 @@ SYSCALL_DEFINE3(landlock_create_ruleset,
>  	    LANDLOCK_MASK_ACCESS_NET)
>  		return -EINVAL;
>  
> +	/* Expands synthetic IOCTL groups. */
> +	ruleset_attr.handled_access_fs = landlock_expand_handled_access_fs(
> +		ruleset_attr.handled_access_fs);
> +
>  	/* Checks arguments and transforms to kernel struct. */
>  	ruleset = landlock_create_ruleset(ruleset_attr.handled_access_fs,
>  					  ruleset_attr.handled_access_net);
> @@ -309,8 +313,13 @@ static int add_rule_path_beneath(struct landlock_ruleset *const ruleset,
>  	if (!path_beneath_attr.allowed_access)
>  		return -ENOMSG;
>  
> -	/* Checks that allowed_access matches the @ruleset constraints. */
> -	mask = landlock_get_raw_fs_access_mask(ruleset, 0);
> +	/*
> +	 * Checks that allowed_access matches the @ruleset constraints and only
> +	 * consists of publicly visible access rights (as opposed to synthetic
> +	 * ones).
> +	 */
> +	mask = landlock_get_raw_fs_access_mask(ruleset, 0) &
> +	       LANDLOCK_MASK_PUBLIC_ACCESS_FS;
>  	if ((path_beneath_attr.allowed_access | mask) != mask)
>  		return -EINVAL;
>  
> diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
> index 646f778dfb1e..d292b419ccba 100644
> --- a/tools/testing/selftests/landlock/base_test.c
> +++ b/tools/testing/selftests/landlock/base_test.c
> @@ -75,7 +75,7 @@ TEST(abi_version)
>  	const struct landlock_ruleset_attr ruleset_attr = {
>  		.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
>  	};
> -	ASSERT_EQ(4, landlock_create_ruleset(NULL, 0,
> +	ASSERT_EQ(5, landlock_create_ruleset(NULL, 0,
>  					     LANDLOCK_CREATE_RULESET_VERSION));
>  
>  	ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
> diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
> index 2d6d9b43d958..3203f4a5bc85 100644
> --- a/tools/testing/selftests/landlock/fs_test.c
> +++ b/tools/testing/selftests/landlock/fs_test.c
> @@ -527,9 +527,10 @@ TEST_F_FORK(layout1, inval)
>  	LANDLOCK_ACCESS_FS_EXECUTE | \
>  	LANDLOCK_ACCESS_FS_WRITE_FILE | \
>  	LANDLOCK_ACCESS_FS_READ_FILE | \
> -	LANDLOCK_ACCESS_FS_TRUNCATE)
> +	LANDLOCK_ACCESS_FS_TRUNCATE | \
> +	LANDLOCK_ACCESS_FS_IOCTL)
>  
> -#define ACCESS_LAST LANDLOCK_ACCESS_FS_TRUNCATE
> +#define ACCESS_LAST LANDLOCK_ACCESS_FS_IOCTL
>  
>  #define ACCESS_ALL ( \
>  	ACCESS_FILE | \
> -- 
> 2.43.0.687.g38aa6559b0-goog
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-16 15:51       ` Mickaël Salaün
@ 2024-02-18  8:34         ` Günther Noack
  2024-02-19 21:44           ` Günther Noack
  0 siblings, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-02-18  8:34 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Jeff Xu,
	Arnd Bergmann, Christian Brauner, Jorge Lucangeli Obes,
	Allen Webb, Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Michael Kerrisk

On Fri, Feb 16, 2024 at 04:51:40PM +0100, Mickaël Salaün wrote:
> On Fri, Feb 16, 2024 at 03:11:18PM +0100, Mickaël Salaün wrote:
> > What about /proc/*/fd/* ? We can test with open_proc_fd() to make sure
> > our assumptions are correct.
> 
> Actually, these fifo and socket checks (and related optimizations)
> should already be handled with is_nouser_or_private() called by
> is_access_to_paths_allowed(). Some new dedicated tests should help
> though.

I am generally a bit confused about how opening /proc/*/fd/* works.

Specifically:

* Do we have to worry about the scenario where the file_open hook gets
  called with the same struct file* twice (overwriting the access
  rights)?

* I had trouble finding the place in fs/proc/ where the re-opening is
  implemented.

Do you happen to understand this in more detail?  At what point do the
re-opened files start sharing the same kernel objects?  Is that at the
inode level?

The documentation I consulted unfortunately did not explain it either:

* The man page (proc_pid_fd(5), or previously proc(5)) does not
  discuss the behavior on open() much, apart from using it in some
  examples.

* Michael Kerrisk's "Linux Programming Interface" book claims that the
  behaviour of opening /dev/fd/1 is like doing dup(1) (section 5.11)
  -- that is true on other UNIXes, but on Linux the resulting file
  descriptors do not share the same struct file* apparently.  This
  makes a difference for regular files, where the two FDs subsequently
  use two separate offsets into the file (f_pos).

Thanks,
–Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
                     ` (2 preceding siblings ...)
  2024-02-16 17:19   ` Mickaël Salaün
@ 2024-02-19 18:34   ` Mickaël Salaün
  2024-02-19 18:35     ` [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers Mickaël Salaün
  2024-02-28 12:57     ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
  3 siblings, 2 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-02-19 18:34 UTC (permalink / raw)
  To: Günther Noack, Arnd Bergmann, Christian Brauner
  Cc: linux-security-module, Jeff Xu, Jorge Lucangeli Obes, Allen Webb,
	Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel

Arn, Christian, please take a look at the following RFC patch and the
rationale explained here.

On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
> and increments the Landlock ABI version to 5.
> 
> Like the truncate right, these rights are associated with a file
> descriptor at the time of open(2), and get respected even when the
> file descriptor is used outside of the thread which it was originally
> opened in.
> 
> A newly enabled Landlock policy therefore does not apply to file
> descriptors which are already open.
> 
> If the LANDLOCK_ACCESS_FS_IOCTL right is handled, only a small number
> of safe IOCTL commands will be permitted on newly opened files.  The
> permitted IOCTLs can be configured through the ruleset in limited ways
> now.  (See documentation for details.)
> 
> Specifically, when LANDLOCK_ACCESS_FS_IOCTL is handled, granting this
> right on a file or directory will *not* permit to do all IOCTL
> commands, but only influence the IOCTL commands which are not already
> handled through other access rights.  The intent is to keep the groups
> of IOCTL commands more fine-grained.
> 
> Noteworthy scenarios which require special attention:
> 
> TTY devices are often passed into a process from the parent process,
> and so a newly enabled Landlock policy does not retroactively apply to
> them automatically.  In the past, TTY devices have often supported
> IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were
> letting callers control the TTY input buffer (and simulate
> keypresses).  This should be restricted to CAP_SYS_ADMIN programs on
> modern kernels though.
> 
> Some legitimate file system features, like setting up fscrypt, are
> exposed as IOCTL commands on regular files and directories -- users of
> Landlock are advised to double check that the sandboxed process does
> not need to invoke these IOCTLs.

I think we really need to allow fscrypt and fs-verity IOCTLs.

> 
> Known limitations:
> 
> The LANDLOCK_ACCESS_FS_IOCTL access right is a coarse-grained control
> over IOCTL commands.  Future work will enable a more fine-grained
> access control for IOCTLs.
> 
> In the meantime, Landlock users may use path-based restrictions in
> combination with their knowledge about the file system layout to
> control what IOCTLs can be done.  Mounting file systems with the nodev
> option can help to distinguish regular files and devices, and give
> guarantees about the affected files, which Landlock alone can not give
> yet.

I had a second though about our current approach, and it looks like we
can do simpler, more generic, and with less IOCTL commands specific
handling.

What we didn't take into account is that an IOCTL needs an opened file,
which means that the caller must already have been allowed to open this
file in read or write mode.

I think most FS-specific IOCTL commands check access rights (i.e. access
mode or required capability), other than implicit ones (at least read or
write), when appropriate.  We don't get such guarantee with device
drivers.

The main threat is IOCTLs on character or block devices because their
impact may be unknown (if we only look at the IOCTL command, not the
backing file), but we should allow IOCTLs on filesystems (e.g. fscrypt,
fs-verity, clone extents).  I think we should only implement a
LANDLOCK_ACCESS_FS_IOCTL_DEV right, which would be more explicit.  This
change would impact the IOCTLs grouping (not required anymore), but
we'll still need the list of VFS IOCTLs.


> 
> Signed-off-by: Günther Noack <gnoack@google.com>
> ---
>  include/uapi/linux/landlock.h                |  55 ++++-
>  security/landlock/fs.c                       | 227 ++++++++++++++++++-
>  security/landlock/fs.h                       |   3 +
>  security/landlock/limits.h                   |  11 +-
>  security/landlock/ruleset.h                  |   2 +-
>  security/landlock/syscalls.c                 |  19 +-
>  tools/testing/selftests/landlock/base_test.c |   2 +-
>  tools/testing/selftests/landlock/fs_test.c   |   5 +-
>  8 files changed, 302 insertions(+), 22 deletions(-)

> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 73997e63734f..84efea3f7c0f 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c

> @@ -84,6 +87,186 @@ static const struct landlock_object_underops landlock_fs_underops = {
>  	.release = release_inode
>  };
>  
> +/* IOCTL helpers */
> +
> +/*
> + * These are synthetic access rights, which are only used within the kernel, but
> + * not exposed to callers in userspace.  The mapping between these access rights
> + * and IOCTL commands is defined in the get_required_ioctl_access() helper function.
> + */
> +#define LANDLOCK_ACCESS_FS_IOCTL_RW (LANDLOCK_LAST_PUBLIC_ACCESS_FS << 1)
> +#define LANDLOCK_ACCESS_FS_IOCTL_RW_FILE (LANDLOCK_LAST_PUBLIC_ACCESS_FS << 2)
> +
> +/* ioctl_groups - all synthetic access rights for IOCTL command groups */
> +/* clang-format off */
> +#define IOCTL_GROUPS (				\
> +	LANDLOCK_ACCESS_FS_IOCTL_RW |		\
> +	LANDLOCK_ACCESS_FS_IOCTL_RW_FILE)
> +/* clang-format on */
> +
> +static_assert((IOCTL_GROUPS & LANDLOCK_MASK_ACCESS_FS) == IOCTL_GROUPS);
> +
> +/**
> + * get_required_ioctl_access(): Determine required IOCTL access rights.
> + *
> + * @cmd: The IOCTL command that is supposed to be run.
> + *
> + * Any new IOCTL commands that are implemented in fs/ioctl.c's do_vfs_ioctl()
> + * should be considered for inclusion here.
> + *
> + * Returns: The access rights that must be granted on an opened file in order to
> + * use the given @cmd.
> + */
> +static __attribute_const__ access_mask_t
> +get_required_ioctl_access(const unsigned int cmd)
> +{
> +	switch (cmd) {
> +	case FIOCLEX:
> +	case FIONCLEX:
> +	case FIONBIO:
> +	case FIOASYNC:
> +		/*
> +		 * FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC manipulate the FD's
> +		 * close-on-exec and the file's buffered-IO and async flags.
> +		 * These operations are also available through fcntl(2), and are
> +		 * unconditionally permitted in Landlock.
> +		 */
> +		return 0;
> +	case FIONREAD:
> +	case FIOQSIZE:
> +	case FIGETBSZ:
> +		/*
> +		 * FIONREAD returns the number of bytes available for reading.
> +		 * FIONREAD returns the number of immediately readable bytes for
> +		 * a file.
> +		 *
> +		 * FIOQSIZE queries the size of a file or directory.
> +		 *
> +		 * FIGETBSZ queries the file system's block size for a file or
> +		 * directory.
> +		 *
> +		 * These IOCTL commands are permitted for files which are opened
> +		 * with LANDLOCK_ACCESS_FS_READ_DIR,
> +		 * LANDLOCK_ACCESS_FS_READ_FILE, or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */

Because files or directories can only be opened with
LANDLOCK_ACCESS_FS_{READ,WRITE}_{FILE,DIR}, and because IOCTLs can only
be sent on a file descriptor, this means that we can always allow these
3 commands (for opened files).

> +		return LANDLOCK_ACCESS_FS_IOCTL_RW;
> +	case FS_IOC_FIEMAP:
> +	case FIBMAP:
> +		/*
> +		 * FS_IOC_FIEMAP and FIBMAP query information about the
> +		 * allocation of blocks within a file.  They are permitted for
> +		 * files which are opened with LANDLOCK_ACCESS_FS_READ_FILE or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */
> +		fallthrough;
> +	case FIDEDUPERANGE:
> +	case FICLONE:
> +	case FICLONERANGE:
> +		/*
> +		 * FIDEDUPERANGE, FICLONE and FICLONERANGE make files share
> +		 * their underlying storage ("reflink") between source and
> +		 * destination FDs, on file systems which support that.
> +		 *
> +		 * The underlying implementations are already checking whether
> +		 * the involved files are opened with the appropriate read/write
> +		 * modes.  We rely on this being implemented correctly.
> +		 *
> +		 * These IOCTLs are permitted for files which are opened with
> +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */
> +		fallthrough;
> +	case FS_IOC_RESVSP:
> +	case FS_IOC_RESVSP64:
> +	case FS_IOC_UNRESVSP:
> +	case FS_IOC_UNRESVSP64:
> +	case FS_IOC_ZERO_RANGE:
> +		/*
> +		 * These IOCTLs reserve space, or create holes like
> +		 * fallocate(2).  We rely on the implementations checking the
> +		 * files' read/write modes.
> +		 *
> +		 * These IOCTLs are permitted for files which are opened with
> +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> +		 */

These 10 commands only make sense on directories, so we could also
always allow them on file descriptors.

> +		return LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> +	default:
> +		/*
> +		 * Other commands are guarded by the catch-all access right.
> +		 */
> +		return LANDLOCK_ACCESS_FS_IOCTL;
> +	}
> +}
> +
> +/**
> + * expand_ioctl() - Return the dst flags from either the src flag or the
> + * %LANDLOCK_ACCESS_FS_IOCTL flag, depending on whether the
> + * %LANDLOCK_ACCESS_FS_IOCTL and src access rights are handled or not.
> + *
> + * @handled: Handled access rights.
> + * @access: The access mask to copy values from.
> + * @src: A single access right to copy from in @access.
> + * @dst: One or more access rights to copy to.
> + *
> + * Returns: @dst, or 0.
> + */
> +static __attribute_const__ access_mask_t
> +expand_ioctl(const access_mask_t handled, const access_mask_t access,
> +	     const access_mask_t src, const access_mask_t dst)
> +{
> +	access_mask_t copy_from;
> +
> +	if (!(handled & LANDLOCK_ACCESS_FS_IOCTL))
> +		return 0;
> +
> +	copy_from = (handled & src) ? src : LANDLOCK_ACCESS_FS_IOCTL;
> +	if (access & copy_from)
> +		return dst;
> +
> +	return 0;
> +}
> +
> +/**
> + * landlock_expand_access_fs() - Returns @access with the synthetic IOCTL group
> + * flags enabled if necessary.
> + *
> + * @handled: Handled FS access rights.
> + * @access: FS access rights to expand.
> + *
> + * Returns: @access expanded by the necessary flags for the synthetic IOCTL
> + * access rights.
> + */
> +static __attribute_const__ access_mask_t landlock_expand_access_fs(
> +	const access_mask_t handled, const access_mask_t access)
> +{
> +	return access |
> +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_WRITE_FILE,
> +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_FILE,
> +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_DIR,
> +			    LANDLOCK_ACCESS_FS_IOCTL_RW);
> +}
> +
> +/**
> + * landlock_expand_handled_access_fs() - add synthetic IOCTL access rights to an
> + * access mask of handled accesses.
> + *
> + * @handled: The handled accesses of a ruleset that is being created.
> + *
> + * Returns: @handled, with the bits for the synthetic IOCTL access rights set,
> + * if %LANDLOCK_ACCESS_FS_IOCTL is handled.
> + */
> +__attribute_const__ access_mask_t
> +landlock_expand_handled_access_fs(const access_mask_t handled)
> +{
> +	return landlock_expand_access_fs(handled, handled);
> +}
> +
>  /* Ruleset management */
>  
>  static struct landlock_object *get_inode_object(struct inode *const inode)
> @@ -148,7 +331,8 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
>  	LANDLOCK_ACCESS_FS_EXECUTE | \
>  	LANDLOCK_ACCESS_FS_WRITE_FILE | \
>  	LANDLOCK_ACCESS_FS_READ_FILE | \
> -	LANDLOCK_ACCESS_FS_TRUNCATE)
> +	LANDLOCK_ACCESS_FS_TRUNCATE | \
> +	LANDLOCK_ACCESS_FS_IOCTL)
>  /* clang-format on */
>  
>  /*
> @@ -158,6 +342,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>  			    const struct path *const path,
>  			    access_mask_t access_rights)
>  {
> +	access_mask_t handled;
>  	int err;
>  	struct landlock_id id = {
>  		.type = LANDLOCK_KEY_INODE,
> @@ -170,9 +355,11 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>  	if (WARN_ON_ONCE(ruleset->num_layers != 1))
>  		return -EINVAL;
>  
> +	handled = landlock_get_fs_access_mask(ruleset, 0);
> +	/* Expands the synthetic IOCTL groups. */
> +	access_rights |= landlock_expand_access_fs(handled, access_rights);
>  	/* Transforms relative access rights to absolute ones. */
> -	access_rights |= LANDLOCK_MASK_ACCESS_FS &
> -			 ~landlock_get_fs_access_mask(ruleset, 0);
> +	access_rights |= LANDLOCK_MASK_ACCESS_FS & ~handled;
>  	id.key.object = get_inode_object(d_backing_inode(path->dentry));
>  	if (IS_ERR(id.key.object))
>  		return PTR_ERR(id.key.object);
> @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
>  {
>  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
>  	access_mask_t open_access_request, full_access_request, allowed_access;
> -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> +					      LANDLOCK_ACCESS_FS_IOCTL |
> +					      IOCTL_GROUPS;
>  	const struct landlock_ruleset *const dom = get_current_fs_domain();
>  
>  	if (!dom)

We should set optional_access according to the file type before
`full_access_request = open_access_request | optional_access;`

const bool is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);

optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
if (is_device)
    optional_access |= LANDLOCK_ACCESS_FS_IOCTL_DEV;


Because LANDLOCK_ACCESS_FS_IOCTL_DEV is dedicated to character or block
devices, we may want landlock_add_rule() to only allow this access right
to be tied to directories, or character devices, or block devices.  Even
if it would be more consistent with constraints on directory-only access
rights, I'm not sure about that.


> @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
>  		}
>  	}
>  
> +	/*
> +	 * Named pipes should be treated just like anonymous pipes.
> +	 * Therefore, we permit all IOCTLs on them.
> +	 */
> +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> +	}

Instead of this S_ISFIFO check:

if (!is_device)
    allowed_access |= LANDLOCK_ACCESS_FS_IOCTL_DEV;

> +
>  	/*
>  	 * For operations on already opened files (i.e. ftruncate()), it is the
>  	 * access rights at the time of open() which decide whether the
> @@ -1406,6 +1605,25 @@ static int hook_file_truncate(struct file *const file)
>  	return -EACCES;
>  }
>  
> +static int hook_file_ioctl(struct file *file, unsigned int cmd,
> +			   unsigned long arg)
> +{
> +	const access_mask_t required_access = get_required_ioctl_access(cmd);

const access_mask_t required_access = LANDLOCK_ACCESS_FS_IOCTL_DEV;


> +	const access_mask_t allowed_access =
> +		landlock_file(file)->allowed_access;
> +
> +	/*
> +	 * It is the access rights at the time of opening the file which
> +	 * determine whether IOCTL can be used on the opened file later.
> +	 *
> +	 * The access right is attached to the opened file in hook_file_open().
> +	 */
> +	if ((allowed_access & required_access) == required_access)
> +		return 0;

We could then check against the do_vfs_ioctl()'s commands, excluding
FIONREAD and file_ioctl()'s commands, to always allow VFS-related
commands:

if (vfs_masked_device_ioctl(cmd))
    return 0;

As a safeguard, we could define vfs_masked_device_ioctl(cmd) in
fs/ioctl.c and make it called by do_vfs_ioctl() as a safeguard to make
sure we keep an accurate list of VFS IOCTL commands (see next RFC patch).

The compat IOCTL hook must also be implemented.

What do you think? Any better idea?


> +
> +	return -EACCES;
> +}
> +
>  static struct security_hook_list landlock_hooks[] __ro_after_init = {
>  	LSM_HOOK_INIT(inode_free_security, hook_inode_free_security),
>  
> @@ -1428,6 +1646,7 @@ static struct security_hook_list landlock_hooks[] __ro_after_init = {

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-02-19 18:34   ` Mickaël Salaün
@ 2024-02-19 18:35     ` Mickaël Salaün
  2024-03-01 13:42       ` Mickaël Salaün
                         ` (2 more replies)
  2024-02-28 12:57     ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
  1 sibling, 3 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-02-19 18:35 UTC (permalink / raw)
  To: Arnd Bergmann, Christian Brauner, Günther Noack
  Cc: Mickaël Salaün, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	Paul Moore, linux-fsdevel, linux-security-module

vfs_masks_device_ioctl() and vfs_masks_device_ioctl_compat() are useful
to differenciate between device driver IOCTL implementations and
filesystem ones.  The goal is to be able to filter well-defined IOCTLs
from per-device (i.e. namespaced) IOCTLs and control such access.

Add a new ioctl_compat() helper, similar to vfs_ioctl(), to wrap
compat_ioctl() calls and handle error conversions.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Günther Noack <gnoack@google.com>
---
 fs/ioctl.c         | 101 +++++++++++++++++++++++++++++++++++++++++----
 include/linux/fs.h |  12 ++++++
 2 files changed, 105 insertions(+), 8 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 76cf22ac97d7..f72c8da47d21 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -763,6 +763,38 @@ static int ioctl_fssetxattr(struct file *file, void __user *argp)
 	return err;
 }
 
+/*
+ * Safeguard to maintain a list of valid IOCTLs handled by do_vfs_ioctl()
+ * instead of def_blk_fops or def_chr_fops (see init_special_inode).
+ */
+__attribute_const__ bool vfs_masked_device_ioctl(const unsigned int cmd)
+{
+	switch (cmd) {
+	case FIOCLEX:
+	case FIONCLEX:
+	case FIONBIO:
+	case FIOASYNC:
+	case FIOQSIZE:
+	case FIFREEZE:
+	case FITHAW:
+	case FS_IOC_FIEMAP:
+	case FIGETBSZ:
+	case FICLONE:
+	case FICLONERANGE:
+	case FIDEDUPERANGE:
+	/* FIONREAD is forwarded to device implementations. */
+	case FS_IOC_GETFLAGS:
+	case FS_IOC_SETFLAGS:
+	case FS_IOC_FSGETXATTR:
+	case FS_IOC_FSSETXATTR:
+	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
+		return true;
+	default:
+		return false;
+	}
+}
+EXPORT_SYMBOL(vfs_masked_device_ioctl);
+
 /*
  * do_vfs_ioctl() is not for drivers and not intended to be EXPORT_SYMBOL()'d.
  * It's just a simple helper for sys_ioctl and compat_sys_ioctl.
@@ -858,6 +890,8 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
 {
 	struct fd f = fdget(fd);
 	int error;
+	const struct inode *inode;
+	bool is_device;
 
 	if (!f.file)
 		return -EBADF;
@@ -866,9 +900,18 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
 	if (error)
 		goto out;
 
+	inode = file_inode(f.file);
+	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
+	if (is_device && !vfs_masked_device_ioctl(cmd)) {
+		error = vfs_ioctl(f.file, cmd, arg);
+		goto out;
+	}
+
 	error = do_vfs_ioctl(f.file, fd, cmd, arg);
-	if (error == -ENOIOCTLCMD)
+	if (error == -ENOIOCTLCMD) {
+		WARN_ON_ONCE(is_device);
 		error = vfs_ioctl(f.file, cmd, arg);
+	}
 
 out:
 	fdput(f);
@@ -911,11 +954,49 @@ long compat_ptr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 }
 EXPORT_SYMBOL(compat_ptr_ioctl);
 
+static long ioctl_compat(struct file *filp, unsigned int cmd,
+			 compat_ulong_t arg)
+{
+	int error = -ENOTTY;
+
+	if (!filp->f_op->compat_ioctl)
+		goto out;
+
+	error = filp->f_op->compat_ioctl(filp, cmd, arg);
+	if (error == -ENOIOCTLCMD)
+		error = -ENOTTY;
+
+out:
+	return error;
+}
+
+__attribute_const__ bool vfs_masked_device_ioctl_compat(const unsigned int cmd)
+{
+	switch (cmd) {
+	case FICLONE:
+#if defined(CONFIG_X86_64)
+	case FS_IOC_RESVSP_32:
+	case FS_IOC_RESVSP64_32:
+	case FS_IOC_UNRESVSP_32:
+	case FS_IOC_UNRESVSP64_32:
+	case FS_IOC_ZERO_RANGE_32:
+#endif
+	case FS_IOC32_GETFLAGS:
+	case FS_IOC32_SETFLAGS:
+		return true;
+	default:
+		return vfs_masked_device_ioctl(cmd);
+	}
+}
+EXPORT_SYMBOL(vfs_masked_device_ioctl_compat);
+
 COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
 		       compat_ulong_t, arg)
 {
 	struct fd f = fdget(fd);
 	int error;
+	const struct inode *inode;
+	bool is_device;
 
 	if (!f.file)
 		return -EBADF;
@@ -924,6 +1005,13 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
 	if (error)
 		goto out;
 
+	inode = file_inode(f.file);
+	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
+	if (is_device && !vfs_masked_device_ioctl_compat(cmd)) {
+		error = ioctl_compat(f.file, cmd, arg);
+		goto out;
+	}
+
 	switch (cmd) {
 	/* FICLONE takes an int argument, so don't use compat_ptr() */
 	case FICLONE:
@@ -964,13 +1052,10 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
 	default:
 		error = do_vfs_ioctl(f.file, fd, cmd,
 				     (unsigned long)compat_ptr(arg));
-		if (error != -ENOIOCTLCMD)
-			break;
-
-		if (f.file->f_op->compat_ioctl)
-			error = f.file->f_op->compat_ioctl(f.file, cmd, arg);
-		if (error == -ENOIOCTLCMD)
-			error = -ENOTTY;
+		if (error == -ENOIOCTLCMD) {
+			WARN_ON_ONCE(is_device);
+			error = ioctl_compat(f.file, cmd, arg);
+		}
 		break;
 	}
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..b620d0c00e16 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1902,6 +1902,18 @@ extern long compat_ptr_ioctl(struct file *file, unsigned int cmd,
 #define compat_ptr_ioctl NULL
 #endif
 
+extern __attribute_const__ bool vfs_masked_device_ioctl(const unsigned int cmd);
+#ifdef CONFIG_COMPAT
+extern __attribute_const__ bool
+vfs_masked_device_ioctl_compat(const unsigned int cmd);
+#else /* CONFIG_COMPAT */
+static inline __attribute_const__ bool
+vfs_masked_device_ioctl_compat(const unsigned int cmd)
+{
+	return vfs_masked_device_ioctl(cmd);
+}
+#endif /* CONFIG_COMPAT */
+
 /*
  * VFS file helper functions.
  */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-18  8:34         ` Günther Noack
@ 2024-02-19 21:44           ` Günther Noack
  0 siblings, 0 replies; 50+ messages in thread
From: Günther Noack @ 2024-02-19 21:44 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, linux-security-module, Jeff Xu,
	Arnd Bergmann, Christian Brauner, Jorge Lucangeli Obes,
	Allen Webb, Dmitry Torokhov, Paul Moore, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, Michael Kerrisk

Hello!

On Sun, Feb 18, 2024 at 09:34:39AM +0100, Günther Noack wrote:
> On Fri, Feb 16, 2024 at 04:51:40PM +0100, Mickaël Salaün wrote:
> > On Fri, Feb 16, 2024 at 03:11:18PM +0100, Mickaël Salaün wrote:
> > > What about /proc/*/fd/* ? We can test with open_proc_fd() to make sure
> > > our assumptions are correct.
> > 
> > Actually, these fifo and socket checks (and related optimizations)
> > should already be handled with is_nouser_or_private() called by
> > is_access_to_paths_allowed(). Some new dedicated tests should help
> > though.
> 
> I am generally a bit confused about how opening /proc/*/fd/* works.
> 
> Specifically:
> 
> * Do we have to worry about the scenario where the file_open hook gets
>   called with the same struct file* twice (overwriting the access
>   rights)?
> 
> * I had trouble finding the place in fs/proc/ where the re-opening is
>   implemented.
> 
> Do you happen to understand this in more detail?  At what point do the
> re-opened files start sharing the same kernel objects?  Is that at the
> inode level?

FYI, I figured it out —

 - every call to open(2) results in a new struct file
 - the resulting struct file refers to an existing inode
 - this is not supported for all inode types;
   a rough categorization happens in inode.c:init_special_inode()

The open(2) syscall creates a struct file and populates it
based on the origin fd's underlying inode through the .open function
in file_operations.

The procfs implementation for the lookup is in proc_pid_get_link /
proc_fd_link on the proc side.  It patches up the current task's
nameidata struct as a side effect by calling nd_jump_link().

For reference, I described this it in more detail at
https://blog.gnoack.org/post/proc-fd-is-not-dup/.

—Günther

-- 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-19 18:34   ` Mickaël Salaün
  2024-02-19 18:35     ` [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers Mickaël Salaün
@ 2024-02-28 12:57     ` Günther Noack
  2024-03-01 12:59       ` Mickaël Salaün
  1 sibling, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-02-28 12:57 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Arnd Bergmann, Christian Brauner, linux-security-module, Jeff Xu,
	Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov, Paul Moore,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

Hello Mickaël!

On Mon, Feb 19, 2024 at 07:34:42PM +0100, Mickaël Salaün wrote:
> Arn, Christian, please take a look at the following RFC patch and the
> rationale explained here.
> 
> On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> > Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
> > and increments the Landlock ABI version to 5.
> > 
> > Like the truncate right, these rights are associated with a file
> > descriptor at the time of open(2), and get respected even when the
> > file descriptor is used outside of the thread which it was originally
> > opened in.
> > 
> > A newly enabled Landlock policy therefore does not apply to file
> > descriptors which are already open.
> > 
> > If the LANDLOCK_ACCESS_FS_IOCTL right is handled, only a small number
> > of safe IOCTL commands will be permitted on newly opened files.  The
> > permitted IOCTLs can be configured through the ruleset in limited ways
> > now.  (See documentation for details.)
> > 
> > Specifically, when LANDLOCK_ACCESS_FS_IOCTL is handled, granting this
> > right on a file or directory will *not* permit to do all IOCTL
> > commands, but only influence the IOCTL commands which are not already
> > handled through other access rights.  The intent is to keep the groups
> > of IOCTL commands more fine-grained.
> > 
> > Noteworthy scenarios which require special attention:
> > 
> > TTY devices are often passed into a process from the parent process,
> > and so a newly enabled Landlock policy does not retroactively apply to
> > them automatically.  In the past, TTY devices have often supported
> > IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were
> > letting callers control the TTY input buffer (and simulate
> > keypresses).  This should be restricted to CAP_SYS_ADMIN programs on
> > modern kernels though.
> > 
> > Some legitimate file system features, like setting up fscrypt, are
> > exposed as IOCTL commands on regular files and directories -- users of
> > Landlock are advised to double check that the sandboxed process does
> > not need to invoke these IOCTLs.
> 
> I think we really need to allow fscrypt and fs-verity IOCTLs.
> 
> > 
> > Known limitations:
> > 
> > The LANDLOCK_ACCESS_FS_IOCTL access right is a coarse-grained control
> > over IOCTL commands.  Future work will enable a more fine-grained
> > access control for IOCTLs.
> > 
> > In the meantime, Landlock users may use path-based restrictions in
> > combination with their knowledge about the file system layout to
> > control what IOCTLs can be done.  Mounting file systems with the nodev
> > option can help to distinguish regular files and devices, and give
> > guarantees about the affected files, which Landlock alone can not give
> > yet.
> 
> I had a second though about our current approach, and it looks like we
> can do simpler, more generic, and with less IOCTL commands specific
> handling.
> 
> What we didn't take into account is that an IOCTL needs an opened file,
> which means that the caller must already have been allowed to open this
> file in read or write mode.
> 
> I think most FS-specific IOCTL commands check access rights (i.e. access
> mode or required capability), other than implicit ones (at least read or
> write), when appropriate.  We don't get such guarantee with device
> drivers.
> 
> The main threat is IOCTLs on character or block devices because their
> impact may be unknown (if we only look at the IOCTL command, not the
> backing file), but we should allow IOCTLs on filesystems (e.g. fscrypt,
> fs-verity, clone extents).  I think we should only implement a
> LANDLOCK_ACCESS_FS_IOCTL_DEV right, which would be more explicit.  This
> change would impact the IOCTLs grouping (not required anymore), but
> we'll still need the list of VFS IOCTLs.


I am fine with dropping the IOCTL grouping and going for this simpler approach.

This must have been a misunderstanding - I thought you wanted to align the
access checks in Landlock with the ones done by the kernel already, so that we
can reason about it more locally.  But I'm fine with doing it just for device
files as well, if that is what it takes.  It's definitely simpler.

Before I jump into the implementation, let me paraphrase your proposal to make
sure I understood it correctly:

 * We *only* introduce the LANDLOCK_ACCESS_FS_IOCTL_DEV right.

 * This access right governs the use of nontrivial IOCTL commands on
   character and block device files.

   * On open()ed files which are not character or block devices,
     all IOCTL commands keep working.

     This includes pipes and sockets, but also a variety of "anonymous" file
     types which are possibly openable through /proc/self/*/fd/*?

 * The trivial IOCTL commands are identified using the proposed function
   vfs_masked_device_ioctl().

   * For these commands, the implementations are in fs/ioctl.c, except for
     FIONREAD, in some cases.  We trust these implementations to check the
     file's type (dir/regular) and access rights (r/w) correctly.


Open questions I have:

* What about files which are neither devices nor regular files or directories?

  The obvious ones which can be open()ed are pipes, where only FIONREAND and two
  harmless-looking watch queue IOCTLs are implemented.

  But then I think that /proc/*/fd/* is a way through which other non-device
  files can become accessible?  What do we do for these?  (I am getting EACCES
  when trying to open some anon_inodes that way... is this something we can
  count on?)

* How did you come up with the list in vfs_masked_device_ioctl()?  I notice that
  some of these are from the switch() statement we had before, but not all of
  them are included.

  I can kind of see that for the fallocate()-like ones and for FIBMAP, because
  these **only** make sense for regular files, and IOCTLs on regular files are
  permitted anyway.

* What do we do for FIONREAD?  Your patch says that it should be forwarded to
  device implementations.  But technically, devices can implement all kinds of
  surprising behaviour for that.

  If you look at the ioctl implementations of different drivers, you can very
  quickly find a surprising amount of things that happen completely independent
  of the IOCTL command.  (Some implementations are acquiring locks and other
  resources before they even check what the cmd value is. - and we would be
  exposing that if we let devices handle FIONREAD).


Please let me know whether I understood you correctly there.

Regarding the implementation notes you left below, I think they mostly derive
from the *_IOCTL_DEV approach in a direct way.


> > +static __attribute_const__ access_mask_t
> > +get_required_ioctl_access(const unsigned int cmd)
> > +{
> > +	switch (cmd) {
> > +	case FIOCLEX:
> > +	case FIONCLEX:
> > +	case FIONBIO:
> > +	case FIOASYNC:
> > +		/*
> > +		 * FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC manipulate the FD's
> > +		 * close-on-exec and the file's buffered-IO and async flags.
> > +		 * These operations are also available through fcntl(2), and are
> > +		 * unconditionally permitted in Landlock.
> > +		 */
> > +		return 0;
> > +	case FIONREAD:
> > +	case FIOQSIZE:
> > +	case FIGETBSZ:
> > +		/*
> > +		 * FIONREAD returns the number of bytes available for reading.
> > +		 * FIONREAD returns the number of immediately readable bytes for
> > +		 * a file.
> > +		 *
> > +		 * FIOQSIZE queries the size of a file or directory.
> > +		 *
> > +		 * FIGETBSZ queries the file system's block size for a file or
> > +		 * directory.
> > +		 *
> > +		 * These IOCTL commands are permitted for files which are opened
> > +		 * with LANDLOCK_ACCESS_FS_READ_DIR,
> > +		 * LANDLOCK_ACCESS_FS_READ_FILE, or
> > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > +		 */
> 
> Because files or directories can only be opened with
> LANDLOCK_ACCESS_FS_{READ,WRITE}_{FILE,DIR}, and because IOCTLs can only
> be sent on a file descriptor, this means that we can always allow these
> 3 commands (for opened files).
> 
> > +		return LANDLOCK_ACCESS_FS_IOCTL_RW;
> > +	case FS_IOC_FIEMAP:
> > +	case FIBMAP:
> > +		/*
> > +		 * FS_IOC_FIEMAP and FIBMAP query information about the
> > +		 * allocation of blocks within a file.  They are permitted for
> > +		 * files which are opened with LANDLOCK_ACCESS_FS_READ_FILE or
> > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > +		 */
> > +		fallthrough;
> > +	case FIDEDUPERANGE:
> > +	case FICLONE:
> > +	case FICLONERANGE:
> > +		/*
> > +		 * FIDEDUPERANGE, FICLONE and FICLONERANGE make files share
> > +		 * their underlying storage ("reflink") between source and
> > +		 * destination FDs, on file systems which support that.
> > +		 *
> > +		 * The underlying implementations are already checking whether
> > +		 * the involved files are opened with the appropriate read/write
> > +		 * modes.  We rely on this being implemented correctly.
> > +		 *
> > +		 * These IOCTLs are permitted for files which are opened with
> > +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > +		 */
> > +		fallthrough;
> > +	case FS_IOC_RESVSP:
> > +	case FS_IOC_RESVSP64:
> > +	case FS_IOC_UNRESVSP:
> > +	case FS_IOC_UNRESVSP64:
> > +	case FS_IOC_ZERO_RANGE:
> > +		/*
> > +		 * These IOCTLs reserve space, or create holes like
> > +		 * fallocate(2).  We rely on the implementations checking the
> > +		 * files' read/write modes.
> > +		 *
> > +		 * These IOCTLs are permitted for files which are opened with
> > +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > +		 */
> 
> These 10 commands only make sense on directories, so we could also
> always allow them on file descriptors.

I imagine that's a typo?  The commands above do make sense on regular files.


> > +		return LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> > +	default:
> > +		/*
> > +		 * Other commands are guarded by the catch-all access right.
> > +		 */
> > +		return LANDLOCK_ACCESS_FS_IOCTL;
> > +	}
> > +}
> > +
> > +/**
> > + * expand_ioctl() - Return the dst flags from either the src flag or the
> > + * %LANDLOCK_ACCESS_FS_IOCTL flag, depending on whether the
> > + * %LANDLOCK_ACCESS_FS_IOCTL and src access rights are handled or not.
> > + *
> > + * @handled: Handled access rights.
> > + * @access: The access mask to copy values from.
> > + * @src: A single access right to copy from in @access.
> > + * @dst: One or more access rights to copy to.
> > + *
> > + * Returns: @dst, or 0.
> > + */
> > +static __attribute_const__ access_mask_t
> > +expand_ioctl(const access_mask_t handled, const access_mask_t access,
> > +	     const access_mask_t src, const access_mask_t dst)
> > +{
> > +	access_mask_t copy_from;
> > +
> > +	if (!(handled & LANDLOCK_ACCESS_FS_IOCTL))
> > +		return 0;
> > +
> > +	copy_from = (handled & src) ? src : LANDLOCK_ACCESS_FS_IOCTL;
> > +	if (access & copy_from)
> > +		return dst;
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * landlock_expand_access_fs() - Returns @access with the synthetic IOCTL group
> > + * flags enabled if necessary.
> > + *
> > + * @handled: Handled FS access rights.
> > + * @access: FS access rights to expand.
> > + *
> > + * Returns: @access expanded by the necessary flags for the synthetic IOCTL
> > + * access rights.
> > + */
> > +static __attribute_const__ access_mask_t landlock_expand_access_fs(
> > +	const access_mask_t handled, const access_mask_t access)
> > +{
> > +	return access |
> > +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_WRITE_FILE,
> > +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> > +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> > +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_FILE,
> > +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> > +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> > +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_DIR,
> > +			    LANDLOCK_ACCESS_FS_IOCTL_RW);
> > +}
> > +
> > +/**
> > + * landlock_expand_handled_access_fs() - add synthetic IOCTL access rights to an
> > + * access mask of handled accesses.
> > + *
> > + * @handled: The handled accesses of a ruleset that is being created.
> > + *
> > + * Returns: @handled, with the bits for the synthetic IOCTL access rights set,
> > + * if %LANDLOCK_ACCESS_FS_IOCTL is handled.
> > + */
> > +__attribute_const__ access_mask_t
> > +landlock_expand_handled_access_fs(const access_mask_t handled)
> > +{
> > +	return landlock_expand_access_fs(handled, handled);
> > +}
> > +
> >  /* Ruleset management */
> >  
> >  static struct landlock_object *get_inode_object(struct inode *const inode)
> > @@ -148,7 +331,8 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
> >  	LANDLOCK_ACCESS_FS_EXECUTE | \
> >  	LANDLOCK_ACCESS_FS_WRITE_FILE | \
> >  	LANDLOCK_ACCESS_FS_READ_FILE | \
> > -	LANDLOCK_ACCESS_FS_TRUNCATE)
> > +	LANDLOCK_ACCESS_FS_TRUNCATE | \
> > +	LANDLOCK_ACCESS_FS_IOCTL)
> >  /* clang-format on */
> >  
> >  /*
> > @@ -158,6 +342,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
> >  			    const struct path *const path,
> >  			    access_mask_t access_rights)
> >  {
> > +	access_mask_t handled;
> >  	int err;
> >  	struct landlock_id id = {
> >  		.type = LANDLOCK_KEY_INODE,
> > @@ -170,9 +355,11 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
> >  	if (WARN_ON_ONCE(ruleset->num_layers != 1))
> >  		return -EINVAL;
> >  
> > +	handled = landlock_get_fs_access_mask(ruleset, 0);
> > +	/* Expands the synthetic IOCTL groups. */
> > +	access_rights |= landlock_expand_access_fs(handled, access_rights);
> >  	/* Transforms relative access rights to absolute ones. */
> > -	access_rights |= LANDLOCK_MASK_ACCESS_FS &
> > -			 ~landlock_get_fs_access_mask(ruleset, 0);
> > +	access_rights |= LANDLOCK_MASK_ACCESS_FS & ~handled;
> >  	id.key.object = get_inode_object(d_backing_inode(path->dentry));
> >  	if (IS_ERR(id.key.object))
> >  		return PTR_ERR(id.key.object);
> > @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
> >  {
> >  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> >  	access_mask_t open_access_request, full_access_request, allowed_access;
> > -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> > +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> > +					      LANDLOCK_ACCESS_FS_IOCTL |
> > +					      IOCTL_GROUPS;
> >  	const struct landlock_ruleset *const dom = get_current_fs_domain();
> >  
> >  	if (!dom)
> 
> We should set optional_access according to the file type before
> `full_access_request = open_access_request | optional_access;`
> 
> const bool is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> 
> optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> if (is_device)
>     optional_access |= LANDLOCK_ACCESS_FS_IOCTL_DEV;
> 
> 
> Because LANDLOCK_ACCESS_FS_IOCTL_DEV is dedicated to character or block
> devices, we may want landlock_add_rule() to only allow this access right
> to be tied to directories, or character devices, or block devices.  Even
> if it would be more consistent with constraints on directory-only access
> rights, I'm not sure about that.
> 
> 
> > @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
> >  		}
> >  	}
> >  
> > +	/*
> > +	 * Named pipes should be treated just like anonymous pipes.
> > +	 * Therefore, we permit all IOCTLs on them.
> > +	 */
> > +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> > +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> > +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> > +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> > +	}
> 
> Instead of this S_ISFIFO check:
> 
> if (!is_device)
>     allowed_access |= LANDLOCK_ACCESS_FS_IOCTL_DEV;
> 
> > +
> >  	/*
> >  	 * For operations on already opened files (i.e. ftruncate()), it is the
> >  	 * access rights at the time of open() which decide whether the
> > @@ -1406,6 +1605,25 @@ static int hook_file_truncate(struct file *const file)
> >  	return -EACCES;
> >  }
> >  
> > +static int hook_file_ioctl(struct file *file, unsigned int cmd,
> > +			   unsigned long arg)
> > +{
> > +	const access_mask_t required_access = get_required_ioctl_access(cmd);
> 
> const access_mask_t required_access = LANDLOCK_ACCESS_FS_IOCTL_DEV;
> 
> 
> > +	const access_mask_t allowed_access =
> > +		landlock_file(file)->allowed_access;
> > +
> > +	/*
> > +	 * It is the access rights at the time of opening the file which
> > +	 * determine whether IOCTL can be used on the opened file later.
> > +	 *
> > +	 * The access right is attached to the opened file in hook_file_open().
> > +	 */
> > +	if ((allowed_access & required_access) == required_access)
> > +		return 0;
> 
> We could then check against the do_vfs_ioctl()'s commands, excluding
> FIONREAD and file_ioctl()'s commands, to always allow VFS-related
> commands:
> 
> if (vfs_masked_device_ioctl(cmd))
>     return 0;
> 
> As a safeguard, we could define vfs_masked_device_ioctl(cmd) in
> fs/ioctl.c and make it called by do_vfs_ioctl() as a safeguard to make
> sure we keep an accurate list of VFS IOCTL commands (see next RFC patch).


> The compat IOCTL hook must also be implemented.

Thanks!  I can't believe I missed that one.


> What do you think? Any better idea?

It seems like a reasonable approach.  I'd like to double check with you that we
are on the same page about it before doing the next implementation step.  (These
iterations seems cheaper when we do them in English than when we do them in C.)

Thanks for the review!
—Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-02-28 12:57     ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
@ 2024-03-01 12:59       ` Mickaël Salaün
  2024-03-01 13:38         ` Mickaël Salaün
  0 siblings, 1 reply; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-01 12:59 UTC (permalink / raw)
  To: Günther Noack
  Cc: Arnd Bergmann, Christian Brauner, linux-security-module, Jeff Xu,
	Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov, Paul Moore,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

On Wed, Feb 28, 2024 at 01:57:42PM +0100, Günther Noack wrote:
> Hello Mickaël!
> 
> On Mon, Feb 19, 2024 at 07:34:42PM +0100, Mickaël Salaün wrote:
> > Arn, Christian, please take a look at the following RFC patch and the
> > rationale explained here.
> > 
> > On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> > > Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
> > > and increments the Landlock ABI version to 5.
> > > 
> > > Like the truncate right, these rights are associated with a file
> > > descriptor at the time of open(2), and get respected even when the
> > > file descriptor is used outside of the thread which it was originally
> > > opened in.
> > > 
> > > A newly enabled Landlock policy therefore does not apply to file
> > > descriptors which are already open.
> > > 
> > > If the LANDLOCK_ACCESS_FS_IOCTL right is handled, only a small number
> > > of safe IOCTL commands will be permitted on newly opened files.  The
> > > permitted IOCTLs can be configured through the ruleset in limited ways
> > > now.  (See documentation for details.)
> > > 
> > > Specifically, when LANDLOCK_ACCESS_FS_IOCTL is handled, granting this
> > > right on a file or directory will *not* permit to do all IOCTL
> > > commands, but only influence the IOCTL commands which are not already
> > > handled through other access rights.  The intent is to keep the groups
> > > of IOCTL commands more fine-grained.
> > > 
> > > Noteworthy scenarios which require special attention:
> > > 
> > > TTY devices are often passed into a process from the parent process,
> > > and so a newly enabled Landlock policy does not retroactively apply to
> > > them automatically.  In the past, TTY devices have often supported
> > > IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were
> > > letting callers control the TTY input buffer (and simulate
> > > keypresses).  This should be restricted to CAP_SYS_ADMIN programs on
> > > modern kernels though.
> > > 
> > > Some legitimate file system features, like setting up fscrypt, are
> > > exposed as IOCTL commands on regular files and directories -- users of
> > > Landlock are advised to double check that the sandboxed process does
> > > not need to invoke these IOCTLs.
> > 
> > I think we really need to allow fscrypt and fs-verity IOCTLs.
> > 
> > > 
> > > Known limitations:
> > > 
> > > The LANDLOCK_ACCESS_FS_IOCTL access right is a coarse-grained control
> > > over IOCTL commands.  Future work will enable a more fine-grained
> > > access control for IOCTLs.
> > > 
> > > In the meantime, Landlock users may use path-based restrictions in
> > > combination with their knowledge about the file system layout to
> > > control what IOCTLs can be done.  Mounting file systems with the nodev
> > > option can help to distinguish regular files and devices, and give
> > > guarantees about the affected files, which Landlock alone can not give
> > > yet.
> > 
> > I had a second though about our current approach, and it looks like we
> > can do simpler, more generic, and with less IOCTL commands specific
> > handling.
> > 
> > What we didn't take into account is that an IOCTL needs an opened file,
> > which means that the caller must already have been allowed to open this
> > file in read or write mode.
> > 
> > I think most FS-specific IOCTL commands check access rights (i.e. access
> > mode or required capability), other than implicit ones (at least read or
> > write), when appropriate.  We don't get such guarantee with device
> > drivers.
> > 
> > The main threat is IOCTLs on character or block devices because their
> > impact may be unknown (if we only look at the IOCTL command, not the
> > backing file), but we should allow IOCTLs on filesystems (e.g. fscrypt,
> > fs-verity, clone extents).  I think we should only implement a
> > LANDLOCK_ACCESS_FS_IOCTL_DEV right, which would be more explicit.  This
> > change would impact the IOCTLs grouping (not required anymore), but
> > we'll still need the list of VFS IOCTLs.
> 
> 
> I am fine with dropping the IOCTL grouping and going for this simpler approach.
> 
> This must have been a misunderstanding - I thought you wanted to align the
> access checks in Landlock with the ones done by the kernel already, so that we
> can reason about it more locally.  But I'm fine with doing it just for device
> files as well, if that is what it takes.  It's definitely simpler.

I still think we should align existing Landlock access rights with the VFS IOCTL
semantic (i.e. mostly defined in do_vfs_ioctl(), but also in the compat
ioctl syscall).  However, according to our investigations and
discussions, it looks like the groups we defined should already be
enforced by the VFS code, which means we should not need such groups
after all.  My last proposal is to still delegate access for VFS IOCTLs
to the current Landlock access rights, but it doesn't seem required to
add specific access check if we are able to identify these VFS IOCTLs.

> 
> Before I jump into the implementation, let me paraphrase your proposal to make
> sure I understood it correctly:
> 
>  * We *only* introduce the LANDLOCK_ACCESS_FS_IOCTL_DEV right.

Yes

> 
>  * This access right governs the use of nontrivial IOCTL commands on
>    character and block device files.
> 
>    * On open()ed files which are not character or block devices,
>      all IOCTL commands keep working.

Yes

> 
>      This includes pipes and sockets, but also a variety of "anonymous" file
>      types which are possibly openable through /proc/self/*/fd/*?

Indeed, and we should document that. It should be noted that these
"anonymous" file types only comes from dedicated syscalls (which are not
currently controlled by Landlock) or from this synthetic proc interface.
One thing to keep in mind is that /proc/*/fd/* can only be opened on
tasks under the same sandbox (or a child one), so we should consider
that they are explicitly allowed by the policy the same way
pre-sandboxed inherited file descriptors are.

It might be interesting to list a few of such anonymous file types.  Are
there any that can act on global resources (like block/char devices
can)?

I also think that most anonymous file types should check for FD's read
and write mode when it makes sense (which is not the case for most
block/char IOCTLs), but I might be wrong.

I think this LANDLOCK_ACCESS_FS_IOCTL_DEV design would be good for now,
and probably enough for most use cases.  This would fill a major gap in
an easy-to-understand-and-document way.

> 
>  * The trivial IOCTL commands are identified using the proposed function
>    vfs_masked_device_ioctl().
> 
>    * For these commands, the implementations are in fs/ioctl.c, except for
>      FIONREAD, in some cases.  We trust these implementations to check the
>      file's type (dir/regular) and access rights (r/w) correctly.

FIONREAD is explicitly not part of vfs_masked_device_ioctl() because it
is only defined for regular files (and forwarded to the underlying
implementation otherwise), hence the "masked_device" name. If the
underlying filesystem handles this IOCTL command for directory that's
fine, and we don't need explicit exception.

> 
> 
> Open questions I have:
> 
> * What about files which are neither devices nor regular files or directories?
> 
>   The obvious ones which can be open()ed are pipes, where only FIONREAND and two
>   harmless-looking watch queue IOCTLs are implemented.
> 
>   But then I think that /proc/*/fd/* is a way through which other non-device
>   files can become accessible?  What do we do for these?  (I am getting EACCES
>   when trying to open some anon_inodes that way... is this something we can
>   count on?)

As explained above, /proc/*/fd/* is already restricted per sandbox
scopes, which seem enough.

> 
> * How did you come up with the list in vfs_masked_device_ioctl()?  I notice that
>   some of these are from the switch() statement we had before, but not all of
>   them are included.
> 
>   I can kind of see that for the fallocate()-like ones and for FIBMAP, because
>   these **only** make sense for regular files, and IOCTLs on regular files are
>   permitted anyway.

I took inspiration from get_required_ioctl_access(), and built this list
looking at which of the VFS IOCTLs go through the VFS implementation
(mostly do_vfs_ioctl() but also the compat syscall) for IOCTL requests
on *block and character devices*.

The initial assumption is that file systems cannot implement block nor
character device IOCTLs, which is why this approach seems safe and
consistent.

> 
> * What do we do for FIONREAD?  Your patch says that it should be forwarded to
>   device implementations.  But technically, devices can implement all kinds of
>   surprising behaviour for that.

FIONREAD should always be allowed for non-device files (which means on
allowed-to-be-opened and non-device files), and controlled with
LANDLOCK_ACCESS_FS_IOCTL_DEV for character and block devices.

> 
>   If you look at the ioctl implementations of different drivers, you can very
>   quickly find a surprising amount of things that happen completely independent
>   of the IOCTL command.  (Some implementations are acquiring locks and other
>   resources before they even check what the cmd value is. - and we would be
>   exposing that if we let devices handle FIONREAD).

Correct, which is why FIONREAD on devices should be controlled by
LANDLOCK_ACCESS_FS_IOCTL_DEV.  See my previous email (below) with the
"is_device" checks.

> 
> 
> Please let me know whether I understood you correctly there.

I think so, but I guess you missed the "is_device" part.

> 
> Regarding the implementation notes you left below, I think they mostly derive
> from the *_IOCTL_DEV approach in a direct way.

Yes

> 
> 
> > > +static __attribute_const__ access_mask_t
> > > +get_required_ioctl_access(const unsigned int cmd)
> > > +{
> > > +	switch (cmd) {
> > > +	case FIOCLEX:
> > > +	case FIONCLEX:
> > > +	case FIONBIO:
> > > +	case FIOASYNC:
> > > +		/*
> > > +		 * FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC manipulate the FD's
> > > +		 * close-on-exec and the file's buffered-IO and async flags.
> > > +		 * These operations are also available through fcntl(2), and are
> > > +		 * unconditionally permitted in Landlock.
> > > +		 */
> > > +		return 0;
> > > +	case FIONREAD:
> > > +	case FIOQSIZE:
> > > +	case FIGETBSZ:
> > > +		/*
> > > +		 * FIONREAD returns the number of bytes available for reading.
> > > +		 * FIONREAD returns the number of immediately readable bytes for
> > > +		 * a file.
> > > +		 *
> > > +		 * FIOQSIZE queries the size of a file or directory.
> > > +		 *
> > > +		 * FIGETBSZ queries the file system's block size for a file or
> > > +		 * directory.
> > > +		 *
> > > +		 * These IOCTL commands are permitted for files which are opened
> > > +		 * with LANDLOCK_ACCESS_FS_READ_DIR,
> > > +		 * LANDLOCK_ACCESS_FS_READ_FILE, or
> > > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > > +		 */
> > 
> > Because files or directories can only be opened with
> > LANDLOCK_ACCESS_FS_{READ,WRITE}_{FILE,DIR}, and because IOCTLs can only
> > be sent on a file descriptor, this means that we can always allow these
> > 3 commands (for opened files).
> > 
> > > +		return LANDLOCK_ACCESS_FS_IOCTL_RW;
> > > +	case FS_IOC_FIEMAP:
> > > +	case FIBMAP:
> > > +		/*
> > > +		 * FS_IOC_FIEMAP and FIBMAP query information about the
> > > +		 * allocation of blocks within a file.  They are permitted for
> > > +		 * files which are opened with LANDLOCK_ACCESS_FS_READ_FILE or
> > > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > > +		 */
> > > +		fallthrough;
> > > +	case FIDEDUPERANGE:
> > > +	case FICLONE:
> > > +	case FICLONERANGE:
> > > +		/*
> > > +		 * FIDEDUPERANGE, FICLONE and FICLONERANGE make files share
> > > +		 * their underlying storage ("reflink") between source and
> > > +		 * destination FDs, on file systems which support that.
> > > +		 *
> > > +		 * The underlying implementations are already checking whether
> > > +		 * the involved files are opened with the appropriate read/write
> > > +		 * modes.  We rely on this being implemented correctly.
> > > +		 *
> > > +		 * These IOCTLs are permitted for files which are opened with
> > > +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> > > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > > +		 */
> > > +		fallthrough;
> > > +	case FS_IOC_RESVSP:
> > > +	case FS_IOC_RESVSP64:
> > > +	case FS_IOC_UNRESVSP:
> > > +	case FS_IOC_UNRESVSP64:
> > > +	case FS_IOC_ZERO_RANGE:
> > > +		/*
> > > +		 * These IOCTLs reserve space, or create holes like
> > > +		 * fallocate(2).  We rely on the implementations checking the
> > > +		 * files' read/write modes.
> > > +		 *
> > > +		 * These IOCTLs are permitted for files which are opened with
> > > +		 * LANDLOCK_ACCESS_FS_READ_FILE or
> > > +		 * LANDLOCK_ACCESS_FS_WRITE_FILE.
> > > +		 */
> > 
> > These 10 commands only make sense on directories, so we could also
> > always allow them on file descriptors.
> 
> I imagine that's a typo?  The commands above do make sense on regular files.

Yes, I meant they "only make sense on regular files".

> 
> 
> > > +		return LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> > > +	default:
> > > +		/*
> > > +		 * Other commands are guarded by the catch-all access right.
> > > +		 */
> > > +		return LANDLOCK_ACCESS_FS_IOCTL;
> > > +	}
> > > +}
> > > +
> > > +/**
> > > + * expand_ioctl() - Return the dst flags from either the src flag or the
> > > + * %LANDLOCK_ACCESS_FS_IOCTL flag, depending on whether the
> > > + * %LANDLOCK_ACCESS_FS_IOCTL and src access rights are handled or not.
> > > + *
> > > + * @handled: Handled access rights.
> > > + * @access: The access mask to copy values from.
> > > + * @src: A single access right to copy from in @access.
> > > + * @dst: One or more access rights to copy to.
> > > + *
> > > + * Returns: @dst, or 0.
> > > + */
> > > +static __attribute_const__ access_mask_t
> > > +expand_ioctl(const access_mask_t handled, const access_mask_t access,
> > > +	     const access_mask_t src, const access_mask_t dst)
> > > +{
> > > +	access_mask_t copy_from;
> > > +
> > > +	if (!(handled & LANDLOCK_ACCESS_FS_IOCTL))
> > > +		return 0;
> > > +
> > > +	copy_from = (handled & src) ? src : LANDLOCK_ACCESS_FS_IOCTL;
> > > +	if (access & copy_from)
> > > +		return dst;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/**
> > > + * landlock_expand_access_fs() - Returns @access with the synthetic IOCTL group
> > > + * flags enabled if necessary.
> > > + *
> > > + * @handled: Handled FS access rights.
> > > + * @access: FS access rights to expand.
> > > + *
> > > + * Returns: @access expanded by the necessary flags for the synthetic IOCTL
> > > + * access rights.
> > > + */
> > > +static __attribute_const__ access_mask_t landlock_expand_access_fs(
> > > +	const access_mask_t handled, const access_mask_t access)
> > > +{
> > > +	return access |
> > > +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_WRITE_FILE,
> > > +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> > > +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> > > +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_FILE,
> > > +			    LANDLOCK_ACCESS_FS_IOCTL_RW |
> > > +				    LANDLOCK_ACCESS_FS_IOCTL_RW_FILE) |
> > > +	       expand_ioctl(handled, access, LANDLOCK_ACCESS_FS_READ_DIR,
> > > +			    LANDLOCK_ACCESS_FS_IOCTL_RW);
> > > +}
> > > +
> > > +/**
> > > + * landlock_expand_handled_access_fs() - add synthetic IOCTL access rights to an
> > > + * access mask of handled accesses.
> > > + *
> > > + * @handled: The handled accesses of a ruleset that is being created.
> > > + *
> > > + * Returns: @handled, with the bits for the synthetic IOCTL access rights set,
> > > + * if %LANDLOCK_ACCESS_FS_IOCTL is handled.
> > > + */
> > > +__attribute_const__ access_mask_t
> > > +landlock_expand_handled_access_fs(const access_mask_t handled)
> > > +{
> > > +	return landlock_expand_access_fs(handled, handled);
> > > +}
> > > +
> > >  /* Ruleset management */
> > >  
> > >  static struct landlock_object *get_inode_object(struct inode *const inode)
> > > @@ -148,7 +331,8 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
> > >  	LANDLOCK_ACCESS_FS_EXECUTE | \
> > >  	LANDLOCK_ACCESS_FS_WRITE_FILE | \
> > >  	LANDLOCK_ACCESS_FS_READ_FILE | \
> > > -	LANDLOCK_ACCESS_FS_TRUNCATE)
> > > +	LANDLOCK_ACCESS_FS_TRUNCATE | \
> > > +	LANDLOCK_ACCESS_FS_IOCTL)
> > >  /* clang-format on */
> > >  
> > >  /*
> > > @@ -158,6 +342,7 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
> > >  			    const struct path *const path,
> > >  			    access_mask_t access_rights)
> > >  {
> > > +	access_mask_t handled;
> > >  	int err;
> > >  	struct landlock_id id = {
> > >  		.type = LANDLOCK_KEY_INODE,
> > > @@ -170,9 +355,11 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
> > >  	if (WARN_ON_ONCE(ruleset->num_layers != 1))
> > >  		return -EINVAL;
> > >  
> > > +	handled = landlock_get_fs_access_mask(ruleset, 0);
> > > +	/* Expands the synthetic IOCTL groups. */
> > > +	access_rights |= landlock_expand_access_fs(handled, access_rights);
> > >  	/* Transforms relative access rights to absolute ones. */
> > > -	access_rights |= LANDLOCK_MASK_ACCESS_FS &
> > > -			 ~landlock_get_fs_access_mask(ruleset, 0);
> > > +	access_rights |= LANDLOCK_MASK_ACCESS_FS & ~handled;
> > >  	id.key.object = get_inode_object(d_backing_inode(path->dentry));
> > >  	if (IS_ERR(id.key.object))
> > >  		return PTR_ERR(id.key.object);
> > > @@ -1333,7 +1520,9 @@ static int hook_file_open(struct file *const file)
> > >  {
> > >  	layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> > >  	access_mask_t open_access_request, full_access_request, allowed_access;
> > > -	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> > > +	const access_mask_t optional_access = LANDLOCK_ACCESS_FS_TRUNCATE |
> > > +					      LANDLOCK_ACCESS_FS_IOCTL |
> > > +					      IOCTL_GROUPS;
> > >  	const struct landlock_ruleset *const dom = get_current_fs_domain();
> > >  
> > >  	if (!dom)
> > 
> > We should set optional_access according to the file type before
> > `full_access_request = open_access_request | optional_access;`
> > 
> > const bool is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> > 
> > optional_access = LANDLOCK_ACCESS_FS_TRUNCATE;
> > if (is_device)
> >     optional_access |= LANDLOCK_ACCESS_FS_IOCTL_DEV;
> > 
> > 
> > Because LANDLOCK_ACCESS_FS_IOCTL_DEV is dedicated to character or block
> > devices, we may want landlock_add_rule() to only allow this access right
> > to be tied to directories, or character devices, or block devices.  Even
> > if it would be more consistent with constraints on directory-only access
> > rights, I'm not sure about that.
> > 
> > 
> > > @@ -1375,6 +1564,16 @@ static int hook_file_open(struct file *const file)
> > >  		}
> > >  	}
> > >  
> > > +	/*
> > > +	 * Named pipes should be treated just like anonymous pipes.
> > > +	 * Therefore, we permit all IOCTLs on them.
> > > +	 */
> > > +	if (S_ISFIFO(file_inode(file)->i_mode)) {
> > > +		allowed_access |= LANDLOCK_ACCESS_FS_IOCTL |
> > > +				  LANDLOCK_ACCESS_FS_IOCTL_RW |
> > > +				  LANDLOCK_ACCESS_FS_IOCTL_RW_FILE;
> > > +	}
> > 
> > Instead of this S_ISFIFO check:
> > 
> > if (!is_device)
> >     allowed_access |= LANDLOCK_ACCESS_FS_IOCTL_DEV;
> > 
> > > +
> > >  	/*
> > >  	 * For operations on already opened files (i.e. ftruncate()), it is the
> > >  	 * access rights at the time of open() which decide whether the
> > > @@ -1406,6 +1605,25 @@ static int hook_file_truncate(struct file *const file)
> > >  	return -EACCES;
> > >  }
> > >  
> > > +static int hook_file_ioctl(struct file *file, unsigned int cmd,
> > > +			   unsigned long arg)
> > > +{
> > > +	const access_mask_t required_access = get_required_ioctl_access(cmd);
> > 
> > const access_mask_t required_access = LANDLOCK_ACCESS_FS_IOCTL_DEV;
> > 
> > 
> > > +	const access_mask_t allowed_access =
> > > +		landlock_file(file)->allowed_access;
> > > +
> > > +	/*
> > > +	 * It is the access rights at the time of opening the file which
> > > +	 * determine whether IOCTL can be used on the opened file later.
> > > +	 *
> > > +	 * The access right is attached to the opened file in hook_file_open().
> > > +	 */
> > > +	if ((allowed_access & required_access) == required_access)
> > > +		return 0;
> > 
> > We could then check against the do_vfs_ioctl()'s commands, excluding
> > FIONREAD and file_ioctl()'s commands, to always allow VFS-related
> > commands:
> > 
> > if (vfs_masked_device_ioctl(cmd))
> >     return 0;
> > 
> > As a safeguard, we could define vfs_masked_device_ioctl(cmd) in
> > fs/ioctl.c and make it called by do_vfs_ioctl() as a safeguard to make
> > sure we keep an accurate list of VFS IOCTL commands (see next RFC patch).
> 
> 
> > The compat IOCTL hook must also be implemented.
> 
> Thanks!  I can't believe I missed that one.
> 
> 
> > What do you think? Any better idea?
> 
> It seems like a reasonable approach.  I'd like to double check with you that we
> are on the same page about it before doing the next implementation step.  (These
> iterations seems cheaper when we do them in English than when we do them in C.)

We only reached this design because of the previous iterations, reviews
and discussions.  Implementations details matter in this case and it's
good to take time to convince ourselves of the best approach (and to
understand how underlying implementations work).  Finding a "simple"
interface that makes sense to control IOCTLs in an efficient way wasn't
obvious but I'm convinced we got it now.

Thanks for your perseverance!

> 
> Thanks for the review!
> —Günther
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v9 1/8] landlock: Add IOCTL access right
  2024-03-01 12:59       ` Mickaël Salaün
@ 2024-03-01 13:38         ` Mickaël Salaün
  0 siblings, 0 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-01 13:38 UTC (permalink / raw)
  To: Günther Noack
  Cc: Arnd Bergmann, Christian Brauner, linux-security-module, Jeff Xu,
	Jorge Lucangeli Obes, Allen Webb, Dmitry Torokhov, Paul Moore,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel

On Fri, Mar 01, 2024 at 01:59:13PM +0100, Mickaël Salaün wrote:
> On Wed, Feb 28, 2024 at 01:57:42PM +0100, Günther Noack wrote:
> > Hello Mickaël!
> > 
> > On Mon, Feb 19, 2024 at 07:34:42PM +0100, Mickaël Salaün wrote:
> > > Arn, Christian, please take a look at the following RFC patch and the
> > > rationale explained here.
> > > 
> > > On Fri, Feb 09, 2024 at 06:06:05PM +0100, Günther Noack wrote:
> > > > Introduces the LANDLOCK_ACCESS_FS_IOCTL access right
> > > > and increments the Landlock ABI version to 5.
> > > > 
> > > > Like the truncate right, these rights are associated with a file
> > > > descriptor at the time of open(2), and get respected even when the
> > > > file descriptor is used outside of the thread which it was originally
> > > > opened in.
> > > > 
> > > > A newly enabled Landlock policy therefore does not apply to file
> > > > descriptors which are already open.
> > > > 
> > > > If the LANDLOCK_ACCESS_FS_IOCTL right is handled, only a small number
> > > > of safe IOCTL commands will be permitted on newly opened files.  The
> > > > permitted IOCTLs can be configured through the ruleset in limited ways
> > > > now.  (See documentation for details.)
> > > > 
> > > > Specifically, when LANDLOCK_ACCESS_FS_IOCTL is handled, granting this
> > > > right on a file or directory will *not* permit to do all IOCTL
> > > > commands, but only influence the IOCTL commands which are not already
> > > > handled through other access rights.  The intent is to keep the groups
> > > > of IOCTL commands more fine-grained.
> > > > 
> > > > Noteworthy scenarios which require special attention:
> > > > 
> > > > TTY devices are often passed into a process from the parent process,
> > > > and so a newly enabled Landlock policy does not retroactively apply to
> > > > them automatically.  In the past, TTY devices have often supported
> > > > IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were
> > > > letting callers control the TTY input buffer (and simulate
> > > > keypresses).  This should be restricted to CAP_SYS_ADMIN programs on
> > > > modern kernels though.
> > > > 
> > > > Some legitimate file system features, like setting up fscrypt, are
> > > > exposed as IOCTL commands on regular files and directories -- users of
> > > > Landlock are advised to double check that the sandboxed process does
> > > > not need to invoke these IOCTLs.
> > > 
> > > I think we really need to allow fscrypt and fs-verity IOCTLs.
> > > 
> > > > 
> > > > Known limitations:
> > > > 
> > > > The LANDLOCK_ACCESS_FS_IOCTL access right is a coarse-grained control
> > > > over IOCTL commands.  Future work will enable a more fine-grained
> > > > access control for IOCTLs.
> > > > 
> > > > In the meantime, Landlock users may use path-based restrictions in
> > > > combination with their knowledge about the file system layout to
> > > > control what IOCTLs can be done.  Mounting file systems with the nodev
> > > > option can help to distinguish regular files and devices, and give
> > > > guarantees about the affected files, which Landlock alone can not give
> > > > yet.
> > > 
> > > I had a second though about our current approach, and it looks like we
> > > can do simpler, more generic, and with less IOCTL commands specific
> > > handling.
> > > 
> > > What we didn't take into account is that an IOCTL needs an opened file,
> > > which means that the caller must already have been allowed to open this
> > > file in read or write mode.
> > > 
> > > I think most FS-specific IOCTL commands check access rights (i.e. access
> > > mode or required capability), other than implicit ones (at least read or
> > > write), when appropriate.  We don't get such guarantee with device
> > > drivers.
> > > 
> > > The main threat is IOCTLs on character or block devices because their
> > > impact may be unknown (if we only look at the IOCTL command, not the
> > > backing file), but we should allow IOCTLs on filesystems (e.g. fscrypt,
> > > fs-verity, clone extents).  I think we should only implement a
> > > LANDLOCK_ACCESS_FS_IOCTL_DEV right, which would be more explicit.  This
> > > change would impact the IOCTLs grouping (not required anymore), but
> > > we'll still need the list of VFS IOCTLs.
> > 
> > 
> > I am fine with dropping the IOCTL grouping and going for this simpler approach.
> > 
> > This must have been a misunderstanding - I thought you wanted to align the
> > access checks in Landlock with the ones done by the kernel already, so that we
> > can reason about it more locally.  But I'm fine with doing it just for device
> > files as well, if that is what it takes.  It's definitely simpler.
> 
> I still think we should align existing Landlock access rights with the VFS IOCTL
> semantic (i.e. mostly defined in do_vfs_ioctl(), but also in the compat
> ioctl syscall).  However, according to our investigations and
> discussions, it looks like the groups we defined should already be
> enforced by the VFS code, which means we should not need such groups
> after all.  My last proposal is to still delegate access for VFS IOCTLs
> to the current Landlock access rights, but it doesn't seem required to
> add specific access check if we are able to identify these VFS IOCTLs.

To say it another way, at least one of the read/write Landlock rights
are already required to open a file/directory, and according to the new
get_required_ioctl_access() grouping we can simplifying it further to
fully rely on the meta "open" access right, and then replace
get_required_ioctl_access() with the file type and
vfs_masked_device_ioctl() checks.

For now, the only "optional" access right is
LANDLOCK_ACCESS_FS_TRUNCATE, and I don't think it needs to be tied to
any VFS IOCTLs.

Because the IOCTL_DEV access right comes now, future access rights that
may need to also check IOCTL (e.g. change file attribute) should be much
simpler to implement.  Indeed, they will only impact VFS IOCTLs which
would always be allowed with LANDLOCK_ACCESS_FS_IOCTL_DEV.  It should
then be trivial to add a new control layer for a subset of the VFS
IOCTLs.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-02-19 18:35     ` [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers Mickaël Salaün
@ 2024-03-01 13:42       ` Mickaël Salaün
  2024-03-01 16:24       ` Arnd Bergmann
  2024-03-05 18:13       ` Günther Noack
  2 siblings, 0 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-01 13:42 UTC (permalink / raw)
  To: Arnd Bergmann, Christian Brauner, Günther Noack
  Cc: Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, Paul Moore, linux-fsdevel,
	linux-security-module

Arnd, Christian, are you OK with this approach to identify VFS IOCTLs?

If yes, Günther should include it in his next patch series.

On Mon, Feb 19, 2024 at 07:35:39PM +0100, Mickaël Salaün wrote:
> vfs_masks_device_ioctl() and vfs_masks_device_ioctl_compat() are useful
> to differenciate between device driver IOCTL implementations and
> filesystem ones.  The goal is to be able to filter well-defined IOCTLs
> from per-device (i.e. namespaced) IOCTLs and control such access.
> 
> Add a new ioctl_compat() helper, similar to vfs_ioctl(), to wrap
> compat_ioctl() calls and handle error conversions.
> 
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> ---
>  fs/ioctl.c         | 101 +++++++++++++++++++++++++++++++++++++++++----
>  include/linux/fs.h |  12 ++++++
>  2 files changed, 105 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 76cf22ac97d7..f72c8da47d21 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -763,6 +763,38 @@ static int ioctl_fssetxattr(struct file *file, void __user *argp)
>  	return err;
>  }
>  
> +/*
> + * Safeguard to maintain a list of valid IOCTLs handled by do_vfs_ioctl()
> + * instead of def_blk_fops or def_chr_fops (see init_special_inode).
> + */
> +__attribute_const__ bool vfs_masked_device_ioctl(const unsigned int cmd)
> +{
> +	switch (cmd) {
> +	case FIOCLEX:
> +	case FIONCLEX:
> +	case FIONBIO:
> +	case FIOASYNC:
> +	case FIOQSIZE:
> +	case FIFREEZE:
> +	case FITHAW:
> +	case FS_IOC_FIEMAP:
> +	case FIGETBSZ:
> +	case FICLONE:
> +	case FICLONERANGE:
> +	case FIDEDUPERANGE:
> +	/* FIONREAD is forwarded to device implementations. */
> +	case FS_IOC_GETFLAGS:
> +	case FS_IOC_SETFLAGS:
> +	case FS_IOC_FSGETXATTR:
> +	case FS_IOC_FSSETXATTR:
> +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
> +		return true;
> +	default:
> +		return false;
> +	}
> +}
> +EXPORT_SYMBOL(vfs_masked_device_ioctl);
> +
>  /*
>   * do_vfs_ioctl() is not for drivers and not intended to be EXPORT_SYMBOL()'d.
>   * It's just a simple helper for sys_ioctl and compat_sys_ioctl.
> @@ -858,6 +890,8 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
>  {
>  	struct fd f = fdget(fd);
>  	int error;
> +	const struct inode *inode;
> +	bool is_device;
>  
>  	if (!f.file)
>  		return -EBADF;
> @@ -866,9 +900,18 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
>  	if (error)
>  		goto out;
>  
> +	inode = file_inode(f.file);
> +	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> +	if (is_device && !vfs_masked_device_ioctl(cmd)) {
> +		error = vfs_ioctl(f.file, cmd, arg);
> +		goto out;
> +	}
> +
>  	error = do_vfs_ioctl(f.file, fd, cmd, arg);
> -	if (error == -ENOIOCTLCMD)
> +	if (error == -ENOIOCTLCMD) {
> +		WARN_ON_ONCE(is_device);
>  		error = vfs_ioctl(f.file, cmd, arg);
> +	}
>  
>  out:
>  	fdput(f);
> @@ -911,11 +954,49 @@ long compat_ptr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
>  }
>  EXPORT_SYMBOL(compat_ptr_ioctl);
>  
> +static long ioctl_compat(struct file *filp, unsigned int cmd,
> +			 compat_ulong_t arg)
> +{
> +	int error = -ENOTTY;
> +
> +	if (!filp->f_op->compat_ioctl)
> +		goto out;
> +
> +	error = filp->f_op->compat_ioctl(filp, cmd, arg);
> +	if (error == -ENOIOCTLCMD)
> +		error = -ENOTTY;
> +
> +out:
> +	return error;
> +}
> +
> +__attribute_const__ bool vfs_masked_device_ioctl_compat(const unsigned int cmd)
> +{
> +	switch (cmd) {
> +	case FICLONE:
> +#if defined(CONFIG_X86_64)
> +	case FS_IOC_RESVSP_32:
> +	case FS_IOC_RESVSP64_32:
> +	case FS_IOC_UNRESVSP_32:
> +	case FS_IOC_UNRESVSP64_32:
> +	case FS_IOC_ZERO_RANGE_32:
> +#endif
> +	case FS_IOC32_GETFLAGS:
> +	case FS_IOC32_SETFLAGS:
> +		return true;
> +	default:
> +		return vfs_masked_device_ioctl(cmd);
> +	}
> +}
> +EXPORT_SYMBOL(vfs_masked_device_ioctl_compat);
> +
>  COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
>  		       compat_ulong_t, arg)
>  {
>  	struct fd f = fdget(fd);
>  	int error;
> +	const struct inode *inode;
> +	bool is_device;
>  
>  	if (!f.file)
>  		return -EBADF;
> @@ -924,6 +1005,13 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
>  	if (error)
>  		goto out;
>  
> +	inode = file_inode(f.file);
> +	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> +	if (is_device && !vfs_masked_device_ioctl_compat(cmd)) {
> +		error = ioctl_compat(f.file, cmd, arg);
> +		goto out;
> +	}
> +
>  	switch (cmd) {
>  	/* FICLONE takes an int argument, so don't use compat_ptr() */
>  	case FICLONE:
> @@ -964,13 +1052,10 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
>  	default:
>  		error = do_vfs_ioctl(f.file, fd, cmd,
>  				     (unsigned long)compat_ptr(arg));
> -		if (error != -ENOIOCTLCMD)
> -			break;
> -
> -		if (f.file->f_op->compat_ioctl)
> -			error = f.file->f_op->compat_ioctl(f.file, cmd, arg);
> -		if (error == -ENOIOCTLCMD)
> -			error = -ENOTTY;
> +		if (error == -ENOIOCTLCMD) {
> +			WARN_ON_ONCE(is_device);
> +			error = ioctl_compat(f.file, cmd, arg);
> +		}
>  		break;
>  	}
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index ed5966a70495..b620d0c00e16 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1902,6 +1902,18 @@ extern long compat_ptr_ioctl(struct file *file, unsigned int cmd,
>  #define compat_ptr_ioctl NULL
>  #endif
>  
> +extern __attribute_const__ bool vfs_masked_device_ioctl(const unsigned int cmd);
> +#ifdef CONFIG_COMPAT
> +extern __attribute_const__ bool
> +vfs_masked_device_ioctl_compat(const unsigned int cmd);
> +#else /* CONFIG_COMPAT */
> +static inline __attribute_const__ bool
> +vfs_masked_device_ioctl_compat(const unsigned int cmd)
> +{
> +	return vfs_masked_device_ioctl(cmd);
> +}
> +#endif /* CONFIG_COMPAT */
> +
>  /*
>   * VFS file helper functions.
>   */
> -- 
> 2.43.0
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-02-19 18:35     ` [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers Mickaël Salaün
  2024-03-01 13:42       ` Mickaël Salaün
@ 2024-03-01 16:24       ` Arnd Bergmann
  2024-03-01 18:35         ` Mickaël Salaün
  2024-03-05 18:13       ` Günther Noack
  2 siblings, 1 reply; 50+ messages in thread
From: Arnd Bergmann @ 2024-03-01 16:24 UTC (permalink / raw)
  To: Mickaël Salaün, Christian Brauner, Günther Noack
  Cc: Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, Paul Moore, linux-fsdevel,
	linux-security-module

On Mon, Feb 19, 2024, at 19:35, Mickaël Salaün wrote:
> vfs_masks_device_ioctl() and vfs_masks_device_ioctl_compat() are useful
> to differenciate between device driver IOCTL implementations and
> filesystem ones.  The goal is to be able to filter well-defined IOCTLs
> from per-device (i.e. namespaced) IOCTLs and control such access.
>
> Add a new ioctl_compat() helper, similar to vfs_ioctl(), to wrap
> compat_ioctl() calls and handle error conversions.

I'm still a bit confused by what your goal is here. I see
the code getting more complex but don't see the payoff in this
patch.

> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>

I assume the missing Signed-off-by is intentional while you are
still gathering feedback?

> +/*
> + * Safeguard to maintain a list of valid IOCTLs handled by 
> do_vfs_ioctl()
> + * instead of def_blk_fops or def_chr_fops (see init_special_inode).
> + */
> +__attribute_const__ bool vfs_masked_device_ioctl(const unsigned int 
> cmd)
> +{
> +	switch (cmd) {
> +	case FIOCLEX:
> +	case FIONCLEX:
> +	case FIONBIO:
> +	case FIOASYNC:
> +	case FIOQSIZE:
> +	case FIFREEZE:
> +	case FITHAW:
> +	case FS_IOC_FIEMAP:
> +	case FIGETBSZ:
> +	case FICLONE:
> +	case FICLONERANGE:
> +	case FIDEDUPERANGE:
> +	/* FIONREAD is forwarded to device implementations. */
> +	case FS_IOC_GETFLAGS:
> +	case FS_IOC_SETFLAGS:
> +	case FS_IOC_FSGETXATTR:
> +	case FS_IOC_FSSETXATTR:
> +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
> +		return true;
> +	default:
> +		return false;
> +	}
> +}
> +EXPORT_SYMBOL(vfs_masked_device_ioctl);

It looks like this gets added into the hot path of every
ioctl command, which is not ideal, especially when this
no longer gets inlined into the caller.

> +	inode = file_inode(f.file);
> +	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> +	if (is_device && !vfs_masked_device_ioctl(cmd)) {
> +		error = vfs_ioctl(f.file, cmd, arg);
> +		goto out;
> +	}

The S_ISBLK() || S_ISCHR() check here looks like it changes
behavior, at least for sockets. If that is intentional,
it should probably be a separate patch with a detailed
explanation.

>  	error = do_vfs_ioctl(f.file, fd, cmd, arg);
> -	if (error == -ENOIOCTLCMD)
> +	if (error == -ENOIOCTLCMD) {
> +		WARN_ON_ONCE(is_device);
>  		error = vfs_ioctl(f.file, cmd, arg);
> +	}

The WARN_ON_ONCE() looks like it can be triggered from
userspace, which is generally a bad idea.
 
> +extern __attribute_const__ bool vfs_masked_device_ioctl(const unsigned 
> int cmd);
> +#ifdef CONFIG_COMPAT
> +extern __attribute_const__ bool
> +vfs_masked_device_ioctl_compat(const unsigned int cmd);
> +#else /* CONFIG_COMPAT */
> +static inline __attribute_const__ bool
> +vfs_masked_device_ioctl_compat(const unsigned int cmd)
> +{
> +	return vfs_masked_device_ioctl(cmd);
> +}
> +#endif /* CONFIG_COMPAT */

I don't understand the purpose of the #else path here,
this should not be needed.

      Arnd

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-01 16:24       ` Arnd Bergmann
@ 2024-03-01 18:35         ` Mickaël Salaün
  0 siblings, 0 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-01 18:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christian Brauner, Günther Noack, Allen Webb,
	Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, Paul Moore, linux-fsdevel,
	linux-security-module

On Fri, Mar 01, 2024 at 05:24:52PM +0100, Arnd Bergmann wrote:
> On Mon, Feb 19, 2024, at 19:35, Mickaël Salaün wrote:
> > vfs_masks_device_ioctl() and vfs_masks_device_ioctl_compat() are useful
> > to differenciate between device driver IOCTL implementations and
> > filesystem ones.  The goal is to be able to filter well-defined IOCTLs
> > from per-device (i.e. namespaced) IOCTLs and control such access.
> >
> > Add a new ioctl_compat() helper, similar to vfs_ioctl(), to wrap
> > compat_ioctl() calls and handle error conversions.
> 
> I'm still a bit confused by what your goal is here. I see
> the code getting more complex but don't see the payoff in this
> patch.

The main idea is to be able to identify if an IOCTL is handled by a
device driver (i.e. block or character devices) or not.  This would be
used at least by Landlock to control such IOCTL (according to the
char/block device file, which can already be identified) while allowing
other VFS/FS IOCTLs.

> 
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Christian Brauner <brauner@kernel.org>
> > Cc: Günther Noack <gnoack@google.com>
> 
> I assume the missing Signed-off-by is intentional while you are
> still gathering feedback?

No, I sent it too quickly and forgot to add it.

Signed-off-by: Mickaël Salaün <mic@digikod.net>

> 
> > +/*
> > + * Safeguard to maintain a list of valid IOCTLs handled by 
> > do_vfs_ioctl()
> > + * instead of def_blk_fops or def_chr_fops (see init_special_inode).
> > + */
> > +__attribute_const__ bool vfs_masked_device_ioctl(const unsigned int 
> > cmd)
> > +{
> > +	switch (cmd) {
> > +	case FIOCLEX:
> > +	case FIONCLEX:
> > +	case FIONBIO:
> > +	case FIOASYNC:
> > +	case FIOQSIZE:
> > +	case FIFREEZE:
> > +	case FITHAW:
> > +	case FS_IOC_FIEMAP:
> > +	case FIGETBSZ:
> > +	case FICLONE:
> > +	case FICLONERANGE:
> > +	case FIDEDUPERANGE:
> > +	/* FIONREAD is forwarded to device implementations. */
> > +	case FS_IOC_GETFLAGS:
> > +	case FS_IOC_SETFLAGS:
> > +	case FS_IOC_FSGETXATTR:
> > +	case FS_IOC_FSSETXATTR:
> > +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
> > +		return true;
> > +	default:
> > +		return false;
> > +	}
> > +}
> > +EXPORT_SYMBOL(vfs_masked_device_ioctl);
> 
> It looks like this gets added into the hot path of every
> ioctl command, which is not ideal, especially when this
> no longer gets inlined into the caller.

I'm looking for a way to guarantee that the list of IOCTLs in this
helper will be kept up-to-date. This kind of run time check might be
too much though. Do you have other suggestions? Do you think a simple
comment to remind contributors to update this helper would be enough?
I guess VFS IOCTLs should not be added often, but I'm worried that this
list could get out of sync...

> 
> > +	inode = file_inode(f.file);
> > +	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> > +	if (is_device && !vfs_masked_device_ioctl(cmd)) {
> > +		error = vfs_ioctl(f.file, cmd, arg);
> > +		goto out;
> > +	}
> 
> The S_ISBLK() || S_ISCHR() check here looks like it changes
> behavior, at least for sockets. If that is intentional,
> it should probably be a separate patch with a detailed
> explanation.

I don't think this changes the behavior for sockets, and at least that's
not intentionnal.  This patch should not change the current behavior at
all.

The path to reach socket IOCTLs goes through a vfs_ioctl() call, which
is...

> 
> >  	error = do_vfs_ioctl(f.file, fd, cmd, arg);
> > -	if (error == -ENOIOCTLCMD)
> > +	if (error == -ENOIOCTLCMD) {
> > +		WARN_ON_ONCE(is_device);
> >  		error = vfs_ioctl(f.file, cmd, arg);

...here!

> > +	}
> 
> The WARN_ON_ONCE() looks like it can be triggered from
> userspace, which is generally a bad idea.

This WARN_ON_ONCE() should never be triggered because if the it is a
device IOCTL it goes through the previous vfs_ioctl() (for
device-specific command) call or through this do_vfs_ioctl() call (for
VFS-specific command).

>  
> > +extern __attribute_const__ bool vfs_masked_device_ioctl(const unsigned 
> > int cmd);
> > +#ifdef CONFIG_COMPAT
> > +extern __attribute_const__ bool
> > +vfs_masked_device_ioctl_compat(const unsigned int cmd);
> > +#else /* CONFIG_COMPAT */
> > +static inline __attribute_const__ bool
> > +vfs_masked_device_ioctl_compat(const unsigned int cmd)
> > +{
> > +	return vfs_masked_device_ioctl(cmd);
> > +}
> > +#endif /* CONFIG_COMPAT */
> 
> I don't understand the purpose of the #else path here,
> this should not be needed.

Correct, this else branch is useless.

> 
>       Arnd
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-02-19 18:35     ` [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers Mickaël Salaün
  2024-03-01 13:42       ` Mickaël Salaün
  2024-03-01 16:24       ` Arnd Bergmann
@ 2024-03-05 18:13       ` Günther Noack
  2024-03-06 13:47         ` Mickaël Salaün
  2 siblings, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-03-05 18:13 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Arnd Bergmann, Christian Brauner, Allen Webb, Dmitry Torokhov,
	Jeff Xu, Jorge Lucangeli Obes, Konstantin Meskhidze,
	Matt Bobrowski, Paul Moore, linux-fsdevel, linux-security-module

Hello!

More questions than answers in this code review, but maybe this discusison will
help to get a clearer picture about what we are going for here.

On Mon, Feb 19, 2024 at 07:35:39PM +0100, Mickaël Salaün wrote:
> vfs_masks_device_ioctl() and vfs_masks_device_ioctl_compat() are useful
> to differenciate between device driver IOCTL implementations and
> filesystem ones.  The goal is to be able to filter well-defined IOCTLs
> from per-device (i.e. namespaced) IOCTLs and control such access.
> 
> Add a new ioctl_compat() helper, similar to vfs_ioctl(), to wrap
> compat_ioctl() calls and handle error conversions.
> 
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Günther Noack <gnoack@google.com>
> ---
>  fs/ioctl.c         | 101 +++++++++++++++++++++++++++++++++++++++++----
>  include/linux/fs.h |  12 ++++++
>  2 files changed, 105 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 76cf22ac97d7..f72c8da47d21 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -763,6 +763,38 @@ static int ioctl_fssetxattr(struct file *file, void __user *argp)
>  	return err;
>  }
>  
> +/*
> + * Safeguard to maintain a list of valid IOCTLs handled by do_vfs_ioctl()
> + * instead of def_blk_fops or def_chr_fops (see init_special_inode).
> + */
> +__attribute_const__ bool vfs_masked_device_ioctl(const unsigned int cmd)
> +{
> +	switch (cmd) {
> +	case FIOCLEX:
> +	case FIONCLEX:
> +	case FIONBIO:
> +	case FIOASYNC:
> +	case FIOQSIZE:
> +	case FIFREEZE:
> +	case FITHAW:
> +	case FS_IOC_FIEMAP:
> +	case FIGETBSZ:
> +	case FICLONE:
> +	case FICLONERANGE:
> +	case FIDEDUPERANGE:
> +	/* FIONREAD is forwarded to device implementations. */
> +	case FS_IOC_GETFLAGS:
> +	case FS_IOC_SETFLAGS:
> +	case FS_IOC_FSGETXATTR:
> +	case FS_IOC_FSSETXATTR:
> +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
> +		return true;
> +	default:
> +		return false;
> +	}
> +}
> +EXPORT_SYMBOL(vfs_masked_device_ioctl);

[
Technical implementation notes about this function: the list of IOCTLs here are
the same ones which do_vfs_ioctl() implements directly.

There are only two cases in which do_vfs_ioctl() does more complicated handling:

(1) FIONREAD falls back to the device's ioctl implemenetation.
    Therefore, we omit FIONREAD in our own list - we do not want to allow that.
(2) The default case falls back to the file_ioctl() function, but *only* for
    S_ISREG() files, so it does not matter for the Landlock case.
]


## What we are actually trying to do (?)

Let me try to take a step back and paraphrase what I think we are *actually*
trying to do here -- please correct me if I am wrong about that:

I think what we *really* are trying to do is to control from the Landlock LSM
whether the filp->f_op->unlocked_ioctl() or filp->f_op->ioctl_compat()
operations are getting called for device files.

So in a world where we cared only about correctness, we could create a new LSM
hook security_file_vfs_ioctl(), which gets checked just before these two f_op
operations get called.  With that, we could permit all IOCTLs that are
implemented in fs/ioctl.c, and we could deny all IOCTL commands that are
implemented in the device implementation.

I guess the reasons why we are not using that approach are performance, and that
it might mess up the LSM hook interface with special cases that only Landlcok
needs?  But it seems like it would be easier to reason about..?  Or maybe we can
find a middle ground, where we have the existing hook return a special value
with the meaning "permit this IOCTL, but do not invoke the f_op hook"?


## What we implemented

Of course, the existing security_file_ioctl LSM hook works differently, and so
with that hook, we need to make our blocking decision purely based on the struct
file*, the IOCTL command number and the IOCTL argument.

So in order to make that decision correctly based on that information, we end up
listing all the IOCTLs which are directly(!) implemented in do_vfs_ioctl(),
because for Landlock, this is the list of IOCTL commands which is safe to permit
on device files.  And we need to keep that list in sync with fs/ioctl.c, which
is why it ended up in the same place in this commit.


(Is it maybe possible to check with a KUnit test whether such lists are in sync?
It sounds superficially like it should be feasible to create a device file which
records whether its ioctl implementation was called.  So we could at least check
that the Landlock command list is a subset of the do_vfs_ioctl() one.)


> +
>  /*
>   * do_vfs_ioctl() is not for drivers and not intended to be EXPORT_SYMBOL()'d.
>   * It's just a simple helper for sys_ioctl and compat_sys_ioctl.
> @@ -858,6 +890,8 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
>  {
>  	struct fd f = fdget(fd);
>  	int error;
> +	const struct inode *inode;
> +	bool is_device;
>  
>  	if (!f.file)
>  		return -EBADF;
> @@ -866,9 +900,18 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
>  	if (error)
>  		goto out;
>  
> +	inode = file_inode(f.file);
> +	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> +	if (is_device && !vfs_masked_device_ioctl(cmd)) {
> +		error = vfs_ioctl(f.file, cmd, arg);
> +		goto out;
> +	}
> +
>  	error = do_vfs_ioctl(f.file, fd, cmd, arg);
> -	if (error == -ENOIOCTLCMD)
> +	if (error == -ENOIOCTLCMD) {
> +		WARN_ON_ONCE(is_device);
>  		error = vfs_ioctl(f.file, cmd, arg);
> +	}

It is not obvious at first that adding this list requires a change to the ioctl
syscall implementations.  If I understand this right, the idea is that you want
to be 100% sure that we are not calling vfs_ioctl() for the commands in that
list.  And there is a scenario where this could potentially happen:

do_vfs_ioctl() implements most things like this:

static int do_vfs_ioctl(...) {
	switch (cmd) {
	/* many cases like the following: */
	case FITHAW:
		return ioctl_fsthaw(filp);
	/* ... */
	}
	return -ENOIOCTLCMD;
}

So I believe the scenario you want to avoid is the one where ioctl_fsthaw() or
one of the other functions return -ENOIOCTLCMD by accident, and where that will
then make the surrounding syscall implementation fall back to vfs_ioctl()
despite the cmd being listed as safe for Landlock?  Is that right?

Looking at do_vfs_ioctl() and its helper functions, I am getting the impression
that -ENOIOCTLCMD is only supposed to be returned at the very end of it, but not
by any of the helper functions?  If that were the case, we could maybe just as
well just solve that problem local to do_vfs_ioctl()?

A bit inelegant maybe, but just to get the idea across:

static int sanitize_enoioctlcmd(int res) {
	if (res == -ENOIOCTLCMD)
		return ENOTTY;
	return res;
}

static int do_vfs_ioctl(...) {
	switch (cmd) {
	/* many cases like the following: */
	case FITHAW:
		return sanitize_enoioctlcmd(ioctl_fsthaw(filp));
	/* ... */
	}
	return -ENOIOCTLCMD;
}

Would that be better?

—Günther


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-05 18:13       ` Günther Noack
@ 2024-03-06 13:47         ` Mickaël Salaün
  2024-03-06 15:18           ` Arnd Bergmann
  0 siblings, 1 reply; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-06 13:47 UTC (permalink / raw)
  To: Günther Noack, Paul Moore, Arnd Bergmann, Christian Brauner
  Cc: Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Tue, Mar 05, 2024 at 07:13:33PM +0100, Günther Noack wrote:
> Hello!
> 
> More questions than answers in this code review, but maybe this discusison will
> help to get a clearer picture about what we are going for here.
> 
> On Mon, Feb 19, 2024 at 07:35:39PM +0100, Mickaël Salaün wrote:
> > vfs_masks_device_ioctl() and vfs_masks_device_ioctl_compat() are useful
> > to differenciate between device driver IOCTL implementations and
> > filesystem ones.  The goal is to be able to filter well-defined IOCTLs
> > from per-device (i.e. namespaced) IOCTLs and control such access.
> > 
> > Add a new ioctl_compat() helper, similar to vfs_ioctl(), to wrap
> > compat_ioctl() calls and handle error conversions.
> > 
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Christian Brauner <brauner@kernel.org>
> > Cc: Günther Noack <gnoack@google.com>
> > ---
> >  fs/ioctl.c         | 101 +++++++++++++++++++++++++++++++++++++++++----
> >  include/linux/fs.h |  12 ++++++
> >  2 files changed, 105 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > index 76cf22ac97d7..f72c8da47d21 100644
> > --- a/fs/ioctl.c
> > +++ b/fs/ioctl.c
> > @@ -763,6 +763,38 @@ static int ioctl_fssetxattr(struct file *file, void __user *argp)
> >  	return err;
> >  }
> >  
> > +/*
> > + * Safeguard to maintain a list of valid IOCTLs handled by do_vfs_ioctl()
> > + * instead of def_blk_fops or def_chr_fops (see init_special_inode).
> > + */
> > +__attribute_const__ bool vfs_masked_device_ioctl(const unsigned int cmd)
> > +{
> > +	switch (cmd) {
> > +	case FIOCLEX:
> > +	case FIONCLEX:
> > +	case FIONBIO:
> > +	case FIOASYNC:
> > +	case FIOQSIZE:
> > +	case FIFREEZE:
> > +	case FITHAW:
> > +	case FS_IOC_FIEMAP:
> > +	case FIGETBSZ:
> > +	case FICLONE:
> > +	case FICLONERANGE:
> > +	case FIDEDUPERANGE:
> > +	/* FIONREAD is forwarded to device implementations. */
> > +	case FS_IOC_GETFLAGS:
> > +	case FS_IOC_SETFLAGS:
> > +	case FS_IOC_FSGETXATTR:
> > +	case FS_IOC_FSSETXATTR:
> > +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
> > +		return true;
> > +	default:
> > +		return false;
> > +	}
> > +}
> > +EXPORT_SYMBOL(vfs_masked_device_ioctl);
> 
> [
> Technical implementation notes about this function: the list of IOCTLs here are
> the same ones which do_vfs_ioctl() implements directly.
> 
> There are only two cases in which do_vfs_ioctl() does more complicated handling:
> 
> (1) FIONREAD falls back to the device's ioctl implemenetation.
>     Therefore, we omit FIONREAD in our own list - we do not want to allow that.
> (2) The default case falls back to the file_ioctl() function, but *only* for
>     S_ISREG() files, so it does not matter for the Landlock case.
> ]
> 
> 
> ## What we are actually trying to do (?)
> 
> Let me try to take a step back and paraphrase what I think we are *actually*
> trying to do here -- please correct me if I am wrong about that:
> 
> I think what we *really* are trying to do is to control from the Landlock LSM
> whether the filp->f_op->unlocked_ioctl() or filp->f_op->ioctl_compat()
> operations are getting called for device files.
> 
> So in a world where we cared only about correctness, we could create a new LSM
> hook security_file_vfs_ioctl(), which gets checked just before these two f_op
> operations get called.  With that, we could permit all IOCTLs that are
> implemented in fs/ioctl.c, and we could deny all IOCTL commands that are
> implemented in the device implementation.
> 
> I guess the reasons why we are not using that approach are performance, and that
> it might mess up the LSM hook interface with special cases that only Landlcok
> needs?  But it seems like it would be easier to reason about..?  Or maybe we can
> find a middle ground, where we have the existing hook return a special value
> with the meaning "permit this IOCTL, but do not invoke the f_op hook"?

Your security_file_vfs_ioctl() approach is simpler and better, I like
it!  From a performance point of view it should not change much because
either an LSM would use the current IOCTL hook or this new one.  Using a
flag with the current IOCTL hook would be a missed opportunity for
performance improvements because this hook could be called even if it is
not needed.

I don't think it would be worth it to create a new hook for compat and
non-compat mode because we want to control these IOCTLs the same way for
now, so it would not have a performance impact, but for consistency with
the current IOCTL hooks I guess Paul would prefer two new hooks:
security_file_vfs_ioctl() and security_file_vfs_ioctl_compat()?

Another approach would be to split the IOCTL hook into two: one for the
VFS layer and another for the underlying implementations.  However, it
looks like a difficult and brittle approach according to the current
IOCTL implementations.

Arnd, Christian, Paul, are you OK with this new hook proposal?

> 
> 
> ## What we implemented
> 
> Of course, the existing security_file_ioctl LSM hook works differently, and so
> with that hook, we need to make our blocking decision purely based on the struct
> file*, the IOCTL command number and the IOCTL argument.
> 
> So in order to make that decision correctly based on that information, we end up
> listing all the IOCTLs which are directly(!) implemented in do_vfs_ioctl(),
> because for Landlock, this is the list of IOCTL commands which is safe to permit
> on device files.  And we need to keep that list in sync with fs/ioctl.c, which
> is why it ended up in the same place in this commit.
> 
> 
> (Is it maybe possible to check with a KUnit test whether such lists are in sync?
> It sounds superficially like it should be feasible to create a device file which
> records whether its ioctl implementation was called.  So we could at least check
> that the Landlock command list is a subset of the do_vfs_ioctl() one.)
> 
> 
> > +
> >  /*
> >   * do_vfs_ioctl() is not for drivers and not intended to be EXPORT_SYMBOL()'d.
> >   * It's just a simple helper for sys_ioctl and compat_sys_ioctl.
> > @@ -858,6 +890,8 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
> >  {
> >  	struct fd f = fdget(fd);
> >  	int error;
> > +	const struct inode *inode;
> > +	bool is_device;
> >  
> >  	if (!f.file)
> >  		return -EBADF;
> > @@ -866,9 +900,18 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
> >  	if (error)
> >  		goto out;
> >  
> > +	inode = file_inode(f.file);
> > +	is_device = S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode);
> > +	if (is_device && !vfs_masked_device_ioctl(cmd)) {
> > +		error = vfs_ioctl(f.file, cmd, arg);
> > +		goto out;
> > +	}
> > +
> >  	error = do_vfs_ioctl(f.file, fd, cmd, arg);
> > -	if (error == -ENOIOCTLCMD)
> > +	if (error == -ENOIOCTLCMD) {
> > +		WARN_ON_ONCE(is_device);
> >  		error = vfs_ioctl(f.file, cmd, arg);
> > +	}
> 
> It is not obvious at first that adding this list requires a change to the ioctl
> syscall implementations.  If I understand this right, the idea is that you want
> to be 100% sure that we are not calling vfs_ioctl() for the commands in that
> list.

Correct

> And there is a scenario where this could potentially happen:
> 
> do_vfs_ioctl() implements most things like this:
> 
> static int do_vfs_ioctl(...) {
> 	switch (cmd) {
> 	/* many cases like the following: */
> 	case FITHAW:
> 		return ioctl_fsthaw(filp);
> 	/* ... */
> 	}
> 	return -ENOIOCTLCMD;
> }
> 
> So I believe the scenario you want to avoid is the one where ioctl_fsthaw() or
> one of the other functions return -ENOIOCTLCMD by accident, and where that will
> then make the surrounding syscall implementation fall back to vfs_ioctl()
> despite the cmd being listed as safe for Landlock?  Is that right?

Yes

> 
> Looking at do_vfs_ioctl() and its helper functions, I am getting the impression
> that -ENOIOCTLCMD is only supposed to be returned at the very end of it, but not
> by any of the helper functions?  If that were the case, we could maybe just as
> well just solve that problem local to do_vfs_ioctl()?
> 
> A bit inelegant maybe, but just to get the idea across:
> 
> static int sanitize_enoioctlcmd(int res) {
> 	if (res == -ENOIOCTLCMD)
> 		return ENOTTY;
> 	return res;
> }
> 
> static int do_vfs_ioctl(...) {
> 	switch (cmd) {
> 	/* many cases like the following: */
> 	case FITHAW:
> 		return sanitize_enoioctlcmd(ioctl_fsthaw(filp));
> 	/* ... */
> 	}
> 	return -ENOIOCTLCMD;
> }
> 
> Would that be better?

I guess so, but a bit more intrusive. Anyway, the new LSM hook would be
much cleaner and would require less intrusive changes in fs/ioctl.c

The ioctl_compat() helper from this patch could still be useful though.

> 
> —Günther
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-06 13:47         ` Mickaël Salaün
@ 2024-03-06 15:18           ` Arnd Bergmann
  2024-03-07 12:15             ` Christian Brauner
  0 siblings, 1 reply; 50+ messages in thread
From: Arnd Bergmann @ 2024-03-06 15:18 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack, Paul Moore,
	Christian Brauner
  Cc: Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
> On Tue, Mar 05, 2024 at 07:13:33PM +0100, Günther Noack wrote:
>> On Mon, Feb 19, 2024 at 07:35:39PM +0100, Mickaël Salaün wrote:

>> > +	case FS_IOC_FSGETXATTR:
>> > +	case FS_IOC_FSSETXATTR:
>> > +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
>> > +		return true;
>> > +	default:
>> > +		return false;
>> > +	}
>> > +}
>> > +EXPORT_SYMBOL(vfs_masked_device_ioctl);
>> 
>> [
>> Technical implementation notes about this function: the list of IOCTLs here are
>> the same ones which do_vfs_ioctl() implements directly.
>> 
>> There are only two cases in which do_vfs_ioctl() does more complicated handling:
>> 
>> (1) FIONREAD falls back to the device's ioctl implemenetation.
>>     Therefore, we omit FIONREAD in our own list - we do not want to allow that.

>> (2) The default case falls back to the file_ioctl() function, but *only* for
>>     S_ISREG() files, so it does not matter for the Landlock case.

How about changing do_vfs_ioctl() to return -ENOIOCTLCMD for
FIONREAD on special files? That way, the two cases become the
same.

>> I guess the reasons why we are not using that approach are performance, and that
>> it might mess up the LSM hook interface with special cases that only Landlcok
>> needs?  But it seems like it would be easier to reason about..?  Or maybe we can
>> find a middle ground, where we have the existing hook return a special value
>> with the meaning "permit this IOCTL, but do not invoke the f_op hook"?
>
> Your security_file_vfs_ioctl() approach is simpler and better, I like
> it!  From a performance point of view it should not change much because
> either an LSM would use the current IOCTL hook or this new one.  Using a
> flag with the current IOCTL hook would be a missed opportunity for
> performance improvements because this hook could be called even if it is
> not needed.
>
> I don't think it would be worth it to create a new hook for compat and
> non-compat mode because we want to control these IOCTLs the same way for
> now, so it would not have a performance impact, but for consistency with
> the current IOCTL hooks I guess Paul would prefer two new hooks:
> security_file_vfs_ioctl() and security_file_vfs_ioctl_compat()?
>
> Another approach would be to split the IOCTL hook into two: one for the
> VFS layer and another for the underlying implementations.  However, it
> looks like a difficult and brittle approach according to the current
> IOCTL implementations.
>
> Arnd, Christian, Paul, are you OK with this new hook proposal?

I think this sounds better. It would fit more closely into
the overall structure of the ioctl handlers with their multiple
levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
you have the same structure for sockets and blockdev, and
then additional levels below that and some weirdness for
things like tty, scsi or cdrom.

>> And there is a scenario where this could potentially happen:
>> 
>> do_vfs_ioctl() implements most things like this:
>> 
>> static int do_vfs_ioctl(...) {
>> 	switch (cmd) {
>> 	/* many cases like the following: */
>> 	case FITHAW:
>> 		return ioctl_fsthaw(filp);
>> 	/* ... */
>> 	}
>> 	return -ENOIOCTLCMD;
>> }
>> 
>> So I believe the scenario you want to avoid is the one where ioctl_fsthaw() or
>> one of the other functions return -ENOIOCTLCMD by accident, and where that will
>> then make the surrounding syscall implementation fall back to vfs_ioctl()
>> despite the cmd being listed as safe for Landlock?  Is that right?
>
> Yes

This does go against the normal structure a bit then, where
any of the commands is allowed to return -ENOIOCTLCMD specifically
for the purpose of passing control to the next level of
callbacks. Having the landlock hook explicitly at the
place where the callback is entered, as Günther suggested makes
much more sense to me then.

      Arnd

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-06 15:18           ` Arnd Bergmann
@ 2024-03-07 12:15             ` Christian Brauner
  2024-03-07 12:21               ` Arnd Bergmann
  0 siblings, 1 reply; 50+ messages in thread
From: Christian Brauner @ 2024-03-07 12:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mickaël Salaün, Günther Noack, Paul Moore,
	Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Wed, Mar 06, 2024 at 04:18:53PM +0100, Arnd Bergmann wrote:
> On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
> > On Tue, Mar 05, 2024 at 07:13:33PM +0100, Günther Noack wrote:
> >> On Mon, Feb 19, 2024 at 07:35:39PM +0100, Mickaël Salaün wrote:
> 
> >> > +	case FS_IOC_FSGETXATTR:
> >> > +	case FS_IOC_FSSETXATTR:
> >> > +	/* file_ioctl()'s IOCTLs are forwarded to device implementations. */
> >> > +		return true;
> >> > +	default:
> >> > +		return false;
> >> > +	}
> >> > +}
> >> > +EXPORT_SYMBOL(vfs_masked_device_ioctl);
> >> 
> >> [
> >> Technical implementation notes about this function: the list of IOCTLs here are
> >> the same ones which do_vfs_ioctl() implements directly.
> >> 
> >> There are only two cases in which do_vfs_ioctl() does more complicated handling:
> >> 
> >> (1) FIONREAD falls back to the device's ioctl implemenetation.
> >>     Therefore, we omit FIONREAD in our own list - we do not want to allow that.
> 
> >> (2) The default case falls back to the file_ioctl() function, but *only* for
> >>     S_ISREG() files, so it does not matter for the Landlock case.
> 
> How about changing do_vfs_ioctl() to return -ENOIOCTLCMD for
> FIONREAD on special files? That way, the two cases become the
> same.
> 
> >> I guess the reasons why we are not using that approach are performance, and that
> >> it might mess up the LSM hook interface with special cases that only Landlcok
> >> needs?  But it seems like it would be easier to reason about..?  Or maybe we can
> >> find a middle ground, where we have the existing hook return a special value
> >> with the meaning "permit this IOCTL, but do not invoke the f_op hook"?
> >
> > Your security_file_vfs_ioctl() approach is simpler and better, I like
> > it!  From a performance point of view it should not change much because
> > either an LSM would use the current IOCTL hook or this new one.  Using a
> > flag with the current IOCTL hook would be a missed opportunity for
> > performance improvements because this hook could be called even if it is
> > not needed.
> >
> > I don't think it would be worth it to create a new hook for compat and
> > non-compat mode because we want to control these IOCTLs the same way for
> > now, so it would not have a performance impact, but for consistency with
> > the current IOCTL hooks I guess Paul would prefer two new hooks:
> > security_file_vfs_ioctl() and security_file_vfs_ioctl_compat()?
> >
> > Another approach would be to split the IOCTL hook into two: one for the
> > VFS layer and another for the underlying implementations.  However, it
> > looks like a difficult and brittle approach according to the current
> > IOCTL implementations.
> >
> > Arnd, Christian, Paul, are you OK with this new hook proposal?
> 
> I think this sounds better. It would fit more closely into
> the overall structure of the ioctl handlers with their multiple
> levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
> you have the same structure for sockets and blockdev, and
> then additional levels below that and some weirdness for
> things like tty, scsi or cdrom.

So an additional security hook called from tty, scsi, or cdrom?
And the original hook is left where it is right now?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-07 12:15             ` Christian Brauner
@ 2024-03-07 12:21               ` Arnd Bergmann
  2024-03-07 12:57                 ` Günther Noack
  0 siblings, 1 reply; 50+ messages in thread
From: Arnd Bergmann @ 2024-03-07 12:21 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Mickaël Salaün, Günther Noack, Paul Moore,
	Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Thu, Mar 7, 2024, at 13:15, Christian Brauner wrote:
> On Wed, Mar 06, 2024 at 04:18:53PM +0100, Arnd Bergmann wrote:
>> On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
>> >
>> > Arnd, Christian, Paul, are you OK with this new hook proposal?
>> 
>> I think this sounds better. It would fit more closely into
>> the overall structure of the ioctl handlers with their multiple
>> levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
>> you have the same structure for sockets and blockdev, and
>> then additional levels below that and some weirdness for
>> things like tty, scsi or cdrom.
>
> So an additional security hook called from tty, scsi, or cdrom?
> And the original hook is left where it is right now?

For the moment, I think adding another hook in vfs_ioctl()
and the corresponding compat path would do what Mickaël
wants. Beyond that, we could consider having hooks in
socket and block ioctls if needed as they are easy to
filter out based on inode->i_mode.

The tty/scsi/cdrom hooks would be harder to do, let's assume
for now that we don't need them.

      Arnd

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-07 12:21               ` Arnd Bergmann
@ 2024-03-07 12:57                 ` Günther Noack
  2024-03-07 20:40                   ` Paul Moore
  0 siblings, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-03-07 12:57 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christian Brauner, Mickaël Salaün, Paul Moore,
	Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Thu, Mar 07, 2024 at 01:21:48PM +0100, Arnd Bergmann wrote:
> On Thu, Mar 7, 2024, at 13:15, Christian Brauner wrote:
> > On Wed, Mar 06, 2024 at 04:18:53PM +0100, Arnd Bergmann wrote:
> >> On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
> >> >
> >> > Arnd, Christian, Paul, are you OK with this new hook proposal?
> >> 
> >> I think this sounds better. It would fit more closely into
> >> the overall structure of the ioctl handlers with their multiple
> >> levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
> >> you have the same structure for sockets and blockdev, and
> >> then additional levels below that and some weirdness for
> >> things like tty, scsi or cdrom.
> >
> > So an additional security hook called from tty, scsi, or cdrom?
> > And the original hook is left where it is right now?
> 
> For the moment, I think adding another hook in vfs_ioctl()
> and the corresponding compat path would do what Mickaël
> wants. Beyond that, we could consider having hooks in
> socket and block ioctls if needed as they are easy to
> filter out based on inode->i_mode.
> 
> The tty/scsi/cdrom hooks would be harder to do, let's assume
> for now that we don't need them.

Thank you all for the help!

Yes, tty/scsi/cdrom are just examples.  We do not need special features for
these for Landlock right now.

What I would do is to invoke the new LSM hook in the following two places in
fs/ioctl.c:

1) at the top of vfs_ioctl()
2) at the top of ioctl_compat()

(Both of these functions are just invoking the f_op->unlocked_ioctl() and
f_op->compat_ioctl() operations with a safeguard for that being a NULL pointer.)

The intent is that the new hook gets called everytime before an ioctl is sent to
these IOCTL operations in f_op, so that the LSM can distinguish cleanly between
the "safe" IOCTLs that are implemented fully within fs/ioctl.c and the
"potentially unsafe" IOCTLs which are implemented by these hooks (as it is
unrealistic for us to holistically reason about the safety of all possible
implementations).

The alternative approach where we try to do the same based on the existing LSM
IOCTL hook resulted in the patch further up in this mail thread - it involves
maintaining a list of "safe" IOCTL commands, and it is difficult to guarantee
that these lists of IOCTL commands stay in sync.

Christian, does that make sense in your mind?

Thanks,
—Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-07 12:57                 ` Günther Noack
@ 2024-03-07 20:40                   ` Paul Moore
  2024-03-07 23:09                     ` Dave Chinner
  0 siblings, 1 reply; 50+ messages in thread
From: Paul Moore @ 2024-03-07 20:40 UTC (permalink / raw)
  To: Günther Noack, Mickaël Salaün
  Cc: Arnd Bergmann, Christian Brauner, Allen Webb, Dmitry Torokhov,
	Jeff Xu, Jorge Lucangeli Obes, Konstantin Meskhidze,
	Matt Bobrowski, linux-fsdevel, linux-security-module

On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> On Thu, Mar 07, 2024 at 01:21:48PM +0100, Arnd Bergmann wrote:
> > On Thu, Mar 7, 2024, at 13:15, Christian Brauner wrote:
> > > On Wed, Mar 06, 2024 at 04:18:53PM +0100, Arnd Bergmann wrote:
> > >> On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
> > >> >
> > >> > Arnd, Christian, Paul, are you OK with this new hook proposal?
> > >>
> > >> I think this sounds better. It would fit more closely into
> > >> the overall structure of the ioctl handlers with their multiple
> > >> levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
> > >> you have the same structure for sockets and blockdev, and
> > >> then additional levels below that and some weirdness for
> > >> things like tty, scsi or cdrom.
> > >
> > > So an additional security hook called from tty, scsi, or cdrom?
> > > And the original hook is left where it is right now?
> >
> > For the moment, I think adding another hook in vfs_ioctl()
> > and the corresponding compat path would do what Mickaël
> > wants. Beyond that, we could consider having hooks in
> > socket and block ioctls if needed as they are easy to
> > filter out based on inode->i_mode.
> >
> > The tty/scsi/cdrom hooks would be harder to do, let's assume
> > for now that we don't need them.
>
> Thank you all for the help!
>
> Yes, tty/scsi/cdrom are just examples.  We do not need special features for
> these for Landlock right now.
>
> What I would do is to invoke the new LSM hook in the following two places in
> fs/ioctl.c:
>
> 1) at the top of vfs_ioctl()
> 2) at the top of ioctl_compat()
>
> (Both of these functions are just invoking the f_op->unlocked_ioctl() and
> f_op->compat_ioctl() operations with a safeguard for that being a NULL pointer.)
>
> The intent is that the new hook gets called everytime before an ioctl is sent to
> these IOCTL operations in f_op, so that the LSM can distinguish cleanly between
> the "safe" IOCTLs that are implemented fully within fs/ioctl.c and the
> "potentially unsafe" IOCTLs which are implemented by these hooks (as it is
> unrealistic for us to holistically reason about the safety of all possible
> implementations).
>
> The alternative approach where we try to do the same based on the existing LSM
> IOCTL hook resulted in the patch further up in this mail thread - it involves
> maintaining a list of "safe" IOCTL commands, and it is difficult to guarantee
> that these lists of IOCTL commands stay in sync.

I need some more convincing as to why we need to introduce these new
hooks, or even the vfs_masked_device_ioctl() classifier as originally
proposed at the top of this thread.  I believe I understand why
Landlock wants this, but I worry that we all might have different
definitions of a "safe" ioctl list, and encoding a definition into the
LSM hooks seems like a bad idea to me.

At this point in time, I think I'd rather see LSMs that care about
ioctls maintaining their own list of "safe" ioctls and after a while
if it looks like everyone is in agreement (VFS folks, individual LSMs,
etc.) we can look into either an ioctl classifier or multiple LSM
ioctl hooks focused on different categories of ioctls.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-07 20:40                   ` Paul Moore
@ 2024-03-07 23:09                     ` Dave Chinner
  2024-03-07 23:35                       ` Paul Moore
  2024-03-08  7:02                       ` Arnd Bergmann
  0 siblings, 2 replies; 50+ messages in thread
From: Dave Chinner @ 2024-03-07 23:09 UTC (permalink / raw)
  To: Paul Moore
  Cc: Günther Noack, Mickaël Salaün, Arnd Bergmann,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> > On Thu, Mar 07, 2024 at 01:21:48PM +0100, Arnd Bergmann wrote:
> > > On Thu, Mar 7, 2024, at 13:15, Christian Brauner wrote:
> > > > On Wed, Mar 06, 2024 at 04:18:53PM +0100, Arnd Bergmann wrote:
> > > >> On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
> > > >> >
> > > >> > Arnd, Christian, Paul, are you OK with this new hook proposal?
> > > >>
> > > >> I think this sounds better. It would fit more closely into
> > > >> the overall structure of the ioctl handlers with their multiple
> > > >> levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
> > > >> you have the same structure for sockets and blockdev, and
> > > >> then additional levels below that and some weirdness for
> > > >> things like tty, scsi or cdrom.
> > > >
> > > > So an additional security hook called from tty, scsi, or cdrom?
> > > > And the original hook is left where it is right now?
> > >
> > > For the moment, I think adding another hook in vfs_ioctl()
> > > and the corresponding compat path would do what Mickaël
> > > wants. Beyond that, we could consider having hooks in
> > > socket and block ioctls if needed as they are easy to
> > > filter out based on inode->i_mode.
> > >
> > > The tty/scsi/cdrom hooks would be harder to do, let's assume
> > > for now that we don't need them.
> >
> > Thank you all for the help!
> >
> > Yes, tty/scsi/cdrom are just examples.  We do not need special features for
> > these for Landlock right now.
> >
> > What I would do is to invoke the new LSM hook in the following two places in
> > fs/ioctl.c:
> >
> > 1) at the top of vfs_ioctl()
> > 2) at the top of ioctl_compat()
> >
> > (Both of these functions are just invoking the f_op->unlocked_ioctl() and
> > f_op->compat_ioctl() operations with a safeguard for that being a NULL pointer.)
> >
> > The intent is that the new hook gets called everytime before an ioctl is sent to
> > these IOCTL operations in f_op, so that the LSM can distinguish cleanly between
> > the "safe" IOCTLs that are implemented fully within fs/ioctl.c and the
> > "potentially unsafe" IOCTLs which are implemented by these hooks (as it is
> > unrealistic for us to holistically reason about the safety of all possible
> > implementations).
> >
> > The alternative approach where we try to do the same based on the existing LSM
> > IOCTL hook resulted in the patch further up in this mail thread - it involves
> > maintaining a list of "safe" IOCTL commands, and it is difficult to guarantee
> > that these lists of IOCTL commands stay in sync.
> 
> I need some more convincing as to why we need to introduce these new
> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> proposed at the top of this thread.  I believe I understand why
> Landlock wants this, but I worry that we all might have different
> definitions of a "safe" ioctl list, and encoding a definition into the
> LSM hooks seems like a bad idea to me.

I have no idea what a "safe" ioctl means here. Subsystems already
restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
"safe" clearly means something different here.

> At this point in time, I think I'd rather see LSMs that care about
> ioctls maintaining their own list of "safe" ioctls and after a while
> if it looks like everyone is in agreement (VFS folks, individual LSMs,
> etc.) we can look into either an ioctl classifier or multiple LSM
> ioctl hooks focused on different categories of ioctls.

From the perspective of a VFS and subsystem developer, I really have
no clue what would make a "safe" ioctl from a LSM perspective, and I
very much doubt an LSM developer has any clue whether deep, dark
subsystem ioctls are "safe" to allow, or even what would stop
working if they decided something was not "safe".

This just seems like a complex recipe for creating unusable and/or
impossible to configure/secure systems to me.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-07 23:09                     ` Dave Chinner
@ 2024-03-07 23:35                       ` Paul Moore
  2024-03-08  7:02                       ` Arnd Bergmann
  1 sibling, 0 replies; 50+ messages in thread
From: Paul Moore @ 2024-03-07 23:35 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Günther Noack, Mickaël Salaün, Arnd Bergmann,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Thu, Mar 7, 2024 at 6:09 PM Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> > On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> > > On Thu, Mar 07, 2024 at 01:21:48PM +0100, Arnd Bergmann wrote:
> > > > On Thu, Mar 7, 2024, at 13:15, Christian Brauner wrote:
> > > > > On Wed, Mar 06, 2024 at 04:18:53PM +0100, Arnd Bergmann wrote:
> > > > >> On Wed, Mar 6, 2024, at 14:47, Mickaël Salaün wrote:
> > > > >> >
> > > > >> > Arnd, Christian, Paul, are you OK with this new hook proposal?
> > > > >>
> > > > >> I think this sounds better. It would fit more closely into
> > > > >> the overall structure of the ioctl handlers with their multiple
> > > > >> levels, where below vfs_ioctl() calling into f_ops->unlocked_ioctl,
> > > > >> you have the same structure for sockets and blockdev, and
> > > > >> then additional levels below that and some weirdness for
> > > > >> things like tty, scsi or cdrom.
> > > > >
> > > > > So an additional security hook called from tty, scsi, or cdrom?
> > > > > And the original hook is left where it is right now?
> > > >
> > > > For the moment, I think adding another hook in vfs_ioctl()
> > > > and the corresponding compat path would do what Mickaël
> > > > wants. Beyond that, we could consider having hooks in
> > > > socket and block ioctls if needed as they are easy to
> > > > filter out based on inode->i_mode.
> > > >
> > > > The tty/scsi/cdrom hooks would be harder to do, let's assume
> > > > for now that we don't need them.
> > >
> > > Thank you all for the help!
> > >
> > > Yes, tty/scsi/cdrom are just examples.  We do not need special features for
> > > these for Landlock right now.
> > >
> > > What I would do is to invoke the new LSM hook in the following two places in
> > > fs/ioctl.c:
> > >
> > > 1) at the top of vfs_ioctl()
> > > 2) at the top of ioctl_compat()
> > >
> > > (Both of these functions are just invoking the f_op->unlocked_ioctl() and
> > > f_op->compat_ioctl() operations with a safeguard for that being a NULL pointer.)
> > >
> > > The intent is that the new hook gets called everytime before an ioctl is sent to
> > > these IOCTL operations in f_op, so that the LSM can distinguish cleanly between
> > > the "safe" IOCTLs that are implemented fully within fs/ioctl.c and the
> > > "potentially unsafe" IOCTLs which are implemented by these hooks (as it is
> > > unrealistic for us to holistically reason about the safety of all possible
> > > implementations).
> > >
> > > The alternative approach where we try to do the same based on the existing LSM
> > > IOCTL hook resulted in the patch further up in this mail thread - it involves
> > > maintaining a list of "safe" IOCTL commands, and it is difficult to guarantee
> > > that these lists of IOCTL commands stay in sync.
> >
> > I need some more convincing as to why we need to introduce these new
> > hooks, or even the vfs_masked_device_ioctl() classifier as originally
> > proposed at the top of this thread.  I believe I understand why
> > Landlock wants this, but I worry that we all might have different
> > definitions of a "safe" ioctl list, and encoding a definition into the
> > LSM hooks seems like a bad idea to me.
>
> I have no idea what a "safe" ioctl means here. Subsystems already
> restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> "safe" clearly means something different here.

That's the point I was trying to make.  I'm not sure exactly what
Günther meant either (I was simply copying his idea of a "safe" ioctl,
complete with all of the associations around the double quotes), which
helps underscore the idea that different groups are likely to have
different ideas of what ioctls they want to allow based on their
security model, environment, etc.

> > At this point in time, I think I'd rather see LSMs that care about
> > ioctls maintaining their own list of "safe" ioctls and after a while
> > if it looks like everyone is in agreement (VFS folks, individual LSMs,
> > etc.) we can look into either an ioctl classifier or multiple LSM
> > ioctl hooks focused on different categories of ioctls.
>
> From the perspective of a VFS and subsystem developer, I really have
> no clue what would make a "safe" ioctl from a LSM perspective ...

We also need to keep in mind that we have multiple LSM implementations
and we need to support different ideas around how to control access to
ioctls, including which ioctls are "safe" for multiple definitions of
the word.

> ... and I
> very much doubt an LSM developer has any clue whether deep, dark
> subsystem ioctls are "safe" to allow, or even what would stop
> working if they decided something was not "safe".

... or for those LSMs with configurable security policies, which
ioctls are allowed by the LSM's policy developer to fit their
particular needs.

> This just seems like a complex recipe for creating unusable and/or
> impossible to configure/secure systems to me.

FWIW, Android has been using the existing LSM ioctl controls for
several years now to help increase the security of Android devices.

https://security.googleblog.com/2016/07/protecting-android-with-more-linux.html
https://kernsec.org/files/lss2015/vanderstoep.pdf

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-07 23:09                     ` Dave Chinner
  2024-03-07 23:35                       ` Paul Moore
@ 2024-03-08  7:02                       ` Arnd Bergmann
  2024-03-08  9:29                         ` Mickaël Salaün
  2024-03-08 11:03                         ` Günther Noack
  1 sibling, 2 replies; 50+ messages in thread
From: Arnd Bergmann @ 2024-03-08  7:02 UTC (permalink / raw)
  To: Dave Chinner, Paul Moore
  Cc: Günther Noack, Mickaël Salaün, Christian Brauner,
	Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
>> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
>> I need some more convincing as to why we need to introduce these new
>> hooks, or even the vfs_masked_device_ioctl() classifier as originally
>> proposed at the top of this thread.  I believe I understand why
>> Landlock wants this, but I worry that we all might have different
>> definitions of a "safe" ioctl list, and encoding a definition into the
>> LSM hooks seems like a bad idea to me.
>
> I have no idea what a "safe" ioctl means here. Subsystems already
> restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> "safe" clearly means something different here.

That was my problem with the first version as well, but I think
drawing the line between "implemented in fs/ioctl.c" and
"implemented in a random device driver fops->unlock_ioctl()"
seems like a more helpful definition.

This won't just protect from calling into drivers that are lacking
a CAP_SYS_ADMIN check, but also from those that end up being
harmful regardless of the ioctl command code passed into them
because of stupid driver bugs.

      Arnd

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08  7:02                       ` Arnd Bergmann
@ 2024-03-08  9:29                         ` Mickaël Salaün
  2024-03-08 19:22                           ` Paul Moore
  2024-03-08 11:03                         ` Günther Noack
  1 sibling, 1 reply; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-08  9:29 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Dave Chinner, Paul Moore, Günther Noack, Christian Brauner,
	Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> >> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> >> I need some more convincing as to why we need to introduce these new
> >> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> >> proposed at the top of this thread.  I believe I understand why
> >> Landlock wants this, but I worry that we all might have different
> >> definitions of a "safe" ioctl list, and encoding a definition into the
> >> LSM hooks seems like a bad idea to me.
> >
> > I have no idea what a "safe" ioctl means here. Subsystems already
> > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > "safe" clearly means something different here.
> 
> That was my problem with the first version as well, but I think
> drawing the line between "implemented in fs/ioctl.c" and
> "implemented in a random device driver fops->unlock_ioctl()"
> seems like a more helpful definition.
> 
> This won't just protect from calling into drivers that are lacking
> a CAP_SYS_ADMIN check, but also from those that end up being
> harmful regardless of the ioctl command code passed into them
> because of stupid driver bugs.

Indeed.

"safe" is definitely not the right word, it is too broad, relative to
use cases and threat models.  There is no "safe" IOCTL.

Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
sandbox".

Our assumptions are (in the context of Landlock):

1. There are IOCTLs tied to file types (e.g. block device with
   major/minor) that can easily be identified from user space (e.g. with
   the path name and file's metadata).  /dev/* files make sense for user
   space and they scope to a specific use case (with relative
   privileges).  This category of IOCTLs is implemented in standalone
   device drivers (for most of them).

2. Most user space processes should not be denied access to IOCTLs that
   are managed by the VFS layer or the underlying filesystem
   implementations.  For instance, the do_vfs_ioctl()'s ones (e.g.
   FIOCLEX, FIONREAD) should always be allowed because they may be
   required to legitimately use files, and for performance and security
   reasons (e.g. fs-crypt, fsverity implemented at the filesystem layer).
   Moreover, these IOCTLs should already check the read/write permission
   (on the related FD), which is not the case for most block/char device
   IOCTL.

3. IOCTLs to pipes and sockets are out of scope.  They should always be
   allowed for now because they don't directly expose files' data but
   IPCs instead, and we are focusing on FS access rights for now.

We want to add a new LANDLOCK_ACCESS_FS_IOCTL_DEV right that could match
on char/block device's specific IOCTLs, but it would not have any impact
on other IOCTLs which would then always be allowed (if the sandboxed
process is allowed to open the file).

Because IOCTLs are implemented in layers and all IOCTLs commands live in
the same 32-bit namespace, we need a way to identify the layer
implemented by block and character devices.  The new LSM hook proposal
enables us to cleanly and efficiently identify the char/block device
IOCTL layer with an additional check on the file type.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08  7:02                       ` Arnd Bergmann
  2024-03-08  9:29                         ` Mickaël Salaün
@ 2024-03-08 11:03                         ` Günther Noack
  2024-03-11  1:03                           ` Dave Chinner
  1 sibling, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-03-08 11:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Dave Chinner, Paul Moore, Mickaël Salaün,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> >> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> >> I need some more convincing as to why we need to introduce these new
> >> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> >> proposed at the top of this thread.  I believe I understand why
> >> Landlock wants this, but I worry that we all might have different
> >> definitions of a "safe" ioctl list, and encoding a definition into the
> >> LSM hooks seems like a bad idea to me.
> >
> > I have no idea what a "safe" ioctl means here. Subsystems already
> > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > "safe" clearly means something different here.
> 
> That was my problem with the first version as well, but I think
> drawing the line between "implemented in fs/ioctl.c" and
> "implemented in a random device driver fops->unlock_ioctl()"
> seems like a more helpful definition.

Yes, sorry for the confusion - that is exactly what I meant to say with "safe".:

Those are the IOCTL commands implemented in fs/ioctl.c which do not go through
f_ops->unlocked_ioctl (or the compat equivalent).

We want to give people a way with Landlock so that they can restrict the use of
device-driver implemented IOCTLs, but where they can keep using the bulk of
more harmless IOCTLs in fs/ioctl.c.

> This won't just protect from calling into drivers that are lacking
> a CAP_SYS_ADMIN check, but also from those that end up being
> harmful regardless of the ioctl command code passed into them
> because of stupid driver bugs.

Exactly -- there is a surprising number of f_ops->unlocked_ioctl implementations
that already run various resource management and locking logic before even
looking at the command number.  That means that even the command numbers that
are not implemented there are executing code in the driver layer, before the
IOCTL returns with an error.

So the f_ops->unlocked_ioctl() invocation is in itself increasing the surface of
exposed functionality, even completely independent of the command number.  Which
makes the invocation of f_ops->unlocked_ioctl() a security boundary that we
would like to restrict.

—Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08  9:29                         ` Mickaël Salaün
@ 2024-03-08 19:22                           ` Paul Moore
  2024-03-08 20:12                             ` Mickaël Salaün
  0 siblings, 1 reply; 50+ messages in thread
From: Paul Moore @ 2024-03-08 19:22 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Arnd Bergmann, Dave Chinner, Günther Noack,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
> On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> > >> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> > >> I need some more convincing as to why we need to introduce these new
> > >> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> > >> proposed at the top of this thread.  I believe I understand why
> > >> Landlock wants this, but I worry that we all might have different
> > >> definitions of a "safe" ioctl list, and encoding a definition into the
> > >> LSM hooks seems like a bad idea to me.
> > >
> > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > "safe" clearly means something different here.
> >
> > That was my problem with the first version as well, but I think
> > drawing the line between "implemented in fs/ioctl.c" and
> > "implemented in a random device driver fops->unlock_ioctl()"
> > seems like a more helpful definition.
> >
> > This won't just protect from calling into drivers that are lacking
> > a CAP_SYS_ADMIN check, but also from those that end up being
> > harmful regardless of the ioctl command code passed into them
> > because of stupid driver bugs.
>
> Indeed.
>
> "safe" is definitely not the right word, it is too broad, relative to
> use cases and threat models.  There is no "safe" IOCTL.
>
> Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
> sandbox".

Which is a problem from a LSM perspective as we want to avoid hooks
which are tightly bound to a single LSM or security model.  It's okay
if a new hook only has a single LSM implementation, but the hook's
definition should be such that it is reasonably generalized to support
multiple LSM/models.

> Our assumptions are (in the context of Landlock):
>
> 1. There are IOCTLs tied to file types (e.g. block device with
>    major/minor) that can easily be identified from user space (e.g. with
>    the path name and file's metadata).  /dev/* files make sense for user
>    space and they scope to a specific use case (with relative
>    privileges).  This category of IOCTLs is implemented in standalone
>    device drivers (for most of them).
>
> 2. Most user space processes should not be denied access to IOCTLs that
>    are managed by the VFS layer or the underlying filesystem
>    implementations.  For instance, the do_vfs_ioctl()'s ones (e.g.
>    FIOCLEX, FIONREAD) should always be allowed because they may be
>    required to legitimately use files, and for performance and security
>    reasons (e.g. fs-crypt, fsverity implemented at the filesystem layer).
>    Moreover, these IOCTLs should already check the read/write permission
>    (on the related FD), which is not the case for most block/char device
>    IOCTL.
>
> 3. IOCTLs to pipes and sockets are out of scope.  They should always be
>    allowed for now because they don't directly expose files' data but
>    IPCs instead, and we are focusing on FS access rights for now.
>
> We want to add a new LANDLOCK_ACCESS_FS_IOCTL_DEV right that could match
> on char/block device's specific IOCTLs, but it would not have any impact
> on other IOCTLs which would then always be allowed (if the sandboxed
> process is allowed to open the file).
>
> Because IOCTLs are implemented in layers and all IOCTLs commands live in
> the same 32-bit namespace, we need a way to identify the layer
> implemented by block and character devices.  The new LSM hook proposal
> enables us to cleanly and efficiently identify the char/block device
> IOCTL layer with an additional check on the file type.

I guess I should wait until there is an actual patch, but as of right
now a VFS ioctl specific LSM hook looks far too limited to me and
isn't something I can support at this point in time.  It's obviously
limited to only a subset of the ioctls, meaning that in order to have
comprehensive coverage we would either need to implement a full range
of subsystem ioctl hooks (ugh), or just use the existing
security_file_ioctl().  I understand that this makes things a bit more
complicated for Landlock's initial ioctl implementation, but
considering my thoughts above and the fact that Landlock's ioctl
protections are still evolving I'd rather not add a lot of extra hooks
right now.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08 19:22                           ` Paul Moore
@ 2024-03-08 20:12                             ` Mickaël Salaün
  2024-03-08 22:04                               ` Casey Schaufler
  2024-03-08 22:25                               ` Paul Moore
  0 siblings, 2 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-08 20:12 UTC (permalink / raw)
  To: Paul Moore
  Cc: Arnd Bergmann, Dave Chinner, Günther Noack,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Fri, Mar 08, 2024 at 02:22:58PM -0500, Paul Moore wrote:
> On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
> > On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > > On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> > > >> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> > > >> I need some more convincing as to why we need to introduce these new
> > > >> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> > > >> proposed at the top of this thread.  I believe I understand why
> > > >> Landlock wants this, but I worry that we all might have different
> > > >> definitions of a "safe" ioctl list, and encoding a definition into the
> > > >> LSM hooks seems like a bad idea to me.
> > > >
> > > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > > "safe" clearly means something different here.
> > >
> > > That was my problem with the first version as well, but I think
> > > drawing the line between "implemented in fs/ioctl.c" and
> > > "implemented in a random device driver fops->unlock_ioctl()"
> > > seems like a more helpful definition.
> > >
> > > This won't just protect from calling into drivers that are lacking
> > > a CAP_SYS_ADMIN check, but also from those that end up being
> > > harmful regardless of the ioctl command code passed into them
> > > because of stupid driver bugs.
> >
> > Indeed.
> >
> > "safe" is definitely not the right word, it is too broad, relative to
> > use cases and threat models.  There is no "safe" IOCTL.
> >
> > Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
> > sandbox".
> 
> Which is a problem from a LSM perspective as we want to avoid hooks
> which are tightly bound to a single LSM or security model.  It's okay
> if a new hook only has a single LSM implementation, but the hook's
> definition should be such that it is reasonably generalized to support
> multiple LSM/models.

As any new hook, there is a first user.  Obviously this new hook would
not be restricted to Landlock, it is a generic approach.  I'm pretty
sure a few hooks are only used by one LSM though. ;)

> 
> > Our assumptions are (in the context of Landlock):
> >
> > 1. There are IOCTLs tied to file types (e.g. block device with
> >    major/minor) that can easily be identified from user space (e.g. with
> >    the path name and file's metadata).  /dev/* files make sense for user
> >    space and they scope to a specific use case (with relative
> >    privileges).  This category of IOCTLs is implemented in standalone
> >    device drivers (for most of them).
> >
> > 2. Most user space processes should not be denied access to IOCTLs that
> >    are managed by the VFS layer or the underlying filesystem
> >    implementations.  For instance, the do_vfs_ioctl()'s ones (e.g.
> >    FIOCLEX, FIONREAD) should always be allowed because they may be
> >    required to legitimately use files, and for performance and security
> >    reasons (e.g. fs-crypt, fsverity implemented at the filesystem layer).
> >    Moreover, these IOCTLs should already check the read/write permission
> >    (on the related FD), which is not the case for most block/char device
> >    IOCTL.
> >
> > 3. IOCTLs to pipes and sockets are out of scope.  They should always be
> >    allowed for now because they don't directly expose files' data but
> >    IPCs instead, and we are focusing on FS access rights for now.
> >
> > We want to add a new LANDLOCK_ACCESS_FS_IOCTL_DEV right that could match
> > on char/block device's specific IOCTLs, but it would not have any impact
> > on other IOCTLs which would then always be allowed (if the sandboxed
> > process is allowed to open the file).
> >
> > Because IOCTLs are implemented in layers and all IOCTLs commands live in
> > the same 32-bit namespace, we need a way to identify the layer
> > implemented by block and character devices.  The new LSM hook proposal
> > enables us to cleanly and efficiently identify the char/block device
> > IOCTL layer with an additional check on the file type.
> 
> I guess I should wait until there is an actual patch, but as of right
> now a VFS ioctl specific LSM hook looks far too limited to me and
> isn't something I can support at this point in time.  It's obviously
> limited to only a subset of the ioctls, meaning that in order to have
> comprehensive coverage we would either need to implement a full range
> of subsystem ioctl hooks (ugh), or just use the existing
> security_file_ioctl().

I think there is a misunderstanding.  The subset of IOCTL commands the
new hook will see would be 99% of them (i.e. all except those
implemented in fs/ioctl.c).  Being able to only handle this (big) subset
would empower LSMs to control IOCTL commands without collision (e.g. the
same command/value may have different meanings according to the
implementation/layer), which is not currently possible (without manual
tweaking).

This proposal is to add a new hook for the layer just beneath the VFS
catch-all IOCTL implementation.  This layer can then differentiate
between the underlying implementation according to the file properties.
There is no need for additional hooks for other layers/subsystems.

The existing security_file_ioctl() hook is useful to catch all IOCTL
commands, but it doesn't enable to identify the underlying target and
then the semantic of the command.  Furthermore, as Günther said, an
IOCTL call can already do kernel operations without looking at the
command, but we would then be able to identify that by looking at the
char/block device file for instance.

> I understand that this makes things a bit more
> complicated for Landlock's initial ioctl implementation, but
> considering my thoughts above and the fact that Landlock's ioctl
> protections are still evolving I'd rather not add a lot of extra hooks
> right now.

Without this hook, we'll need to rely on a list of allowed IOCTLs, which
will be out-of-sync eventually.  It would be a maintenance burden and an
hacky approach.

We're definitely open to new proposals, but until now this is the best
approach we found from a maintenance, performance, and security point of
view.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08 20:12                             ` Mickaël Salaün
@ 2024-03-08 22:04                               ` Casey Schaufler
  2024-03-08 22:25                               ` Paul Moore
  1 sibling, 0 replies; 50+ messages in thread
From: Casey Schaufler @ 2024-03-08 22:04 UTC (permalink / raw)
  To: Mickaël Salaün, Paul Moore
  Cc: Arnd Bergmann, Dave Chinner, Günther Noack,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module, Casey Schaufler

On 3/8/2024 12:12 PM, Mickaël Salaün wrote:
> On Fri, Mar 08, 2024 at 02:22:58PM -0500, Paul Moore wrote:
>> On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
>>> On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
>>>> On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
>>>>> On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
>>>>>> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
>>>>>> I need some more convincing as to why we need to introduce these new
>>>>>> hooks, or even the vfs_masked_device_ioctl() classifier as originally
>>>>>> proposed at the top of this thread.  I believe I understand why
>>>>>> Landlock wants this, but I worry that we all might have different
>>>>>> definitions of a "safe" ioctl list, and encoding a definition into the
>>>>>> LSM hooks seems like a bad idea to me.
>>>>> I have no idea what a "safe" ioctl means here. Subsystems already
>>>>> restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
>>>>> "safe" clearly means something different here.
>>>> That was my problem with the first version as well, but I think
>>>> drawing the line between "implemented in fs/ioctl.c" and
>>>> "implemented in a random device driver fops->unlock_ioctl()"
>>>> seems like a more helpful definition.
>>>>
>>>> This won't just protect from calling into drivers that are lacking
>>>> a CAP_SYS_ADMIN check, but also from those that end up being
>>>> harmful regardless of the ioctl command code passed into them
>>>> because of stupid driver bugs.
>>> Indeed.
>>>
>>> "safe" is definitely not the right word, it is too broad, relative to
>>> use cases and threat models.  There is no "safe" IOCTL.
>>>
>>> Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
>>> sandbox".
>> Which is a problem from a LSM perspective as we want to avoid hooks
>> which are tightly bound to a single LSM or security model.  It's okay
>> if a new hook only has a single LSM implementation, but the hook's
>> definition should be such that it is reasonably generalized to support
>> multiple LSM/models.

I've been watching this thread with some interest, as one of my side projects
has been an attempt to address the "CAP_SYS_ADMIN problem", and there looks
to be a lot of similarity between that and the "ioctl problem". In both cases
it comes down to a matter of:
	1. uniquely identifying the action
	2. providing the information to code that can act upon it
	3. providing "policy" to determine what to do about it

My thought for the CAP_SYS_ADMIN case was to provide a new LSM hook
security_sysadmin() that takes a single parameter which is the action ID.
I called the action ID a "chit", because it's short and all the good,
more descriptive words where taken. Calls to cap_able(CAP_SYS_ADMIN) could
be replaced by calls to security_sysadmin(chit). security_sysadmin() would
first call cap_able(CAP_SYSADMIN) and, if that succeeded, allow LSMs with
registered hooks (selinux_sysadmin() etc) the opportunity to disallow the
operation. I planned to include a small LSM (chits) that would allow the
operation only if the process had the chit on its chit list. Landlock could
add policy to deal with chits if so inclined.

A generalization of this scheme would be to leave the cap_able(CAP_SYS_ADMIN)
checks as they are and add an optional security_chit() hook for places where
additional enforcement information is desired. Adding

	security_chit(CHIT_IOCTL_TTY_SOMETHING)

to an ioctl would allow any LSM to make policy decisions about that ioctl
operation. Adding

	security_chit(CHIT_ERASE_TAPE_REGISTERS)

after a cap_able(CAP_SYS_ADMIN) could appease the driver writer who would
otherwise be begging for CAP_ERASE_TAPE_REGISTERS. My biggest concern with
this scheme is the management of chit values, which would have to be kept
in a uapi header.

A major advantage of this is that the security_chit() calls would only have
to be added where someone wants to take advantage of the mechanism. People
who are happy with CAP_SYS_ADMIN or ioctl as it is don't have to do anything,
and their code won't get churned for the new world order. The downside is the
potentially onerous process of deciding if an LSM cares about an action known
only by its number.

I have patches close to ready. The chit LSM isn't passing the laugh test quite
yet, so I'm holding it back for now. I wanted to bring this up before we go too
far down a more complicated path.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08 20:12                             ` Mickaël Salaün
  2024-03-08 22:04                               ` Casey Schaufler
@ 2024-03-08 22:25                               ` Paul Moore
  2024-03-09  8:14                                 ` Günther Noack
  1 sibling, 1 reply; 50+ messages in thread
From: Paul Moore @ 2024-03-08 22:25 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Arnd Bergmann, Dave Chinner, Günther Noack,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Fri, Mar 8, 2024 at 3:12 PM Mickaël Salaün <mic@digikod.net> wrote:
> On Fri, Mar 08, 2024 at 02:22:58PM -0500, Paul Moore wrote:
> > On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
> > > On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > > > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > > > On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> > > > >> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> > > > >> I need some more convincing as to why we need to introduce these new
> > > > >> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> > > > >> proposed at the top of this thread.  I believe I understand why
> > > > >> Landlock wants this, but I worry that we all might have different
> > > > >> definitions of a "safe" ioctl list, and encoding a definition into the
> > > > >> LSM hooks seems like a bad idea to me.
> > > > >
> > > > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > > > "safe" clearly means something different here.
> > > >
> > > > That was my problem with the first version as well, but I think
> > > > drawing the line between "implemented in fs/ioctl.c" and
> > > > "implemented in a random device driver fops->unlock_ioctl()"
> > > > seems like a more helpful definition.
> > > >
> > > > This won't just protect from calling into drivers that are lacking
> > > > a CAP_SYS_ADMIN check, but also from those that end up being
> > > > harmful regardless of the ioctl command code passed into them
> > > > because of stupid driver bugs.
> > >
> > > Indeed.
> > >
> > > "safe" is definitely not the right word, it is too broad, relative to
> > > use cases and threat models.  There is no "safe" IOCTL.
> > >
> > > Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
> > > sandbox".
> >
> > Which is a problem from a LSM perspective as we want to avoid hooks
> > which are tightly bound to a single LSM or security model.  It's okay
> > if a new hook only has a single LSM implementation, but the hook's
> > definition should be such that it is reasonably generalized to support
> > multiple LSM/models.
>
> As any new hook, there is a first user.  Obviously this new hook would
> not be restricted to Landlock, it is a generic approach.  I'm pretty
> sure a few hooks are only used by one LSM though. ;)

Sure, as I said above, it's okay for there to only be a single LSM
implementation, but the basic idea behind the hook needs to have some
hope of being generic.  Your "let's redefine a safe ioctl as 'IOCTL
always allowed in a Landlock sandbox'" doesn't fill me with confidence
about the hook being generic; who knows, maybe it will be, but in the
absence of a patch, I'm left with descriptions like those.

> > > Our assumptions are (in the context of Landlock):
> > >
> > > 1. There are IOCTLs tied to file types (e.g. block device with
> > >    major/minor) that can easily be identified from user space (e.g. with
> > >    the path name and file's metadata).  /dev/* files make sense for user
> > >    space and they scope to a specific use case (with relative
> > >    privileges).  This category of IOCTLs is implemented in standalone
> > >    device drivers (for most of them).
> > >
> > > 2. Most user space processes should not be denied access to IOCTLs that
> > >    are managed by the VFS layer or the underlying filesystem
> > >    implementations.  For instance, the do_vfs_ioctl()'s ones (e.g.
> > >    FIOCLEX, FIONREAD) should always be allowed because they may be
> > >    required to legitimately use files, and for performance and security
> > >    reasons (e.g. fs-crypt, fsverity implemented at the filesystem layer).
> > >    Moreover, these IOCTLs should already check the read/write permission
> > >    (on the related FD), which is not the case for most block/char device
> > >    IOCTL.
> > >
> > > 3. IOCTLs to pipes and sockets are out of scope.  They should always be
> > >    allowed for now because they don't directly expose files' data but
> > >    IPCs instead, and we are focusing on FS access rights for now.
> > >
> > > We want to add a new LANDLOCK_ACCESS_FS_IOCTL_DEV right that could match
> > > on char/block device's specific IOCTLs, but it would not have any impact
> > > on other IOCTLs which would then always be allowed (if the sandboxed
> > > process is allowed to open the file).
> > >
> > > Because IOCTLs are implemented in layers and all IOCTLs commands live in
> > > the same 32-bit namespace, we need a way to identify the layer
> > > implemented by block and character devices.  The new LSM hook proposal
> > > enables us to cleanly and efficiently identify the char/block device
> > > IOCTL layer with an additional check on the file type.
> >
> > I guess I should wait until there is an actual patch, but as of right
> > now a VFS ioctl specific LSM hook looks far too limited to me and
> > isn't something I can support at this point in time.  It's obviously
> > limited to only a subset of the ioctls, meaning that in order to have
> > comprehensive coverage we would either need to implement a full range
> > of subsystem ioctl hooks (ugh), or just use the existing
> > security_file_ioctl().
>
> I think there is a misunderstanding.  The subset of IOCTL commands the
> new hook will see would be 99% of them (i.e. all except those
> implemented in fs/ioctl.c).

*cough* 99% != 100% *cough*

> Being able to only handle this (big) subset
> would empower LSMs to control IOCTL commands without collision (e.g. the
> same command/value may have different meanings according to the
> implementation/layer), which is not currently possible (without manual
> tweaking).
>
> This proposal is to add a new hook for the layer just beneath the VFS
> catch-all IOCTL implementation.  This layer can then differentiate
> between the underlying implementation according to the file properties.
> There is no need for additional hooks for other layers/subsystems.

I'm not sure how you reconcile less than 100% coverage, the need for a
generic hook, and the idea that there will not be a need for
additional hooks.  That still seems like a problem to me.

> The existing security_file_ioctl() hook is useful to catch all IOCTL
> commands, but it doesn't enable to identify the underlying target and
> then the semantic of the command.

The LSM hook gets the file pointer, the command, and the argument, how
is a LSM not able to identify the underlying target?

> Furthermore, as Günther said, an
> IOCTL call can already do kernel operations without looking at the
> command, but we would then be able to identify that by looking at the
> char/block device file for instance.
>
> > I understand that this makes things a bit more
> > complicated for Landlock's initial ioctl implementation, but
> > considering my thoughts above and the fact that Landlock's ioctl
> > protections are still evolving I'd rather not add a lot of extra hooks
> > right now.
>
> Without this hook, we'll need to rely on a list of allowed IOCTLs, which
> will be out-of-sync eventually.  It would be a maintenance burden and an
> hacky approach.

Welcome to the painful world of a LSM developer, ioctls are not the
only place where this is a problem, and it should be easy enough to
watch for changes in the ioctl list and update your favorite LSM
accordingly.  Honestly, I think that is kinda the right thing anyway,
I'm skeptical that one could have a generic solution that would
automatically allow or disallow a new ioctl without potentially
breaking your favorite LSM's security model.  If a new ioctl is
introduced it seems like having someone manually review it's impact on
your LSM would be a good idea.

> We're definitely open to new proposals, but until now this is the best
> approach we found from a maintenance, performance, and security point of
> view.

At this point it's probably a good idea to post another RFC patch with
your revised idea, if nothing else it will help rule out any
confusion.  While I remain skeptical, perhaps I am misunderstanding
the design and you'll get my apology and an ACK, but be warned that as
of right now I'm not convinced.

--
paul-moore.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08 22:25                               ` Paul Moore
@ 2024-03-09  8:14                                 ` Günther Noack
  2024-03-09 17:41                                   ` Casey Schaufler
  2024-03-11 19:04                                   ` Paul Moore
  0 siblings, 2 replies; 50+ messages in thread
From: Günther Noack @ 2024-03-09  8:14 UTC (permalink / raw)
  To: Paul Moore
  Cc: Mickaël Salaün, Arnd Bergmann, Dave Chinner,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Fri, Mar 08, 2024 at 05:25:21PM -0500, Paul Moore wrote:
> On Fri, Mar 8, 2024 at 3:12 PM Mickaël Salaün <mic@digikod.net> wrote:
> > On Fri, Mar 08, 2024 at 02:22:58PM -0500, Paul Moore wrote:
> > > On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
> > > > Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
> > > > sandbox".
> > >
> > > Which is a problem from a LSM perspective as we want to avoid hooks
> > > which are tightly bound to a single LSM or security model.  It's okay
> > > if a new hook only has a single LSM implementation, but the hook's
> > > definition should be such that it is reasonably generalized to support
> > > multiple LSM/models.
> >
> > As any new hook, there is a first user.  Obviously this new hook would
> > not be restricted to Landlock, it is a generic approach.  I'm pretty
> > sure a few hooks are only used by one LSM though. ;)
> 
> Sure, as I said above, it's okay for there to only be a single LSM
> implementation, but the basic idea behind the hook needs to have some
> hope of being generic.  Your "let's redefine a safe ioctl as 'IOCTL
> always allowed in a Landlock sandbox'" doesn't fill me with confidence
> about the hook being generic; who knows, maybe it will be, but in the
> absence of a patch, I'm left with descriptions like those.

FWIW, the existing IOCTL hook is used in the following places:

* TOMOYO: seemingly configurable per IOCTL command?  (I did not dig deeper)
* SELinux: has a hardcoded switch of IOCTL commands, some with special checks.
  These are also a subset of the do_vfs_ioctl() commands,
  plus KDSKBENT, KDSKBSENT (from ioctl_console(2)).
* Smack: Decomposes the IOCTL command number to look at the _IOC_WRITE and
  _IOC_READ bits. (This is a known problematic approach, because (1) these bits
  describe whether the argument is getting read or written, not whether the
  operation is a mutating one, and (2) some IOCTL commands do not adhere to the
  convention and don't use these macros)

AppArmor does not use the LSM IOCTL hook.


> > > I understand that this makes things a bit more
> > > complicated for Landlock's initial ioctl implementation, but
> > > considering my thoughts above and the fact that Landlock's ioctl
> > > protections are still evolving I'd rather not add a lot of extra hooks
> > > right now.
> >
> > Without this hook, we'll need to rely on a list of allowed IOCTLs, which
> > will be out-of-sync eventually.  It would be a maintenance burden and an
> > hacky approach.
> 
> Welcome to the painful world of a LSM developer, ioctls are not the
> only place where this is a problem, and it should be easy enough to
> watch for changes in the ioctl list and update your favorite LSM
> accordingly.  Honestly, I think that is kinda the right thing anyway,
> I'm skeptical that one could have a generic solution that would
> automatically allow or disallow a new ioctl without potentially
> breaking your favorite LSM's security model.  If a new ioctl is
> introduced it seems like having someone manually review it's impact on
> your LSM would be a good idea.

We are concerned that we will miss a change in do_vfs_ioctl(), which we would
like to reflect in the matching Landlock code.  Do other LSMs have any
approaches for that which go beyond just watching the do_vfs_ioctl()
implementation for changes?


> > We're definitely open to new proposals, but until now this is the best
> > approach we found from a maintenance, performance, and security point of
> > view.
> 
> At this point it's probably a good idea to post another RFC patch with
> your revised idea, if nothing else it will help rule out any
> confusion.  While I remain skeptical, perhaps I am misunderstanding
> the design and you'll get my apology and an ACK, but be warned that as
> of right now I'm not convinced.

Thanks you for your feedback!

Here is V10 with the approach where we use a new LSM hook:
https://lore.kernel.org/all/20240309075320.160128-1-gnoack@google.com/

I hope this helps to clarify the approach a bit.  I'm explaining it in more
detail again in the commit which adds the LSM hook, including a call graph, and
avoiding the word "safe" this time ;-)

Let me know what you think!

Thanks!
—Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-09  8:14                                 ` Günther Noack
@ 2024-03-09 17:41                                   ` Casey Schaufler
  2024-03-11 19:04                                   ` Paul Moore
  1 sibling, 0 replies; 50+ messages in thread
From: Casey Schaufler @ 2024-03-09 17:41 UTC (permalink / raw)
  To: Günther Noack, Paul Moore
  Cc: Mickaël Salaün, Arnd Bergmann, Dave Chinner,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module, Casey Schaufler

On 3/9/2024 12:14 AM, Günther Noack wrote:
> On Fri, Mar 08, 2024 at 05:25:21PM -0500, Paul Moore wrote:
>> On Fri, Mar 8, 2024 at 3:12 PM Mickaël Salaün <mic@digikod.net> wrote:
>>> On Fri, Mar 08, 2024 at 02:22:58PM -0500, Paul Moore wrote:
>>>> On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>>> Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
>>>>> sandbox".
>>>> Which is a problem from a LSM perspective as we want to avoid hooks
>>>> which are tightly bound to a single LSM or security model.  It's okay
>>>> if a new hook only has a single LSM implementation, but the hook's
>>>> definition should be such that it is reasonably generalized to support
>>>> multiple LSM/models.
>>> As any new hook, there is a first user.  Obviously this new hook would
>>> not be restricted to Landlock, it is a generic approach.  I'm pretty
>>> sure a few hooks are only used by one LSM though. ;)
>> Sure, as I said above, it's okay for there to only be a single LSM
>> implementation, but the basic idea behind the hook needs to have some
>> hope of being generic.  Your "let's redefine a safe ioctl as 'IOCTL
>> always allowed in a Landlock sandbox'" doesn't fill me with confidence
>> about the hook being generic; who knows, maybe it will be, but in the
>> absence of a patch, I'm left with descriptions like those.
> FWIW, the existing IOCTL hook is used in the following places:
>
> * TOMOYO: seemingly configurable per IOCTL command?  (I did not dig deeper)
> * SELinux: has a hardcoded switch of IOCTL commands, some with special checks.
>   These are also a subset of the do_vfs_ioctl() commands,
>   plus KDSKBENT, KDSKBSENT (from ioctl_console(2)).
> * Smack: Decomposes the IOCTL command number to look at the _IOC_WRITE and
>   _IOC_READ bits. (This is a known problematic approach, because (1) these bits
>   describe whether the argument is getting read or written, not whether the
>   operation is a mutating one, and (2) some IOCTL commands do not adhere to the
>   convention and don't use these macros)

These shortcomings are well understood. It's a whole lot better than what was
done originally, but definitely not up to formal scrutiny. Back in the bad old
days of UNIX security evaluations we spent as much time on ioctl() as we did
on the rest of the system. Or so it seemed.

>
> AppArmor does not use the LSM IOCTL hook.
>
>
>>>> I understand that this makes things a bit more
>>>> complicated for Landlock's initial ioctl implementation, but
>>>> considering my thoughts above and the fact that Landlock's ioctl
>>>> protections are still evolving I'd rather not add a lot of extra hooks
>>>> right now.
>>> Without this hook, we'll need to rely on a list of allowed IOCTLs, which
>>> will be out-of-sync eventually.  It would be a maintenance burden and an
>>> hacky approach.
>> Welcome to the painful world of a LSM developer, ioctls are not the
>> only place where this is a problem, and it should be easy enough to
>> watch for changes in the ioctl list and update your favorite LSM
>> accordingly.  Honestly, I think that is kinda the right thing anyway,
>> I'm skeptical that one could have a generic solution that would
>> automatically allow or disallow a new ioctl without potentially
>> breaking your favorite LSM's security model.  If a new ioctl is
>> introduced it seems like having someone manually review it's impact on
>> your LSM would be a good idea.
> We are concerned that we will miss a change in do_vfs_ioctl(), which we would
> like to reflect in the matching Landlock code.  Do other LSMs have any
> approaches for that which go beyond just watching the do_vfs_ioctl()
> implementation for changes?
>
>
>>> We're definitely open to new proposals, but until now this is the best
>>> approach we found from a maintenance, performance, and security point of
>>> view.
>> At this point it's probably a good idea to post another RFC patch with
>> your revised idea, if nothing else it will help rule out any
>> confusion.  While I remain skeptical, perhaps I am misunderstanding
>> the design and you'll get my apology and an ACK, but be warned that as
>> of right now I'm not convinced.
> Thanks you for your feedback!
>
> Here is V10 with the approach where we use a new LSM hook:
> https://lore.kernel.org/all/20240309075320.160128-1-gnoack@google.com/
>
> I hope this helps to clarify the approach a bit.  I'm explaining it in more
> detail again in the commit which adds the LSM hook, including a call graph, and
> avoiding the word "safe" this time ;-)
>
> Let me know what you think!
>
> Thanks!
> —Günther
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-08 11:03                         ` Günther Noack
@ 2024-03-11  1:03                           ` Dave Chinner
  2024-03-11  9:01                             ` Günther Noack
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2024-03-11  1:03 UTC (permalink / raw)
  To: Günther Noack
  Cc: Arnd Bergmann, Paul Moore, Mickaël Salaün,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Fri, Mar 08, 2024 at 12:03:01PM +0100, Günther Noack wrote:
> On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > On Thu, Mar 07, 2024 at 03:40:44PM -0500, Paul Moore wrote:
> > >> On Thu, Mar 7, 2024 at 7:57 AM Günther Noack <gnoack@google.com> wrote:
> > >> I need some more convincing as to why we need to introduce these new
> > >> hooks, or even the vfs_masked_device_ioctl() classifier as originally
> > >> proposed at the top of this thread.  I believe I understand why
> > >> Landlock wants this, but I worry that we all might have different
> > >> definitions of a "safe" ioctl list, and encoding a definition into the
> > >> LSM hooks seems like a bad idea to me.
> > >
> > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > "safe" clearly means something different here.
> > 
> > That was my problem with the first version as well, but I think
> > drawing the line between "implemented in fs/ioctl.c" and
> > "implemented in a random device driver fops->unlock_ioctl()"
> > seems like a more helpful definition.
> 
> Yes, sorry for the confusion - that is exactly what I meant to say with "safe".:
> 
> Those are the IOCTL commands implemented in fs/ioctl.c which do not go through
> f_ops->unlocked_ioctl (or the compat equivalent).

Which means all the ioctls we wrequire for to manage filesystems are
going to be considered "unsafe" and barred, yes?

That means you'll break basic commands like 'xfs_info' that tell you
the configuration of the filesystem. It will prevent things like
online growing and shrinking, online defrag, fstrim, online
scrubbing and repair, etc will not worki anymore. It will break
backup utilities like xfsdump, and break -all- the device management
of btrfs and bcachefs filesystems.

Further, all the setup and management of -VFS functionality- like
fsverity and fscrypt is actually done at the filesystem level (i.e
through ->unlocked_ioctl, no do_vfs_ioctl()) so those are all going
to get broken as well despite them being "vfs features".

Hence from a filesystem perspective, this is a fundamentally
unworkable definition of "safe".

> We want to give people a way with Landlock so that they can restrict the use of
> device-driver implemented IOCTLs, but where they can keep using the bulk of
> more harmless IOCTLs in fs/ioctl.c.

Hah! There's plenty of "harm" that can be done through those ioctls.
It's the entry point for things like filesystem freeze/thaw, FIEMAP
(returns physical data location information), file cloning,
deduplication and per-inode feature manipulation. Lots of this stuff
is under CAP_SYS_ADMIN because they aren't safe for to be exposed to
general users...

So, yeah, I don't think this definition of "safe" is actually useful
in any way. It's arbitrary, and will require both widespread
whitelisting of ioctls to maintain a useful working system and
widespread blacklisting to create a secure system....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-11  1:03                           ` Dave Chinner
@ 2024-03-11  9:01                             ` Günther Noack
  2024-03-11 22:12                               ` Dave Chinner
  0 siblings, 1 reply; 50+ messages in thread
From: Günther Noack @ 2024-03-11  9:01 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Arnd Bergmann, Paul Moore, Mickaël Salaün,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Mon, Mar 11, 2024 at 12:03:13PM +1100, Dave Chinner wrote:
> On Fri, Mar 08, 2024 at 12:03:01PM +0100, Günther Noack wrote:
> > On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > > "safe" clearly means something different here.
> > > 
> > > That was my problem with the first version as well, but I think
> > > drawing the line between "implemented in fs/ioctl.c" and
> > > "implemented in a random device driver fops->unlock_ioctl()"
> > > seems like a more helpful definition.
> > 
> > Yes, sorry for the confusion - that is exactly what I meant to say with "safe".:
> > 
> > Those are the IOCTL commands implemented in fs/ioctl.c which do not go through
> > f_ops->unlocked_ioctl (or the compat equivalent).
> 
> Which means all the ioctls we wrequire for to manage filesystems are
> going to be considered "unsafe" and barred, yes?
> 
> That means you'll break basic commands like 'xfs_info' that tell you
> the configuration of the filesystem. It will prevent things like
> online growing and shrinking, online defrag, fstrim, online
> scrubbing and repair, etc will not worki anymore. It will break
> backup utilities like xfsdump, and break -all- the device management
> of btrfs and bcachefs filesystems.
> 
> Further, all the setup and management of -VFS functionality- like
> fsverity and fscrypt is actually done at the filesystem level (i.e
> through ->unlocked_ioctl, no do_vfs_ioctl()) so those are all going
> to get broken as well despite them being "vfs features".
> 
> Hence from a filesystem perspective, this is a fundamentally
> unworkable definition of "safe".

As discussed further up in this thread[1], we want to only apply the IOCTL
command filtering to block and character devices.  I think this should resolve
your concerns about file system specific IOCTLs?  This is implemented in patch
V10 going forward[2].

[1] https://lore.kernel.org/all/20240219.chu4Yeegh3oo@digikod.net/
[2] https://lore.kernel.org/all/20240309075320.160128-1-gnoack@google.com/


> > We want to give people a way with Landlock so that they can restrict the use of
> > device-driver implemented IOCTLs, but where they can keep using the bulk of
> > more harmless IOCTLs in fs/ioctl.c.
> 
> Hah! There's plenty of "harm" that can be done through those ioctls.
> It's the entry point for things like filesystem freeze/thaw, FIEMAP
> (returns physical data location information), file cloning,
> deduplication and per-inode feature manipulation. Lots of this stuff
> is under CAP_SYS_ADMIN because they aren't safe for to be exposed to
> general users...

The operations themselves are not all harmless, but they are harmless to permit
from the Landlock perspective, because (as you point out as well) their use is
already adequately controlled in their existing implementations.

The proposed patch v10 only influences IOCTL operations on device files,
so the "reflink" deduplication IOCTLs, FIEMAP, etc. should not matter.

—Günther

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-09  8:14                                 ` Günther Noack
  2024-03-09 17:41                                   ` Casey Schaufler
@ 2024-03-11 19:04                                   ` Paul Moore
  1 sibling, 0 replies; 50+ messages in thread
From: Paul Moore @ 2024-03-11 19:04 UTC (permalink / raw)
  To: Günther Noack
  Cc: Mickaël Salaün, Arnd Bergmann, Dave Chinner,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Sat, Mar 9, 2024 at 3:14 AM Günther Noack <gnoack@google.com> wrote:
> On Fri, Mar 08, 2024 at 05:25:21PM -0500, Paul Moore wrote:
> > On Fri, Mar 8, 2024 at 3:12 PM Mickaël Salaün <mic@digikod.net> wrote:
> > > On Fri, Mar 08, 2024 at 02:22:58PM -0500, Paul Moore wrote:
> > > > On Fri, Mar 8, 2024 at 4:29 AM Mickaël Salaün <mic@digikod.net> wrote:
> > > > > Let's replace "safe IOCTL" with "IOCTL always allowed in a Landlock
> > > > > sandbox".
> > > >
> > > > Which is a problem from a LSM perspective as we want to avoid hooks
> > > > which are tightly bound to a single LSM or security model.  It's okay
> > > > if a new hook only has a single LSM implementation, but the hook's
> > > > definition should be such that it is reasonably generalized to support
> > > > multiple LSM/models.
> > >
> > > As any new hook, there is a first user.  Obviously this new hook would
> > > not be restricted to Landlock, it is a generic approach.  I'm pretty
> > > sure a few hooks are only used by one LSM though. ;)
> >
> > Sure, as I said above, it's okay for there to only be a single LSM
> > implementation, but the basic idea behind the hook needs to have some
> > hope of being generic.  Your "let's redefine a safe ioctl as 'IOCTL
> > always allowed in a Landlock sandbox'" doesn't fill me with confidence
> > about the hook being generic; who knows, maybe it will be, but in the
> > absence of a patch, I'm left with descriptions like those.
>
> FWIW, the existing IOCTL hook is used in the following places:
>
> * TOMOYO: seemingly configurable per IOCTL command?  (I did not dig deeper)
> * SELinux: has a hardcoded switch of IOCTL commands, some with special checks.
>   These are also a subset of the do_vfs_ioctl() commands,
>   plus KDSKBENT, KDSKBSENT (from ioctl_console(2)).

One should be careful using the term "hardcoded" here as I believe it
is misleading in the SELinux case.  SELinux has 11 explicitly defined
ioctls, with an additional two configurable on a per-policy basis
depending on the state of the SELinux IOCTL_SKIP_CLOEXEC policy
capability.  The security policy associated with these explicit ioctl
checks is not hardcoded into the kernel, it is defined as part of the
greater SELinux security policy.  One could make an argument that
FIONBIO and FIOASYNC look a bit hardcoded, but there is some subtlety
there that is probably not worth exploring further in this context but
I'm happy to discuss in a different thread if that is helpful.

All the ioctls that are not explicitly defined in the SELinux code,
are still subject to SELinux policy through both the file/ioctl
permission and the extended permission (xperm) functionality.  The
SELinux xperm functionality, when tied to an ioctl operation, allows
policy developers to allow or deny specific ioctl operations on a
per-domain basis.

> * Smack: Decomposes the IOCTL command number to look at the _IOC_WRITE and
>   _IOC_READ bits. (This is a known problematic approach, because (1) these bits
>   describe whether the argument is getting read or written, not whether the
>   operation is a mutating one, and (2) some IOCTL commands do not adhere to the
>   convention and don't use these macros)
>
> AppArmor does not use the LSM IOCTL hook.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-11  9:01                             ` Günther Noack
@ 2024-03-11 22:12                               ` Dave Chinner
  2024-03-12 10:58                                 ` Mickaël Salaün
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Chinner @ 2024-03-11 22:12 UTC (permalink / raw)
  To: Günther Noack
  Cc: Arnd Bergmann, Paul Moore, Mickaël Salaün,
	Christian Brauner, Allen Webb, Dmitry Torokhov, Jeff Xu,
	Jorge Lucangeli Obes, Konstantin Meskhidze, Matt Bobrowski,
	linux-fsdevel, linux-security-module

On Mon, Mar 11, 2024 at 10:01:33AM +0100, Günther Noack wrote:
> On Mon, Mar 11, 2024 at 12:03:13PM +1100, Dave Chinner wrote:
> > On Fri, Mar 08, 2024 at 12:03:01PM +0100, Günther Noack wrote:
> > > On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > > > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > > > "safe" clearly means something different here.
> > > > 
> > > > That was my problem with the first version as well, but I think
> > > > drawing the line between "implemented in fs/ioctl.c" and
> > > > "implemented in a random device driver fops->unlock_ioctl()"
> > > > seems like a more helpful definition.
> > > 
> > > Yes, sorry for the confusion - that is exactly what I meant to say with "safe".:
> > > 
> > > Those are the IOCTL commands implemented in fs/ioctl.c which do not go through
> > > f_ops->unlocked_ioctl (or the compat equivalent).
> > 
> > Which means all the ioctls we wrequire for to manage filesystems are
> > going to be considered "unsafe" and barred, yes?
> > 
> > That means you'll break basic commands like 'xfs_info' that tell you
> > the configuration of the filesystem. It will prevent things like
> > online growing and shrinking, online defrag, fstrim, online
> > scrubbing and repair, etc will not worki anymore. It will break
> > backup utilities like xfsdump, and break -all- the device management
> > of btrfs and bcachefs filesystems.
> > 
> > Further, all the setup and management of -VFS functionality- like
> > fsverity and fscrypt is actually done at the filesystem level (i.e
> > through ->unlocked_ioctl, no do_vfs_ioctl()) so those are all going
> > to get broken as well despite them being "vfs features".
> > 
> > Hence from a filesystem perspective, this is a fundamentally
> > unworkable definition of "safe".
> 
> As discussed further up in this thread[1], we want to only apply the IOCTL
> command filtering to block and character devices.  I think this should resolve
> your concerns about file system specific IOCTLs?  This is implemented in patch
> V10 going forward[2].

I think you misunderstand. I used filesystem ioctls as an obvious
counter argument to this "VFS-only ioctls are safe" proposal to show
that it fundamentally breaks core filesystem boot and management
interfaces. Operations to prepare filesystems for mount may require
block device ioctls to be run. i.e. block device ioctls are required
core boot and management interfaces.

Disallowing ioctls on block devices will break udev rules that set
up block devices on kernel device instantiation events. It will
break partitioning tools that need to read/modify/rescan the
partition table. This will prevent discard, block zeroing and
*secure erase* operations. It may prevent libblkid from reporting
optimal device IO parameters to filesystem utilities like mkfs. You
won't be able to mark block devices as read only.  Management of
zoned block devices will be impossible.

Then stuff like DM and MD devices (e.g. LVM, RAID, etc) simply won't
appear on the system because they can't be scanned, configured,
assembled, etc.

And so on.

The fundamental fact is that system critical block device ioctls are
implemented by generic infrastructure below the VFS layer. They have
their own generic ioctl layer - blkdev_ioctl() is equivalent of
do_vfs_ioctl() for the block layer.  But if we cut off everything
below ->unlocked_ioctl() at the VFS, then we simply can't run any
of these generic block device ioctls.

As I said: this proposal is fundamentally unworkable without
extensive white- and black-listing of individual ioctls in the
security policies. That's not really a viable situation, because
we're going to change code and hence likely silently break those
security policy lists regularly....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers
  2024-03-11 22:12                               ` Dave Chinner
@ 2024-03-12 10:58                                 ` Mickaël Salaün
  0 siblings, 0 replies; 50+ messages in thread
From: Mickaël Salaün @ 2024-03-12 10:58 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Günther Noack, Arnd Bergmann, Paul Moore, Christian Brauner,
	Allen Webb, Dmitry Torokhov, Jeff Xu, Jorge Lucangeli Obes,
	Konstantin Meskhidze, Matt Bobrowski, linux-fsdevel,
	linux-security-module

On Tue, Mar 12, 2024 at 09:12:28AM +1100, Dave Chinner wrote:
> On Mon, Mar 11, 2024 at 10:01:33AM +0100, Günther Noack wrote:
> > On Mon, Mar 11, 2024 at 12:03:13PM +1100, Dave Chinner wrote:
> > > On Fri, Mar 08, 2024 at 12:03:01PM +0100, Günther Noack wrote:
> > > > On Fri, Mar 08, 2024 at 08:02:13AM +0100, Arnd Bergmann wrote:
> > > > > On Fri, Mar 8, 2024, at 00:09, Dave Chinner wrote:
> > > > > > I have no idea what a "safe" ioctl means here. Subsystems already
> > > > > > restrict ioctls that can do damage if misused to CAP_SYS_ADMIN, so
> > > > > > "safe" clearly means something different here.
> > > > > 
> > > > > That was my problem with the first version as well, but I think
> > > > > drawing the line between "implemented in fs/ioctl.c" and
> > > > > "implemented in a random device driver fops->unlock_ioctl()"
> > > > > seems like a more helpful definition.
> > > > 
> > > > Yes, sorry for the confusion - that is exactly what I meant to say with "safe".:
> > > > 
> > > > Those are the IOCTL commands implemented in fs/ioctl.c which do not go through
> > > > f_ops->unlocked_ioctl (or the compat equivalent).
> > > 
> > > Which means all the ioctls we wrequire for to manage filesystems are
> > > going to be considered "unsafe" and barred, yes?
> > > 
> > > That means you'll break basic commands like 'xfs_info' that tell you
> > > the configuration of the filesystem. It will prevent things like
> > > online growing and shrinking, online defrag, fstrim, online
> > > scrubbing and repair, etc will not worki anymore. It will break
> > > backup utilities like xfsdump, and break -all- the device management
> > > of btrfs and bcachefs filesystems.
> > > 
> > > Further, all the setup and management of -VFS functionality- like
> > > fsverity and fscrypt is actually done at the filesystem level (i.e
> > > through ->unlocked_ioctl, no do_vfs_ioctl()) so those are all going
> > > to get broken as well despite them being "vfs features".
> > > 
> > > Hence from a filesystem perspective, this is a fundamentally
> > > unworkable definition of "safe".
> > 
> > As discussed further up in this thread[1], we want to only apply the IOCTL
> > command filtering to block and character devices.  I think this should resolve
> > your concerns about file system specific IOCTLs?  This is implemented in patch
> > V10 going forward[2].
> 
> I think you misunderstand. I used filesystem ioctls as an obvious
> counter argument to this "VFS-only ioctls are safe" proposal to show
> that it fundamentally breaks core filesystem boot and management
> interfaces. Operations to prepare filesystems for mount may require
> block device ioctls to be run. i.e. block device ioctls are required
> core boot and management interfaces.
> 
> Disallowing ioctls on block devices will break udev rules that set
> up block devices on kernel device instantiation events. It will
> break partitioning tools that need to read/modify/rescan the
> partition table. This will prevent discard, block zeroing and
> *secure erase* operations. It may prevent libblkid from reporting
> optimal device IO parameters to filesystem utilities like mkfs. You
> won't be able to mark block devices as read only.  Management of
> zoned block devices will be impossible.
> 
> Then stuff like DM and MD devices (e.g. LVM, RAID, etc) simply won't
> appear on the system because they can't be scanned, configured,
> assembled, etc.
> 
> And so on.
> 
> The fundamental fact is that system critical block device ioctls are
> implemented by generic infrastructure below the VFS layer. They have
> their own generic ioctl layer - blkdev_ioctl() is equivalent of
> do_vfs_ioctl() for the block layer.  But if we cut off everything
> below ->unlocked_ioctl() at the VFS, then we simply can't run any
> of these generic block device ioctls.
> 
> As I said: this proposal is fundamentally unworkable without
> extensive white- and black-listing of individual ioctls in the
> security policies. That's not really a viable situation, because
> we're going to change code and hence likely silently break those
> security policy lists regularly....

Landlock is an optional sandboxing mechanism targeting unprivileged
users/processes (even if it can of course be used by privileged ones).
This means that there is no global security policy for the whole system
(unlike SELinux, AppArmor...).  System administrators that need to
manage a file system or any block devices would just not sandbox
themselves.  Moreover, most block devices should only be accessible to
the root user (which makes root the only one able to send IOCTL commands
to these block devices).  In a nutshell, processes using boot and
management interfaces are already privileged and they don't use
Landlock.  For instance, a landlocked process cannot do any mount
action, which is documented and it makes sense for the sandboxing use
case (to avoid sandbox bypass).

However, it would be interesting to know if unprivileged users can
request legitimate IOCTL commands on block devices (on a generic
distro), and if this is required for a common file system use (i.e.
excluding administration tasks).  I think all required IOCTL for common
file system use are available through the file system, not block devices,
but please correct me if I'm wrong.  What is nice with this
LANDLOCK_ACCESS_FS_IOCTL_DEV approach is that user space can identify
(with path and dev major/minor) on which device IOCTLs should be
allowed.  This is simple to understand and the information to identify
such devices is already well known.  We can also allow IOCTLs on a set
of devices, e.g. /dev/snd/.

The goal of this patch series is to enable applications to sandbox
themselves and avoid an attacker (exploiting a bug in this application)
to send arbitrary IOCTL commands to any devices available to the user
running this application.  For this sandboxing use case, I think it
wouldn't be useful to differentiate between blkdev_ioctl()'s commands
and device-specific commands because we want to either allow all IOCTL
on a block device or deny most of them (not those handled by
do_vfs_ioctl(), e.g. FIOCLEX, but that's a detail because of the file
access rights).  This is a trade off to ease sandboxing while being able
to limit access to unneeded features (which could potentially be used to
bypass the sandbox, e.g. TTY's IOCTLs).

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2024-03-12 10:58 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-09 17:06 [PATCH v9 0/8] Landlock: IOCTL support Günther Noack
2024-02-09 17:06 ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
2024-02-10 11:06   ` Günther Noack
2024-02-10 11:49     ` Arnd Bergmann
2024-02-12 11:09       ` Christian Brauner
2024-02-12 22:10         ` Günther Noack
2024-02-10 11:18   ` Günther Noack
2024-02-16 14:11     ` Mickaël Salaün
2024-02-16 15:51       ` Mickaël Salaün
2024-02-18  8:34         ` Günther Noack
2024-02-19 21:44           ` Günther Noack
2024-02-16 17:19   ` Mickaël Salaün
2024-02-19 18:34   ` Mickaël Salaün
2024-02-19 18:35     ` [RFC PATCH] fs: Add vfs_masks_device_ioctl*() helpers Mickaël Salaün
2024-03-01 13:42       ` Mickaël Salaün
2024-03-01 16:24       ` Arnd Bergmann
2024-03-01 18:35         ` Mickaël Salaün
2024-03-05 18:13       ` Günther Noack
2024-03-06 13:47         ` Mickaël Salaün
2024-03-06 15:18           ` Arnd Bergmann
2024-03-07 12:15             ` Christian Brauner
2024-03-07 12:21               ` Arnd Bergmann
2024-03-07 12:57                 ` Günther Noack
2024-03-07 20:40                   ` Paul Moore
2024-03-07 23:09                     ` Dave Chinner
2024-03-07 23:35                       ` Paul Moore
2024-03-08  7:02                       ` Arnd Bergmann
2024-03-08  9:29                         ` Mickaël Salaün
2024-03-08 19:22                           ` Paul Moore
2024-03-08 20:12                             ` Mickaël Salaün
2024-03-08 22:04                               ` Casey Schaufler
2024-03-08 22:25                               ` Paul Moore
2024-03-09  8:14                                 ` Günther Noack
2024-03-09 17:41                                   ` Casey Schaufler
2024-03-11 19:04                                   ` Paul Moore
2024-03-08 11:03                         ` Günther Noack
2024-03-11  1:03                           ` Dave Chinner
2024-03-11  9:01                             ` Günther Noack
2024-03-11 22:12                               ` Dave Chinner
2024-03-12 10:58                                 ` Mickaël Salaün
2024-02-28 12:57     ` [PATCH v9 1/8] landlock: Add IOCTL access right Günther Noack
2024-03-01 12:59       ` Mickaël Salaün
2024-03-01 13:38         ` Mickaël Salaün
2024-02-09 17:06 ` [PATCH v9 2/8] selftests/landlock: Test IOCTL support Günther Noack
2024-02-09 17:06 ` [PATCH v9 3/8] selftests/landlock: Test IOCTL with memfds Günther Noack
2024-02-09 17:06 ` [PATCH v9 4/8] selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH) Günther Noack
2024-02-09 17:06 ` [PATCH v9 5/8] selftests/landlock: Test IOCTLs on named pipes Günther Noack
2024-02-09 17:06 ` [PATCH v9 6/8] selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets Günther Noack
2024-02-09 17:06 ` [PATCH v9 7/8] samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL Günther Noack
2024-02-09 17:06 ` [PATCH v9 8/8] landlock: Document IOCTL support Günther Noack

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.