LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Christian Brauner <christian.brauner@ubuntu.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: "Mickaël Salaün" <mic@digikod.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Andy Lutomirski" <luto@amacapital.net>,
	"Anton Ivanov" <anton.ivanov@cambridgegreys.com>,
	"Casey Schaufler" <casey@schaufler-ca.com>,
	"James Morris" <jmorris@namei.org>,
	"Jann Horn" <jannh@google.com>, "Jeff Dike" <jdike@addtoit.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Kees Cook" <keescook@chromium.org>,
	"Michael Kerrisk" <mtk.manpages@gmail.com>,
	"Mickaël Salaün" <mickael.salaun@ssi.gouv.fr>,
	"Richard Weinberger" <richard@nod.at>,
	"Serge E . Hallyn" <serge@hallyn.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Vincent Dagonneau" <vincent.dagonneau@ssi.gouv.fr>,
	"Kernel Hardening" <kernel-hardening@lists.openwall.com>,
	"Linux API" <linux-api@vger.kernel.org>,
	linux-arch <linux-arch@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	"Linux FS-devel Mailing List" <linux-fsdevel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	"LSM List" <linux-security-module@vger.kernel.org>,
	"the arch/x86 maintainers" <x86@kernel.org>
Subject: Re: [PATCH v19 08/12] landlock: Add syscall implementation
Date: Thu, 9 Jul 2020 19:47:23 +0200
Message-ID: <20200709174723.3m7iuma4re2v3xod@wittgenstein> (raw)
In-Reply-To: <CAK8P3a34X1qfDhn8u3nR+aQA_g+V2i35L0oTnvhNAs83YJPB_w@mail.gmail.com>

On Thu, Jul 09, 2020 at 07:26:18PM +0200, Arnd Bergmann wrote:
> On Wed, Jul 8, 2020 at 7:50 PM Mickaël Salaün <mic@digikod.net> wrote:
> > On 08/07/2020 15:49, Arnd Bergmann wrote:
> > > On Wed, Jul 8, 2020 at 3:04 PM Mickaël Salaün <mic@digikod.net> wrote:
> > >> On 08/07/2020 10:57, Arnd Bergmann wrote:
> > >>> On Tue, Jul 7, 2020 at 8:10 PM Mickaël Salaün <mic@digikod.net> wrote:
> > >>>
> > >>> It looks like all you need here today is a single argument bit, plus
> > >>> possibly some room for extensibility. I would suggest removing all
> > >>> the extra bits and using a syscall like
> > >>>
> > >>> SYSCALL_DEFINE1(landlock_create_ruleset, u32, flags);
> > >>>
> > >>> I don't really see how this needs any variable-length arguments,
> > >>> it really doesn't do much.
> > >>
> > >> We need the attr_ptr/attr_size pattern because the number of ruleset
> > >> properties will increase (e.g. network access mask).
> > >
> > > But how many bits do you think you will *actually* need in total that
> > > this needs to be a two-dimensional set of flags? At the moment you
> > > only have a single bit that you interpret.
> >
> > I think there is a misunderstanding. For this syscall I wasn't talking
> > about the "options" field but about the "handled_access_fs" field which
> > has 14 bits dedicated to control access to the file system:
> > https://landlock.io/linux-doc/landlock-v19/security/landlock/user.html#filesystem-flags
> 
> Ok, got it. I didn't read far enough there.
> 
> > The idea is to add other handled_access_* fields for other kernel object
> > types (e.g. network, process, etc.).
> >
> > The "options" field is fine as a raw __u32 syscall argument.
> 
> I'd still like to avoid having it variable-length and structured though.
> How about having a __u32 "options" flag, plus an indirect argument
> with 32 fixed-length (all 32 bit or all 64 bit) flag words, each of which
> corresponds to one of the option bits?
> 
> It's still fairly complex that way, but not as much as the version
> you have right now that can be extended in multiple dimensions.
> 
> This could possibly also help avoid the need for the get_features

What is this fresh hell again, please?

> syscall: If user space just passes the bitmap of all the access flags
> it wants to use in a fixed-size structure, the kernel can update the
> bits to mask out the ones it does not understand and write back
> that bitmap as the result of create_ruleset().
> 
> > >>> To be on the safe side, you might split up the flags into either the
> > >>> upper/lower 16 bits or two u32 arguments, to allow both compatible
> > >>> (ignored by older kernels if flag is set) and incompatible (return error
> > >>> when an unknown flag is set) bits.
> > >>
> > >> This may be a good idea in general, but in the case of Landlock, because
> > >> this kind of (discretionary) sandboxing should be a best-effort security
> > >> feature, we should avoid incompatible behavior. In practice, every
> > >> unknown bit returns an error because userland can probe for available
> > >> bits thanks to the get_features command. This kind of (in)compatibility
> > >> can then be handled by userland.
> > >
> > > If there are not going to be incompatible extensions, then just ignore
> > > all unknown bits and never return an error but get rid of the user
> > > space probing that just complicates the interface.
> >
> > There was multiple discussions about ABI compatibility, especially
> > inspired by open(2) vs. openat2(2), and ignoring flags seems to be a bad
> > idea. In the "sandboxer" example, we first probe the supported features
> > and then mask unknown bits (i.e. access rights) at run time in userland.
> > This strategy is quite straightforward, backward compatible and
> > future-proof.
> 
> For behavior changing flags, I agree they should be seen as
> incompatible flags (i.e. return an error if an unknown bit is set).
> 
> However, for the flags you pass in in an allowlist, treating them
> as compatible (i.e. ignore any unknown flags, allowing everything
> you are not forbidding already) seems completely reasonable
> to me. Do you foresee user space doing anything other than masking
> out the bits that the kernel doesn't know about? If not, then doing
> it in the  kernel should always be simpler.
> 
> > >> I suggest this syscall signature:
> > >> SYSCALL_DEFINE3(landlock_create_ruleset, __u32, options, const struct
> > >> landlock_attr_ruleset __user *, ruleset_ptr, size_t, ruleset_size);
> > >
> > > The other problem here is that indirect variable-size structured arguments
> > > are a pain to instrument with things like strace or seccomp, so you
> > > should first try to use a fixed argument list, and fall back to a fixed
> > > structure if that fails.
> >
> > I agree that it is not perfect with the current tools but this kind of
> > extensible structs are becoming common and well defined (e.g. openat2).
> > Moreover there is some work going on for seccomp to support "extensible
> > argument" syscalls: https://lwn.net/Articles/822256/
> 
> openat2() is already more complex than we'd ideally want, I think we
> should try hard to make new syscalls simpler than that, following the
> rule that any interface should be as simple as possible, but no simpler.

Extensible structs are targeted at system calls that are either known to
grow a lot of features or we already have prior versions that have
accumulated quite a lot of features or that by their nature need to be
more complex.
openat2() is not really complex per se (At least not yet. It will likely
grow quite a bit in the future...). The kernel now has infrastructure
since clone3() and later generalized with openat2() and is well-equipped
with a consistent api to deal with such syscalls so I don't see how this
is really an issue in the first place. Yes, syscalls should be kept
as simple as possible but we don't need to lock us into a "structs as
arguments" are inherently bad mindset. That will also cause us to end up
with crappy syscalls that are awkward to use for userspace.
(Second-level pointers is a whole different issue of course.)

(Arnd, you should also note that we're giving a talk at kernel summit
about new syscall conventions and I'm syncing with Florian who'll be
talking about the userspace side and requirements of this.)

Christian

> 
> > >>>> +static int syscall_add_rule_path_beneath(const void __user *const attr_ptr,
> > >>>> +               const size_t attr_size)
> > >>>> +{
> > >>>> +       struct landlock_attr_path_beneath attr_path_beneath;
> > >>>> +       struct path path;
> > >>>> +       struct landlock_ruleset *ruleset;
> > >>>> +       int err;
> > >>>
> > >>> Similarly, it looks like this wants to be
> > >>>
> > >>> SYSCALL_DEFINE3(landlock_add_rule_path_beneath, int, ruleset, int,
> > >>> path, __u32, flags)
> > >>>
> > >>> I don't see any need to extend this in a way that wouldn't already
> > >>> be served better by adding another system call. You might argue
> > >>> that 'flags' and 'allowed_access' could be separate, with the latter
> > >>> being an indirect in/out argument here, like
> > >>>
> > >>> SYSCALL_DEFINE4(landlock_add_rule_path_beneath, int, ruleset, int, path,
> > >>>                            __u64 *, allowed_acces, __u32, flags)
> > >>
> > >> To avoid adding a new syscall for each new rule type (e.g. path_beneath,
> > >> path_range, net_ipv4_range, etc.), I think it would be better to keep
> > >> the attr_ptr/attr_size pattern and to explicitely set a dedicated option
> > >> flag to specify the attr type.
> > >>
> > >> This would look like this:
> > >> SYSCALL_DEFINE4(landlock_add_rule, __u32, options, int, ruleset, const
> > >> void __user *, rule_ptr, size_t, rule_size);
> > >>
> > >> The rule_ptr could then point to multiple types like struct
> > >> landlock_attr_path_beneath (without the current ruleset_fd field).
> > >
> > > This again introduces variable-sized structured data. How many different
> > > kinds of rule types do you think there will be (most likely, and maybe an
> > > upper bound)?
> >
> > I don't know how many rule types will come, but right now I think it may
> > be less than 10.
> 
> Ok,
> 
> > > Could (some of) these be generalized to use the same data structure?
> >
> > I don't think so, file path and network addresses are an example of very
> > different types.
> 
> Clearly the target object is something different, but maybe there is
> enough commonality to still make them fit into a more regular form.
> 
> For the file system case, you have an identify for an object
> (the file descriptor) and the  '__u64 allowed_access'. I would
> expect that the 'allowed_access' concept is generic enough that
> you can make it a direct argument (32 bit register arg, or pointer
> to a __u64). Do you expect others to need something besides
> an object identifier and a permission bitmask? Maybe it could
> be something like
> 
>  SYSCALL_DEFINE4(landlock_add_rule, int, ruleset, __u32, options,
>                        const void __user *, object, const __u64 __user
> *, allowed_access,
>                        __u32, flags);
> 
> with a fixed-length 'object' identifier type (file descriptor,
> sockaddr_storage, ...) for each option.
> 
>     Arnd

  reply index

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 18:09 [PATCH v19 00/12] Landlock LSM Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 01/12] landlock: Add object management Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 02/12] landlock: Add ruleset and domain management Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 03/12] landlock: Set up the security framework and manage credentials Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 04/12] landlock: Add ptrace restrictions Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 05/12] LSM: Infrastructure management of the superblock Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 06/12] fs,security: Add sb_delete hook Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 07/12] landlock: Support filesystem access-control Mickaël Salaün
2020-07-07 20:11   ` Randy Dunlap
2020-07-08  7:03     ` Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 08/12] landlock: Add syscall implementation Mickaël Salaün
2020-07-08  8:57   ` Arnd Bergmann
2020-07-08 13:04     ` Mickaël Salaün
2020-07-08 13:49       ` Arnd Bergmann
2020-07-08 17:50         ` Mickaël Salaün
2020-07-09 17:26           ` Arnd Bergmann
2020-07-09 17:47             ` Christian Brauner [this message]
2020-07-10 12:57               ` Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 09/12] arch: Wire up landlock() syscall Mickaël Salaün
2020-07-08  7:22   ` Arnd Bergmann
2020-07-08  7:31     ` Mickaël Salaün
2020-07-08  7:47       ` Arnd Bergmann
2020-07-08  8:23         ` Mickaël Salaün
2020-07-08  8:58           ` Arnd Bergmann
2020-07-07 18:09 ` [PATCH v19 10/12] selftests/landlock: Add initial tests Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 11/12] samples/landlock: Add a sandbox manager example Mickaël Salaün
2020-07-07 18:09 ` [PATCH v19 12/12] landlock: Add user and kernel documentation Mickaël Salaün

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200709174723.3m7iuma4re2v3xod@wittgenstein \
    --to=christian.brauner@ubuntu.com \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=arnd@arndb.de \
    --cc=casey@schaufler-ca.com \
    --cc=corbet@lwn.net \
    --cc=jannh@google.com \
    --cc=jdike@addtoit.com \
    --cc=jmorris@namei.org \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mic@digikod.net \
    --cc=mickael.salaun@ssi.gouv.fr \
    --cc=mtk.manpages@gmail.com \
    --cc=richard@nod.at \
    --cc=serge@hallyn.com \
    --cc=shuah@kernel.org \
    --cc=vincent.dagonneau@ssi.gouv.fr \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git