util-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFD] A mount api that notices previous mounts
@ 2019-01-29 21:44 Eric W. Biederman
  2019-01-29 23:01 ` Casey Schaufler
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-29 21:44 UTC (permalink / raw)
  To: linux-api
  Cc: linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski

[-- Attachment #1: Type: text/plain, Size: 3563 bytes --]


All,

With the existing mount API it is possible to mount a filesystem
like:

mount -t ext4 /dev/sda1 -o user_xattr /some/path
mount -t ext4 /dev/sda1 -o nouser_xattr /some/other/path

And have both mount commands succeed and have two mounts of the same
filesystem.  If the mounter is not attentive or the first mount is added
earlier it may not be immediately noticed that a mount option that is
needed for the correct operation or the security of the system is lost.

We have seen this failure mode with both devpts and proc.  So it is not
theoretical, and it has resulted in CVEs.

In some cases the existing mount API (such as a conflict between ro and
rw) handles this by returning -EBUSY.  So we may be able to correct this
in the existing mount API.  But it is always very tricky to to get
adequate testing for a change like that to avoid regressions, so I am
proposing we change this in the new mount api.

This has been brought up before and I have been told it is technically
infeasible to make this work.  To counter that I have sat down and
implemented it.

The basic idea is:
 - get a handle to a filesystem
   (passing enough options to uniquely identify the super block).
   Also capture enough state in the file handle to let you know if
   the file system has it's mount options changed between system calls.
   (essentially this is just the fs code that calls sget)

 - If the super block has not been configured allow setting the file
   systems options.

 - If the super block has already been configured require reading the
   file systems mount options before setting/updating the file systems
   mount options.

To complement that I have functionality that:
 - Allows reading a file systems current mount options.
 - Allows reading the mount options that are needed to get a handle to
   the filesystem.  For most filesystems it is just the block device
   name.  For nfs is is essentially all mount options.  For btrfs
   it is the block device name, and the "devices=" mount option for
   secondary block device names.

Please find below a tree where all of this is implemented and working.
Not all file systems have been converted but the most of the unconverted
ones are just a couple minutes work as I have converted all of the file
system mount helper routines.

Also please find below an example mount program that takes the same set
of mount options as mount(8) today and mounts filesystems with the
proposed new mount api.
 - Without having any filesystem mount knowledge it sucessfully figures

out which system calls different mount options needs to be applied
   to.

- Without having any filesystem specific knowledge in most cases it
   can detect if a mount option you specify is already specified to an
   existing mount or not.  For duplicates it can detect it ignores them.
   For the other cases it fails the mount as it thinks the mount options
   are different.

 - Which demonstrates it safe to put the detection and remediation of
   multiple mount commands resolving to the same filesystem in user
   space.

I really don't care whose code gets used as long as it works.  I do very
much care that we don't add a new mount api that has the confusion flaws
of the existing mount api.

Along the way I have also detected a lot of room for improvement on the
mount path for filesystems.  Those cleanup patches are in my tree below
and will be extracting them and sending them along shortly.

Comments?

git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git no-losing-mount-options-proof-of-concept


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: example_mount.c --]
[-- Type: text/x-csrc, Size: 9637 bytes --]

/* gcc -Wall -O2 -g example_mount.c -o example_mount */
#define _ATFILE_SOURCE
#include <sys/syscall.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/syscall.h>

#ifdef __x86_64__
# ifndef __NR_open_tree
#  define __NR_open_tree 335
# endif
# ifndef __NR_move_mount
#  define __NR_move_mount 336
# endif
# ifndef __NR_fsopen
#  define  __NR_fsopen 337
# endif
# ifndef __NR_fsspecifiers
#  define  __NR_fsspecifiers 338
# endif
# ifndef __NR_fsinstance
#  define  __NR_fsinstance 339
# endif
# ifndef __NR_fspick
#  define  __NR_fspick 340
# endif
# ifndef __NR_fsset
#  define  __NR_fsset 341
# endif
# ifndef __NR_fsmount
#  define  __NR_fsmount 342
# endif
# ifndef __NR_fsoptions
#  define  __NR_fsoptions 343
# endif
# ifndef __NR_fsname
#  define  __NR_fsname 344
# endif
# ifndef __NR_fstype
#  define  __NR_fstype 345
# endif
#endif /* x86_64 */

#ifndef MOVE_MOUNT_F_SYMLINKS
# define MOVE_MOUNT_F_SYMLINKS		0x00000001 /* Follow symlinks on from path */
#endif
#ifndef MOVE_MOUNT_F_AUTOMOUNTS
# define MOVE_MOUNT_F_AUTOMOUNTS	0x00000002 /* Follow automounts on from path */
#endif
#ifndef MOVE_MOUNT_F_EMPTY_PATH
# define MOVE_MOUNT_F_EMPTY_PATH	0x00000004 /* Empty from path permitted */
#endif
#ifndef MOVE_MOUNT_T_SYMLINKS
# define MOVE_MOUNT_T_SYMLINKS		0x00000010 /* Follow symlinks on to path */
#endif
#ifndef MOVE_MOUNT_T_AUTOMOUNTS
# define MOVE_MOUNT_T_AUTOMOUNTS	0x00000020 /* Follow automounts on to path */
#endif
#ifndef MOVE_MOUNT_T_EMPTY_PATH
# define MOVE_MOUNT_T_EMPTY_PATH	0x00000040 /* Empty to path permitted */
#endif

int open_tree(int dfd, const char *path, unsigned int flags)
{
	return syscall(__NR_open_tree, dfd, path, flags);
}

int move_mount(int from_dfd, const char *from_path,
	       int to_dfd, const char *to_path, unsigned int flags)
{
	return syscall(__NR_move_mount, from_dfd, from_path, to_dfd, to_path, flags);
}

int fsopen(const char *type)
{
	return syscall(__NR_fsopen, type);
}


int fsspecifiers(int driverfd, char *specifiers, size_t specifiers_len)
{
	return syscall(__NR_fsspecifiers, driverfd, specifiers, specifiers_len);
}


int fsinstance(int driverfd, const char *name, const char *specifiers)
{
	return syscall(__NR_fsinstance, driverfd, name, specifiers);
}


int fspick(int dfd, const char *path, unsigned int flags)
{
	return syscall(__NR_fspick, dfd, path, flags);
}

int fsset(int instancefd, const char *options)
{
	return syscall(__NR_fsset, instancefd, options);
}

int fsmount(int instancefd, const char *options)
{
	return syscall(__NR_fsmount, instancefd, options);
}

int fsoptions(int instancefd, char *options, size_t options_len)
{
	return syscall(__NR_fsoptions, instancefd, options, options_len);
}

int fsname(int instancefd, char *name, size_t name_len)
{
	return syscall(__NR_fsname, instancefd, name, name_len);
}

int fstype(int instancefd, char *type, size_t type_len)
{
	return syscall(__NR_fstype, instancefd, type, type_len);
}


static const char *mount_vec[] = {
	"ro",
	"rw",
	"nosuid",
	"nodev",
	"dev",
	"noexec",
	"nodiratime",
	"diratime",
	"noatime",
	"relatime",
	"strictatime",
	NULL
};

static const char *joint_vec[] = {
	"ro",
	"rw",
	NULL
};

static char *parse_mnt_opt(char *opt, char *end)
{
	char *p, *equal = NULL;
	int open_quote;

	open_quote = 0;
	for (p = opt; (p < end) && *p; p++) {
		if (equal && (*p == '"'))
			open_quote ^= 1;
		if (open_quote)
			continue;
		if (!equal && (*p == '='))
			equal = p;
		else if (*p == ',') {
			end = p;
			break;
		}
	}
	/* Unbalaned quotes? */
	if (open_quote) {
		errno = -EINVAL;
		return NULL;
	}

	return end;
}

int split_options(char *options, const char ***optvp)
{
	const char **vec = NULL;
	size_t count = 0, index = 0;
	char *opt, *end, *comma;

	opt = options;
	end = opt + strlen(opt);
	for (comma = opt; comma != end; opt = comma + 1) {
		comma = parse_mnt_opt(opt, end);
		if (!comma)
			return -1;

		if (opt == comma)
			continue;

		count++;
	}

	vec = calloc(count + 1, sizeof(char *));
	if (!vec)
		return -1;

	opt = options;
	for (comma = opt; comma != end; opt = comma + 1) {
		comma = parse_mnt_opt(opt, end);
		if (!comma)
			abort();

		if (opt == comma)
			continue;

		*comma = '\0';
		vec[index++] = opt;
	}
	vec[index] = NULL;
	*optvp = vec;
	return 0;
}

static char *join_options(const char *optv[])
{
	char *flat, *flat_opt;
	const char **opt;
	size_t bytes = 0;

	for (opt = optv; *opt; opt++) {
		size_t len = strlen(*opt);
		/* An extra byte for the comma */
		bytes += len + 1;
	}

	flat_opt = flat = malloc(bytes + 1);
	if (!flat) {
		fprintf(stderr, "malloc failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	for (opt = optv; *opt; opt++) {
		size_t len = strlen(*opt);
		if (flat_opt != flat) {
			*flat_opt = ',';
			flat_opt += 1;
		}
		memcpy(flat_opt, *opt, len);
		flat_opt += len;
	}
	*flat_opt = '\0';
	return flat;
}


void build_option_strings(int dfd,
	const char **specifiers, const char **ioptions, const char **moptions,
	const char *input_options)
{
	char accepted_specifiers[1024*1024];
	const char **ovec, **asvec, **svec, **ivec, **mvec;
	const char **opt, **sopt, **iopt, **mopt;
	char *options;
	size_t count;
	int ret;

	ret = fsspecifiers(dfd, accepted_specifiers, sizeof(accepted_specifiers));
	if (ret < 0) {
		fprintf(stderr, "fsspecifiers failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	options = strdup(input_options);
	if (!options) {
		fprintf(stderr, "strdup failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}
	if (split_options(options, &ovec) != 0) {
		fprintf(stderr, "split_options failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}
	if (split_options(accepted_specifiers, &asvec) != 0) {
		fprintf(stderr, "split_options failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	for (count = 0, opt = ovec; *opt; opt++) {
		count++;
	}

	svec = calloc(count + 1, sizeof(char *));
	if (!svec) {
		fprintf(stderr, "calloc failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}
	ivec = calloc(count + 1, sizeof(char *));
	if (!ivec) {
		fprintf(stderr, "calloc failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}
	mvec = calloc(count + 1, sizeof(char *));
	if (!mvec) {
		fprintf(stderr, "calloc failed: %s\n", strerror(errno));
		exit(EXIT_FAILURE);
	}

	sopt = svec;
	iopt = ivec;
	mopt = mvec;
	for (opt = ovec; *opt; opt++) {
		const char **topt;
		const char *sep;
		int len;

		sep = strchr(*opt, '=');
		len = sep ? sep - *opt : strlen(*opt);

		/* Is the option both a mount an a normal option? */
		for (topt = joint_vec; *topt; topt++) {
			if (strcmp(*opt, *topt) == 0) {
				*mopt++ = *opt;
				goto fsoption;
			}
		}
		/* Is the option just a mount option? */
		for (topt = mount_vec; *topt; topt++) {
			if (strcmp(*opt, *topt) == 0) {
				*mopt++ = *opt;
				goto placed;
			}
		}
	fsoption:
		for (topt = asvec; *topt; topt++) {
			char tail;
			if (strncmp(*opt, *topt, len) != 0)
				continue;
			tail = (*opt)[len];
			if (tail != (sep ? '=' : '\0'))
				continue;
			*sopt++ = *opt;
			goto placed;
		}
		*iopt++ = *opt;
	placed:
		;
	}
	*sopt = NULL;
	*iopt = NULL;
	*mopt = NULL;
	*specifiers = join_options(svec);
	*ioptions = join_options(ivec);
	*moptions = join_options(mvec);
	free(svec);
	free(ivec);
	free(mvec);
	return;
}

int main(int argc, char **argv)
{
	const char *type, *name, *specifiers, *options, *moptions, *dir;
	const char *input_options;
	int dfd, ifd, mfd, ret;

	if (argc != 5) {
		fprintf(stderr, "usage: new_mount <type> <name> <options> <dir>\n");
		return EXIT_FAILURE;
	}

	type = argv[1];
	name = argv[2];
	input_options = argv[3];
	dir = argv[4];

	dfd = fsopen(type);
	if (dfd < 0) {
		fprintf(stderr, "fsopen failed: %s\n", strerror(errno));
		return EXIT_FAILURE;
	}

	build_option_strings(dfd, &specifiers, &options, &moptions, input_options);

	ifd = fsinstance(dfd, name, specifiers);
	if (ifd < 0) {
		fprintf(stderr, "fsinstance failed: %s\n", strerror(errno));
		return EXIT_FAILURE;
	}

	errno = 0;
	ret = fsset(ifd, options);
	if (ret && (errno != ESTALE)) {
		fprintf(stderr, "fsset.1 failed: %s\n", strerror(errno));
		return EXIT_FAILURE;
	}
	if (errno == ESTALE) {
		char existing_options[1024*1024];
		const char **evec, **nvec, **eopt, **nopt;
		char *new_options;
		ret = fsoptions(ifd, existing_options, sizeof(existing_options));
		if (ret < 0) {
			fprintf(stderr, "fsoptions failed: %s\n", strerror(errno));
			return EXIT_FAILURE;
		}

		new_options = strdup(options);
		if (!new_options) {
			fprintf(stderr, "strdup failed: %s\n", strerror(errno));
			return EXIT_FAILURE;
		}
		if (split_options(existing_options, &evec) != 0) {
			fprintf(stderr, "split_options failed: %s\n", strerror(errno));
			return EXIT_FAILURE;
		}
		if (split_options(new_options, &nvec) != 0) {
			fprintf(stderr, "split_options failed: %s\n", strerror(errno));
			return EXIT_FAILURE;
		}
		for (nopt = nvec; *nopt; nopt++) {
			for (eopt = evec; *eopt; eopt++) {
				if (strcmp(*eopt, *nopt) == 0)
					goto found;
			}
			fprintf(stderr, "Setting '%s' would change the existing file system options\n",
				*nopt);
			return EXIT_FAILURE;
		found:
			;
		}
		ret = fsset(ifd, options);
		if (ret) {
			fprintf(stderr, "fsset.2 failed: %s\n", strerror(errno));
			return EXIT_FAILURE;
		}

	}

	mfd = fsmount(ifd, moptions);
	if (mfd < 0) {
		fprintf(stderr, "fsmount failed: %s\n", strerror(errno));
		return EXIT_FAILURE;
	}
	close(ifd);

	ret = move_mount(mfd, "", AT_FDCWD, dir, MOVE_MOUNT_F_EMPTY_PATH);
	if (ret) {
		fprintf(stderr, "move_mount failed: %s\n", strerror(errno));
		return EXIT_FAILURE;
	}

	return 0;
}

[-- Attachment #3: Type: text/plain, Size: 8 bytes --]



Eric


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-29 21:44 [RFD] A mount api that notices previous mounts Eric W. Biederman
@ 2019-01-29 23:01 ` Casey Schaufler
  2019-01-30  1:15   ` Eric W. Biederman
  2019-01-30 12:06 ` Karel Zak
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Casey Schaufler @ 2019-01-29 23:01 UTC (permalink / raw)
  To: Eric W. Biederman, linux-api
  Cc: linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski, LSM

On 1/29/2019 1:44 PM, Eric W. Biederman wrote:
> All,
>
> With the existing mount API it is possible to mount a filesystem
> like:
>
> mount -t ext4 /dev/sda1 -o user_xattr /some/path
> mount -t ext4 /dev/sda1 -o nouser_xattr /some/other/path
>
> And have both mount commands succeed and have two mounts of the same
> filesystem.  If the mounter is not attentive or the first mount is added
> earlier it may not be immediately noticed that a mount option that is
> needed for the correct operation or the security of the system is lost.
>
> We have seen this failure mode with both devpts and proc.  So it is not
> theoretical, and it has resulted in CVEs.
>
> In some cases the existing mount API (such as a conflict between ro and
> rw) handles this by returning -EBUSY.  So we may be able to correct this
> in the existing mount API.  But it is always very tricky to to get
> adequate testing for a change like that to avoid regressions, so I am
> proposing we change this in the new mount api.
>
> This has been brought up before and I have been told it is technically
> infeasible to make this work.  To counter that I have sat down and
> implemented it.
>
> The basic idea is:
>  - get a handle to a filesystem
>    (passing enough options to uniquely identify the super block).
>    Also capture enough state in the file handle to let you know if
>    the file system has it's mount options changed between system calls.
>    (essentially this is just the fs code that calls sget)
>
>  - If the super block has not been configured allow setting the file
>    systems options.
>
>  - If the super block has already been configured require reading the
>    file systems mount options before setting/updating the file systems
>    mount options.
>
> To complement that I have functionality that:
>  - Allows reading a file systems current mount options.
>  - Allows reading the mount options that are needed to get a handle to
>    the filesystem.  For most filesystems it is just the block device
>    name.  For nfs is is essentially all mount options.  For btrfs
>    it is the block device name, and the "devices=" mount option for
>    secondary block device names.

Are you taking the LSM specific mount options into account?

>
> Please find below a tree where all of this is implemented and working.
> Not all file systems have been converted but the most of the unconverted
> ones are just a couple minutes work as I have converted all of the file
> system mount helper routines.
>
> Also please find below an example mount program that takes the same set
> of mount options as mount(8) today and mounts filesystems with the
> proposed new mount api.
>  - Without having any filesystem mount knowledge it sucessfully figures
>
> out which system calls different mount options needs to be applied
>    to.
>
> - Without having any filesystem specific knowledge in most cases it
>    can detect if a mount option you specify is already specified to an
>    existing mount or not.  For duplicates it can detect it ignores them.
>    For the other cases it fails the mount as it thinks the mount options
>    are different.
>
>  - Which demonstrates it safe to put the detection and remediation of
>    multiple mount commands resolving to the same filesystem in user
>    space.
>
> I really don't care whose code gets used as long as it works.  I do very
> much care that we don't add a new mount api that has the confusion flaws
> of the existing mount api.
>
> Along the way I have also detected a lot of room for improvement on the
> mount path for filesystems.  Those cleanup patches are in my tree below
> and will be extracting them and sending them along shortly.
>
> Comments?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git no-losing-mount-options-proof-of-concept
>
>
>
> Eric
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-29 23:01 ` Casey Schaufler
@ 2019-01-30  1:15   ` Eric W. Biederman
  2019-01-30  1:23     ` Eric W. Biederman
  0 siblings, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-30  1:15 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski, LSM

Casey Schaufler <casey@schaufler-ca.com> writes:

> On 1/29/2019 1:44 PM, Eric W. Biederman wrote:
>> All,
>>
>> With the existing mount API it is possible to mount a filesystem
>> like:
>>
>> mount -t ext4 /dev/sda1 -o user_xattr /some/path
>> mount -t ext4 /dev/sda1 -o nouser_xattr /some/other/path
>>
>> And have both mount commands succeed and have two mounts of the same
>> filesystem.  If the mounter is not attentive or the first mount is added
>> earlier it may not be immediately noticed that a mount option that is
>> needed for the correct operation or the security of the system is lost.
>>
>> We have seen this failure mode with both devpts and proc.  So it is not
>> theoretical, and it has resulted in CVEs.
>>
>> In some cases the existing mount API (such as a conflict between ro and
>> rw) handles this by returning -EBUSY.  So we may be able to correct this
>> in the existing mount API.  But it is always very tricky to to get
>> adequate testing for a change like that to avoid regressions, so I am
>> proposing we change this in the new mount api.
>>
>> This has been brought up before and I have been told it is technically
>> infeasible to make this work.  To counter that I have sat down and
>> implemented it.
>>
>> The basic idea is:
>>  - get a handle to a filesystem
>>    (passing enough options to uniquely identify the super block).
>>    Also capture enough state in the file handle to let you know if
>>    the file system has it's mount options changed between system calls.
>>    (essentially this is just the fs code that calls sget)
>>
>>  - If the super block has not been configured allow setting the file
>>    systems options.
>>
>>  - If the super block has already been configured require reading the
>>    file systems mount options before setting/updating the file systems
>>    mount options.
>>
>> To complement that I have functionality that:
>>  - Allows reading a file systems current mount options.
>>  - Allows reading the mount options that are needed to get a handle to
>>    the filesystem.  For most filesystems it is just the block device
>>    name.  For nfs is is essentially all mount options.  For btrfs
>>    it is the block device name, and the "devices=" mount option for
>>    secondary block device names.
>
> Are you taking the LSM specific mount options into account?

In the design yes, and I allow setting them.  It appears in the code
to retrieve the mount options I forgot to call security_sb_show_options.

For finding the super block that you are going to mount the LSM mount
options are not relevant.  Even nfs will not want to set those early as
they do not help determine the nfs super block.  So the only place where
there is anything interesting in my api is in reading back the security
options so they can be compared to the options the mounter is setting.

I will add the missing call to security_sb_show_options which is enough
to fix selinux.  Unfortunately smack does not currently implement
.sb_show_options.  Not implementing smack_sb_show_options means
/proc/mounts fails to match /etc/mtab which is a bug and it is likely
a real workd bug for the people who use smack and don't want to depend
on /etc/mtab, or are transitioning away from it.

Casey do you want to implement smack_sb_show_options or should I put it
on my todo list?

Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30  1:15   ` Eric W. Biederman
@ 2019-01-30  1:23     ` Eric W. Biederman
  2019-01-30 12:47       ` Eric W. Biederman
  0 siblings, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-30  1:23 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski, LSM

ebiederm@xmission.com (Eric W. Biederman) writes:

> Casey Schaufler <casey@schaufler-ca.com> writes:
>> Are you taking the LSM specific mount options into account?
>
> In the design yes, and I allow setting them.  It appears in the code
> to retrieve the mount options I forgot to call security_sb_show_options.
>
> For finding the super block that you are going to mount the LSM mount
> options are not relevant.  Even nfs will not want to set those early as
> they do not help determine the nfs super block.  So the only place where
> there is anything interesting in my api is in reading back the security
> options so they can be compared to the options the mounter is setting.
>
> I will add the missing call to security_sb_show_options which is enough
> to fix selinux.  Unfortunately smack does not currently implement
> .sb_show_options.  Not implementing smack_sb_show_options means
> /proc/mounts fails to match /etc/mtab which is a bug and it is likely
> a real workd bug for the people who use smack and don't want to depend
> on /etc/mtab, or are transitioning away from it.
>
> Casey do you want to implement smack_sb_show_options or should I put it
> on my todo list?

Oh.  I should add that I am always parsing the LSM mount options out so
that there is not a chance of the individual filesystems implementing
comflicting options even when there are no LSMs active.  Without that I
am afraid we run the risk of having LSM mount otions in conflict with
ordinary filesystems options at some point and by the time we discover
it it would start introducing filesystem regressions.

That does help with stack though as there is no fundamental reason only
one LSM could process mount options.

Eric



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-29 21:44 [RFD] A mount api that notices previous mounts Eric W. Biederman
  2019-01-29 23:01 ` Casey Schaufler
@ 2019-01-30 12:06 ` Karel Zak
  2019-01-30 13:45   ` Eric W. Biederman
  2019-01-30 12:50 ` David Howells
  2019-01-30 13:01 ` David Howells
  3 siblings, 1 reply; 14+ messages in thread
From: Karel Zak @ 2019-01-30 12:06 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, util-linux, Andy Lutomirski

On Tue, Jan 29, 2019 at 03:44:22PM -0600, Eric W. Biederman wrote:
> so I am proposing we change this in the new mount api.

Well, this forces me to ask what the new API is? :-)

It seems that David uses fsconfig() and fsinfo() to set and get
mount options, and your patch introduces fsset() and fsoptions().

IMHO differentiate between FS driver and FS instance is a good idea it
makes things more extendable. The sequence number in the instance is a
good example.

But for me David's fsinfo() seems better that fsoptions() and
fsspecifier(). I'm not sure about "all mount options as one string"
From your example is pretty obvious how much energy is necessary to 
split and join the strings.

It seems more elegant is to ask for Nth option as expected by fsinfo().
It also seems that fsinfo() is able to replace fsname() and fstype().

It would be better to extend David's fsinfo() to work with FS instance
and to return specifiers. And use fsconfig() rather than fsset().

    Karel

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30  1:23     ` Eric W. Biederman
@ 2019-01-30 12:47       ` Eric W. Biederman
  2019-01-30 16:19         ` Casey Schaufler
  0 siblings, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-30 12:47 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski, LSM

ebiederm@xmission.com (Eric W. Biederman) writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Casey Schaufler <casey@schaufler-ca.com> writes:
>>> Are you taking the LSM specific mount options into account?
>>
>> In the design yes, and I allow setting them.  It appears in the code
>> to retrieve the mount options I forgot to call security_sb_show_options.
>>
>> For finding the super block that you are going to mount the LSM mount
>> options are not relevant.  Even nfs will not want to set those early as
>> they do not help determine the nfs super block.  So the only place where
>> there is anything interesting in my api is in reading back the security
>> options so they can be compared to the options the mounter is setting.
>>
>> I will add the missing call to security_sb_show_options which is enough
>> to fix selinux.  Unfortunately smack does not currently implement
>> .sb_show_options.  Not implementing smack_sb_show_options means
>> /proc/mounts fails to match /etc/mtab which is a bug and it is likely
>> a real workd bug for the people who use smack and don't want to depend
>> on /etc/mtab, or are transitioning away from it.
>>
>> Casey do you want to implement smack_sb_show_options or should I put it
>> on my todo list?
>
> Oh.  I should add that I am always parsing the LSM mount options out so
> that there is not a chance of the individual filesystems implementing
> comflicting options even when there are no LSMs active.  Without that I
> am afraid we run the risk of having LSM mount otions in conflict with
> ordinary filesystems options at some point and by the time we discover
> it it would start introducing filesystem regressions.
>
> That does help with stack though as there is no fundamental reason only
> one LSM could process mount options.

Sigh.  I just realized that there is a smack variant of the bug I am
working to fix.

smack on remount does not fail if you change the smack mount options.
It just silently ignores the smack mount options.  Which is exactly the
same poor interaction with userspace that has surprised user space
and caused CVEs.

How much do you think the smack users will care if you start verifying
that if smack options are present in remount that they are unchanged
from mount?

I suspect the smack userbase is small enough, and the corner case is
crazy enough we can fix this poor communication by smack.  Otherwise it
looks like there needs to be a new security hook so old and new remounts
can be distinguished by the LSMs, and smack can be fixed in the new
version.

Eric


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-29 21:44 [RFD] A mount api that notices previous mounts Eric W. Biederman
  2019-01-29 23:01 ` Casey Schaufler
  2019-01-30 12:06 ` Karel Zak
@ 2019-01-30 12:50 ` David Howells
  2019-01-30 13:24   ` Eric W. Biederman
  2019-01-30 13:01 ` David Howells
  3 siblings, 1 reply; 14+ messages in thread
From: David Howells @ 2019-01-30 12:50 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: dhowells, linux-api, linux-fsdevel, linux-kernel, Al Viro,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski

You need to rebase on linus/master.  A bunch of your patches are obsoleted by
Al's security changes there.

David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-29 21:44 [RFD] A mount api that notices previous mounts Eric W. Biederman
                   ` (2 preceding siblings ...)
  2019-01-30 12:50 ` David Howells
@ 2019-01-30 13:01 ` David Howells
  2019-01-30 13:35   ` Eric W. Biederman
  2019-01-30 17:43   ` Karel Zak
  3 siblings, 2 replies; 14+ messages in thread
From: David Howells @ 2019-01-30 13:01 UTC (permalink / raw)
  To: Karel Zak
  Cc: dhowells, Eric W. Biederman, linux-api, linux-fsdevel,
	linux-kernel, Al Viro, Miklos Szeredi, Linus Torvalds,
	util-linux, Andy Lutomirski

Karel Zak <kzak@redhat.com> wrote:

> It seems more elegant is to ask for Nth option as expected by fsinfo().

More elegant yes, but there's an issue with atomiticity[*].  I'm in the
process of switching to something that returns you a single buffer with all
the options in, but each key and each value is preceded by a length count.

The reasons for not using separator characters are:

 (1) There's no separator char that cannot validly occur within an option[**].

 (2) Makes it possible to return binary values if we need to.

David

[*] Atomic with respect to remount calls, that is.

[**] Oh, and look at cifs where you can *change* the separator char during
     option parsing ("sep=<char>").

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30 12:50 ` David Howells
@ 2019-01-30 13:24   ` Eric W. Biederman
  0 siblings, 0 replies; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-30 13:24 UTC (permalink / raw)
  To: David Howells
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, Miklos Szeredi,
	Linus Torvalds, Karel Zak, util-linux, Andy Lutomirski

David Howells <dhowells@redhat.com> writes:

> You need to rebase on linus/master.  A bunch of your patches are obsoleted by
> Al's security changes there.

Before anything is merged definitely.

Al dealt with mount options from the LSMs in a slightly different way
than I did.  At a practical level Als version of the changes to the LSMs
and mine are you say po-tae-toe I say po-tah-toe differences so I don't
see that influencing semantics up at the api level.

For purposes of disucssing an API (not of merging one) I chose to start
with code I had a reasonable amount of testing against, so that other
people could play with it without expecting trouble.

Eric


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30 13:01 ` David Howells
@ 2019-01-30 13:35   ` Eric W. Biederman
  2019-01-30 18:00     ` Karel Zak
  2019-01-30 17:43   ` Karel Zak
  1 sibling, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-30 13:35 UTC (permalink / raw)
  To: David Howells
  Cc: Karel Zak, linux-api, linux-fsdevel, linux-kernel, Al Viro,
	Miklos Szeredi, Linus Torvalds, util-linux, Andy Lutomirski

David Howells <dhowells@redhat.com> writes:

> Karel Zak <kzak@redhat.com> wrote:
>
>> It seems more elegant is to ask for Nth option as expected by fsinfo().
>
> More elegant yes, but there's an issue with atomiticity[*].  I'm in the
> process of switching to something that returns you a single buffer with all
> the options in, but each key and each value is preceded by a length count.
>
> The reasons for not using separator characters are:
>
>  (1) There's no separator char that cannot validly occur within an option[**].

*Blink*  I had missed the cifs issue.  So yes we certainly need
a better way to encode things in the buffer.  I just used a single
string as an easy way to place everything in a buffer.

>  (2) Makes it possible to return binary values if we need to.

I don't totally disagree with this.  But I will point out that
except for coda passing a file descriptor there are no filesystems
that currently take or need binary options.

I suspect that as long as userspace supports /etc/fstab and we in turn
support /proc/mounts there is going to be a lot of pressure to keep
the majority of options so they  can be encoded in a string separated by
commas.

> David
>
> [*] Atomic with respect to remount calls, that is.

There are also mount options that depend on each other and whose order
matters with respect to other mount options extN's ("sb=<NNNN>") for
example.

> [**] Oh, and look at cifs where you can *change* the separator char during
>      option parsing ("sep=<char>").

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30 12:06 ` Karel Zak
@ 2019-01-30 13:45   ` Eric W. Biederman
  0 siblings, 0 replies; 14+ messages in thread
From: Eric W. Biederman @ 2019-01-30 13:45 UTC (permalink / raw)
  To: Karel Zak
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, util-linux, Andy Lutomirski

Karel Zak <kzak@redhat.com> writes:

> On Tue, Jan 29, 2019 at 03:44:22PM -0600, Eric W. Biederman wrote:
>> so I am proposing we change this in the new mount api.
>
> Well, this forces me to ask what the new API is? :-)
>
> It seems that David uses fsconfig() and fsinfo() to set and get
> mount options, and your patch introduces fsset() and fsoptions().
>
> IMHO differentiate between FS driver and FS instance is a good idea it
> makes things more extendable. The sequence number in the instance is a
> good example.
>
> But for me David's fsinfo() seems better that fsoptions() and
> fsspecifier(). I'm not sure about "all mount options as one string"
> From your example is pretty obvious how much energy is necessary to 
> split and join the strings.
>
> It seems more elegant is to ask for Nth option as expected by fsinfo().
> It also seems that fsinfo() is able to replace fsname() and fstype().
>
> It would be better to extend David's fsinfo() to work with FS instance
> and to return specifiers. And use fsconfig() rather than fsset().

As David has pointed out with cifs having a sep= option we need a better
story of parsing the options in the kernel.

What my branch does is demonstrate there is at least one way we can
avoid mount options being silently different from what userspace
expects.

Which means my branch is fine for looking at semantics and possible
system calls, but not much else.

I actually used multiple system calls just so I could avoid dealing
with multi-plexor systems calls.

Eric



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30 12:47       ` Eric W. Biederman
@ 2019-01-30 16:19         ` Casey Schaufler
  0 siblings, 0 replies; 14+ messages in thread
From: Casey Schaufler @ 2019-01-30 16:19 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-api, linux-fsdevel, linux-kernel, Al Viro, David Howells,
	Miklos Szeredi, Linus Torvalds, Karel Zak, util-linux,
	Andy Lutomirski, LSM, SMACK-discuss

On 1/30/2019 4:47 AM, Eric W. Biederman wrote:
> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> ebiederm@xmission.com (Eric W. Biederman) writes:
>>
>>> Casey Schaufler <casey@schaufler-ca.com> writes:
>>>> Are you taking the LSM specific mount options into account?
>>> In the design yes, and I allow setting them.  It appears in the code
>>> to retrieve the mount options I forgot to call security_sb_show_options.
>>>
>>> For finding the super block that you are going to mount the LSM mount
>>> options are not relevant.  Even nfs will not want to set those early as
>>> they do not help determine the nfs super block.  So the only place where
>>> there is anything interesting in my api is in reading back the security
>>> options so they can be compared to the options the mounter is setting.
>>>
>>> I will add the missing call to security_sb_show_options which is enough
>>> to fix selinux.  Unfortunately smack does not currently implement
>>> .sb_show_options.  Not implementing smack_sb_show_options means
>>> /proc/mounts fails to match /etc/mtab which is a bug and it is likely
>>> a real workd bug for the people who use smack and don't want to depend
>>> on /etc/mtab, or are transitioning away from it.
>>>
>>> Casey do you want to implement smack_sb_show_options or should I put it
>>> on my todo list?
>> Oh.  I should add that I am always parsing the LSM mount options out so
>> that there is not a chance of the individual filesystems implementing
>> comflicting options even when there are no LSMs active.  Without that I
>> am afraid we run the risk of having LSM mount otions in conflict with
>> ordinary filesystems options at some point and by the time we discover
>> it it would start introducing filesystem regressions.
>>
>> That does help with stack though as there is no fundamental reason only
>> one LSM could process mount options.
> Sigh.  I just realized that there is a smack variant of the bug I am
> working to fix.
>
> smack on remount does not fail if you change the smack mount options.
> It just silently ignores the smack mount options.  Which is exactly the
> same poor interaction with userspace that has surprised user space
> and caused CVEs.
>
> How much do you think the smack users will care if you start verifying
> that if smack options are present in remount that they are unchanged
> from mount?

I've added the smack-discuss list to the conversation.

> I suspect the smack userbase is small enough, and the corner case is
> crazy enough we can fix this poor communication by smack.  Otherwise it
> looks like there needs to be a new security hook so old and new remounts
> can be distinguished by the LSMs, and smack can be fixed in the new
> version.

I fear that it may be worse than that. It's not enough to distinguish
a mount from a remount. On remount you need an LSM specific way to
compare mount options. Smack may decide that it's OK to remount a
filesystem with more restrictive smackfshat values, for example. Or,
it may allow smackfsroot=Pop for one and smackfstransmute=Pop on
the other. I'm not sure about the 2nd case, but you should get the idea.

>
> Eric
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30 13:01 ` David Howells
  2019-01-30 13:35   ` Eric W. Biederman
@ 2019-01-30 17:43   ` Karel Zak
  1 sibling, 0 replies; 14+ messages in thread
From: Karel Zak @ 2019-01-30 17:43 UTC (permalink / raw)
  To: David Howells
  Cc: Eric W. Biederman, linux-api, linux-fsdevel, linux-kernel,
	Al Viro, Miklos Szeredi, Linus Torvalds, util-linux,
	Andy Lutomirski

On Wed, Jan 30, 2019 at 01:01:54PM +0000, David Howells wrote:
> Karel Zak <kzak@redhat.com> wrote:
> 
> > It seems more elegant is to ask for Nth option as expected by fsinfo().
> 
> More elegant yes, but there's an issue with atomiticity[*].  I'm in the
> process of switching to something that returns you a single buffer with all
> the options in, but each key and each value is preceded by a length count.

Sounds good, for me is important to avoid all the split/join
operations with the strings.

> The reasons for not using separator characters are:
> 
>  (1) There's no separator char that cannot validly occur within an option[**].

Yes, it's pretty common for selinux mount options where "," is
used within an option, so mount options string looks like

    'context="system_u:object_r:tmp_t:s0:c127,c456",noexec'

and I have doubts all the parses in userspace are compatible with this
use case...

>  (2) Makes it possible to return binary values if we need to.

Yes.

> [**] Oh, and look at cifs where you can *change* the separator char during
>      option parsing ("sep=<char>").

No comment :-)

    Karel

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFD] A mount api that notices previous mounts
  2019-01-30 13:35   ` Eric W. Biederman
@ 2019-01-30 18:00     ` Karel Zak
  0 siblings, 0 replies; 14+ messages in thread
From: Karel Zak @ 2019-01-30 18:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Howells, linux-api, linux-fsdevel, linux-kernel, Al Viro,
	Miklos Szeredi, Linus Torvalds, util-linux, Andy Lutomirski

On Wed, Jan 30, 2019 at 07:35:39AM -0600, Eric W. Biederman wrote:
> I suspect that as long as userspace supports /etc/fstab and we in turn
> support /proc/mounts there is going to be a lot of pressure to keep
> the majority of options so they  can be encoded in a string separated by
> commas.

Well, we're doing many crazy things in userspace ;-) For example we do
not distinguish between VFS flags (MS_*), FS specific mount options
and userspace mount options (loop=) in our config files. 

You already need to parse fstab/command line before you can use the 
strings for mount(2) syscall. It's already not straightforward, see
all the code in libmount...

/proc/mounts is only for backward compatibility, /proc/self/mountinfo
is better way, because it allows to see VFS and FS, etc.

IMHO it would be better to not care about way how we use mount options 
strings now (in userspace) when you think about the new API design.

    Karel

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-01-30 18:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-29 21:44 [RFD] A mount api that notices previous mounts Eric W. Biederman
2019-01-29 23:01 ` Casey Schaufler
2019-01-30  1:15   ` Eric W. Biederman
2019-01-30  1:23     ` Eric W. Biederman
2019-01-30 12:47       ` Eric W. Biederman
2019-01-30 16:19         ` Casey Schaufler
2019-01-30 12:06 ` Karel Zak
2019-01-30 13:45   ` Eric W. Biederman
2019-01-30 12:50 ` David Howells
2019-01-30 13:24   ` Eric W. Biederman
2019-01-30 13:01 ` David Howells
2019-01-30 13:35   ` Eric W. Biederman
2019-01-30 18:00     ` Karel Zak
2019-01-30 17:43   ` Karel Zak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).