All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v18 0/1] fs: Add VirtualBox guest shared folder (vboxsf) support
@ 2019-11-25 14:08 Hans de Goede
       [not found] ` <20191125140839.4956-2-hdegoede@redhat.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Hans de Goede @ 2019-11-25 14:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hans de Goede, Greg Kroah-Hartman, Andrew Morton, Alexander Viro,
	David Howells, Christoph Hellwig, linux-fsdevel

Hi Linus,

As discussed in relation to the brief sting of vboxsf in drivers/staging,
Al Viro seems to be too busy with other things to have time to take this
patch upstream, therefor I'm submitting it directly to you.

Al Viro did do an initial review around v2 or v3 of the patch, which I
believe I have fully addressed.

v10 and v11 have been reviewed by David Howell and all his comments have
been addressed in v12.

v12 and v13 have been reviewed by Christoph Hellwig and all his comments
have been addressed in v14.

Later versions contain some small changes, but have mostly been
resubmissions attempting to get this upstream.

Christoph responded to my v17 submission with:

"Please just send it to Linus directly.  This is the equivalent of
consumer hardware enablement and it is in a state as clean as it gets
for the rather messed up protocol."

And Christoph has given his Acked-by for this v18 submission.

Regards,

Hans

---

Full changelog:

Changes in v18:
- Move back to fs/vboxsf for direct submission to Linus
- Squash in a few small fixes done during vboxsf's brief stint in staging
- Add handling of short copies to vboxsf_write_end
- Add a comment about our simple_write_begin usage (Suggested by David Howell)

Changes in v17:
- Submit upstream through drivers/staging as this is not getting any
  traction on the fsdevel list
- Add TODO file
- vboxsf_free_inode uses sbi->idr, call rcu_barrier from put_super to make
  sure all delayed rcu free inodes are flushed

Changes in v16:
- Make vboxsf_parse_monolithic() static, reported-by: kbuild test robot

Changes in v15:
- Rebase on top of 5.4-rc1

Changes in v14
- Add a commment explaining possible read cache strategies and which one
  has been chosen
- Change pagecache-invalidation on open (inode_revalidate) to use
  invalidate_inode_pages2() so that mmap-ed pages are dropped from
  the cache too
- Add a comment to vboxsf_file_release explaining why the
  filemap_write_and_wait() call is done there
- Some minor code tweaks

Changes in v13
- Add SPDX tag to Makefile, use foo-y := to set objectfile list
- Drop kerneldoc headers stating the obvious from vfs callbacks,
  to avoid them going stale
- Replace sf_ prefix of functions and data-types with vboxsf_
- Use more normal naming scheme for sbi and private inode data:
    struct vboxsf_sbi *sbi = VBOXSF_SBI(inode->i_sb);
    struct vboxsf_inode *sf_i = VBOXSF_I(inode);
- Refactor directory reading code
- Use goto based unwinding instead of nested if-s in a number of places
- Consolidate dir unlink and rmdir inode_operations into a single function
- Use the page-cache for regular reads/writes too
- Directly set super_operations.free_inode to what used to be our
  vboxsf_i_callback, instead of setting super_operations.destroy_inode
  to a function which just does: call_rcu(&inode->i_rcu, vboxsf_i_callback);
- Use spinlock_irqsafe for ino_idr_lock
  vboxsf_free_inode may be called from a RCU callback, and thus from
  softirq context, so we need to use spinlock_irqsafe vboxsf_new_inode.
  On alloc_inode failure vboxsf_free_inode may be called from process
  context, so it too needs to use spinlock_irqsafe.

Changes in v12:
- Move make_kuid / make_kgid calls to option parsing time and add
  uid_valid / gid_valid checks.
- In init_fs_context call current_uid_gid() to init uid and gid
- Validate dmode, fmode, dmask and fmask options during option parsing
- Use correct types for various mount option variables (kuid_t, kgid_t, umode_t)
- Some small coding-style tweaks

Changes in v11:
- Convert to the new Documentation/filesystems/mount_api.txt mount API
- Fixed all the function kerneldoc comments to have things in the proper order
- Change type of d_type variable passed as type to dir_emit from int to
  unsigned int
- Replaced the fake-ino overflow test with the one suggested by David Howells
- Fixed various coding style issues

Changes in v10:
-Code-style fixes and remove some unneeded checks as suggested by Al Viro
-Stop handle reuse between sf_create_aux and sf_reg_open, the code for this
 was racy and the re-use meant the O_APPEND was not passed to the host for
 newly created files with O_APPEND set
-Use idr to generate unique inode number, modelled after the kernfs code
-Only read and write the contents of the passed in offset pointer once in
 sf_reg_write
-Keep a list of refcounted open handles in the inode, so that writepage can
 get a writeable handle this way. This replaces the old very racy code which
 was just storing a pointer to the last opened struct file inside the inode.
 This is modelled after how the cifs and fuse code do this

Changes in v9:
-Change license from GPL-2.0 or CDDL-1.0 to MIT, following upstream's
 license change from: https://www.virtualbox.org/changeset/72627/vbox
 I've gotten permission by email from VirtualBox upstream to retro-actively
 apply the license-change to my "fork" of the vboxsf code
-Fix not being able to mount any shared-folders when built with gcc9
-Adjust for recent vboxguest changes
-Fix potential buffer overrun in vboxsf_nlscpy
-Fix build errors in some configs, caught by buildbot
-Fix 3 sparse warnings
-Some changes from upstream VirtualBox svn:
 -Use 0x786f4256 /* 'VBox' little endian */ as super-magic matching upstream
 -Implement AT_STATX_SYNC_TYPE support
 -Properly return -EPERM when symlink creation is not supported

Changes in v8:
-Fix broken error-handling / oops when the vboxsf_map_folder() call fails
-Fix umount using umount.nfs to umount vboxsf mounts
-Prefixed the modules init and exit function names with vboxsf_
-Delay connecting to the vbox hypervisor until the first mount, this fixes
 vboxsf not working when it is builtin (in which case it may be initialized
 before the vboxguest driver has bound to the guest communication PCI device)
-Fix sf_write_end return value, return the number of bytes written or 0 on error:
 https://github.com/jwrdegoede/vboxsf/issues/2
-Use an ida id in the name passed to super_setup_bdi_name so that the same
 shared-folder can be mounted twice without causing a
 "sysfs: cannot create duplicate filename" error
 https://github.com/jwrdegoede/vboxsf/issues/3

Changes in v7:
-Do not propagate sgid / suid bits between guest-host, note hosts with
 VirtualBox version 5.2.6 or newer will filter these out regardless of what
 we do
-Better error messages when we cannot connect to the VirtualBox guest PCI
 device, which may e.g. happen when trying to use vboxsf outside a vbox vm

Changes in v6:
-Address: https://www.virtualbox.org/ticket/819 which really is multiple bugs:
 1) Fix MAP_SHARED not being supported
 2) Fix changes done through regular read/write on the guest side not being
    seen by guest apps using mmap() access
 3) Fix any changes done on the host side not being seen by guest apps using
    mmap() access

Changes in v5:
-Honor CONFIG_NLS_DEFAULT (reported-by michael.thayer@oracle.com)

Changes in v4:
-Drop "name=..." mount option, instead use the dev_name argument to the
 mount syscall, to keep compatibility with existing fstab entries
-Fix "nls=%" match_table_t entry to "nls=%s"

Changes in v3:
-Use text only mount options, instead of a custom data struct
-Stop caching full path in inode data, if parents gets renamed it will change
-Fixed negative dentries handling
-Dropped the force_reread flag for dirs, not sure what it was actually for
 but it is no good, doing a re-read on unlink of a file will lead to
 another file being skipped if the caller has already iterated over the
 entry for the unlinked file.
-Use file_inode(), file_dentry() and d_inode() helpers
-Prefix any non object-private symbols with vboxsf_ so as to not pollute
 the global namespace when builtin
-Add MAINTAINERS entry
-Misc. cleanups

Changes in v2:
-Removed various unused wrapper functions
-Don't use i_private, instead defined alloc_inode and destroy_inode
 methods and use container_of.
-Drop obsolete comment referencing people to
 http://www.atnf.csiro.au/people/rgooch/linux/vfs.txt
-move the single symlink op of from lnkops.c to file.c
-Use SPDX license headers
-Replace SHFLROOT / SHFLHANDLE defines with normal types
-Removed unnecessary S_ISREG checks
-Got rid of bounce_buffer in regops, instead add a "user" flag to
 vboxsf_read / vboxsf_write, re-using the existing __user address support
 in the vboxguest module
-Make vboxsf_wrappers return regular linux errno values
-Use i_size_write to update size on writing
-Convert doxygen style comments to kerneldoc style comments
 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v18] fs: Add VirtualBox guest shared folder (vboxsf) support
       [not found] ` <20191125140839.4956-2-hdegoede@redhat.com>
@ 2019-12-08  5:33   ` Al Viro
  2019-12-10 15:34     ` Hans de Goede
  0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2019-12-08  5:33 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Torvalds, Greg Kroah-Hartman, Andrew Morton, David Howells,
	Christoph Hellwig, linux-fsdevel

On Mon, Nov 25, 2019 at 03:08:39PM +0100, Hans de Goede wrote:

> +	list_for_each_entry(b, &sf_d->info_list, head) {
> +try_next_entry:
> +		if (ctx->pos >= cur + b->entries) {
> +			cur += b->entries;
> +			continue;
> +		}
> +
> +		/*
> +		 * Note the vboxsf_dir_info objects we are iterating over here
> +		 * are variable sized, so the info pointer may end up being
> +		 * unaligned. This is how we get the data from the host.
> +		 * Since vboxsf is only supported on x86 machines this is not
> +		 * a problem.
> +		 */
> +		for (i = 0, info = b->buf; i < ctx->pos - cur; i++) {
> +			size = offsetof(struct shfl_dirinfo, name.string) +
> +			       info->name.size;
> +			info = (struct shfl_dirinfo *)((uintptr_t)info + size);

Yecchhh...
	1) end = &info->name.string[info->name.size];
	   info = (struct shfl_dirinfo *)end;
please.  Compiler can and will optimize it just fine.
	2) what guarantees the lack of overruns here?

> +{
> +	bool keep_iterating;
> +
> +	for (keep_iterating = true; keep_iterating; ctx->pos += 1)
> +		keep_iterating = vboxsf_dir_emit(dir, ctx);

Are you sure you want to bump ctx->pos when vboxsf_dir_emit() returns false?

> +static int vboxsf_dir_create(struct inode *parent, struct dentry *dentry,
> +			     umode_t mode, int is_dir)
> +{
> +	struct vboxsf_inode *sf_parent_i = VBOXSF_I(parent);
> +	struct vboxsf_sbi *sbi = VBOXSF_SBI(parent->i_sb);
> +	struct shfl_createparms params = {};
> +	int err;
> +
> +	params.handle = SHFL_HANDLE_NIL;
> +	params.create_flags = SHFL_CF_ACT_CREATE_IF_NEW |
> +			      SHFL_CF_ACT_FAIL_IF_EXISTS |
> +			      SHFL_CF_ACCESS_READWRITE |
> +			      (is_dir ? SHFL_CF_DIRECTORY : 0);
> +	params.info.attr.mode = (mode & 0777) |
> +				(is_dir ? SHFL_TYPE_DIRECTORY : SHFL_TYPE_FILE);
> +	params.info.attr.additional = SHFLFSOBJATTRADD_NOTHING;
> +
> +	err = vboxsf_create_at_dentry(dentry, &params);

That's... interesting.  What should happen if you race with rename of
grandparent?  Note that *parent* is locked here; no deeper ancestors
are.

The same goes for removals.

> +static const char *vboxsf_get_link(struct dentry *dentry, struct inode *inode,
> +				   struct delayed_call *done)
> +{
> +	struct vboxsf_sbi *sbi = VBOXSF_SBI(inode->i_sb);
> +	struct shfl_string *path;
> +	char *link;
> +	int err;
> +
> +	if (!dentry)
> +		return ERR_PTR(-ECHILD);
> +
> +	path = vboxsf_path_from_dentry(sbi, dentry);
> +	if (IS_ERR(path))
> +		return (char *)path;

ERR_CAST(path)

> +	/** No additional information is available / requested. */
> +	SHFLFSOBJATTRADD_NOTHING = 1,

<unprintable>
Well, unpronounceable, actually...

> +	switch (opt) {
> +	case opt_nls:
> +		if (fc->purpose != FS_CONTEXT_FOR_MOUNT) {
> +			vbg_err("vboxsf: Cannot reconfigure nls option\n");
> +			return -EINVAL;
> +		}
> +		ctx->nls_name = param->string;
> +		param->string = NULL;

Umm...  What happens if you are given several such?  A leak?

> +{
> +	int err;
> +
> +	err = vboxsf_setup();
> +	if (err)
> +		return err;
> +
> +	return vfs_get_super(fc, vfs_get_independent_super, vboxsf_fill_super);

	return get_tree_nodev(fc, vboxsf_fill_super);
please,

> +static int vboxsf_reconfigure(struct fs_context *fc)
> +{
> +	struct vboxsf_sbi *sbi = VBOXSF_SBI(fc->root->d_sb);
> +	struct vboxsf_fs_context *ctx = fc->fs_private;
> +	struct inode *iroot;
> +
> +	iroot = ilookup(fc->root->d_sb, 0);
> +	if (!iroot)
> +		return -ENOENT;

Huh?  If that's supposed to be root directory inode, what's wrong
with ->d_sb->s_root->d_inode?

> +	path = dentry_path_raw(dentry, buf, PATH_MAX);
> +	if (IS_ERR(path)) {
> +		__putname(buf);
> +		return (struct shfl_string *)path;

ERR_CAST(path)...

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v18] fs: Add VirtualBox guest shared folder (vboxsf) support
  2019-12-08  5:33   ` [PATCH v18] " Al Viro
@ 2019-12-10 15:34     ` Hans de Goede
  0 siblings, 0 replies; 3+ messages in thread
From: Hans de Goede @ 2019-12-10 15:34 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Greg Kroah-Hartman, Andrew Morton, David Howells,
	Christoph Hellwig, linux-fsdevel

Hi Al,

Thank you for the review.

You probably know this alreayd but to be sure here is some
background info on vboxsf. vboxsf is not really a filesystem, it
is a kernel driver for VirtualBox virtual machines to access part
of the host filesystem (a Shared-Folder) in the guest.

This uses an IPC interface between the guest and the host and we
are limited to what this interface offers us, in some cases this
is not ideal, e.g. there is no locking.


On 12/8/19 6:33 AM, Al Viro wrote:
> On Mon, Nov 25, 2019 at 03:08:39PM +0100, Hans de Goede wrote:
> 
>> +	list_for_each_entry(b, &sf_d->info_list, head) {
>> +try_next_entry:
>> +		if (ctx->pos >= cur + b->entries) {
>> +			cur += b->entries;
>> +			continue;
>> +		}
>> +
>> +		/*
>> +		 * Note the vboxsf_dir_info objects we are iterating over here
>> +		 * are variable sized, so the info pointer may end up being
>> +		 * unaligned. This is how we get the data from the host.
>> +		 * Since vboxsf is only supported on x86 machines this is not
>> +		 * a problem.
>> +		 */
>> +		for (i = 0, info = b->buf; i < ctx->pos - cur; i++) {
>> +			size = offsetof(struct shfl_dirinfo, name.string) +
>> +			       info->name.size;
>> +			info = (struct shfl_dirinfo *)((uintptr_t)info + size);
> 
> Yecchhh...
> 	1) end = &info->name.string[info->name.size];
> 	   info = (struct shfl_dirinfo *)end;
> please.  Compiler can and will optimize it just fine.

Ok I will rework this as you suggest for the next version.

> 	2) what guarantees the lack of overruns here?

We have checked that (ctx->pos - cur) < b->entries above,
so "i" will be limeted to the range of 0 - (b->entries - 1).

Or do you mean what guarantees that we do not overrun
b->used which is the total (used) number of bytes of the vboxsf_dir_buf?

Currently the code is trusting the host to have not filled the
vboxsf_dir_buf with nonsense and that it b->entries is accurate.

I guess we could add some checks for b->used here as an extra check.

> 
>> +{
>> +	bool keep_iterating;
>> +
>> +	for (keep_iterating = true; keep_iterating; ctx->pos += 1)
>> +		keep_iterating = vboxsf_dir_emit(dir, ctx);
> 
> Are you sure you want to bump ctx->pos when vboxsf_dir_emit() returns false?

No that seems to be a bug, thank you for catching that.

>> +static int vboxsf_dir_create(struct inode *parent, struct dentry *dentry,
>> +			     umode_t mode, int is_dir)
>> +{
>> +	struct vboxsf_inode *sf_parent_i = VBOXSF_I(parent);
>> +	struct vboxsf_sbi *sbi = VBOXSF_SBI(parent->i_sb);
>> +	struct shfl_createparms params = {};
>> +	int err;
>> +
>> +	params.handle = SHFL_HANDLE_NIL;
>> +	params.create_flags = SHFL_CF_ACT_CREATE_IF_NEW |
>> +			      SHFL_CF_ACT_FAIL_IF_EXISTS |
>> +			      SHFL_CF_ACCESS_READWRITE |
>> +			      (is_dir ? SHFL_CF_DIRECTORY : 0);
>> +	params.info.attr.mode = (mode & 0777) |
>> +				(is_dir ? SHFL_TYPE_DIRECTORY : SHFL_TYPE_FILE);
>> +	params.info.attr.additional = SHFLFSOBJATTRADD_NOTHING;
>> +
>> +	err = vboxsf_create_at_dentry(dentry, &params);
> 
> That's... interesting.  What should happen if you race with rename of
> grandparent?  Note that *parent* is locked here; no deeper ancestors
> are.
> 
> The same goes for removals.

vboxsf_create_at_dentry (and the removal and open paths) all use
vboxsf_path_from_dentry which uses dentry_path_raw to get a path
relative to the mount-point. The vboxsf IPC interface does not have
directory-handles or some-such, so we need to do everything with a path
relative to the mount-point. dentry_path_raw() itself does check for
renames happening while it is running, but a rename could still happen
between it finishing and us passing the path to the host, in which
case the IPC call will fail with an ENOENT error.

I see 2 possible solutions for this:

1) Leave this as is, a rename can not only happen on the guest
side (which we could catch) but also on the host side, so a rename
happening while we are doing another path based operation is always
possible. IOW a user can always shoot theirselves in the foot by
doing renames while other operations are in progress.

2) Add a rename mutex to the vboxsf code and lock that around all the cases
where we do a dentry_path_raw() followed by a path based IPC call (keeping
the mutex locke during the ipc call) and also lock the mutex around renames.

I would prefer option 1. I'm open to other suggestions.

>> +static const char *vboxsf_get_link(struct dentry *dentry, struct inode *inode,
>> +				   struct delayed_call *done)
>> +{
>> +	struct vboxsf_sbi *sbi = VBOXSF_SBI(inode->i_sb);
>> +	struct shfl_string *path;
>> +	char *link;
>> +	int err;
>> +
>> +	if (!dentry)
>> +		return ERR_PTR(-ECHILD);
>> +
>> +	path = vboxsf_path_from_dentry(sbi, dentry);
>> +	if (IS_ERR(path))
>> +		return (char *)path;
> 
> ERR_CAST(path)

Ok, will fix for the next version.

> 
>> +	/** No additional information is available / requested. */
>> +	SHFLFSOBJATTRADD_NOTHING = 1,
> 
> <unprintable>
> Well, unpronounceable, actually...

This comes from the upstream VirtualBox headers defining the IPC
interface (this header in essence is a much cleaned up copy of
the official header defining the IPC interface). As such
I would prefer to keep this as is.

>> +	switch (opt) {
>> +	case opt_nls:
>> +		if (fc->purpose != FS_CONTEXT_FOR_MOUNT) {
>> +			vbg_err("vboxsf: Cannot reconfigure nls option\n");
>> +			return -EINVAL;
>> +		}
>> +		ctx->nls_name = param->string;
>> +		param->string = NULL;
> 
> Umm...  What happens if you are given several such?  A leak?

Ah yes, I did not realize a user could specify the same option more then
once I will fix this for the next version.

>> +{
>> +	int err;
>> +
>> +	err = vboxsf_setup();
>> +	if (err)
>> +		return err;
>> +
>> +	return vfs_get_super(fc, vfs_get_independent_super, vboxsf_fill_super);
> 
> 	return get_tree_nodev(fc, vboxsf_fill_super);
> please,

Ok, will fix for the next version.

>> +static int vboxsf_reconfigure(struct fs_context *fc)
>> +{
>> +	struct vboxsf_sbi *sbi = VBOXSF_SBI(fc->root->d_sb);
>> +	struct vboxsf_fs_context *ctx = fc->fs_private;
>> +	struct inode *iroot;
>> +
>> +	iroot = ilookup(fc->root->d_sb, 0);
>> +	if (!iroot)
>> +		return -ENOENT;
> 
> Huh?  If that's supposed to be root directory inode, what's wrong
> with ->d_sb->s_root->d_inode?

That is indeed better, I will fix this for the next version.
>> +	path = dentry_path_raw(dentry, buf, PATH_MAX);
>> +	if (IS_ERR(path)) {
>> +		__putname(buf);
>> +		return (struct shfl_string *)path;
> 
> ERR_CAST(path)...

Ok, will fix for the next version.

Regards,

Hans


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-12-10 15:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-25 14:08 [PATCH v18 0/1] fs: Add VirtualBox guest shared folder (vboxsf) support Hans de Goede
     [not found] ` <20191125140839.4956-2-hdegoede@redhat.com>
2019-12-08  5:33   ` [PATCH v18] " Al Viro
2019-12-10 15:34     ` Hans de Goede

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.