archive mirror
 help / color / mirror / Atom feed
From: David Howells <>
To: Michael Kerrisk <>
Subject: [MANPAGE PATCH] Add manpage for fsopen(2), fspick(2) and fsmount(2)
Date: Tue, 10 Jul 2018 23:54:09 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

Add a manual page to document the fsopen(), fspick() and fsmount() system

Signed-off-by: David Howells <>

 man2/fsmount.2 |    1 
 man2/fsopen.2  |  357 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 man2/fspick.2  |    1 
 3 files changed, 359 insertions(+)
 create mode 100644 man2/fsmount.2
 create mode 100644 man2/fsopen.2
 create mode 100644 man2/fspick.2

diff --git a/man2/fsmount.2 b/man2/fsmount.2
new file mode 100644
index 000000000..2bf59fc3e
--- /dev/null
+++ b/man2/fsmount.2
@@ -0,0 +1 @@ man2/fsopen.2
diff --git a/man2/fsopen.2 b/man2/fsopen.2
new file mode 100644
index 000000000..1bc761ab4
--- /dev/null
+++ b/man2/fsopen.2
@@ -0,0 +1,357 @@
+'\" t
+.\" Copyright (c) 2018 David Howells <>
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\" Permission is granted to copy and distribute modified versions of this
+.\" manual under the conditions for verbatim copying, provided that the
+.\" entire resulting derived work is distributed under the terms of a
+.\" permission notice identical to this one.
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date.  The author(s) assume no
+.\" responsibility for errors or omissions, or for damages resulting from
+.\" the use of the information contained herein.  The author(s) may not
+.\" have taken the same level of care in the production of this manual,
+.\" which is licensed free of charge, as they might when working
+.\" professionally.
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.TH FSOPEN 2 2018-06-07 "Linux" "Linux Programmer's Manual"
+fsopen, fsmount, fspick \- Handle filesystem (re-)configuration and mounting
+.B #include <sys/types.h>
+.B #include <sys/mount.h>
+.B #include <unistd.h>
+.BR "#include <fcntl.h>           " "/* Definition of AT_* constants */"
+.BI "int fsopen(const char *" fsname ", unsigned int " flags );
+.BI "int fsmount(int " fd ", unsigned int " flags ", unsigned int " ms_flags );
+.BI "int fspick(int " dirfd ", const char *" pathname ", unsigned int " flags );
+.IR Note :
+There are no glibc wrappers for these system calls.
+.BR fsopen ()
+creates a new filesystem configuration context within the kernel for the
+filesystem named in the
+.I fsname
+parameter and attaches it to a file descriptor, which it then returns.  The
+file descriptor can be marked close-on-exec by setting
+in flags.
+file descriptor can then be used to configure the desired filesystem parameters
+and security parameters by using
+.BR write (2)
+to pass parameters to it and then writing a command to actually create the
+filesystem representation.
+The file descriptor also serves as a channel by which more comprehensive error,
+warning and information messages may be retrieved from the kernel using
+.BR read (2).
+Once the kernel's filesystem representation has been created, it can be queried
+by calling
+.BR fsinfo (2)
+on the file descriptor.  fsinfo() will spot that the target is actually a
+creation context and look inside that.
+.BR fsmount ()
+can then be called to create a mount object that refers to the newly created
+filesystem representation, with the propagation and mount restrictions to be
+applied specified in
+.IR ms_flags .
+The mount object is then attached to a new file descriptor that looks like one
+created by
+.BR open "(2) with " O_PATH " or " open_tree (2).
+This can be passed to
+.BR move_mount (2)
+to attach the mount object to a mountpoint, thereby completing the process.
+The file descriptor returned by fsmount() is marked close-on-exec if
+FSMOUNT_CLOEXEC is specified in
+.IR flags .
+After fsmount() has completed, the context created by fsopen() is reset and
+moved to reconfiguration state, allowing the new superblock to be reconfigured.
+.BR fspick ()
+creates a new filesystem context within the kernel, attaches the superblock
+specified by
+.IR dfd ", " pathname ", " flags
+and puts it into the reconfiguration state and attached the context to a new
+file descriptor that can then be parameterised with
+.BR write (2)
+exactly the same as for the context created by fsopen() above.
+.I flags
+is an OR'd together mask of
+which indicates that the returned file descriptor should be marked
+close-on-exec and
+which control the pathwalk to the target object (see below).
+.SS Writable Command Interface
+Superblock (re-)configuration is achieved by writing command strings to the
+context file descriptor using
+.BR write (2).
+Each string is prefixed with a specifier indicating the class of command
+being specified.  The available commands include:
+\fB"o <option>"\fP
+Specify a filesystem or security parameter.
+.I <option>
+is typically a key or key=val format string.  Since the length of the option is
+given to write(), the option may include any sort of character, including
+spaces and commas or even binary data.
+\fB"s <name>"\fP
+Specify a device file, network server or other other source specification.
+This may be optional, depending on the filesystem, and it may be possible to
+provide multiple of them to a filesystem.
+\fB"x create"\fP
+End the filesystem configuration phase and try and create a representation in
+the kernel with the parameters specified.  After this, the context is shifted
+to the mount-pending state waiting for an fsmount() call to occur.
+\fB"x reconfigure"\fP
+End a filesystem reconfiguration phase try to apply the parameters to the
+filesystem representation.  After this, the context gets reset and put back to
+the start of the reconfiguration phase again.
+With this interface, option strings are not limited to 4096 bytes, either
+individually or in sum, and they are also not restricted to text-only options.
+Further, errors may be given individually for each option and not aggregated or
+dumped into the kernel log.
+.SS Message Retrieval Interface
+The context file descriptor may be queried for message strings at any time by
+.BR read (2)
+on the file descriptor.  This will return formatted messages that are prefixed
+to indicate their class:
+\fB"e <message>"\fP
+An error message string was logged.
+\fB"i <message>"\fP
+An informational message string was logged.
+\fB"w <message>"\fP
+An warning message string was logged.
+Messages are removed from the queue as they're read.
+To illustrate the process, here's an example whereby this can be used to mount
+an ext4 filesystem on /dev/sdb1 onto /mnt.  Note that the example ignores the
+fact that
+.BR write (2)
+has a length parameter and that errors might occur.
+.PP +4n
+sfd = fsopen("ext4", FSOPEN_CLOEXEC);
+write(sfd, "s /dev/sdb1");
+write(sfd, "o noatime");
+write(sfd, "o acl");
+write(sfd, "o user_attr");
+write(sfd, "o iversion");
+write(sfd, "x create");
+fsinfo(sfd, NULL, ...);
+mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME);
+move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+Here, an ext4 context is created first and attached to sfd.  This is then told
+where its source will be, given a bunch of options and created.
+.BR fsinfo (2)
+can then be used to query the filesystem.  Then fsmount() is called to create a
+mount object and
+.BR move_mount (2)
+is called to attach it to its intended mountpoint.
+And here's an example of mounting from an NFS server:
+.PP +4n
+sfd = fsopen("nfs", 0);
+write(sfd, "s");
+write(sfd, "o nfsvers=3");
+write(sfd, "o rsize=65536");
+write(sfd, "o wsize=65536");
+write(sfd, "o rdma");
+write(sfd, "x create");
+mfd = fsmount(sfd, 0, MS_NODEV);
+move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+Reconfiguration can be achieved by:
+.PP +4n
+write(sfd, "o ro");
+write(sfd, "x reconfigure");
+.PP +4n
+sfd = fsopen(...);
+mfd = fsmount(sfd, ...);
+write(sfd, "o ro");
+write(sfd, "x reconfigure");
+On success, all three functions return a file descriptor.  On error, \-1 is
+returned, and
+.I errno
+is set appropriately.
+The error values given below result from filesystem type independent
+Each filesystem type may have its own special errors and its
+own special behavior.
+See the Linux kernel source code for details.
+A component of a path was not searchable.
+(See also
+.BR path_resolution (7).)
+Mounting a read-only filesystem was attempted without giving the
+The block device
+.I source
+is located on a filesystem mounted with the
+.\" mtk: Probably: write permission is required for MS_BIND, with
+.\" the error EPERM if not present; CAP_DAC_OVERRIDE is required.
+.I source
+cannot be reconfigured read-only, because it still holds files open for
+One of the pointer arguments points outside the user address space.
+.I source
+had an invalid superblock.
+.I ms_flags
+includes more than one of
+An attempt was made to bind mount an unbindable mount.
+Too many links encountered during pathname resolution.
+The system has too many open files to create more.
+The process has too many open files to create more.
+A pathname was longer than
+.I fsname
+not configured in the kernel.
+A pathname was empty or had a nonexistent component.
+The kernel could not allocate sufficient memory to complete the call.
+.I source
+is not a block device (and a device was required).
+.IR pathname ,
+or a prefix of
+.IR source ,
+is not a directory.
+The major number of the block device
+.I source
+is out of range.
+The caller does not have the required privileges.
+These functions are Linux-specific and should not be used in programs intended
+to be portable.
+.BR fsopen "(), " fsmount "() and " fspick ()
+were added to Linux in kernel 4.18.
+Glibc does not (yet) provide a wrapper for the
+.BR fsopen "() , " fsmount "() or " fspick "()"
+system calls; call them using
+.BR syscall (2).
+.BR mountpoint (1),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR umount (2),
+.BR mount_namespaces (7),
+.BR path_resolution (7),
+.BR findmnt (8),
+.BR lsblk (8),
+.BR mount (8),
+.BR umount (8)
diff --git a/man2/fspick.2 b/man2/fspick.2
new file mode 100644
index 000000000..2bf59fc3e
--- /dev/null
+++ b/man2/fspick.2
@@ -0,0 +1 @@ man2/fsopen.2

  parent reply	other threads:[~2018-07-10 22:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <>
     [not found] ` <>
2018-07-10 22:52   ` [MANPAGE PATCH] Add manpages for move_mount(2) and open_tree(2) David Howells
2019-10-09  9:51     ` Michael Kerrisk (man-pages)
2018-07-10 22:54   ` David Howells [this message]
2019-10-09  9:52     ` [MANPAGE PATCH] Add manpage for fsopen(2), fspick(2) and fsmount(2) Michael Kerrisk (man-pages)
2018-07-10 22:55   ` [MANPAGE PATCH] Add manpage for fsinfo(2) David Howells
2019-10-09  9:52     ` Michael Kerrisk (man-pages)
2019-10-09 12:02     ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).