All of lore.kernel.org
 help / color / mirror / Atom feed
* For review: open_by_name_at(2) man page
@ 2014-03-17 15:57 Michael Kerrisk (man-pages)
  2014-03-17 22:00 ` NeilBrown
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-17 15:57 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml
  Cc: mtk.manpages, Andreas Dilger, NeilBrown, Christoph Hellwig

Hi Aneesh, (and others)

Below is a man page I've written for name_to_handle_at(2) and
open_by_name_at(2). Would you be willing to review it please,
and let me know of any corrections/improvements?

Thanks,

Michael


'\" t -*- coding: UTF-8 -*-
.\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@gmail.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual"
.SH NAME
name_to_handle_at, open_by_handle_at \- obtain handle
for a pathname and open file via a handle
.SH SYNOPSIS
.nf
.B #define _GNU_SOURCE
.B #include <sys/types.h>
.B #include <sys/stat.h>
.B #include <fcntl.h>

.BI "int name_to_handle_at(int " dirfd ", const char *" pathname ,
.BI "                      struct file_handle *" handle ,
.BI "                      int *" mnt_id ", int " flags );

.BI "int open_by_handle_at(int " mountdirfd ", struct file_handle *" handle ,
.BI "                      int " flags );
.fi
.SH DESCRIPTION
The
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
system calls split the functionality of
.BR openat (2)
into two parts:
.BR name_to_handle_at ()
returns an opaque handle that corresponds to a specified file;
.BR open_by_handle_at ()
opens the file corresponding to a handle returned by a previous call to
.BR name_to_handle_at ()
and returns an open file descriptor.
.SS name_to_handle_at()
The
.BR name_to_handle_at ()
system call returns a file handle and a mount ID corresponding to
the file specified by
.IR pathname ,
which specifies the pathname of an existing file.
The file handle is returned via the argument
.IR handle ,
which is a pointer to a structure of the following form:

.in +4n
.nf
struct file_handle {
    unsigned int  handle_bytes;   /* Size of f_handle [in, out] */
    int           handle_type;    /* Handle type [out] */
    unsigned char f_handle[0];    /* File identifier (sized by
                                     caller) [out] */
};
.fi
.in
.PP
It is the caller's responsibility to allocate the structure
with a size large enough to hold the handle returned in
.IR f_handle .
Before the call, the
.IR handle_bytes
field should be initialized to contain the allocated size for
.IR f_handle .
(The constant
.BR MAX_HANDLE_SZ ,
defined in
.IR <fcntl.h> ,
specifies the maximum possible size for a file handle.)
Upon successful return, the
.IR handle_bytes
field is updated to contain the number of bytes actually written to
.IR f_handle .

The caller can discover the required size for the
.I file_handle
structure by making a call in which
.IR handle->handle_bytes
is zero;
in this case, the call fails with the error
.BR EOVERFLOW
and
.IR handle->handle_bytes
is set to indicate the required size;
the caller can then use this information to allocate a structure
of the correct size (see EXAMPLE below).

Other than the use of the
.IR handle_bytes
field, the caller should treat the
.IR file_handle
structure as an opaque data type: the
.IR handle_type
and
.IR f_handle
fields are needed only by a subsequent call to
.BR open_by_handle_at ().

The treatment of a relative pathname in
.I pathname
depends on the value of
.IR dirfd .
If
.I dirfd
has the special value
.BR AT_FDCWD ,
then
.I pathname
is interpreted relative to the current working
directory of the calling process.
(see
.BR openat (3)
for an explanation of why this is useful.)
Otherwise,
.IR dirfd
must be a file descriptor that refers to a directory, and
.I pathname
is interpreted relative to that directory.
If
.I pathname
is an absolute pathname, then
.I dirfd
is ignored.

The
.I mnt_id
argument returns an identifier for the filesystem
mount that corresponds to
.IR pathname .
This corresponds to the first field in one of the records in
.IR /proc/self/mountinfo .
Opening the pathname in the fifth field of that record yields a file
descriptor for the mount point;
that file descriptor can be used in a subsequent call to
.BR open_by_handle_at ().

The
.I flags
argument is a bit mask constructed by ORing together
zero or more of the following value:
.TP
.B AT_EMPTY_PATH
If
.I pathname
is an empty string,
then obtain a handle for the file referred to by
.IR dirfd
(which may have been obtained using the
.BR open (2)
.B O_PATH
flag).
In this case,
.I dirfd
can refer to any type of file, not just a directory.
.TP
.B AT_SYMLINK_FOLLOW
By default,
.BR name_to_handle_at ()
does not dereference
.I pathname
if it is a symbolic link.
The flag
.B AT_SYMLINK_FOLLOW
can be specified in
.I flags
to cause
.I pathname
to be dereferenced if it is a symbolic link.
.SS open_by_handle_at()
The
.BR open_by_handle_at ()
system call opens the file referred to by
.IR handle ,
a file handle returned by a previous call to
.BR name_to_handle_at ().

The
.IR mountdirfd
argument is a file descriptor for a directory under
the mount point with respect to which
.IR handle
should be interpreted.
The special value
.B AT_FDCWD
can be specified, meaning the current working directory of the caller.

The
.I flags
argument
is as for
.BR open (2).

The caller must have the
.B CAP_DAC_READ_SEARCH
capability to invoke
.BR open_by_handle_at ().
.SH RETURN VALUE
On success,
.BR name_to_handle_at ()
returns 0,
and
.BR open_by_handle_at ()
returns a nonnegative file descriptor.

In the event of an error, both system calls return \-1 and set
.I errno
to indicate the cause of the error.
.SH ERRORS
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
can fail for the same errors as
.BR open (2).
In addition, they can fail with the errors noted below.

.BR name_to_handle_at ()
can fail with the following errors:
.TP
.B EBADF
.IR dirfd
is not an open file descriptor.
.TP
.B EINVAL
.I flags
includes an invalid bit value.
.TP
.B EINVAL
.IR handle_bytes\->handle_bytes
is greater than
.BR MAX_HANDLE_SZ .
.TP
.B ENOTDIR
The file descriptor supplied in
.I dirfd
does not refer to a directory,
and it it is not the case that both
.I flags
includes
.BR AT_EMPTY_PATH
and
.I pathname
is an empty string.
.TP
.B EOPNOTSUPP
The filesystem does not support decoding of a pathname to a file handle.
.TP
.B EOVERFLOW
The
.I handle->handle_bytes
value passed into the call was too small.
When this error occurs,
.I handle->handle_bytes
is updated to indicate the required size for the handle.
.\"
.\"
.PP
.BR open_by_handle_at ()
can fail with the following errors:
.TP
.B EBADF
.IR mountdirfd
is not an open file descriptor.
.TP
.B EINVAL
.I handle->handle_bytes
is greater than
.BR MAX_HANDLE_SZ
or is equal to zero.
.TP
.B ENOMEM
Insufficient memory.
.TP
.B ENOTDIR
.IR mountdirfd
is not
.B AT_FDCWD
and does not refer to a directory.
.TP
.B EPERM
The caller does not have the
.BR CAP_DAC_READ_SEARCH
capability.
.TP
.B ESTALE
The specified
.I handle
is no longer valid.
.SH VERSIONS
These system calls first appeared in Linux 2.6.39.
.SH CONFORMING TO
These system calls are nonstandard Linux extensions.
.SH NOTES
A file handle can be generated in one process using
.BR name_to_handle_at ()
and later used in a different process that calls
.BR open_by_handle_at ().

These system calls are designed for use by user-space file servers.
For example, a user-space NFS server might generate a file handle
and pass it to an NFS client.
Later, when the client wants to open the file,
it could pass the handle back to the server.
.\" https://lwn.net/Articles/375888/
.\"	"Open by handle" - Jonathan Corbet, 2010-02-23
This sort of functionality allows a user-space file server to operate in
a stateless fashion with respect to the files it serves.

Specifying both
.BR O_PATH
and
.BR O_NOFOLLOW
in a call to
.BR name_to_handle_at ()
that operates on a symbolic link can be used to obtain a handle for the link.
.\" commit bcda76524cd1fa32af748536f27f674a13e56700
The process receiving the handle can later perform operations
on the symbolic link by converting the handle to a file descriptor using
.BR open_by_handle_at ()
and then passing the file descriptor as the
.IR dirfd
argument in system calls such as
.BR readlinkat (2)
and
.BR fchownat (2).
.SS Obtaining a persistent filesystem ID
The mount IDs in
.IR /proc/self/mountinfo
can be reused as filesystems are unmounted and mounted.
Therefore, the mount ID returned by
.BR name_to_handle_at (3)
(in
.IR *mnt_id )
should not be treated as a persistent identifier
for the corresponding mounted filesystem.
However, an application can use the information in the
.I mountinfo
record that corresponds to the mount ID
to derive a persistent identifier.

For example, one can use the device name in the fifth field of the
.I mountinfo
record to search for the corresponding device UUID via the symbolic links in
.IR /dev/disks/by-uuid .
(A more comfortable way of obtaining the UUID is to use the
.\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
.BR libblkid (3)
library, which uses the
.I /sys
filesystem to obtain the same information.)
That process can then be reversed,
using the UUID to look up the device name,
and then obtaining the corresponding mount point,
in order to produce the
.IR mountdirfd
argument used by
.BR open_by_name_at ().
.SH EXAMPLE
The two programs below demonstrate the use of
.BR name_to_handle_at ()
and
.BR open_by_handle_at ().
The first program
.RI ( t_name_to_handle_at.c )
uses
.BR name_to_handle_at ()
to obtain the file handle and mount ID
for the file specified in its command-line argument;
the handle and ID are written to standard output.

The second program
.RI ( t_open_by_handle_at.c )
reads a mount ID and file handle from standard input.
The program then employs
.BR open_by_handle_at ()
to open the file using that handle.
If an optional command-line argument is supplied, then the
.IR mountdirfd
argument for
.BR open_by_handle_at ()
is obtained by opening the directory named in that argument.
Otherwise,
.IR mountdirfd
is obtained by scanning
.IR /proc/self/mountinfo
to find a record whose mount ID matches the mount ID
read from standard input,
and the mount directory specified in that record is opened.
(These programs do not deal with the fact that mount IDs are not persistent.)

The following shell session demonstrates the use of these two programs:

.in +4n
.nf
$ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP
$ \fB./t_name_to_handle_at cecilia.txt > fh\fP
$ \fB./t_open_by_handle_at < fh\fP
open_by_handle_at: Operation not permitted
$ \fBsudo ./t_open_by_handle_at < fh\fP      # Need CAP_SYS_ADMIN
Read 28 bytes
$ \fBrm cecilia.txt\fP
.fi
.in

Now delete and re-create the file with the same inode number;
.BR open_by_handle_at ()
recognizes that the file referred to by the file handle no longer exists.

.in +4n
.nf
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Display inode number 
4072121
$ \fBecho 'Warum?' > cecilia.txt\fP
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Check inode number
4072121
$ \fBsudo ./t_open_by_handle_at < fh\fP
open_by_handle_at: Stale NFS file handle
.fi
.in
.SS Program source: t_name_to_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fhsize, s;

    if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) {
        fprintf(stderr, "Usage: %s pathname\\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Allocate file_handle structure */

    fhsize = sizeof(struct file_handle *);
    fhp = malloc(fhsize);
    if (fhp == NULL)
        errExit("malloc");

    /* Make an initial call to name_to_handle_at() to discover
       the size required for file handle */

    fhp\->handle_bytes = 0;
    s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0);
    if (s != \-1 || errno != EOVERFLOW) {
        fprintf(stderr, "Unexpected result from name_to_handle_at()\\n");
        exit(EXIT_FAILURE);
    }

    /* Reallocate file_handle structure with correct size */

    fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;
    fhp = realloc(fhp, fhsize);         /* Copies fhp\->handle_bytes */
    if (fhp == NULL)
        errExit("realloc");

    /* Get file handle from pathname supplied on command line */

    if (name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0) == \-1)
        errExit("name_to_handle_at");

    /* Write mount ID, file handle size, and file handle to stdout,
       for later reuse by t_open_by_handle_at.c */

    if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) ||
            write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) ||
            write(STDOUT_FILENO, fhp, fhsize) != fhsize) {
        fprintf(stderr, "Write failure\\n");
        exit(EXIT_FAILURE);
    }

    exit(EXIT_SUCCESS);
}
.fi
.SS Program source: t_open_by_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

/* Scan /proc/self/mountinfo to find the line whose mount ID matches
   \(aqmount_id\(aq. (An easier way to do this is to install and use the
   \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.)
   Open the corresponding mount path and return the resulting file
   descriptor. */

static int
open_mount_path_by_id(int mount_id)
{
    char *linep;
    size_t lsize;
    char mount_path[PATH_MAX];
    int fmnt_id, fnd, nread;
    FILE *fp;

    fp  = fopen("/proc/self/mountinfo", "r");
    if (fp == NULL)
        errExit("fopen");

    for (fnd = 0; !fnd ; ) {
        linep = NULL;
        nread = getline(&linep, &lsize, fp);
        if (nread == \-1)
            break;

	nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path);
        if (nread != 2) {
            fprintf(stderr, "Bad sscanf()\\n");
            exit(EXIT_FAILURE);
        }

        free(linep);

        if (fmnt_id == mount_id)
            fnd = 1;
    }

    fclose(fp);

    if (!fnd) {
        fprintf(stderr, "Could not find mount point\\n");
        exit(EXIT_FAILURE);
    }

    return open(mount_path, O_RDONLY | O_DIRECTORY);
}

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fd, mount_fd, fhsize;
    ssize_t nread;
#define BSIZE 1000
    char buf[BSIZE];

    if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) {
        fprintf(stderr, "Usage: %s [mount\-dir]]\\n",
                argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Read data produced by t_name_to_handle_at.c */

    if (read(STDIN_FILENO, &mount_id, sizeof(int)) != sizeof(int))
        errExit("read");

    if (read(STDIN_FILENO, &fhsize, sizeof(int)) != sizeof(int))
        errExit("read");

    fhp = malloc(fhsize);
    if (fhp == NULL)
        errExit("malloc");

    if (read(STDIN_FILENO, fhp, fhsize) != fhsize)
        errExit("read");

    /* Obtain file descriptor for mount point, either by opening
       the pathname specified on the command line, or by scanning
       /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq
       obtained by name_to_handle_at() (in t_name_to_handle_at.c) */

    if (argc > 1)
        mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);
    else
        mount_fd = open_mount_path_by_id(mount_id);

    if (mount_fd == \-1)
        errExit("opening mount fd");

    /* Open name using handle and mount point */

    fd = open_by_handle_at(mount_fd, fhp, O_RDONLY);
    if (fd == \-1)
        errExit("open_by_handle_at");

    /* Try reading a few bytes from the file */

    nread = read(fd, buf, BSIZE);
    if (nread == \-1)
        errExit("read");
    printf("Read %ld bytes\\n", (long) nread);

    exit(EXIT_SUCCESS);
}
.fi
.SH SEE ALSO
.BR blkid (1),
.BR findfs (1),
.BR open (2),
.BR libblkid (3),
.BR mount (8)

The
.I libblkid
and
.I libmount
documentation under the latest
.I util-linux
release at
.UR https://www.kernel.org/pub/linux/utils/util-linux/
.UE




-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages)
@ 2014-03-17 22:00 ` NeilBrown
  2014-03-18  9:43   ` Christoph Hellwig
  2014-03-18 12:35     ` Michael Kerrisk (man-pages)
  2014-03-18  9:37 ` Christoph Hellwig
  2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages)
  2 siblings, 2 replies; 22+ messages in thread
From: NeilBrown @ 2014-03-17 22:00 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger,
	Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 4127 bytes --]

On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)"
<mtk.manpages@gmail.com> wrote:

> Hi Aneesh, (and others)
> 
> Below is a man page I've written for name_to_handle_at(2) and
> open_by_name_at(2). Would you be willing to review it please,
> and let me know of any corrections/improvements?
> 
> Thanks,
> 
> Michael

Thanks for writing this Michael.  The fact that I can only find very small
points to comment on reflects the high quality...


> Otherwise,
> .IR dirfd
> must be a file descriptor that refers to a directory, and
  ^^^^^^^
> .I pathname
> is interpreted relative to that directory.

As you clarify later, "must be" is not correct.  Maybe this is just an issue
of style, in which case you should obviously keep a consistent style across
man pages, but to me it sounds wrong.  I would use "is generally" or similar.



> The
> .IR mountdirfd
> argument is a file descriptor for a directory under
> the mount point with respect to which
> .IR handle
> should be interpreted.

mountdirfd does not have to be for a directory.  It can be for any object in
the filesystem.  And I would say "in", not "under".
If /foo and /foo/bar are both mountpoints, and I want to look up a
filehandle for the filesystem mounted at /foo, then opening "/foo/bar"
wouldn't work even though /foo/bar is "under" /foo.  And opening "/foo" would
work even though "/foo" is not under "/foo/" (is it?).

  The
  .IR mountfd
  argument is a file descriptor for any object (file, directory, etc.) in the
  filesystem with respect to which
  .IR handle
  should be interpreted.

??



> .B ESTALE
> The specified
> .I handle
> is no longer valid.

ESTALE is also returned if the filesystem does not support file-handle ->
file mappings.
On filesystems which don't provide export_operations (/sys /proc ubifs
romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
generic handle using the 32 bit inode and 32 bit i_generation.
open_by_name_at given this (or any) filehandle will fail with ESTALE.
I don't know how best to include this in the documentation.  Maybe a note
earlier noting that some filesystems do not support open_by_name_at(), and
you cannot programatically determine which do except by trying.
At the same time note that a file handle can become in valid if a file is
deleted or for any other reason as determined by the filesystem, and that the
error is the same as for when the filesystem doesn't support open_by_name_at.


> For example, one can use the device name in the fifth field of the
> .I mountinfo
> record to search for the corresponding device UUID via the symbolic links in
> .IR /dev/disks/by-uuid .
> (A more comfortable way of obtaining the UUID is to use the
> .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
> .BR libblkid (3)
> library, which uses the
> .I /sys
> filesystem to obtain the same information.)

Does it?  My understanding from "man libblkid" (it is a while since I've read
the code) is that it either uses info in /dev/disks/by-* or reads directly
from the block devices (maybe using /sys to find them?) and interprets the
superblock to extract a UUID.



> Now delete and re-create the file with the same inode number;
> .BR open_by_handle_at ()
> recognizes that the file referred to by the file handle no longer exists.
> 
> .in +4n
> .nf
> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Display inode number 
> 4072121
> $ \fBecho 'Warum?' > cecilia.txt\fP
> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Check inode number
> 4072121
> $ \fBsudo ./t_open_by_handle_at < fh\fP
> open_by_handle_at: Stale NFS file handle

Something is very wrong here.
   echo foo > somefile
does not "delete and re-create" the file.  It opens and truncates.
That operation should not invalidate the filehandle on any sane filesystem.


>     if (argc > 1)
>         mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);

O_DIRECTORY is not appropriate, as mentioned earlier.


Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages)
  2014-03-17 22:00 ` NeilBrown
@ 2014-03-18  9:37 ` Christoph Hellwig
  2014-03-18 12:41   ` Michael Kerrisk (man-pages)
  2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages)
  2 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2014-03-18  9:37 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger,
	NeilBrown

Hi Michael,

the man page looks reasonable.  If you refer to openat(2) instead of
open(2) in the ERRORS section you could avoid duplicating a few of the
dirfd and flags related errors.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-17 22:00 ` NeilBrown
@ 2014-03-18  9:43   ` Christoph Hellwig
  2014-03-18 12:37     ` Michael Kerrisk (man-pages)
  2014-03-18 12:35     ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2014-03-18  9:43 UTC (permalink / raw)
  To: NeilBrown
  Cc: Michael Kerrisk (man-pages),
	Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger,
	Christoph Hellwig

On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote:
> ESTALE is also returned if the filesystem does not support file-handle ->
> file mappings.
> On filesystems which don't provide export_operations (/sys /proc ubifs
> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
> generic handle using the 32 bit inode and 32 bit i_generation.

Do we?  Seems like the code is erroring out early if there are no
export_ops?

> Does it?  My understanding from "man libblkid" (it is a while since I've read
> the code) is that it either uses info in /dev/disks/by-* or reads directly
> from the block devices (maybe using /sys to find them?) and interprets the
> superblock to extract a UUID.

It normally reads directly from disk, unless it has changed very
recently.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
@ 2014-03-18 12:35     ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-18 12:35 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger, Christoph Hellwig

On 03/17/2014 11:00 PM, NeilBrown wrote:
> On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:
> 
>> Hi Aneesh, (and others)
>>
>> Below is a man page I've written for name_to_handle_at(2) and
>> open_by_name_at(2). Would you be willing to review it please,
>> and let me know of any corrections/improvements?
>>
>> Thanks,
>>
>> Michael
> 
> Thanks for writing this Michael.  The fact that I can only find very small
> points to comment on reflects the high quality...

Thanks, Neil. But there was at least one good clanger below :-}.

> 
>> Otherwise,
>> .IR dirfd
>> must be a file descriptor that refers to a directory, and
>   ^^^^^^^
>> .I pathname
>> is interpreted relative to that directory.
> 
> As you clarify later, "must be" is not correct.  Maybe this is just an issue
> of style, in which case you should obviously keep a consistent style across
> man pages, but to me it sounds wrong.  I would use "is generally" or similar.

Yep, good point. In fact, what I did was rewrite that section completely, to 
more clearly describe the distinct cases based on dirfd/pathname/AT_EMPTY_PATH.


>> The
>> .IR mountdirfd
>> argument is a file descriptor for a directory under
>> the mount point with respect to which
>> .IR handle
>> should be interpreted.
> 
> mountdirfd does not have to be for a directory.  It can be for any object in
> the filesystem.  And I would say "in", not "under".
> If /foo and /foo/bar are both mountpoints, and I want to look up a
> filehandle for the filesystem mounted at /foo, then opening "/foo/bar"
> wouldn't work even though /foo/bar is "under" /foo.  And opening "/foo" would
> work even though "/foo" is not under "/foo/" (is it?).

Good catch. I got deceived by the name of the argument, which in the kernel
source is indeed 'mountdirfd', implying it must be a descriptor for a directory.
I'll rename the argument in the man page to 'mount_fd' and fix the description 
as you suggest here:

>   The
>   .IR mountfd
>   argument is a file descriptor for any object (file, directory, etc.) in the
>   filesystem with respect to which

I did s/filesystem/mounted filesystem/

>   .IR handle
>   should be interpreted.
> 
> ??
>> .B ESTALE
>> The specified
>> .I handle
>> is no longer valid.
> 
> ESTALE is also returned if the filesystem does not support file-handle ->
> file mappings.
> On filesystems which don't provide export_operations (/sys /proc ubifs
> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
> generic handle using the 32 bit inode and 32 bit i_generation.

Are you sure about this? When I try name_to_handle_at() on /proc and
/sys, it gives an error (EOPNOTSUPP). I haven't tested the other
FSes though, so maybe some of them do what you say.

> open_by_name_at given this (or any) filehandle will fail with ESTALE.
> I don't know how best to include this in the documentation.  Maybe a note
> earlier noting that some filesystems do not support open_by_name_at(), and
> you cannot programatically determine which do except by trying.
> At the same time note that a file handle can become in valid if a file is
> deleted or for any other reason as determined by the filesystem, and that the
> error is the same as for when the filesystem doesn't support open_by_name_at.

I've added text about invalid file handles into NOTES, and noted that not all
FSes support the production of file handles, but haven't noted ESTALE for the
latter, since I don't yet know if your statement above is true for some 
filesystems.


>> For example, one can use the device name in the fifth field of the
>> .I mountinfo
>> record to search for the corresponding device UUID via the symbolic links in
>> .IR /dev/disks/by-uuid .
>> (A more comfortable way of obtaining the UUID is to use the
>> .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
>> .BR libblkid (3)
>> library, which uses the
>> .I /sys
>> filesystem to obtain the same information.)
> 
> Does it?  My understanding from "man libblkid" (it is a while since I've read
> the code) is that it either uses info in /dev/disks/by-* or reads directly
> from the block devices (maybe using /sys to find them?) and interprets the
> superblock to extract a UUID.

Thanks (and to Christoph) -- I'll just remove the words "which uses the /sys
filesystem to obtain the same information"

>> Now delete and re-create the file with the same inode number;
>> .BR open_by_handle_at ()
>> recognizes that the file referred to by the file handle no longer exists.
>>
>> .in +4n
>> .nf
>> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Display inode number 
>> 4072121
>> $ \fBecho 'Warum?' > cecilia.txt\fP
>> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Check inode number
>> 4072121
>> $ \fBsudo ./t_open_by_handle_at < fh\fP
>> open_by_handle_at: Stale NFS file handle
> 
> Something is very wrong here.
>    echo foo > somefile
> does not "delete and re-create" the file.  It opens and truncates.
> That operation should not invalidate the filehandle on any sane filesystem.

Indeed! I don't know quite what I was smoking as I reviewed that piece.
In fact, I started writing this page a long time ago, but then other 
events intervened, and it was a long time before I came back to it recently.
Certainly, when I produced that shell session log, things proceeded
(almost) as shown. I'm guessing that what happened is that I by 
accident edited out a line

    rm cecilia.txt

just before

    echo 'Warum?' > cecilia.txt

Fixed now. (In that case of course, it is of course a matter of chance
whether the pathname is re-created with the same i-node number, but if 
you are quick, it often is. I'll add some explanation to the page.)

>>     if (argc > 1)
>>         mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);
> 
> O_DIRECTORY is not appropriate, as mentioned earlier.

Fixed (in two places).

Thanks for the review, Neil. That helped fix a lot of problems in the page.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
@ 2014-03-18 12:35     ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-18 12:35 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Aneesh Kumar K.V,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml,
	Andreas Dilger, Christoph Hellwig

On 03/17/2014 11:00 PM, NeilBrown wrote:
> On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)"
> <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
>> Hi Aneesh, (and others)
>>
>> Below is a man page I've written for name_to_handle_at(2) and
>> open_by_name_at(2). Would you be willing to review it please,
>> and let me know of any corrections/improvements?
>>
>> Thanks,
>>
>> Michael
> 
> Thanks for writing this Michael.  The fact that I can only find very small
> points to comment on reflects the high quality...

Thanks, Neil. But there was at least one good clanger below :-}.

> 
>> Otherwise,
>> .IR dirfd
>> must be a file descriptor that refers to a directory, and
>   ^^^^^^^
>> .I pathname
>> is interpreted relative to that directory.
> 
> As you clarify later, "must be" is not correct.  Maybe this is just an issue
> of style, in which case you should obviously keep a consistent style across
> man pages, but to me it sounds wrong.  I would use "is generally" or similar.

Yep, good point. In fact, what I did was rewrite that section completely, to 
more clearly describe the distinct cases based on dirfd/pathname/AT_EMPTY_PATH.


>> The
>> .IR mountdirfd
>> argument is a file descriptor for a directory under
>> the mount point with respect to which
>> .IR handle
>> should be interpreted.
> 
> mountdirfd does not have to be for a directory.  It can be for any object in
> the filesystem.  And I would say "in", not "under".
> If /foo and /foo/bar are both mountpoints, and I want to look up a
> filehandle for the filesystem mounted at /foo, then opening "/foo/bar"
> wouldn't work even though /foo/bar is "under" /foo.  And opening "/foo" would
> work even though "/foo" is not under "/foo/" (is it?).

Good catch. I got deceived by the name of the argument, which in the kernel
source is indeed 'mountdirfd', implying it must be a descriptor for a directory.
I'll rename the argument in the man page to 'mount_fd' and fix the description 
as you suggest here:

>   The
>   .IR mountfd
>   argument is a file descriptor for any object (file, directory, etc.) in the
>   filesystem with respect to which

I did s/filesystem/mounted filesystem/

>   .IR handle
>   should be interpreted.
> 
> ??
>> .B ESTALE
>> The specified
>> .I handle
>> is no longer valid.
> 
> ESTALE is also returned if the filesystem does not support file-handle ->
> file mappings.
> On filesystems which don't provide export_operations (/sys /proc ubifs
> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
> generic handle using the 32 bit inode and 32 bit i_generation.

Are you sure about this? When I try name_to_handle_at() on /proc and
/sys, it gives an error (EOPNOTSUPP). I haven't tested the other
FSes though, so maybe some of them do what you say.

> open_by_name_at given this (or any) filehandle will fail with ESTALE.
> I don't know how best to include this in the documentation.  Maybe a note
> earlier noting that some filesystems do not support open_by_name_at(), and
> you cannot programatically determine which do except by trying.
> At the same time note that a file handle can become in valid if a file is
> deleted or for any other reason as determined by the filesystem, and that the
> error is the same as for when the filesystem doesn't support open_by_name_at.

I've added text about invalid file handles into NOTES, and noted that not all
FSes support the production of file handles, but haven't noted ESTALE for the
latter, since I don't yet know if your statement above is true for some 
filesystems.


>> For example, one can use the device name in the fifth field of the
>> .I mountinfo
>> record to search for the corresponding device UUID via the symbolic links in
>> .IR /dev/disks/by-uuid .
>> (A more comfortable way of obtaining the UUID is to use the
>> .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
>> .BR libblkid (3)
>> library, which uses the
>> .I /sys
>> filesystem to obtain the same information.)
> 
> Does it?  My understanding from "man libblkid" (it is a while since I've read
> the code) is that it either uses info in /dev/disks/by-* or reads directly
> from the block devices (maybe using /sys to find them?) and interprets the
> superblock to extract a UUID.

Thanks (and to Christoph) -- I'll just remove the words "which uses the /sys
filesystem to obtain the same information"

>> Now delete and re-create the file with the same inode number;
>> .BR open_by_handle_at ()
>> recognizes that the file referred to by the file handle no longer exists.
>>
>> .in +4n
>> .nf
>> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Display inode number 
>> 4072121
>> $ \fBecho 'Warum?' > cecilia.txt\fP
>> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Check inode number
>> 4072121
>> $ \fBsudo ./t_open_by_handle_at < fh\fP
>> open_by_handle_at: Stale NFS file handle
> 
> Something is very wrong here.
>    echo foo > somefile
> does not "delete and re-create" the file.  It opens and truncates.
> That operation should not invalidate the filehandle on any sane filesystem.

Indeed! I don't know quite what I was smoking as I reviewed that piece.
In fact, I started writing this page a long time ago, but then other 
events intervened, and it was a long time before I came back to it recently.
Certainly, when I produced that shell session log, things proceeded
(almost) as shown. I'm guessing that what happened is that I by 
accident edited out a line

    rm cecilia.txt

just before

    echo 'Warum?' > cecilia.txt

Fixed now. (In that case of course, it is of course a matter of chance
whether the pathname is re-created with the same i-node number, but if 
you are quick, it often is. I'll add some explanation to the page.)

>>     if (argc > 1)
>>         mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);
> 
> O_DIRECTORY is not appropriate, as mentioned earlier.

Fixed (in two places).

Thanks for the review, Neil. That helped fix a lot of problems in the page.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-18  9:43   ` Christoph Hellwig
@ 2014-03-18 12:37     ` Michael Kerrisk (man-pages)
  2014-03-18 22:24         ` NeilBrown
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-18 12:37 UTC (permalink / raw)
  To: Christoph Hellwig, NeilBrown
  Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger

On 03/18/2014 10:43 AM, Christoph Hellwig wrote:
> On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote:
>> ESTALE is also returned if the filesystem does not support file-handle ->
>> file mappings.
>> On filesystems which don't provide export_operations (/sys /proc ubifs
>> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
>> generic handle using the 32 bit inode and 32 bit i_generation.
> 
> Do we?  Seems like the code is erroring out early if there are no
> export_ops?

It appears to me that Neil's statement isn't correct, at least for /proc
and /sys (see my other mail, to Neil). I'm unsure about whether it is true
for some of those other FSes thought.

>> Does it?  My understanding from "man libblkid" (it is a while since I've read
>> the code) is that it either uses info in /dev/disks/by-* or reads directly
>> from the block devices (maybe using /sys to find them?) and interprets the
>> superblock to extract a UUID.
> 
> It normally reads directly from disk, unless it has changed very
> recently.

Thanks. As noted in my mail, I solved this one by just saying a little less
about libblkid.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-18  9:37 ` Christoph Hellwig
@ 2014-03-18 12:41   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-18 12:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger, NeilBrown

On 03/18/2014 10:37 AM, Christoph Hellwig wrote:
> Hi Michael,
> 
> the man page looks reasonable.  If you refer to openat(2) instead of
> open(2) in the ERRORS section you could avoid duplicating a few of the
> dirfd and flags related errors.

Good idea. Done.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* For review: open_by_name_at(2) man page [v2]
  2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages)
  2014-03-17 22:00 ` NeilBrown
  2014-03-18  9:37 ` Christoph Hellwig
@ 2014-03-18 12:55 ` Michael Kerrisk (man-pages)
  2014-03-19  4:13     ` NeilBrown
  2014-03-19  6:42   ` For review: open_by_name_at(2) man page [v2] Mike Frysinger
  2 siblings, 2 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-18 12:55 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml
  Cc: mtk.manpages, Andreas Dilger, NeilBrown, Christoph Hellwig

Hi Aneesh, (and others)

After integrating review comments from NeilBown and Christoph Hellwig,
here is draft 2 of a man page I've written for name_to_handle_at(2) and
open_by_name_at(2). Especially thanks to Neil's comments, several parts
of the page underwent a substantial rewrite. Would you be willing to 
review it please, and let me know of any corrections/improvements?

There are some FIXMEs in the page that I would especially like some
help with.

Thanks,

Michael


'\" t -*- coding: UTF-8 -*-
.\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@gmail.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual"
.SH NAME
name_to_handle_at, open_by_handle_at \- obtain handle
for a pathname and open file via a handle
.SH SYNOPSIS
.nf
.B #define _GNU_SOURCE
.B #include <sys/types.h>
.B #include <sys/stat.h>
.B #include <fcntl.h>

.BI "int name_to_handle_at(int " dirfd ", const char *" pathname ,
.BI "                      struct file_handle *" handle ,
.BI "                      int *" mount_id ", int " flags );

.BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle ,
.BI "                      int " flags );
.fi
.SH DESCRIPTION
The
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
system calls split the functionality of
.BR openat (2)
into two parts:
.BR name_to_handle_at ()
returns an opaque handle that corresponds to a specified file;
.BR open_by_handle_at ()
opens the file corresponding to a handle returned by a previous call to
.BR name_to_handle_at ()
and returns an open file descriptor.
.\"
.\"
.SS name_to_handle_at()
The
.BR name_to_handle_at ()
system call returns a file handle and a mount ID corresponding to
the file specified by the
.IR dirfd
and
.IR pathname
arguments.
The file handle is returned via the argument
.IR handle ,
which is a pointer to a structure of the following form:

.in +4n
.nf
struct file_handle {
    unsigned int  handle_bytes;   /* Size of f_handle [in, out] */
    int           handle_type;    /* Handle type [out] */
    unsigned char f_handle[0];    /* File identifier (sized by
                                     caller) [out] */
};
.fi
.in
.PP
It is the caller's responsibility to allocate the structure
with a size large enough to hold the handle returned in
.IR f_handle .
Before the call, the
.IR handle_bytes
field should be initialized to contain the allocated size for
.IR f_handle .
(The constant
.BR MAX_HANDLE_SZ ,
defined in
.IR <fcntl.h> ,
specifies the maximum possible size for a file handle.)
Upon successful return, the
.IR handle_bytes
field is updated to contain the number of bytes actually written to
.IR f_handle .

The caller can discover the required size for the
.I file_handle
structure by making a call in which
.IR handle->handle_bytes
is zero;
in this case, the call fails with the error
.BR EOVERFLOW
and
.IR handle->handle_bytes
is set to indicate the required size;
the caller can then use this information to allocate a structure
of the correct size (see EXAMPLE below).

Other than the use of the
.IR handle_bytes
field, the caller should treat the
.IR file_handle
structure as an opaque data type: the
.IR handle_type
and
.IR f_handle
fields are needed only by a subsequent call to
.BR open_by_handle_at ().

Together, the
.I pathname
and
.I dirfd
arguments identify the file for which a handle is to obtained.
There are four distinct cases:
.IP * 3
If
.I pathname
is a nonempty string containing an absolute pathname,
then a handle is returned for the file referred to by that pathname.
In this case,
.IR dirfd
is ignored.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
has the special value
.BR AT_FDCWD ,
then
.I pathname
is interpreted relative to the current working directory of the caller,
and a handle is returned for the file to which it refers.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
is a file descriptor referring to a directory, then
.I pathname
is interpreted relative to the directory referred to by
.IR dirfd ,
and a handle is returned for the file to which it refers.
(See
.BR openat (3)
for an explanation of why "directory file descriptors" are useful.)
.IP *
If
.I pathname
is an empty string and
.I flags
specifies the value
.BR AT_EMPTY_PATH ,
then
.IR dirfd
can be an open file descriptor referring to any type of file,
or
.BR AT_FDCWD ,
meaning the current working directory,
and a handle is returned for the file to which it refers.
.PP
The
.I mount_id
argument returns an identifier for the filesystem
mount that corresponds to
.IR pathname .
This corresponds to the first field in one of the records in
.IR /proc/self/mountinfo .
Opening the pathname in the fifth field of that record yields a file
descriptor for the mount point;
that file descriptor can be used in a subsequent call to
.BR open_by_handle_at ().

The
.I flags
argument is a bit mask constructed by ORing together
zero or more of the following value:
.TP
.B AT_EMPTY_PATH
Allow
.I pathname
to be an empty string.
See above.
(which may have been obtained using the
.BR open (2)
.B O_PATH
flag).
.TP
.B AT_SYMLINK_FOLLOW
By default,
.BR name_to_handle_at ()
does not dereference
.I pathname
if it is a symbolic link.
The flag
.B AT_SYMLINK_FOLLOW
can be specified in
.I flags
to cause
.I pathname
to be dereferenced if it is a symbolic link.
.SS open_by_handle_at()
The
.BR open_by_handle_at ()
system call opens the file referred to by
.IR handle ,
a file handle returned by a previous call to
.BR name_to_handle_at ().

The
.IR mount_fd
argument is a file descriptor for any object (file, directory, etc.)
in the mounted filesystem with respect to which
.IR handle
should be interpreted.
The special value
.B AT_FDCWD
can be specified, meaning the current working directory of the caller.

The
.I flags
argument
is as for
.BR open (2).
.\" FIXME: Confirm that the following is intended behavior.
.\"        (It certainly seems to be the behavior, from experimenting.)
If
.I handle
refers to a symbolic link, the caller must specify the
.B O_PATH
flag, and the symbolic link is not dereferenced (the
.B O_NOFOLLOW
flag, if specified, is ignored).

The caller must have the
.B CAP_DAC_READ_SEARCH
capability to invoke
.BR open_by_handle_at ().
.SH RETURN VALUE
On success,
.BR name_to_handle_at ()
returns 0,
and
.BR open_by_handle_at ()
returns a nonnegative file descriptor.

In the event of an error, both system calls return \-1 and set
.I errno
to indicate the cause of the error.
.SH ERRORS
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
can fail for the same errors as
.BR openat (2).
In addition, they can fail with the errors noted below.

.BR name_to_handle_at ()
can fail with the following errors:
.TP
.B EINVAL
.I flags
includes an invalid bit value.
.TP
.B EINVAL
.IR handle_bytes\->handle_bytes
is greater than
.BR MAX_HANDLE_SZ .
.TP
.B ENOENT
.I pathname
is an empty string, but
.BR AT_EMPTY_PATH
was not specified in
.IR flags .
.TP
.B ENOTDIR
The file descriptor supplied in
.I dirfd
does not refer to a directory,
and it it is not the case that both
.I flags
includes
.BR AT_EMPTY_PATH
and
.I pathname
is an empty string.
.TP
.B EOPNOTSUPP
The filesystem does not support decoding of a pathname to a file handle.
.TP
.B EOVERFLOW
The
.I handle->handle_bytes
value passed into the call was too small.
When this error occurs,
.I handle->handle_bytes
is updated to indicate the required size for the handle.
.\"
.\"
.PP
.BR open_by_handle_at ()
can fail with the following errors:
.TP
.B EBADF
.IR mount_fd
is not an open file descriptor.
.TP
.B EINVAL
.I handle->handle_bytes
is greater than
.BR MAX_HANDLE_SZ
or is equal to zero.
.TP
.B ELOOP
.\" FIXME (see earlier FIXME). Is this the intended behavior?
.I handle
refers to a symbolic link, but
.B O_PATH
was not specified in
.IR flags .
.TP
.B EPERM
The caller does not have the
.BR CAP_DAC_READ_SEARCH
capability.
.TP
.B ESTALE
The specified
.I handle
is no longer valid.
.SH VERSIONS
These system calls first appeared in Linux 2.6.39.
.SH CONFORMING TO
These system calls are nonstandard Linux extensions.
.SH NOTES
A file handle can be generated in one process using
.BR name_to_handle_at ()
and later used in a different process that calls
.BR open_by_handle_at ().

Not all filesystem types support the translation of pathnames to
file handles.
.\" FIXME NeilBrown noted:
.\"    ESTALE is also returned if the filesystem does not support
.\"    file-handle -> file mappings.
.\"    On filesystems which don't provide export_operations (/sys /proc
.\"    ubifs romfs cramfs nfs coda ... several others) name_to_handle_at
.\"    will produce a generic handle using the 32 bit inode and 32 bit
.\"    i_generation. open_by_name_at given this (or any) filehandle
.\"    will fail with ESTALE.
.\" However, on /proc and /sys, at least, name_to_handle_at() fails with
.\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the
.\" same error as for an invalid file handle) in the above circumstances?

A file handle may become invalid ("stale") if a file is deleted,
or for other filesystem-specific reasons.
Invalid handles are notified by an
.B ESTALE
error from
.BR open_by_name_at ().

These system calls are designed for use by user-space file servers.
For example, a user-space NFS server might generate a file handle
and pass it to an NFS client.
Later, when the client wants to open the file,
it could pass the handle back to the server.
.\" https://lwn.net/Articles/375888/
.\"	"Open by handle" - Jonathan Corbet, 2010-02-23
This sort of functionality allows a user-space file server to operate in
a stateless fashion with respect to the files it serves.

If
.I pathname
refers to a symbolic link and
.IR flags
does not specify
.BR AT_SYMLINK_FOLLOW ,
then
.BR name_to_handle_at ()
returns a handle for the link (rather than the file to which it refers).
.\" commit bcda76524cd1fa32af748536f27f674a13e56700
The process receiving the handle can later perform operations
on the symbolic link by converting the handle to a file descriptor using
.BR open_by_handle_at ()
with the
.BR O_PATH
flag, and then passing the file descriptor as the
.IR dirfd
argument in system calls such as
.BR readlinkat (2)
and
.BR fchownat (2).
.SS Obtaining a persistent filesystem ID
The mount IDs in
.IR /proc/self/mountinfo
can be reused as filesystems are unmounted and mounted.
Therefore, the mount ID returned by
.BR name_to_handle_at (3)
(in
.IR *mount_id )
should not be treated as a persistent identifier
for the corresponding mounted filesystem.
However, an application can use the information in the
.I mountinfo
record that corresponds to the mount ID
to derive a persistent identifier.

For example, one can use the device name in the fifth field of the
.I mountinfo
record to search for the corresponding device UUID via the symbolic links in
.IR /dev/disks/by-uuid .
(A more comfortable way of obtaining the UUID is to use the
.\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
.BR libblkid (3)
library.)
That process can then be reversed,
using the UUID to look up the device name,
and then obtaining the corresponding mount point,
in order to produce the
.IR mount_fd
argument used by
.BR open_by_name_at ().
.SH EXAMPLE
The two programs below demonstrate the use of
.BR name_to_handle_at ()
and
.BR open_by_handle_at ().
The first program
.RI ( t_name_to_handle_at.c )
uses
.BR name_to_handle_at ()
to obtain the file handle and mount ID
for the file specified in its command-line argument;
the handle and ID are written to standard output.

The second program
.RI ( t_open_by_handle_at.c )
reads a mount ID and file handle from standard input.
The program then employs
.BR open_by_handle_at ()
to open the file using that handle.
If an optional command-line argument is supplied, then the
.IR mount_fd
argument for
.BR open_by_handle_at ()
is obtained by opening the directory named in that argument.
Otherwise,
.IR mount_fd
is obtained by scanning
.IR /proc/self/mountinfo
to find a record whose mount ID matches the mount ID
read from standard input,
and the mount directory specified in that record is opened.
(These programs do not deal with the fact that mount IDs are not persistent.)

The following shell session demonstrates the use of these two programs:

.in +4n
.nf
$ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP
$ \fB./t_name_to_handle_at cecilia.txt > fh\fP
$ \fB./t_open_by_handle_at < fh\fP
open_by_handle_at: Operation not permitted
$ \fBsudo ./t_open_by_handle_at < fh\fP      # Need CAP_SYS_ADMIN
Read 28 bytes
$ \fBrm cecilia.txt\fP
.fi
.in

Now we delete and (quickly) re-create the file so that
it has the same content and (by chance) the same inode.
Nevertheless,
.BR open_by_handle_at ()
recognizes that the original file referred to by the file handle
no longer exists.

.in +4n
.nf
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Display inode number 
4072121
$ \fBrm cecilia.txt\fP
$ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP       # Check inode number
4072121
$ \fBsudo ./t_open_by_handle_at < fh\fP
open_by_handle_at: Stale NFS file handle
.fi
.in
.SS Program source: t_name_to_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fhsize, s;

    if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) {
        fprintf(stderr, "Usage: %s pathname\\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Allocate file_handle structure */

    fhsize = sizeof(struct file_handle *);
    fhp = malloc(fhsize);
    if (fhp == NULL)
        errExit("malloc");

    /* Make an initial call to name_to_handle_at() to discover
       the size required for file handle */

    fhp\->handle_bytes = 0;
    s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0);
    if (s != \-1 || errno != EOVERFLOW) {
        fprintf(stderr, "Unexpected result from name_to_handle_at()\\n");
        exit(EXIT_FAILURE);
    }

    /* Reallocate file_handle structure with correct size */

    fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;
    fhp = realloc(fhp, fhsize);         /* Copies fhp\->handle_bytes */
    if (fhp == NULL)
        errExit("realloc");

    /* Get file handle from pathname supplied on command line */

    if (name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0) == \-1)
        errExit("name_to_handle_at");

    /* Write mount ID, file handle size, and file handle to stdout,
       for later reuse by t_open_by_handle_at.c */

    if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) ||
            write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) ||
            write(STDOUT_FILENO, fhp, fhsize) != fhsize) {
        fprintf(stderr, "Write failure\\n");
        exit(EXIT_FAILURE);
    }

    exit(EXIT_SUCCESS);
}
.fi
.SS Program source: t_open_by_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

/* Scan /proc/self/mountinfo to find the line whose mount ID matches
   \(aqmount_id\(aq. (An easier way to do this is to install and use the
   \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.)
   Open the corresponding mount path and return the resulting file
   descriptor. */

static int
open_mount_path_by_id(int mount_id)
{
    char *linep;
    size_t lsize;
    char mount_path[PATH_MAX];
    int fmnt_id, fnd, nread;
    FILE *fp;

    fp  = fopen("/proc/self/mountinfo", "r");
    if (fp == NULL)
        errExit("fopen");

    for (fnd = 0; !fnd ; ) {
        linep = NULL;
        nread = getline(&linep, &lsize, fp);
        if (nread == \-1)
            break;

	nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path);
        if (nread != 2) {
            fprintf(stderr, "Bad sscanf()\\n");
            exit(EXIT_FAILURE);
        }

        free(linep);

        if (fmnt_id == mount_id)
            fnd = 1;
    }

    fclose(fp);

    if (!fnd) {
        fprintf(stderr, "Could not find mount point\\n");
        exit(EXIT_FAILURE);
    }

    return open(mount_path, O_RDONLY | O_DIRECTORY);
}

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fd, mount_fd, fhsize;
    ssize_t nread;
#define BSIZE 1000
    char buf[BSIZE];

    if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) {
        fprintf(stderr, "Usage: %s [mount\-dir]]\\n",
                argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Read data produced by t_name_to_handle_at.c */

    if (read(STDIN_FILENO, &mount_id, sizeof(int)) != sizeof(int))
        errExit("read");

    if (read(STDIN_FILENO, &fhsize, sizeof(int)) != sizeof(int))
        errExit("read");

    fhp = malloc(fhsize);
    if (fhp == NULL)
        errExit("malloc");

    if (read(STDIN_FILENO, fhp, fhsize) != fhsize)
        errExit("read");

    /* Obtain file descriptor for mount point, either by opening
       the pathname specified on the command line, or by scanning
       /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq
       obtained by name_to_handle_at() (in t_name_to_handle_at.c) */

    if (argc > 1)
        mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);
    else
        mount_fd = open_mount_path_by_id(mount_id);

    if (mount_fd == \-1)
        errExit("opening mount fd");

    /* Open name using handle and mount point */

    fd = open_by_handle_at(mount_fd, fhp, O_RDONLY);
    if (fd == \-1)
        errExit("open_by_handle_at");

    /* Try reading a few bytes from the file */

    nread = read(fd, buf, BSIZE);
    if (nread == \-1)
        errExit("read");
    printf("Read %ld bytes\\n", (long) nread);

    exit(EXIT_SUCCESS);
}
.fi
.SH SEE ALSO
.BR blkid (1),
.BR findfs (1),
.BR open (2),
.BR libblkid (3),
.BR mount (8)

The
.I libblkid
and
.I libmount
documentation under the latest
.I util-linux
release at
.UR https://www.kernel.org/pub/linux/utils/util-linux/
.UE



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-18 12:35     ` Michael Kerrisk (man-pages)
  (?)
@ 2014-03-18 13:07     ` Christoph Hellwig
  2014-03-18 13:30       ` Michael Kerrisk (man-pages)
  -1 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2014-03-18 13:07 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: NeilBrown, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger

On Tue, Mar 18, 2014 at 01:35:06PM +0100, Michael Kerrisk (man-pages) wrote:
> Indeed! I don't know quite what I was smoking as I reviewed that piece.
> In fact, I started writing this page a long time ago, but then other 
> events intervened, and it was a long time before I came back to it recently.
> Certainly, when I produced that shell session log, things proceeded
> (almost) as shown. I'm guessing that what happened is that I by 
> accident edited out a line
> 
>     rm cecilia.txt
> 
> just before
> 
>     echo 'Warum?' > cecilia.txt
> 
> Fixed now. (In that case of course, it is of course a matter of chance
> whether the pathname is re-created with the same i-node number, but if 
> you are quick, it often is. I'll add some explanation to the page.)

That's why the file handles contain a generation counter that gets
incremented in this case.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-18 13:07     ` Christoph Hellwig
@ 2014-03-18 13:30       ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-18 13:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: mtk.manpages, NeilBrown, Aneesh Kumar K.V, linux-man,
	Linux-Fsdevel, lkml, Andreas Dilger

On 03/18/2014 02:07 PM, Christoph Hellwig wrote:
> That's why the file handles contain a generation counter that gets
> incremented in this case.

Ahh, yes. Thanks for the reminder/clue.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
@ 2014-03-18 22:24         ` NeilBrown
  0 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2014-03-18 22:24 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christoph Hellwig, Aneesh Kumar K.V, linux-man, Linux-Fsdevel,
	lkml, Andreas Dilger

[-- Attachment #1: Type: text/plain, Size: 2560 bytes --]

On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)"
<mtk.manpages@gmail.com> wrote:

> On 03/18/2014 10:43 AM, Christoph Hellwig wrote:
> > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote:
> >> ESTALE is also returned if the filesystem does not support file-handle ->
> >> file mappings.
> >> On filesystems which don't provide export_operations (/sys /proc ubifs
> >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
> >> generic handle using the 32 bit inode and 32 bit i_generation.
> > 
> > Do we?  Seems like the code is erroring out early if there are no
> > export_ops?
> 
> It appears to me that Neil's statement isn't correct, at least for /proc
> and /sys (see my other mail, to Neil). I'm unsure about whether it is true
> for some of those other FSes thought.


Indeed, I was wrong.

I was looking at

int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
			     int *max_len, struct inode *parent)
{
	const struct export_operations *nop = inode->i_sb->s_export_op;

	if (nop && nop->encode_fh)
		return nop->encode_fh(inode, fid->raw, max_len, parent);

	return export_encode_fh(inode, fid, max_len, parent);
}


which uses a default if there is no 'nop'.

However do_sys_name_to_handle() contains

	if (!path->dentry->d_sb->s_export_op ||
	    !path->dentry->d_sb->s_export_op->fh_to_dentry)
		return -EOPNOTSUPP;

long before export_encode_inode_fh() gets called.  So the default isn't used.

I would have thought that exportfs_encode_inode_fh would never get called if
there were no s_export_op pointer - certainly name_to_handle_at and nfsd
would never call it in that case.
However it seems that

    This routine will be used to generate a file handle in fdinfo output for
    inotify subsystem, where if no s_export_op present the general
    export_encode_fh should be used.  Thus add a test if s_export_op present
    inside exportfs_encode_fh itself.

according to

commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6
Author: Cyrill Gorcunov <gorcunov@openvz.org>
Date:   Mon Dec 17 16:05:06 2012 -0800


I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD
when $FD is an inotify fd which is watching the particular file.....  I
wouldn't have expected that, but maybe it is a good idea.

So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP.
So if you get ESTALE from open_by_handle_at(), then it really is a stale
handle.  Sorry for the confusion.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
@ 2014-03-18 22:24         ` NeilBrown
  0 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2014-03-18 22:24 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christoph Hellwig, Aneesh Kumar K.V,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml,
	Andreas Dilger

[-- Attachment #1: Type: text/plain, Size: 2619 bytes --]

On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)"
<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On 03/18/2014 10:43 AM, Christoph Hellwig wrote:
> > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote:
> >> ESTALE is also returned if the filesystem does not support file-handle ->
> >> file mappings.
> >> On filesystems which don't provide export_operations (/sys /proc ubifs
> >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
> >> generic handle using the 32 bit inode and 32 bit i_generation.
> > 
> > Do we?  Seems like the code is erroring out early if there are no
> > export_ops?
> 
> It appears to me that Neil's statement isn't correct, at least for /proc
> and /sys (see my other mail, to Neil). I'm unsure about whether it is true
> for some of those other FSes thought.


Indeed, I was wrong.

I was looking at

int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
			     int *max_len, struct inode *parent)
{
	const struct export_operations *nop = inode->i_sb->s_export_op;

	if (nop && nop->encode_fh)
		return nop->encode_fh(inode, fid->raw, max_len, parent);

	return export_encode_fh(inode, fid, max_len, parent);
}


which uses a default if there is no 'nop'.

However do_sys_name_to_handle() contains

	if (!path->dentry->d_sb->s_export_op ||
	    !path->dentry->d_sb->s_export_op->fh_to_dentry)
		return -EOPNOTSUPP;

long before export_encode_inode_fh() gets called.  So the default isn't used.

I would have thought that exportfs_encode_inode_fh would never get called if
there were no s_export_op pointer - certainly name_to_handle_at and nfsd
would never call it in that case.
However it seems that

    This routine will be used to generate a file handle in fdinfo output for
    inotify subsystem, where if no s_export_op present the general
    export_encode_fh should be used.  Thus add a test if s_export_op present
    inside exportfs_encode_fh itself.

according to

commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6
Author: Cyrill Gorcunov <gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Date:   Mon Dec 17 16:05:06 2012 -0800


I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD
when $FD is an inotify fd which is watching the particular file.....  I
wouldn't have expected that, but maybe it is a good idea.

So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP.
So if you get ESTALE from open_by_handle_at(), then it really is a stale
handle.  Sorry for the confusion.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page [v2]
@ 2014-03-19  4:13     ` NeilBrown
  0 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2014-03-19  4:13 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger,
	Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 3075 bytes --]

On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)"
<mtk.manpages@gmail.com> wrote:

> Hi Aneesh, (and others)
> 
> After integrating review comments from NeilBown and Christoph Hellwig,
> here is draft 2 of a man page I've written for name_to_handle_at(2) and
> open_by_name_at(2). Especially thanks to Neil's comments, several parts
> of the page underwent a substantial rewrite. Would you be willing to 
> review it please, and let me know of any corrections/improvements?

I didn't notice before but above and in $SUBJ I see "open_by_name_at", which
is fictitious :-)

> 
> Together, the
> .I pathname
> and
> .I dirfd
> arguments identify the file for which a handle is to obtained.
                                                      ^be


> 
> The
> .I flags
> argument is a bit mask constructed by ORing together
> zero or more of the following value:
                                     ^s


> .TP
> .B AT_EMPTY_PATH
> Allow
> .I pathname
> to be an empty string.
> See above.
> (which may have been obtained using the
> .BR open (2)
> .B O_PATH
> flag).

What "may have been obtained" ??


> The
> .I flags
> argument
> is as for
> .BR open (2).
> .\" FIXME: Confirm that the following is intended behavior.
> .\"        (It certainly seems to be the behavior, from experimenting.)
> If
> .I handle
> refers to a symbolic link, the caller must specify the
> .B O_PATH
> flag, and the symbolic link is not dereferenced (the
> .B O_NOFOLLOW
> flag, if specified, is ignored).

It certainly sounds like reasonable behaviour.  I cannot comment on intention
though.
Are you bothered that O_PATH is needed for symlinks?  An fd on a symlink is a
sufficiently unusual thing that it seems reasonable for a programmer to
explicitly say they are expecting one.


> 
> In the event of an error, both system calls return \-1 and set
> .I errno
> to indicate the cause of the error.
> .SH ERRORS
> .BR name_to_handle_at ()
> and
> .BR open_by_handle_at ()
> can fail for the same errors as
> .BR openat (2).
> In addition, they can fail with the errors noted below.

Should you mention EFAULT if mount_id or handle are not valid pointers?


> 
> Not all filesystem types support the translation of pathnames to
> file handles.
> .\" FIXME NeilBrown noted:
> .\"    ESTALE is also returned if the filesystem does not support
> .\"    file-handle -> file mappings.
> .\"    On filesystems which don't provide export_operations (/sys /proc
> .\"    ubifs romfs cramfs nfs coda ... several others) name_to_handle_at
> .\"    will produce a generic handle using the 32 bit inode and 32 bit
> .\"    i_generation. open_by_name_at given this (or any) filehandle
> .\"    will fail with ESTALE.
> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with
> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the
> .\" same error as for an invalid file handle) in the above circumstances?

This is all wrong - discard it :-)

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page [v2]
@ 2014-03-19  4:13     ` NeilBrown
  0 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2014-03-19  4:13 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Aneesh Kumar K.V, linux-man-u79uwXL29TY76Z2rM5mHXA,
	Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 3105 bytes --]

On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)"
<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Hi Aneesh, (and others)
> 
> After integrating review comments from NeilBown and Christoph Hellwig,
> here is draft 2 of a man page I've written for name_to_handle_at(2) and
> open_by_name_at(2). Especially thanks to Neil's comments, several parts
> of the page underwent a substantial rewrite. Would you be willing to 
> review it please, and let me know of any corrections/improvements?

I didn't notice before but above and in $SUBJ I see "open_by_name_at", which
is fictitious :-)

> 
> Together, the
> .I pathname
> and
> .I dirfd
> arguments identify the file for which a handle is to obtained.
                                                      ^be


> 
> The
> .I flags
> argument is a bit mask constructed by ORing together
> zero or more of the following value:
                                     ^s


> .TP
> .B AT_EMPTY_PATH
> Allow
> .I pathname
> to be an empty string.
> See above.
> (which may have been obtained using the
> .BR open (2)
> .B O_PATH
> flag).

What "may have been obtained" ??


> The
> .I flags
> argument
> is as for
> .BR open (2).
> .\" FIXME: Confirm that the following is intended behavior.
> .\"        (It certainly seems to be the behavior, from experimenting.)
> If
> .I handle
> refers to a symbolic link, the caller must specify the
> .B O_PATH
> flag, and the symbolic link is not dereferenced (the
> .B O_NOFOLLOW
> flag, if specified, is ignored).

It certainly sounds like reasonable behaviour.  I cannot comment on intention
though.
Are you bothered that O_PATH is needed for symlinks?  An fd on a symlink is a
sufficiently unusual thing that it seems reasonable for a programmer to
explicitly say they are expecting one.


> 
> In the event of an error, both system calls return \-1 and set
> .I errno
> to indicate the cause of the error.
> .SH ERRORS
> .BR name_to_handle_at ()
> and
> .BR open_by_handle_at ()
> can fail for the same errors as
> .BR openat (2).
> In addition, they can fail with the errors noted below.

Should you mention EFAULT if mount_id or handle are not valid pointers?


> 
> Not all filesystem types support the translation of pathnames to
> file handles.
> .\" FIXME NeilBrown noted:
> .\"    ESTALE is also returned if the filesystem does not support
> .\"    file-handle -> file mappings.
> .\"    On filesystems which don't provide export_operations (/sys /proc
> .\"    ubifs romfs cramfs nfs coda ... several others) name_to_handle_at
> .\"    will produce a generic handle using the 32 bit inode and 32 bit
> .\"    i_generation. open_by_name_at given this (or any) filehandle
> .\"    will fail with ESTALE.
> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with
> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the
> .\" same error as for an invalid file handle) in the above circumstances?

This is all wrong - discard it :-)

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page [v2]
  2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages)
  2014-03-19  4:13     ` NeilBrown
@ 2014-03-19  6:42   ` Mike Frysinger
  2014-03-19 13:11     ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 22+ messages in thread
From: Mike Frysinger @ 2014-03-19  6:42 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger,
	NeilBrown, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 5942 bytes --]

On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote:
> The
> .I flags
> argument is a bit mask constructed by ORing together
> zero or more of the following value:
> .TP
> .B AT_EMPTY_PATH
> Allow
> .I pathname
> to be an empty string.
> See above.
> (which may have been obtained using the
> .BR open (2)
> .B O_PATH
> flag).
> .TP
> .B AT_SYMLINK_FOLLOW
> By default,
> .BR name_to_handle_at ()
> does not dereference
> .I pathname
> if it is a symbolic link.
> The flag
> .B AT_SYMLINK_FOLLOW
> can be specified in
> .I flags
> to cause
> .I pathname
> to be dereferenced if it is a symbolic link.

this section is only talking about |flags|, and further this part is only 
talking about AT_SYMLINK_FOLLOW.  so this last sentence sounds super 
redundant.

how about reversing the sentence order so that both are implicit like is done 
in the openat() page and the description of O_NOFOLLOW ?

> .B ENOTDIR
> The file descriptor supplied in
> .I dirfd
> does not refer to a directory,
> and it it is not the case that both

"it" is duplicated

> .SS Obtaining a persistent filesystem ID
> The mount IDs in
> .IR /proc/self/mountinfo
> can be reused as filesystems are unmounted and mounted.
> Therefore, the mount ID returned by
> .BR name_to_handle_at (3)

should be () and not (3)

side note: this seems like an easy error to script for ...

> $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP

aber, ich spreche kein Deutsch :(

do we have a standard about sticking to english ?  i wonder if people are more 
likely to be confused or to appreciate it ...

> #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
>                         } while (0)

i wonder if err.h makes sense now that this is a man page for completely 
linux-specific syscalls :).  and you use _GNU_SOURCE.

> int
> main(int argc, char *argv[])
> {
>     struct file_handle *fhp;
>     int mount_id, fhsize, s;
> 
>     if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) {

argc != 2 ?

>     /* Allocate file_handle structure */
> 
>     fhsize = sizeof(struct file_handle *);

pretty sure this is wrong as sizeof() here returns the size of a pointer, not 
the size of the struct.  it's why i prefer the form:

	fhsize = sizeof(*fhp);

less typing and harder to screw up by accident.

granted, the case below won't crash since the kernel only reads/writes 
sizeof(unsigned int) and i'm not aware of any system where that is larger than 
sizeof(void *), but it's still wrong :).

>     s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0);

another personal style: create dedicated variables for each arg you unpack out 
of argv[1].  it's generally OK when you only take one arg, but when you get 
more than one, you end up flipping back and forth between the usage trying to 
figure out what index 1 represents instead of focusing on what the code is 
doing.
	const char *pathname = argv[1];

>     fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;

fhsize += fhp->handle_bytes ?

it's the same, but i think nicer ;)

>     /* Write mount ID, file handle size, and file handle to stdout,
>        for later reuse by t_open_by_handle_at.c */
> 
>     if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) ||
>             write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) ||
>             write(STDOUT_FILENO, fhp, fhsize) != fhsize) {

seems like a whole lot of code spew for a simple printf() ?  you'd have to 
adjust the other program to use scanf(), but seems like the end result would 
be nicer for users to experiment with ?

> static int
> open_mount_path_by_id(int mount_id)
> {
>     char *linep;
>     size_t lsize;
>     char mount_path[PATH_MAX];
>     int fmnt_id, fnd, nread;

could we buy a few more letters for these vars ?  i guess fmnt_id is the 
filesystem mount id, and fnd is "find".

also, getline() returns a ssize_t, not an int.

>     FILE *fp;
> 
>     fp  = fopen("/proc/self/mountinfo", "r");

only one space before the =

i would encourage using the "e" flag whenever possible in the hopes that 
someone might start using it in their own code base.

	fp = fopen("/proc/self/mountinfo", "re");

>     for (fnd = 0; !fnd ; ) {

in my experience, seems like a while() loop makes more sense when you're 
implementing a while() loop ...
	fnd = 0;
	while (!fnd) {

>         linep = NULL;
>         nread = getline(&linep, &lsize, fp);

this works, but it's unusual when using getline() as it kind of defeats the 
purpose of using the dyn allocation feature.

	fnd = 0;
	linep = NULL;
	while (!fnd) {
		nread = getline(&linep, &lsize, fp);
		...
	}
	free(linep);

i don't think it complicates the code much more ?

>         if (nread == \-1)
>             break;
> 
> 	nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path);

indent is off here

>     return open(mount_path, O_RDONLY | O_DIRECTORY);

O_CLOEXEC for funsies ?

> int
> main(int argc, char *argv[])
> {
>     struct file_handle *fhp;
>     int mount_id, fd, mount_fd, fhsize;
>     ssize_t nread;
> #define BSIZE 1000
>     char buf[BSIZE];

why not sizeof(buf) and avoid the define ?

>     if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) {
>         fprintf(stderr, "Usage: %s [mount\-dir]]\\n",
>                 argv[0]);

how about also aborting when argc > 2 ?

>     if (argc > 1)
>         mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);

O_CLOEXEC ?

>     nread = read(fd, buf, BSIZE);
>     if (nread == \-1)
>         errExit("read");
>     printf("Read %ld bytes\\n", (long) nread);

yikes, that's a bad habit to encourage.  read() returns a ssize_t, so print it 
out using %zd.

> .SH SEE ALSO
> .BR blkid (1),
> .BR findfs (1),

i don't have a findfs(1).  i do have a findfs(8) ...
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page
  2014-03-18 22:24         ` NeilBrown
  (?)
@ 2014-03-19  9:09         ` Michael Kerrisk (man-pages)
  -1 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-19  9:09 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages, Christoph Hellwig, Aneesh Kumar K.V, linux-man,
	Linux-Fsdevel, lkml, Andreas Dilger

Hi Neil,

On 03/18/2014 11:24 PM, NeilBrown wrote:
> On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:
> 
>> On 03/18/2014 10:43 AM, Christoph Hellwig wrote:
>>> On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote:
>>>> ESTALE is also returned if the filesystem does not support file-handle ->
>>>> file mappings.
>>>> On filesystems which don't provide export_operations (/sys /proc ubifs
>>>> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a
>>>> generic handle using the 32 bit inode and 32 bit i_generation.
>>>
>>> Do we?  Seems like the code is erroring out early if there are no
>>> export_ops?
>>
>> It appears to me that Neil's statement isn't correct, at least for /proc
>> and /sys (see my other mail, to Neil). I'm unsure about whether it is true
>> for some of those other FSes thought.
> 
> 
> Indeed, I was wrong.
> 
> I was looking at
> 
> int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
> 			     int *max_len, struct inode *parent)
> {
> 	const struct export_operations *nop = inode->i_sb->s_export_op;
> 
> 	if (nop && nop->encode_fh)
> 		return nop->encode_fh(inode, fid->raw, max_len, parent);
> 
> 	return export_encode_fh(inode, fid, max_len, parent);
> }
> 
> 
> which uses a default if there is no 'nop'.
> 
> However do_sys_name_to_handle() contains
> 
> 	if (!path->dentry->d_sb->s_export_op ||
> 	    !path->dentry->d_sb->s_export_op->fh_to_dentry)
> 		return -EOPNOTSUPP;
> 
> long before export_encode_inode_fh() gets called.  So the default isn't used.

Okay.

> I would have thought that exportfs_encode_inode_fh would never get called if
> there were no s_export_op pointer - certainly name_to_handle_at and nfsd
> would never call it in that case.
> However it seems that
> 
>     This routine will be used to generate a file handle in fdinfo output for
>     inotify subsystem, where if no s_export_op present the general
>     export_encode_fh should be used.  Thus add a test if s_export_op present
>     inside exportfs_encode_fh itself.
> 
> according to
> 
> commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6
> Author: Cyrill Gorcunov <gorcunov@openvz.org>
> Date:   Mon Dec 17 16:05:06 2012 -0800
> 
> 
> I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD
> when $FD is an inotify fd which is watching the particular file.....  I
> wouldn't have expected that, but maybe it is a good idea.

Yes, it does--I tested it, and it works! I was unaware of this feature,
though I'm not sure that I'll add anything to a man page just yet.

> So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP.
> So if you get ESTALE from open_by_handle_at(), then it really is a stale
> handle.  Sorry for the confusion.

Yup, I've updated the page now.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page [v2]
@ 2014-03-19  9:09       ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-19  9:09 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger, Christoph Hellwig, Al Viro

[CC =+ Al Viro]

Hi Neil,

On 03/19/2014 05:13 AM, NeilBrown wrote:
> On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)"
> <mtk.manpages@gmail.com> wrote:
> 
>> Hi Aneesh, (and others)
>>
>> After integrating review comments from NeilBown and Christoph Hellwig,
>> here is draft 2 of a man page I've written for name_to_handle_at(2) and
>> open_by_name_at(2). Especially thanks to Neil's comments, several parts
>> of the page underwent a substantial rewrite. Would you be willing to 
>> review it please, and let me know of any corrections/improvements?

[various typos you reported fixed now.]

>> .TP
>> .B AT_EMPTY_PATH
>> Allow
>> .I pathname
>> to be an empty string.
>> See above.
>> (which may have been obtained using the
>> .BR open (2)
>> .B O_PATH
>> flag).
> 
> What "may have been obtained" ??

Crufty text. gone now.

>> The
>> .I flags
>> argument
>> is as for
>> .BR open (2).
>> .\" FIXME: Confirm that the following is intended behavior.
>> .\"        (It certainly seems to be the behavior, from experimenting.)
>> If
>> .I handle
>> refers to a symbolic link, the caller must specify the
>> .B O_PATH
>> flag, and the symbolic link is not dereferenced (the
>> .B O_NOFOLLOW
>> flag, if specified, is ignored).
> 
> It certainly sounds like reasonable behaviour.  I cannot comment on intention
> though.
> Are you bothered that O_PATH is needed for symlinks?  

No.

> An fd on a symlink is a
> sufficiently unusual thing that it seems reasonable for a programmer to
> explicitly say they are expecting one.

I think the point is this: If you have a file handle for a symlink, then
you can't follow the symlink, which is why you must specify O_PATH and
O_NOFOLLOW becomes irrelevant. I'm curious about the rationale though.
I suspect it's something like: the process receiving the handle doesn't
have enough information for the symlink to be interpreted, I think because
it can;t reliably determine what directory the link lives in.
Possibly Al Viro or Aneesh can confirm.

>> In the event of an error, both system calls return \-1 and set
>> .I errno
>> to indicate the cause of the error.
>> .SH ERRORS
>> .BR name_to_handle_at ()
>> and
>> .BR open_by_handle_at ()
>> can fail for the same errors as
>> .BR openat (2).
>> In addition, they can fail with the errors noted below.
> 
> Should you mention EFAULT if mount_id or handle are not valid pointers?

Done.

>> Not all filesystem types support the translation of pathnames to
>> file handles.
>> .\" FIXME NeilBrown noted:
>> .\"    ESTALE is also returned if the filesystem does not support
>> .\"    file-handle -> file mappings.
>> .\"    On filesystems which don't provide export_operations (/sys /proc
>> .\"    ubifs romfs cramfs nfs coda ... several others) name_to_handle_at
>> .\"    will produce a generic handle using the 32 bit inode and 32 bit
>> .\"    i_generation. open_by_name_at given this (or any) filehandle
>> .\"    will fail with ESTALE.
>> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with
>> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the
>> .\" same error as for an invalid file handle) in the above circumstances?
> 
> This is all wrong - discard it :-)

Yup. Gone now ;-).

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page [v2]
@ 2014-03-19  9:09       ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-19  9:09 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Aneesh Kumar K.V,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml,
	Andreas Dilger, Christoph Hellwig, Al Viro

[CC =+ Al Viro]

Hi Neil,

On 03/19/2014 05:13 AM, NeilBrown wrote:
> On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)"
> <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
>> Hi Aneesh, (and others)
>>
>> After integrating review comments from NeilBown and Christoph Hellwig,
>> here is draft 2 of a man page I've written for name_to_handle_at(2) and
>> open_by_name_at(2). Especially thanks to Neil's comments, several parts
>> of the page underwent a substantial rewrite. Would you be willing to 
>> review it please, and let me know of any corrections/improvements?

[various typos you reported fixed now.]

>> .TP
>> .B AT_EMPTY_PATH
>> Allow
>> .I pathname
>> to be an empty string.
>> See above.
>> (which may have been obtained using the
>> .BR open (2)
>> .B O_PATH
>> flag).
> 
> What "may have been obtained" ??

Crufty text. gone now.

>> The
>> .I flags
>> argument
>> is as for
>> .BR open (2).
>> .\" FIXME: Confirm that the following is intended behavior.
>> .\"        (It certainly seems to be the behavior, from experimenting.)
>> If
>> .I handle
>> refers to a symbolic link, the caller must specify the
>> .B O_PATH
>> flag, and the symbolic link is not dereferenced (the
>> .B O_NOFOLLOW
>> flag, if specified, is ignored).
> 
> It certainly sounds like reasonable behaviour.  I cannot comment on intention
> though.
> Are you bothered that O_PATH is needed for symlinks?  

No.

> An fd on a symlink is a
> sufficiently unusual thing that it seems reasonable for a programmer to
> explicitly say they are expecting one.

I think the point is this: If you have a file handle for a symlink, then
you can't follow the symlink, which is why you must specify O_PATH and
O_NOFOLLOW becomes irrelevant. I'm curious about the rationale though.
I suspect it's something like: the process receiving the handle doesn't
have enough information for the symlink to be interpreted, I think because
it can;t reliably determine what directory the link lives in.
Possibly Al Viro or Aneesh can confirm.

>> In the event of an error, both system calls return \-1 and set
>> .I errno
>> to indicate the cause of the error.
>> .SH ERRORS
>> .BR name_to_handle_at ()
>> and
>> .BR open_by_handle_at ()
>> can fail for the same errors as
>> .BR openat (2).
>> In addition, they can fail with the errors noted below.
> 
> Should you mention EFAULT if mount_id or handle are not valid pointers?

Done.

>> Not all filesystem types support the translation of pathnames to
>> file handles.
>> .\" FIXME NeilBrown noted:
>> .\"    ESTALE is also returned if the filesystem does not support
>> .\"    file-handle -> file mappings.
>> .\"    On filesystems which don't provide export_operations (/sys /proc
>> .\"    ubifs romfs cramfs nfs coda ... several others) name_to_handle_at
>> .\"    will produce a generic handle using the 32 bit inode and 32 bit
>> .\"    i_generation. open_by_name_at given this (or any) filehandle
>> .\"    will fail with ESTALE.
>> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with
>> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the
>> .\" same error as for an invalid file handle) in the above circumstances?
> 
> This is all wrong - discard it :-)

Yup. Gone now ;-).

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: For review: open_by_name_at(2) man page [v2]
  2014-03-19  6:42   ` For review: open_by_name_at(2) man page [v2] Mike Frysinger
@ 2014-03-19 13:11     ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-19 13:11 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger, NeilBrown, Christoph Hellwig

Hi Mike,

On 03/19/2014 07:42 AM, Mike Frysinger wrote:
> On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote:
>> The
>> .I flags
>> argument is a bit mask constructed by ORing together
>> zero or more of the following value:
>> .TP
>> .B AT_EMPTY_PATH
>> Allow
>> .I pathname
>> to be an empty string.
>> See above.
>> (which may have been obtained using the
>> .BR open (2)
>> .B O_PATH
>> flag).
>> .TP
>> .B AT_SYMLINK_FOLLOW
>> By default,
>> .BR name_to_handle_at ()
>> does not dereference
>> .I pathname
>> if it is a symbolic link.
>> The flag
>> .B AT_SYMLINK_FOLLOW
>> can be specified in
>> .I flags
>> to cause
>> .I pathname
>> to be dereferenced if it is a symbolic link.
> 
> this section is only talking about |flags|, and further this part is only 
> talking about AT_SYMLINK_FOLLOW.  so this last sentence sounds super 
> redundant.
> how about reversing the sentence order so that both are implicit like is done 
> in the openat() page and the description of O_NOFOLLOW ?

I'm not sure that I completely understand you here, but I agree that this could 
be better. I've rewritten somewhat.

>> .B ENOTDIR
>> The file descriptor supplied in
>> .I dirfd
>> does not refer to a directory,
>> and it it is not the case that both
> 
> "it" is duplicated

Fixed.


>> .SS Obtaining a persistent filesystem ID
>> The mount IDs in
>> .IR /proc/self/mountinfo
>> can be reused as filesystems are unmounted and mounted.
>> Therefore, the mount ID returned by
>> .BR name_to_handle_at (3)
> 
> should be () and not (3)

Fixed.

> side note: this seems like an easy error to script for ...

Yep, I've got some scripts that I run manually now and then to 
check for this sort of stuff.

>> $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP
> 
> aber, ich spreche kein Deutsch :(
> 
> do we have a standard about sticking to english ?  i wonder if people are more 
> likely to be confused or to appreciate it ...

Fair enough. I'm too influenced by recent work on the locale pages (and 
family conversations ;-)). I'll switch it to English

>> #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
>>                         } while (0)
> 
> i wonder if err.h makes sense now that this is a man page for completely 
> linux-specific syscalls :).  and you use _GNU_SOURCE.

I'm not really convinced about using these functions, but I'll reflect 
on it more.

>> int
>> main(int argc, char *argv[])
>> {
>>     struct file_handle *fhp;
>>     int mount_id, fhsize, s;
>>
>>     if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) {
> 
> argc != 2 ?

Yes, some cruft crept in.

>>     /* Allocate file_handle structure */
>>
>>     fhsize = sizeof(struct file_handle *);
> 
> pretty sure this is wrong 

<blush>

> as sizeof() here returns the size of a pointer, not 
> the size of the struct.  it's why i prefer the form:
> 
> 	fhsize = sizeof(*fhp);
> 
> less typing and harder to screw up by accident.

Yep, changed.

> granted, the case below won't crash since the kernel only reads/writes 
> sizeof(unsigned int) and i'm not aware of any system where that is larger than 
> sizeof(void *), but it's still wrong :).
> 
>>     s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0);
> 
> another personal style: create dedicated variables for each arg you unpack out 
> of argv[1].  it's generally OK when you only take one arg, but when you get 
> more than one, you end up flipping back and forth between the usage trying to 
> figure out what index 1 represents instead of focusing on what the code is 
> doing.
> 	const char *pathname = argv[1];

Yup.

>>     fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;
> 
> fhsize += fhp->handle_bytes ?
> 
> it's the same, but i think nicer ;)

Depends on your perspective. It relies on no one changing the code
so that fhsize is modified after the earlier initialization. And
also, with this line, I see exactly what is going on, in one place.
I'll leave as is.

>>     /* Write mount ID, file handle size, and file handle to stdout,
>>        for later reuse by t_open_by_handle_at.c */
>>
>>     if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) ||
>>             write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) ||
>>             write(STDOUT_FILENO, fhp, fhsize) != fhsize) {
> 
> seems like a whole lot of code spew for a simple printf() ?  you'd have to 
> adjust the other program to use scanf(), but seems like the end result would 
> be nicer for users to experiment with ?

Yes. I'd already reflected on exactly that and made a change to using text 
formats.

>> static int
>> open_mount_path_by_id(int mount_id)
>> {
>>     char *linep;
>>     size_t lsize;
>>     char mount_path[PATH_MAX];
>>     int fmnt_id, fnd, nread;
> 
> could we buy a few more letters for these vars ?  i guess fmnt_id is the 
> filesystem mount id, and fnd is "find".

When I was a kid, you had to pay a dollar for each letter...
(I've made a few changes.)

> also, getline() returns a ssize_t, not an int.

Fixed.

>>     FILE *fp;
>>
>>     fp  = fopen("/proc/self/mountinfo", "r");
> 
> only one space before the =

Yup.

> i would encourage using the "e" flag whenever possible in the hopes that 
> someone might start using it in their own code base.
> 
> 	fp = fopen("/proc/self/mountinfo", "re");

I'm of two minds about this. I foresee the day when I get a bug report that
says: "Why did you use 'e' here (or O_CLOEXEC)? It's not needed". So, I'm 
inclined to leave this.

>>     for (fnd = 0; !fnd ; ) {
> 
> in my experience, seems like a while() loop makes more sense when you're 
> implementing a while() loop ...
> 	fnd = 0;
> 	while (!fnd) {

Yup. ;-}.

>>         linep = NULL;
>>         nread = getline(&linep, &lsize, fp);
> 
> this works, but it's unusual when using getline() as it kind of defeats the 
> purpose of using the dyn allocation feature.
> 
> 	fnd = 0;
> 	linep = NULL;
> 	while (!fnd) {
> 		nread = getline(&linep, &lsize, fp);
> 		...
> 	}
> 	free(linep);
> 
> i don't think it complicates the code much more ?

Yes. Fixed.

>>         if (nread == \-1)
>>             break;
>>
>> 	nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path);
> 
> indent is off here

Fixed.

>>     return open(mount_path, O_RDONLY | O_DIRECTORY);
> 
> O_CLOEXEC for funsies ?

See above comment.

>> int
>> main(int argc, char *argv[])
>> {
>>     struct file_handle *fhp;
>>     int mount_id, fd, mount_fd, fhsize;
>>     ssize_t nread;
>> #define BSIZE 1000
>>     char buf[BSIZE];
> 
> why not sizeof(buf) and avoid the define ?

Done.

>>     if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) {
>>         fprintf(stderr, "Usage: %s [mount\-dir]]\\n",
>>                 argv[0]);
> 
> how about also aborting when argc > 2 ?

Yes.

>>     if (argc > 1)
>>         mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY);
> 
> O_CLOEXEC ?

See comment above.

>>     nread = read(fd, buf, BSIZE);
>>     if (nread == \-1)
>>         errExit("read");
>>     printf("Read %ld bytes\\n", (long) nread);
> 
> yikes, that's a bad habit to encourage.  read() returns a ssize_t, so print it 
> out using %zd.

Calling it a bad habit seems a bit too strong. It's a habit conditioned by writing 
code that runs on systems that don't have C99. Less important these days, of course.
I've changed it.

>> .SH SEE ALSO
>> .BR blkid (1),
>> .BR findfs (1),
> 
> i don't have a findfs(1).  i do have a findfs(8) ...

Thanks. blkid(8) also, actually.

Thanks for the comments, Mike.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* For review: open_by_handle_at(2) man page [v3]
@ 2014-03-19 14:14         ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-19 14:14 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml,
	Andreas Dilger, Christoph Hellwig, Al Viro, Mike Frysinger

Hi Aneesh, (and others),

After integrating review comments from NeilBown, Christoph Hellwig,
and Mike Frysinger, here is draft 3 of a man page I've written for 
name_to_handle_at(2) and open_by_handle_at(2). Would you be willing to 
review it please, and let me know of any corrections/improvements?

There are some FIXMEs in the page that I would especially like some
help with.

Thanks,

Michael

.\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@gmail.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual"
.SH NAME
name_to_handle_at, open_by_handle_at \- obtain handle
for a pathname and open file via a handle
.SH SYNOPSIS
.nf
.B #define _GNU_SOURCE
.B #include <sys/types.h>
.B #include <sys/stat.h>
.B #include <fcntl.h>

.BI "int name_to_handle_at(int " dirfd ", const char *" pathname ,
.BI "                      struct file_handle *" handle ,
.BI "                      int *" mount_id ", int " flags );

.BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle ,
.BI "                      int " flags );
.fi
.SH DESCRIPTION
The
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
system calls split the functionality of
.BR openat (2)
into two parts:
.BR name_to_handle_at ()
returns an opaque handle that corresponds to a specified file;
.BR open_by_handle_at ()
opens the file corresponding to a handle returned by a previous call to
.BR name_to_handle_at ()
and returns an open file descriptor.
.\"
.\"
.SS name_to_handle_at()
The
.BR name_to_handle_at ()
system call returns a file handle and a mount ID corresponding to
the file specified by the
.IR dirfd
and
.IR pathname
arguments.
The file handle is returned via the argument
.IR handle ,
which is a pointer to a structure of the following form:

.in +4n
.nf
struct file_handle {
    unsigned int  handle_bytes;   /* Size of f_handle [in, out] */
    int           handle_type;    /* Handle type [out] */
    unsigned char f_handle[0];    /* File identifier (sized by
                                     caller) [out] */
};
.fi
.in
.PP
It is the caller's responsibility to allocate the structure
with a size large enough to hold the handle returned in
.IR f_handle .
Before the call, the
.IR handle_bytes
field should be initialized to contain the allocated size for
.IR f_handle .
(The constant
.BR MAX_HANDLE_SZ ,
defined in
.IR <fcntl.h> ,
specifies the maximum possible size for a file handle.)
Upon successful return, the
.IR handle_bytes
field is updated to contain the number of bytes actually written to
.IR f_handle .

The caller can discover the required size for the
.I file_handle
structure by making a call in which
.IR handle->handle_bytes
is zero;
in this case, the call fails with the error
.BR EOVERFLOW
and
.IR handle->handle_bytes
is set to indicate the required size;
the caller can then use this information to allocate a structure
of the correct size (see EXAMPLE below).

Other than the use of the
.IR handle_bytes
field, the caller should treat the
.IR file_handle
structure as an opaque data type: the
.IR handle_type
and
.IR f_handle
fields are needed only by a subsequent call to
.BR open_by_handle_at ().

The
.I flags
argument is a bit mask constructed by ORing together zero or more of
.BR AT_EMPTY_PATH
and
.BR AT_SYMLINK_FOLLOW ,
described below.

Together, the
.I pathname
and
.I dirfd
arguments identify the file for which a handle is to be obtained.
There are four distinct cases:
.IP * 3
If
.I pathname
is a nonempty string containing an absolute pathname,
then a handle is returned for the file referred to by that pathname.
In this case,
.IR dirfd
is ignored.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
has the special value
.BR AT_FDCWD ,
then
.I pathname
is interpreted relative to the current working directory of the caller,
and a handle is returned for the file to which it refers.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
is a file descriptor referring to a directory, then
.I pathname
is interpreted relative to the directory referred to by
.IR dirfd ,
and a handle is returned for the file to which it refers.
(See
.BR openat (3)
for an explanation of why "directory file descriptors" are useful.)
.IP *
If
.I pathname
is an empty string and
.I flags
specifies the value
.BR AT_EMPTY_PATH ,
then
.IR dirfd
can be an open file descriptor referring to any type of file,
or
.BR AT_FDCWD ,
meaning the current working directory,
and a handle is returned for the file to which it refers.
.PP
The
.I mount_id
argument returns an identifier for the filesystem
mount that corresponds to
.IR pathname .
This corresponds to the first field in one of the records in
.IR /proc/self/mountinfo .
Opening the pathname in the fifth field of that record yields a file
descriptor for the mount point;
that file descriptor can be used in a subsequent call to
.BR open_by_handle_at ().

By default,
.BR name_to_handle_at ()
does not dereference
.I pathname
if it is a symbolic link, and thus returns a handle for the link itself.
If
.B AT_SYMLINK_FOLLOW
is specified in
.IR flags ,
.I pathname
is dereferenced if it is a symbolic link
(so that the call returns a handle for the file referred to by the link).
.SS open_by_handle_at()
The
.BR open_by_handle_at ()
system call opens the file referred to by
.IR handle ,
a file handle returned by a previous call to
.BR name_to_handle_at ().

The
.IR mount_fd
argument is a file descriptor for any object (file, directory, etc.)
in the mounted filesystem with respect to which
.IR handle
should be interpreted.
The special value
.B AT_FDCWD
can be specified, meaning the current working directory of the caller.

The
.I flags
argument
is as for
.BR open (2).
.\" FIXME: Confirm that the following is intended behavior.
.\"        (It certainly seems to be the behavior, from experimenting.)
If
.I handle
refers to a symbolic link, the caller must specify the
.B O_PATH
flag, and the symbolic link is not dereferenced; the
.B O_NOFOLLOW
flag, if specified, is ignored.


The caller must have the
.B CAP_DAC_READ_SEARCH
capability to invoke
.BR open_by_handle_at ().
.SH RETURN VALUE
On success,
.BR name_to_handle_at ()
returns 0,
and
.BR open_by_handle_at ()
returns a nonnegative file descriptor.

In the event of an error, both system calls return \-1 and set
.I errno
to indicate the cause of the error.
.SH ERRORS
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
can fail for the same errors as
.BR openat (2).
In addition, they can fail with the errors noted below.

.BR name_to_handle_at ()
can fail with the following errors:
.TP
.B EFAULT
.IR pathname ,
.IR mount_id ,
or
.IR handle
points outside your accessible address space.
.TP
.B EINVAL
.I flags
includes an invalid bit value.
.TP
.B EINVAL
.IR handle_bytes\->handle_bytes
is greater than
.BR MAX_HANDLE_SZ .
.TP
.B ENOENT
.I pathname
is an empty string, but
.BR AT_EMPTY_PATH
was not specified in
.IR flags .
.TP
.B ENOTDIR
The file descriptor supplied in
.I dirfd
does not refer to a directory,
and it is not the case that both
.I flags
includes
.BR AT_EMPTY_PATH
and
.I pathname
is an empty string.
.TP
.B EOPNOTSUPP
The filesystem does not support decoding of a pathname to a file handle.
.TP
.B EOVERFLOW
The
.I handle->handle_bytes
value passed into the call was too small.
When this error occurs,
.I handle->handle_bytes
is updated to indicate the required size for the handle.
.\"
.\"
.PP
.BR open_by_handle_at ()
can fail with the following errors:
.TP
.B EBADF
.IR mount_fd
is not an open file descriptor.
.TP
.B EFAULT
.IR handle
points outside your accessible address space.
.TP
.B EINVAL
.I handle->handle_bytes
is greater than
.BR MAX_HANDLE_SZ
or is equal to zero.
.TP
.B ELOOP
.\" FIXME (see earlier FIXME). Is this the intended behavior?
.I handle
refers to a symbolic link, but
.B O_PATH
was not specified in
.IR flags .
.TP
.B EPERM
The caller does not have the
.BR CAP_DAC_READ_SEARCH
capability.
.TP
.B ESTALE
The specified
.I handle
is not valid.
This error will occur if, for example, the file has been deleted.
.SH VERSIONS
These system calls first appeared in Linux 2.6.39.
.SH CONFORMING TO
These system calls are nonstandard Linux extensions.
.SH NOTES
A file handle can be generated in one process using
.BR name_to_handle_at ()
and later used in a different process that calls
.BR open_by_handle_at ().

Some filesystem don't support the translation of pathnames to
file handles, for example,
.IR /proc ,
.IR /sys ,
and various network filesystems.

A file handle may become invalid ("stale") if a file is deleted,
or for other filesystem-specific reasons.
Invalid handles are notified by an
.B ESTALE
error from
.BR open_by_handle_at ().

These system calls are designed for use by user-space file servers.
For example, a user-space NFS server might generate a file handle
and pass it to an NFS client.
Later, when the client wants to open the file,
it could pass the handle back to the server.
.\" https://lwn.net/Articles/375888/
.\"	"Open by handle" - Jonathan Corbet, 2010-02-23
This sort of functionality allows a user-space file server to operate in
a stateless fashion with respect to the files it serves.

If
.I pathname
refers to a symbolic link and
.IR flags
does not specify
.BR AT_SYMLINK_FOLLOW ,
then
.BR name_to_handle_at ()
returns a handle for the link (rather than the file to which it refers).
.\" commit bcda76524cd1fa32af748536f27f674a13e56700
The process receiving the handle can later perform operations
on the symbolic link by converting the handle to a file descriptor using
.BR open_by_handle_at ()
with the
.BR O_PATH
flag, and then passing the file descriptor as the
.IR dirfd
argument in system calls such as
.BR readlinkat (2)
and
.BR fchownat (2).
.SS Obtaining a persistent filesystem ID
The mount IDs in
.IR /proc/self/mountinfo
can be reused as filesystems are unmounted and mounted.
Therefore, the mount ID returned by
.BR name_to_handle_at ()
(in
.IR *mount_id )
should not be treated as a persistent identifier
for the corresponding mounted filesystem.
However, an application can use the information in the
.I mountinfo
record that corresponds to the mount ID
to derive a persistent identifier.

For example, one can use the device name in the fifth field of the
.I mountinfo
record to search for the corresponding device UUID via the symbolic links in
.IR /dev/disks/by-uuid .
(A more comfortable way of obtaining the UUID is to use the
.\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
.BR libblkid (3)
library.)
That process can then be reversed,
using the UUID to look up the device name,
and then obtaining the corresponding mount point,
in order to produce the
.IR mount_fd
argument used by
.BR open_by_handle_at ().
.SH EXAMPLE
The two programs below demonstrate the use of
.BR name_to_handle_at ()
and
.BR open_by_handle_at ().
The first program
.RI ( t_name_to_handle_at.c )
uses
.BR name_to_handle_at ()
to obtain the file handle and mount ID
for the file specified in its command-line argument;
the handle and mount ID are written to standard output.

The second program
.RI ( t_open_by_handle_at.c )
reads a mount ID and file handle from standard input.
The program then employs
.BR open_by_handle_at ()
to open the file using that handle.
If an optional command-line argument is supplied, then the
.IR mount_fd
argument for
.BR open_by_handle_at ()
is obtained by opening the directory named in that argument.
Otherwise,
.IR mount_fd
is obtained by scanning
.IR /proc/self/mountinfo
to find a record whose mount ID matches the mount ID
read from standard input,
and the mount directory specified in that record is opened.
(These programs do not deal with the fact that mount IDs are not persistent.)

The following shell session demonstrates the use of these two programs:

.in +4n
.nf
$ \fBecho 'Can you please think about it?' > cecilia.txt\fP
$ \fB./t_name_to_handle_at cecilia.txt > fh\fP
$ \fB./t_open_by_handle_at < fh\fP
open_by_handle_at: Operation not permitted
$ \fBsudo ./t_open_by_handle_at < fh\fP      # Need CAP_SYS_ADMIN
Read 31 bytes
$ \fBrm cecilia.txt\fP
.fi
.in

Now we delete and (quickly) re-create the file so that
it has the same content and (by chance) the same inode.
Nevertheless,
.BR open_by_handle_at ()
.\" Christoph Hellwig: That's why the file handles contain a generation
.\" counter that gets incremented in this case.
recognizes that the original file referred to by the file handle
no longer exists.

.in +4n
.nf
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP     # Display inode number
4072121
$ \fBrm cecilia.txt\fP
$ \fBecho 'Can you please think about it?' > cecilia.txt\fP
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP     # Check inode number
4072121
$ \fBsudo ./t_open_by_handle_at < fh\fP
open_by_handle_at: Stale NFS file handle
.fi
.in
.SS Program source: t_name_to_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fhsize, flags, dirfd, j;
    char *pathname;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s pathname\\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    pathname = argv[1];

    /* Allocate file_handle structure */

    fhsize = sizeof(*fhp);
    fhp = malloc(fhsize);
    if (fhp == NULL)
        errExit("malloc");

    /* Make an initial call to name_to_handle_at() to discover
       the size required for file handle */

    dirfd = AT_FDCWD;           /* For name_to_handle_at() calls */
    flags = 0;                  /* For name_to_handle_at() calls */
    fhp\->handle_bytes = 0;
    if (name_to_handle_at(dirfd, pathname, fhp,
                &mount_id, flags) != \-1 || errno != EOVERFLOW) {
        fprintf(stderr, "Unexpected result from name_to_handle_at()\\n");
        exit(EXIT_FAILURE);
    }

    /* Reallocate file_handle structure with correct size */

    fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;
    fhp = realloc(fhp, fhsize);         /* Copies fhp\->handle_bytes */
    if (fhp == NULL)
        errExit("realloc");

    /* Get file handle from pathname supplied on command line */

    if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) == \-1)
        errExit("name_to_handle_at");

    /* Write mount ID, file handle size, and file handle to stdout,
       for later reuse by t_open_by_handle_at.c */

    printf("%d\\n", mount_id);
    printf("%d %d   ", fhp\->handle_bytes, fhp\->handle_type);
    for (j = 0; j < fhp\->handle_bytes; j++)
        printf(" %02x", fhp\->f_handle[j]);
    printf("\\n");

    exit(EXIT_SUCCESS);
}
.fi
.SS Program source: t_open_by_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

/* Scan /proc/self/mountinfo to find the line whose mount ID matches
   \(aqmount_id\(aq. (An easier way to do this is to install and use the
   \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.)
   Open the corresponding mount path and return the resulting file
   descriptor. */

static int
open_mount_path_by_id(int mount_id)
{
    char *linep;
    size_t lsize;
    char mount_path[PATH_MAX];
    int mi_mount_id, found;
    ssize_t nread;
    FILE *fp;

    fp = fopen("/proc/self/mountinfo", "r");
    if (fp == NULL)
        errExit("fopen");

    found = 0;
    linep = NULL;
    while (!found) {
        nread = getline(&linep, &lsize, fp);
        if (nread == \-1)
            break;

        nread = sscanf(linep, "%d %*d %*s %*s %s",
                       &mi_mount_id, mount_path);
        if (nread != 2) {
            fprintf(stderr, "Bad sscanf()\\n");
            exit(EXIT_FAILURE);
        }

        if (mi_mount_id == mount_id)
            found = 1;
    }
    free(linep);

    fclose(fp);

    if (!found) {
        fprintf(stderr, "Could not find mount point\\n");
        exit(EXIT_FAILURE);
    }

    return open(mount_path, O_RDONLY);
}

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fd, mount_fd, handle_bytes, j;
    ssize_t nread;
    char buf[1000];
#define LINE_SIZE 100
    char line1[LINE_SIZE], line2[LINE_SIZE];
    char *nextp;

    if ((argc > 1 && strcmp(argv[1], "\-\-help") == 0) || argc > 2) {
        fprintf(stderr, "Usage: %s [mount\-path]\\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Standard input contains mount ID and file handle information:

         Line 1: <mount_id>
         Line 2: <handle_bytes> <handle_type>   <bytes of handle in hex>
    */

    if ((fgets(line1, sizeof(line1), stdin) == NULL) ||
           (fgets(line2, sizeof(line2), stdin) == NULL)) {
        fprintf(stderr, "Missing mount_id / file handle\\n");
        exit(EXIT_FAILURE);
    }

    mount_id = atoi(line1);

    handle_bytes = strtoul(line2, &nextp, 0);

    /* Given handle_bytes, we can now allocate file_handle structure */

    fhp = malloc(sizeof(struct file_handle) + handle_bytes);
    if (fhp == NULL)
        errExit("malloc");

    fhp\->handle_bytes = handle_bytes;

    fhp\->handle_type = strtoul(nextp, &nextp, 0);

    for (j = 0; j < fhp\->handle_bytes; j++)
        fhp\->f_handle[j] = strtoul(nextp, &nextp, 16);

    /* Obtain file descriptor for mount point, either by opening
       the pathname specified on the command line, or by scanning
       /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq
       that we received from stdin. */

    if (argc > 1)
        mount_fd = open(argv[1], O_RDONLY);
    else
        mount_fd = open_mount_path_by_id(mount_id);

    if (mount_fd == \-1)
        errExit("opening mount fd");

    /* Open file using handle and mount point */

    fd = open_by_handle_at(mount_fd, fhp, O_RDONLY);
    if (fd == \-1)
        errExit("open_by_handle_at");

    /* Try reading a few bytes from the file */

    nread = read(fd, buf, sizeof(buf));
    if (nread == \-1)
        errExit("read");

    printf("Read %zd bytes\\n", nread);

    exit(EXIT_SUCCESS);
}
.fi
.SH SEE ALSO
.BR open (2),
.BR libblkid (3),
.BR blkid (8),
.BR findfs (8),
.BR mount (8)

The
.I libblkid
and
.I libmount
documentation in the latest
.I util-linux
release at
.UR https://www.kernel.org/pub/linux/utils/util-linux/
.UE

^ permalink raw reply	[flat|nested] 22+ messages in thread

* For review: open_by_handle_at(2) man page [v3]
@ 2014-03-19 14:14         ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-03-19 14:14 UTC (permalink / raw)
  To: NeilBrown
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Aneesh Kumar K.V,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml,
	Andreas Dilger, Christoph Hellwig, Al Viro, Mike Frysinger

Hi Aneesh, (and others),

After integrating review comments from NeilBown, Christoph Hellwig,
and Mike Frysinger, here is draft 3 of a man page I've written for 
name_to_handle_at(2) and open_by_handle_at(2). Would you be willing to 
review it please, and let me know of any corrections/improvements?

There are some FIXMEs in the page that I would especially like some
help with.

Thanks,

Michael

.\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual"
.SH NAME
name_to_handle_at, open_by_handle_at \- obtain handle
for a pathname and open file via a handle
.SH SYNOPSIS
.nf
.B #define _GNU_SOURCE
.B #include <sys/types.h>
.B #include <sys/stat.h>
.B #include <fcntl.h>

.BI "int name_to_handle_at(int " dirfd ", const char *" pathname ,
.BI "                      struct file_handle *" handle ,
.BI "                      int *" mount_id ", int " flags );

.BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle ,
.BI "                      int " flags );
.fi
.SH DESCRIPTION
The
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
system calls split the functionality of
.BR openat (2)
into two parts:
.BR name_to_handle_at ()
returns an opaque handle that corresponds to a specified file;
.BR open_by_handle_at ()
opens the file corresponding to a handle returned by a previous call to
.BR name_to_handle_at ()
and returns an open file descriptor.
.\"
.\"
.SS name_to_handle_at()
The
.BR name_to_handle_at ()
system call returns a file handle and a mount ID corresponding to
the file specified by the
.IR dirfd
and
.IR pathname
arguments.
The file handle is returned via the argument
.IR handle ,
which is a pointer to a structure of the following form:

.in +4n
.nf
struct file_handle {
    unsigned int  handle_bytes;   /* Size of f_handle [in, out] */
    int           handle_type;    /* Handle type [out] */
    unsigned char f_handle[0];    /* File identifier (sized by
                                     caller) [out] */
};
.fi
.in
.PP
It is the caller's responsibility to allocate the structure
with a size large enough to hold the handle returned in
.IR f_handle .
Before the call, the
.IR handle_bytes
field should be initialized to contain the allocated size for
.IR f_handle .
(The constant
.BR MAX_HANDLE_SZ ,
defined in
.IR <fcntl.h> ,
specifies the maximum possible size for a file handle.)
Upon successful return, the
.IR handle_bytes
field is updated to contain the number of bytes actually written to
.IR f_handle .

The caller can discover the required size for the
.I file_handle
structure by making a call in which
.IR handle->handle_bytes
is zero;
in this case, the call fails with the error
.BR EOVERFLOW
and
.IR handle->handle_bytes
is set to indicate the required size;
the caller can then use this information to allocate a structure
of the correct size (see EXAMPLE below).

Other than the use of the
.IR handle_bytes
field, the caller should treat the
.IR file_handle
structure as an opaque data type: the
.IR handle_type
and
.IR f_handle
fields are needed only by a subsequent call to
.BR open_by_handle_at ().

The
.I flags
argument is a bit mask constructed by ORing together zero or more of
.BR AT_EMPTY_PATH
and
.BR AT_SYMLINK_FOLLOW ,
described below.

Together, the
.I pathname
and
.I dirfd
arguments identify the file for which a handle is to be obtained.
There are four distinct cases:
.IP * 3
If
.I pathname
is a nonempty string containing an absolute pathname,
then a handle is returned for the file referred to by that pathname.
In this case,
.IR dirfd
is ignored.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
has the special value
.BR AT_FDCWD ,
then
.I pathname
is interpreted relative to the current working directory of the caller,
and a handle is returned for the file to which it refers.
.IP *
If
.I pathname
is a nonempty string containing a relative pathname and
.IR dirfd
is a file descriptor referring to a directory, then
.I pathname
is interpreted relative to the directory referred to by
.IR dirfd ,
and a handle is returned for the file to which it refers.
(See
.BR openat (3)
for an explanation of why "directory file descriptors" are useful.)
.IP *
If
.I pathname
is an empty string and
.I flags
specifies the value
.BR AT_EMPTY_PATH ,
then
.IR dirfd
can be an open file descriptor referring to any type of file,
or
.BR AT_FDCWD ,
meaning the current working directory,
and a handle is returned for the file to which it refers.
.PP
The
.I mount_id
argument returns an identifier for the filesystem
mount that corresponds to
.IR pathname .
This corresponds to the first field in one of the records in
.IR /proc/self/mountinfo .
Opening the pathname in the fifth field of that record yields a file
descriptor for the mount point;
that file descriptor can be used in a subsequent call to
.BR open_by_handle_at ().

By default,
.BR name_to_handle_at ()
does not dereference
.I pathname
if it is a symbolic link, and thus returns a handle for the link itself.
If
.B AT_SYMLINK_FOLLOW
is specified in
.IR flags ,
.I pathname
is dereferenced if it is a symbolic link
(so that the call returns a handle for the file referred to by the link).
.SS open_by_handle_at()
The
.BR open_by_handle_at ()
system call opens the file referred to by
.IR handle ,
a file handle returned by a previous call to
.BR name_to_handle_at ().

The
.IR mount_fd
argument is a file descriptor for any object (file, directory, etc.)
in the mounted filesystem with respect to which
.IR handle
should be interpreted.
The special value
.B AT_FDCWD
can be specified, meaning the current working directory of the caller.

The
.I flags
argument
is as for
.BR open (2).
.\" FIXME: Confirm that the following is intended behavior.
.\"        (It certainly seems to be the behavior, from experimenting.)
If
.I handle
refers to a symbolic link, the caller must specify the
.B O_PATH
flag, and the symbolic link is not dereferenced; the
.B O_NOFOLLOW
flag, if specified, is ignored.


The caller must have the
.B CAP_DAC_READ_SEARCH
capability to invoke
.BR open_by_handle_at ().
.SH RETURN VALUE
On success,
.BR name_to_handle_at ()
returns 0,
and
.BR open_by_handle_at ()
returns a nonnegative file descriptor.

In the event of an error, both system calls return \-1 and set
.I errno
to indicate the cause of the error.
.SH ERRORS
.BR name_to_handle_at ()
and
.BR open_by_handle_at ()
can fail for the same errors as
.BR openat (2).
In addition, they can fail with the errors noted below.

.BR name_to_handle_at ()
can fail with the following errors:
.TP
.B EFAULT
.IR pathname ,
.IR mount_id ,
or
.IR handle
points outside your accessible address space.
.TP
.B EINVAL
.I flags
includes an invalid bit value.
.TP
.B EINVAL
.IR handle_bytes\->handle_bytes
is greater than
.BR MAX_HANDLE_SZ .
.TP
.B ENOENT
.I pathname
is an empty string, but
.BR AT_EMPTY_PATH
was not specified in
.IR flags .
.TP
.B ENOTDIR
The file descriptor supplied in
.I dirfd
does not refer to a directory,
and it is not the case that both
.I flags
includes
.BR AT_EMPTY_PATH
and
.I pathname
is an empty string.
.TP
.B EOPNOTSUPP
The filesystem does not support decoding of a pathname to a file handle.
.TP
.B EOVERFLOW
The
.I handle->handle_bytes
value passed into the call was too small.
When this error occurs,
.I handle->handle_bytes
is updated to indicate the required size for the handle.
.\"
.\"
.PP
.BR open_by_handle_at ()
can fail with the following errors:
.TP
.B EBADF
.IR mount_fd
is not an open file descriptor.
.TP
.B EFAULT
.IR handle
points outside your accessible address space.
.TP
.B EINVAL
.I handle->handle_bytes
is greater than
.BR MAX_HANDLE_SZ
or is equal to zero.
.TP
.B ELOOP
.\" FIXME (see earlier FIXME). Is this the intended behavior?
.I handle
refers to a symbolic link, but
.B O_PATH
was not specified in
.IR flags .
.TP
.B EPERM
The caller does not have the
.BR CAP_DAC_READ_SEARCH
capability.
.TP
.B ESTALE
The specified
.I handle
is not valid.
This error will occur if, for example, the file has been deleted.
.SH VERSIONS
These system calls first appeared in Linux 2.6.39.
.SH CONFORMING TO
These system calls are nonstandard Linux extensions.
.SH NOTES
A file handle can be generated in one process using
.BR name_to_handle_at ()
and later used in a different process that calls
.BR open_by_handle_at ().

Some filesystem don't support the translation of pathnames to
file handles, for example,
.IR /proc ,
.IR /sys ,
and various network filesystems.

A file handle may become invalid ("stale") if a file is deleted,
or for other filesystem-specific reasons.
Invalid handles are notified by an
.B ESTALE
error from
.BR open_by_handle_at ().

These system calls are designed for use by user-space file servers.
For example, a user-space NFS server might generate a file handle
and pass it to an NFS client.
Later, when the client wants to open the file,
it could pass the handle back to the server.
.\" https://lwn.net/Articles/375888/
.\"	"Open by handle" - Jonathan Corbet, 2010-02-23
This sort of functionality allows a user-space file server to operate in
a stateless fashion with respect to the files it serves.

If
.I pathname
refers to a symbolic link and
.IR flags
does not specify
.BR AT_SYMLINK_FOLLOW ,
then
.BR name_to_handle_at ()
returns a handle for the link (rather than the file to which it refers).
.\" commit bcda76524cd1fa32af748536f27f674a13e56700
The process receiving the handle can later perform operations
on the symbolic link by converting the handle to a file descriptor using
.BR open_by_handle_at ()
with the
.BR O_PATH
flag, and then passing the file descriptor as the
.IR dirfd
argument in system calls such as
.BR readlinkat (2)
and
.BR fchownat (2).
.SS Obtaining a persistent filesystem ID
The mount IDs in
.IR /proc/self/mountinfo
can be reused as filesystems are unmounted and mounted.
Therefore, the mount ID returned by
.BR name_to_handle_at ()
(in
.IR *mount_id )
should not be treated as a persistent identifier
for the corresponding mounted filesystem.
However, an application can use the information in the
.I mountinfo
record that corresponds to the mount ID
to derive a persistent identifier.

For example, one can use the device name in the fifth field of the
.I mountinfo
record to search for the corresponding device UUID via the symbolic links in
.IR /dev/disks/by-uuid .
(A more comfortable way of obtaining the UUID is to use the
.\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition
.BR libblkid (3)
library.)
That process can then be reversed,
using the UUID to look up the device name,
and then obtaining the corresponding mount point,
in order to produce the
.IR mount_fd
argument used by
.BR open_by_handle_at ().
.SH EXAMPLE
The two programs below demonstrate the use of
.BR name_to_handle_at ()
and
.BR open_by_handle_at ().
The first program
.RI ( t_name_to_handle_at.c )
uses
.BR name_to_handle_at ()
to obtain the file handle and mount ID
for the file specified in its command-line argument;
the handle and mount ID are written to standard output.

The second program
.RI ( t_open_by_handle_at.c )
reads a mount ID and file handle from standard input.
The program then employs
.BR open_by_handle_at ()
to open the file using that handle.
If an optional command-line argument is supplied, then the
.IR mount_fd
argument for
.BR open_by_handle_at ()
is obtained by opening the directory named in that argument.
Otherwise,
.IR mount_fd
is obtained by scanning
.IR /proc/self/mountinfo
to find a record whose mount ID matches the mount ID
read from standard input,
and the mount directory specified in that record is opened.
(These programs do not deal with the fact that mount IDs are not persistent.)

The following shell session demonstrates the use of these two programs:

.in +4n
.nf
$ \fBecho 'Can you please think about it?' > cecilia.txt\fP
$ \fB./t_name_to_handle_at cecilia.txt > fh\fP
$ \fB./t_open_by_handle_at < fh\fP
open_by_handle_at: Operation not permitted
$ \fBsudo ./t_open_by_handle_at < fh\fP      # Need CAP_SYS_ADMIN
Read 31 bytes
$ \fBrm cecilia.txt\fP
.fi
.in

Now we delete and (quickly) re-create the file so that
it has the same content and (by chance) the same inode.
Nevertheless,
.BR open_by_handle_at ()
.\" Christoph Hellwig: That's why the file handles contain a generation
.\" counter that gets incremented in this case.
recognizes that the original file referred to by the file handle
no longer exists.

.in +4n
.nf
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP     # Display inode number
4072121
$ \fBrm cecilia.txt\fP
$ \fBecho 'Can you please think about it?' > cecilia.txt\fP
$ \fBstat \-\-printf="%i\\n" cecilia.txt\fP     # Check inode number
4072121
$ \fBsudo ./t_open_by_handle_at < fh\fP
open_by_handle_at: Stale NFS file handle
.fi
.in
.SS Program source: t_name_to_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fhsize, flags, dirfd, j;
    char *pathname;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s pathname\\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    pathname = argv[1];

    /* Allocate file_handle structure */

    fhsize = sizeof(*fhp);
    fhp = malloc(fhsize);
    if (fhp == NULL)
        errExit("malloc");

    /* Make an initial call to name_to_handle_at() to discover
       the size required for file handle */

    dirfd = AT_FDCWD;           /* For name_to_handle_at() calls */
    flags = 0;                  /* For name_to_handle_at() calls */
    fhp\->handle_bytes = 0;
    if (name_to_handle_at(dirfd, pathname, fhp,
                &mount_id, flags) != \-1 || errno != EOVERFLOW) {
        fprintf(stderr, "Unexpected result from name_to_handle_at()\\n");
        exit(EXIT_FAILURE);
    }

    /* Reallocate file_handle structure with correct size */

    fhsize = sizeof(struct file_handle) + fhp\->handle_bytes;
    fhp = realloc(fhp, fhsize);         /* Copies fhp\->handle_bytes */
    if (fhp == NULL)
        errExit("realloc");

    /* Get file handle from pathname supplied on command line */

    if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) == \-1)
        errExit("name_to_handle_at");

    /* Write mount ID, file handle size, and file handle to stdout,
       for later reuse by t_open_by_handle_at.c */

    printf("%d\\n", mount_id);
    printf("%d %d   ", fhp\->handle_bytes, fhp\->handle_type);
    for (j = 0; j < fhp\->handle_bytes; j++)
        printf(" %02x", fhp\->f_handle[j]);
    printf("\\n");

    exit(EXIT_SUCCESS);
}
.fi
.SS Program source: t_open_by_handle_at.c
\&
.nf
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \\
                        } while (0)

/* Scan /proc/self/mountinfo to find the line whose mount ID matches
   \(aqmount_id\(aq. (An easier way to do this is to install and use the
   \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.)
   Open the corresponding mount path and return the resulting file
   descriptor. */

static int
open_mount_path_by_id(int mount_id)
{
    char *linep;
    size_t lsize;
    char mount_path[PATH_MAX];
    int mi_mount_id, found;
    ssize_t nread;
    FILE *fp;

    fp = fopen("/proc/self/mountinfo", "r");
    if (fp == NULL)
        errExit("fopen");

    found = 0;
    linep = NULL;
    while (!found) {
        nread = getline(&linep, &lsize, fp);
        if (nread == \-1)
            break;

        nread = sscanf(linep, "%d %*d %*s %*s %s",
                       &mi_mount_id, mount_path);
        if (nread != 2) {
            fprintf(stderr, "Bad sscanf()\\n");
            exit(EXIT_FAILURE);
        }

        if (mi_mount_id == mount_id)
            found = 1;
    }
    free(linep);

    fclose(fp);

    if (!found) {
        fprintf(stderr, "Could not find mount point\\n");
        exit(EXIT_FAILURE);
    }

    return open(mount_path, O_RDONLY);
}

int
main(int argc, char *argv[])
{
    struct file_handle *fhp;
    int mount_id, fd, mount_fd, handle_bytes, j;
    ssize_t nread;
    char buf[1000];
#define LINE_SIZE 100
    char line1[LINE_SIZE], line2[LINE_SIZE];
    char *nextp;

    if ((argc > 1 && strcmp(argv[1], "\-\-help") == 0) || argc > 2) {
        fprintf(stderr, "Usage: %s [mount\-path]\\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Standard input contains mount ID and file handle information:

         Line 1: <mount_id>
         Line 2: <handle_bytes> <handle_type>   <bytes of handle in hex>
    */

    if ((fgets(line1, sizeof(line1), stdin) == NULL) ||
           (fgets(line2, sizeof(line2), stdin) == NULL)) {
        fprintf(stderr, "Missing mount_id / file handle\\n");
        exit(EXIT_FAILURE);
    }

    mount_id = atoi(line1);

    handle_bytes = strtoul(line2, &nextp, 0);

    /* Given handle_bytes, we can now allocate file_handle structure */

    fhp = malloc(sizeof(struct file_handle) + handle_bytes);
    if (fhp == NULL)
        errExit("malloc");

    fhp\->handle_bytes = handle_bytes;

    fhp\->handle_type = strtoul(nextp, &nextp, 0);

    for (j = 0; j < fhp\->handle_bytes; j++)
        fhp\->f_handle[j] = strtoul(nextp, &nextp, 16);

    /* Obtain file descriptor for mount point, either by opening
       the pathname specified on the command line, or by scanning
       /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq
       that we received from stdin. */

    if (argc > 1)
        mount_fd = open(argv[1], O_RDONLY);
    else
        mount_fd = open_mount_path_by_id(mount_id);

    if (mount_fd == \-1)
        errExit("opening mount fd");

    /* Open file using handle and mount point */

    fd = open_by_handle_at(mount_fd, fhp, O_RDONLY);
    if (fd == \-1)
        errExit("open_by_handle_at");

    /* Try reading a few bytes from the file */

    nread = read(fd, buf, sizeof(buf));
    if (nread == \-1)
        errExit("read");

    printf("Read %zd bytes\\n", nread);

    exit(EXIT_SUCCESS);
}
.fi
.SH SEE ALSO
.BR open (2),
.BR libblkid (3),
.BR blkid (8),
.BR findfs (8),
.BR mount (8)

The
.I libblkid
and
.I libmount
documentation in the latest
.I util-linux
release at
.UR https://www.kernel.org/pub/linux/utils/util-linux/
.UE
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-03-19 14:14 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages)
2014-03-17 22:00 ` NeilBrown
2014-03-18  9:43   ` Christoph Hellwig
2014-03-18 12:37     ` Michael Kerrisk (man-pages)
2014-03-18 22:24       ` NeilBrown
2014-03-18 22:24         ` NeilBrown
2014-03-19  9:09         ` Michael Kerrisk (man-pages)
2014-03-18 12:35   ` Michael Kerrisk (man-pages)
2014-03-18 12:35     ` Michael Kerrisk (man-pages)
2014-03-18 13:07     ` Christoph Hellwig
2014-03-18 13:30       ` Michael Kerrisk (man-pages)
2014-03-18  9:37 ` Christoph Hellwig
2014-03-18 12:41   ` Michael Kerrisk (man-pages)
2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages)
2014-03-19  4:13   ` NeilBrown
2014-03-19  4:13     ` NeilBrown
2014-03-19  9:09     ` Michael Kerrisk (man-pages)
2014-03-19  9:09       ` Michael Kerrisk (man-pages)
2014-03-19 14:14       ` For review: open_by_handle_at(2) man page [v3] Michael Kerrisk (man-pages)
2014-03-19 14:14         ` Michael Kerrisk (man-pages)
2014-03-19  6:42   ` For review: open_by_name_at(2) man page [v2] Mike Frysinger
2014-03-19 13:11     ` Michael Kerrisk (man-pages)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.