* For review: open_by_name_at(2) man page @ 2014-03-17 15:57 Michael Kerrisk (man-pages) 2014-03-17 22:00 ` NeilBrown ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-17 15:57 UTC (permalink / raw) To: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml Cc: mtk.manpages, Andreas Dilger, NeilBrown, Christoph Hellwig Hi Aneesh, (and others) Below is a man page I've written for name_to_handle_at(2) and open_by_name_at(2). Would you be willing to review it please, and let me know of any corrections/improvements? Thanks, Michael '\" t -*- coding: UTF-8 -*- .\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@gmail.com> .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .\" .TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual" .SH NAME name_to_handle_at, open_by_handle_at \- obtain handle for a pathname and open file via a handle .SH SYNOPSIS .nf .B #define _GNU_SOURCE .B #include <sys/types.h> .B #include <sys/stat.h> .B #include <fcntl.h> .BI "int name_to_handle_at(int " dirfd ", const char *" pathname , .BI " struct file_handle *" handle , .BI " int *" mnt_id ", int " flags ); .BI "int open_by_handle_at(int " mountdirfd ", struct file_handle *" handle , .BI " int " flags ); .fi .SH DESCRIPTION The .BR name_to_handle_at () and .BR open_by_handle_at () system calls split the functionality of .BR openat (2) into two parts: .BR name_to_handle_at () returns an opaque handle that corresponds to a specified file; .BR open_by_handle_at () opens the file corresponding to a handle returned by a previous call to .BR name_to_handle_at () and returns an open file descriptor. .SS name_to_handle_at() The .BR name_to_handle_at () system call returns a file handle and a mount ID corresponding to the file specified by .IR pathname , which specifies the pathname of an existing file. The file handle is returned via the argument .IR handle , which is a pointer to a structure of the following form: .in +4n .nf struct file_handle { unsigned int handle_bytes; /* Size of f_handle [in, out] */ int handle_type; /* Handle type [out] */ unsigned char f_handle[0]; /* File identifier (sized by caller) [out] */ }; .fi .in .PP It is the caller's responsibility to allocate the structure with a size large enough to hold the handle returned in .IR f_handle . Before the call, the .IR handle_bytes field should be initialized to contain the allocated size for .IR f_handle . (The constant .BR MAX_HANDLE_SZ , defined in .IR <fcntl.h> , specifies the maximum possible size for a file handle.) Upon successful return, the .IR handle_bytes field is updated to contain the number of bytes actually written to .IR f_handle . The caller can discover the required size for the .I file_handle structure by making a call in which .IR handle->handle_bytes is zero; in this case, the call fails with the error .BR EOVERFLOW and .IR handle->handle_bytes is set to indicate the required size; the caller can then use this information to allocate a structure of the correct size (see EXAMPLE below). Other than the use of the .IR handle_bytes field, the caller should treat the .IR file_handle structure as an opaque data type: the .IR handle_type and .IR f_handle fields are needed only by a subsequent call to .BR open_by_handle_at (). The treatment of a relative pathname in .I pathname depends on the value of .IR dirfd . If .I dirfd has the special value .BR AT_FDCWD , then .I pathname is interpreted relative to the current working directory of the calling process. (see .BR openat (3) for an explanation of why this is useful.) Otherwise, .IR dirfd must be a file descriptor that refers to a directory, and .I pathname is interpreted relative to that directory. If .I pathname is an absolute pathname, then .I dirfd is ignored. The .I mnt_id argument returns an identifier for the filesystem mount that corresponds to .IR pathname . This corresponds to the first field in one of the records in .IR /proc/self/mountinfo . Opening the pathname in the fifth field of that record yields a file descriptor for the mount point; that file descriptor can be used in a subsequent call to .BR open_by_handle_at (). The .I flags argument is a bit mask constructed by ORing together zero or more of the following value: .TP .B AT_EMPTY_PATH If .I pathname is an empty string, then obtain a handle for the file referred to by .IR dirfd (which may have been obtained using the .BR open (2) .B O_PATH flag). In this case, .I dirfd can refer to any type of file, not just a directory. .TP .B AT_SYMLINK_FOLLOW By default, .BR name_to_handle_at () does not dereference .I pathname if it is a symbolic link. The flag .B AT_SYMLINK_FOLLOW can be specified in .I flags to cause .I pathname to be dereferenced if it is a symbolic link. .SS open_by_handle_at() The .BR open_by_handle_at () system call opens the file referred to by .IR handle , a file handle returned by a previous call to .BR name_to_handle_at (). The .IR mountdirfd argument is a file descriptor for a directory under the mount point with respect to which .IR handle should be interpreted. The special value .B AT_FDCWD can be specified, meaning the current working directory of the caller. The .I flags argument is as for .BR open (2). The caller must have the .B CAP_DAC_READ_SEARCH capability to invoke .BR open_by_handle_at (). .SH RETURN VALUE On success, .BR name_to_handle_at () returns 0, and .BR open_by_handle_at () returns a nonnegative file descriptor. In the event of an error, both system calls return \-1 and set .I errno to indicate the cause of the error. .SH ERRORS .BR name_to_handle_at () and .BR open_by_handle_at () can fail for the same errors as .BR open (2). In addition, they can fail with the errors noted below. .BR name_to_handle_at () can fail with the following errors: .TP .B EBADF .IR dirfd is not an open file descriptor. .TP .B EINVAL .I flags includes an invalid bit value. .TP .B EINVAL .IR handle_bytes\->handle_bytes is greater than .BR MAX_HANDLE_SZ . .TP .B ENOTDIR The file descriptor supplied in .I dirfd does not refer to a directory, and it it is not the case that both .I flags includes .BR AT_EMPTY_PATH and .I pathname is an empty string. .TP .B EOPNOTSUPP The filesystem does not support decoding of a pathname to a file handle. .TP .B EOVERFLOW The .I handle->handle_bytes value passed into the call was too small. When this error occurs, .I handle->handle_bytes is updated to indicate the required size for the handle. .\" .\" .PP .BR open_by_handle_at () can fail with the following errors: .TP .B EBADF .IR mountdirfd is not an open file descriptor. .TP .B EINVAL .I handle->handle_bytes is greater than .BR MAX_HANDLE_SZ or is equal to zero. .TP .B ENOMEM Insufficient memory. .TP .B ENOTDIR .IR mountdirfd is not .B AT_FDCWD and does not refer to a directory. .TP .B EPERM The caller does not have the .BR CAP_DAC_READ_SEARCH capability. .TP .B ESTALE The specified .I handle is no longer valid. .SH VERSIONS These system calls first appeared in Linux 2.6.39. .SH CONFORMING TO These system calls are nonstandard Linux extensions. .SH NOTES A file handle can be generated in one process using .BR name_to_handle_at () and later used in a different process that calls .BR open_by_handle_at (). These system calls are designed for use by user-space file servers. For example, a user-space NFS server might generate a file handle and pass it to an NFS client. Later, when the client wants to open the file, it could pass the handle back to the server. .\" https://lwn.net/Articles/375888/ .\" "Open by handle" - Jonathan Corbet, 2010-02-23 This sort of functionality allows a user-space file server to operate in a stateless fashion with respect to the files it serves. Specifying both .BR O_PATH and .BR O_NOFOLLOW in a call to .BR name_to_handle_at () that operates on a symbolic link can be used to obtain a handle for the link. .\" commit bcda76524cd1fa32af748536f27f674a13e56700 The process receiving the handle can later perform operations on the symbolic link by converting the handle to a file descriptor using .BR open_by_handle_at () and then passing the file descriptor as the .IR dirfd argument in system calls such as .BR readlinkat (2) and .BR fchownat (2). .SS Obtaining a persistent filesystem ID The mount IDs in .IR /proc/self/mountinfo can be reused as filesystems are unmounted and mounted. Therefore, the mount ID returned by .BR name_to_handle_at (3) (in .IR *mnt_id ) should not be treated as a persistent identifier for the corresponding mounted filesystem. However, an application can use the information in the .I mountinfo record that corresponds to the mount ID to derive a persistent identifier. For example, one can use the device name in the fifth field of the .I mountinfo record to search for the corresponding device UUID via the symbolic links in .IR /dev/disks/by-uuid . (A more comfortable way of obtaining the UUID is to use the .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition .BR libblkid (3) library, which uses the .I /sys filesystem to obtain the same information.) That process can then be reversed, using the UUID to look up the device name, and then obtaining the corresponding mount point, in order to produce the .IR mountdirfd argument used by .BR open_by_name_at (). .SH EXAMPLE The two programs below demonstrate the use of .BR name_to_handle_at () and .BR open_by_handle_at (). The first program .RI ( t_name_to_handle_at.c ) uses .BR name_to_handle_at () to obtain the file handle and mount ID for the file specified in its command-line argument; the handle and ID are written to standard output. The second program .RI ( t_open_by_handle_at.c ) reads a mount ID and file handle from standard input. The program then employs .BR open_by_handle_at () to open the file using that handle. If an optional command-line argument is supplied, then the .IR mountdirfd argument for .BR open_by_handle_at () is obtained by opening the directory named in that argument. Otherwise, .IR mountdirfd is obtained by scanning .IR /proc/self/mountinfo to find a record whose mount ID matches the mount ID read from standard input, and the mount directory specified in that record is opened. (These programs do not deal with the fact that mount IDs are not persistent.) The following shell session demonstrates the use of these two programs: .in +4n .nf $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP $ \fB./t_name_to_handle_at cecilia.txt > fh\fP $ \fB./t_open_by_handle_at < fh\fP open_by_handle_at: Operation not permitted $ \fBsudo ./t_open_by_handle_at < fh\fP # Need CAP_SYS_ADMIN Read 28 bytes $ \fBrm cecilia.txt\fP .fi .in Now delete and re-create the file with the same inode number; .BR open_by_handle_at () recognizes that the file referred to by the file handle no longer exists. .in +4n .nf $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number 4072121 $ \fBecho 'Warum?' > cecilia.txt\fP $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number 4072121 $ \fBsudo ./t_open_by_handle_at < fh\fP open_by_handle_at: Stale NFS file handle .fi .in .SS Program source: t_name_to_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fhsize, s; if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) { fprintf(stderr, "Usage: %s pathname\\n", argv[0]); exit(EXIT_FAILURE); } /* Allocate file_handle structure */ fhsize = sizeof(struct file_handle *); fhp = malloc(fhsize); if (fhp == NULL) errExit("malloc"); /* Make an initial call to name_to_handle_at() to discover the size required for file handle */ fhp\->handle_bytes = 0; s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0); if (s != \-1 || errno != EOVERFLOW) { fprintf(stderr, "Unexpected result from name_to_handle_at()\\n"); exit(EXIT_FAILURE); } /* Reallocate file_handle structure with correct size */ fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; fhp = realloc(fhp, fhsize); /* Copies fhp\->handle_bytes */ if (fhp == NULL) errExit("realloc"); /* Get file handle from pathname supplied on command line */ if (name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0) == \-1) errExit("name_to_handle_at"); /* Write mount ID, file handle size, and file handle to stdout, for later reuse by t_open_by_handle_at.c */ if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, fhp, fhsize) != fhsize) { fprintf(stderr, "Write failure\\n"); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); } .fi .SS Program source: t_open_by_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) /* Scan /proc/self/mountinfo to find the line whose mount ID matches \(aqmount_id\(aq. (An easier way to do this is to install and use the \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.) Open the corresponding mount path and return the resulting file descriptor. */ static int open_mount_path_by_id(int mount_id) { char *linep; size_t lsize; char mount_path[PATH_MAX]; int fmnt_id, fnd, nread; FILE *fp; fp = fopen("/proc/self/mountinfo", "r"); if (fp == NULL) errExit("fopen"); for (fnd = 0; !fnd ; ) { linep = NULL; nread = getline(&linep, &lsize, fp); if (nread == \-1) break; nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path); if (nread != 2) { fprintf(stderr, "Bad sscanf()\\n"); exit(EXIT_FAILURE); } free(linep); if (fmnt_id == mount_id) fnd = 1; } fclose(fp); if (!fnd) { fprintf(stderr, "Could not find mount point\\n"); exit(EXIT_FAILURE); } return open(mount_path, O_RDONLY | O_DIRECTORY); } int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fd, mount_fd, fhsize; ssize_t nread; #define BSIZE 1000 char buf[BSIZE]; if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) { fprintf(stderr, "Usage: %s [mount\-dir]]\\n", argv[0]); exit(EXIT_FAILURE); } /* Read data produced by t_name_to_handle_at.c */ if (read(STDIN_FILENO, &mount_id, sizeof(int)) != sizeof(int)) errExit("read"); if (read(STDIN_FILENO, &fhsize, sizeof(int)) != sizeof(int)) errExit("read"); fhp = malloc(fhsize); if (fhp == NULL) errExit("malloc"); if (read(STDIN_FILENO, fhp, fhsize) != fhsize) errExit("read"); /* Obtain file descriptor for mount point, either by opening the pathname specified on the command line, or by scanning /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq obtained by name_to_handle_at() (in t_name_to_handle_at.c) */ if (argc > 1) mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); else mount_fd = open_mount_path_by_id(mount_id); if (mount_fd == \-1) errExit("opening mount fd"); /* Open name using handle and mount point */ fd = open_by_handle_at(mount_fd, fhp, O_RDONLY); if (fd == \-1) errExit("open_by_handle_at"); /* Try reading a few bytes from the file */ nread = read(fd, buf, BSIZE); if (nread == \-1) errExit("read"); printf("Read %ld bytes\\n", (long) nread); exit(EXIT_SUCCESS); } .fi .SH SEE ALSO .BR blkid (1), .BR findfs (1), .BR open (2), .BR libblkid (3), .BR mount (8) The .I libblkid and .I libmount documentation under the latest .I util-linux release at .UR https://www.kernel.org/pub/linux/utils/util-linux/ .UE -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages) @ 2014-03-17 22:00 ` NeilBrown 2014-03-18 9:43 ` Christoph Hellwig 2014-03-18 12:35 ` Michael Kerrisk (man-pages) 2014-03-18 9:37 ` Christoph Hellwig 2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages) 2 siblings, 2 replies; 22+ messages in thread From: NeilBrown @ 2014-03-17 22:00 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 4127 bytes --] On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> wrote: > Hi Aneesh, (and others) > > Below is a man page I've written for name_to_handle_at(2) and > open_by_name_at(2). Would you be willing to review it please, > and let me know of any corrections/improvements? > > Thanks, > > Michael Thanks for writing this Michael. The fact that I can only find very small points to comment on reflects the high quality... > Otherwise, > .IR dirfd > must be a file descriptor that refers to a directory, and ^^^^^^^ > .I pathname > is interpreted relative to that directory. As you clarify later, "must be" is not correct. Maybe this is just an issue of style, in which case you should obviously keep a consistent style across man pages, but to me it sounds wrong. I would use "is generally" or similar. > The > .IR mountdirfd > argument is a file descriptor for a directory under > the mount point with respect to which > .IR handle > should be interpreted. mountdirfd does not have to be for a directory. It can be for any object in the filesystem. And I would say "in", not "under". If /foo and /foo/bar are both mountpoints, and I want to look up a filehandle for the filesystem mounted at /foo, then opening "/foo/bar" wouldn't work even though /foo/bar is "under" /foo. And opening "/foo" would work even though "/foo" is not under "/foo/" (is it?). The .IR mountfd argument is a file descriptor for any object (file, directory, etc.) in the filesystem with respect to which .IR handle should be interpreted. ?? > .B ESTALE > The specified > .I handle > is no longer valid. ESTALE is also returned if the filesystem does not support file-handle -> file mappings. On filesystems which don't provide export_operations (/sys /proc ubifs romfs cramfs nfs coda ... several others) name_to_handle_at will produce a generic handle using the 32 bit inode and 32 bit i_generation. open_by_name_at given this (or any) filehandle will fail with ESTALE. I don't know how best to include this in the documentation. Maybe a note earlier noting that some filesystems do not support open_by_name_at(), and you cannot programatically determine which do except by trying. At the same time note that a file handle can become in valid if a file is deleted or for any other reason as determined by the filesystem, and that the error is the same as for when the filesystem doesn't support open_by_name_at. > For example, one can use the device name in the fifth field of the > .I mountinfo > record to search for the corresponding device UUID via the symbolic links in > .IR /dev/disks/by-uuid . > (A more comfortable way of obtaining the UUID is to use the > .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition > .BR libblkid (3) > library, which uses the > .I /sys > filesystem to obtain the same information.) Does it? My understanding from "man libblkid" (it is a while since I've read the code) is that it either uses info in /dev/disks/by-* or reads directly from the block devices (maybe using /sys to find them?) and interprets the superblock to extract a UUID. > Now delete and re-create the file with the same inode number; > .BR open_by_handle_at () > recognizes that the file referred to by the file handle no longer exists. > > .in +4n > .nf > $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number > 4072121 > $ \fBecho 'Warum?' > cecilia.txt\fP > $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number > 4072121 > $ \fBsudo ./t_open_by_handle_at < fh\fP > open_by_handle_at: Stale NFS file handle Something is very wrong here. echo foo > somefile does not "delete and re-create" the file. It opens and truncates. That operation should not invalidate the filehandle on any sane filesystem. > if (argc > 1) > mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); O_DIRECTORY is not appropriate, as mentioned earlier. Thanks, NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-17 22:00 ` NeilBrown @ 2014-03-18 9:43 ` Christoph Hellwig 2014-03-18 12:37 ` Michael Kerrisk (man-pages) 2014-03-18 12:35 ` Michael Kerrisk (man-pages) 1 sibling, 1 reply; 22+ messages in thread From: Christoph Hellwig @ 2014-03-18 9:43 UTC (permalink / raw) To: NeilBrown Cc: Michael Kerrisk (man-pages), Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: > ESTALE is also returned if the filesystem does not support file-handle -> > file mappings. > On filesystems which don't provide export_operations (/sys /proc ubifs > romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > generic handle using the 32 bit inode and 32 bit i_generation. Do we? Seems like the code is erroring out early if there are no export_ops? > Does it? My understanding from "man libblkid" (it is a while since I've read > the code) is that it either uses info in /dev/disks/by-* or reads directly > from the block devices (maybe using /sys to find them?) and interprets the > superblock to extract a UUID. It normally reads directly from disk, unless it has changed very recently. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-18 9:43 ` Christoph Hellwig @ 2014-03-18 12:37 ` Michael Kerrisk (man-pages) 2014-03-18 22:24 ` NeilBrown 0 siblings, 1 reply; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-18 12:37 UTC (permalink / raw) To: Christoph Hellwig, NeilBrown Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger On 03/18/2014 10:43 AM, Christoph Hellwig wrote: > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: >> ESTALE is also returned if the filesystem does not support file-handle -> >> file mappings. >> On filesystems which don't provide export_operations (/sys /proc ubifs >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a >> generic handle using the 32 bit inode and 32 bit i_generation. > > Do we? Seems like the code is erroring out early if there are no > export_ops? It appears to me that Neil's statement isn't correct, at least for /proc and /sys (see my other mail, to Neil). I'm unsure about whether it is true for some of those other FSes thought. >> Does it? My understanding from "man libblkid" (it is a while since I've read >> the code) is that it either uses info in /dev/disks/by-* or reads directly >> from the block devices (maybe using /sys to find them?) and interprets the >> superblock to extract a UUID. > > It normally reads directly from disk, unless it has changed very > recently. Thanks. As noted in my mail, I solved this one by just saying a little less about libblkid. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page @ 2014-03-18 22:24 ` NeilBrown 0 siblings, 0 replies; 22+ messages in thread From: NeilBrown @ 2014-03-18 22:24 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Christoph Hellwig, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger [-- Attachment #1: Type: text/plain, Size: 2560 bytes --] On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> wrote: > On 03/18/2014 10:43 AM, Christoph Hellwig wrote: > > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: > >> ESTALE is also returned if the filesystem does not support file-handle -> > >> file mappings. > >> On filesystems which don't provide export_operations (/sys /proc ubifs > >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > >> generic handle using the 32 bit inode and 32 bit i_generation. > > > > Do we? Seems like the code is erroring out early if there are no > > export_ops? > > It appears to me that Neil's statement isn't correct, at least for /proc > and /sys (see my other mail, to Neil). I'm unsure about whether it is true > for some of those other FSes thought. Indeed, I was wrong. I was looking at int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent) { const struct export_operations *nop = inode->i_sb->s_export_op; if (nop && nop->encode_fh) return nop->encode_fh(inode, fid->raw, max_len, parent); return export_encode_fh(inode, fid, max_len, parent); } which uses a default if there is no 'nop'. However do_sys_name_to_handle() contains if (!path->dentry->d_sb->s_export_op || !path->dentry->d_sb->s_export_op->fh_to_dentry) return -EOPNOTSUPP; long before export_encode_inode_fh() gets called. So the default isn't used. I would have thought that exportfs_encode_inode_fh would never get called if there were no s_export_op pointer - certainly name_to_handle_at and nfsd would never call it in that case. However it seems that This routine will be used to generate a file handle in fdinfo output for inotify subsystem, where if no s_export_op present the general export_encode_fh should be used. Thus add a test if s_export_op present inside exportfs_encode_fh itself. according to commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 Author: Cyrill Gorcunov <gorcunov@openvz.org> Date: Mon Dec 17 16:05:06 2012 -0800 I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD when $FD is an inotify fd which is watching the particular file..... I wouldn't have expected that, but maybe it is a good idea. So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. So if you get ESTALE from open_by_handle_at(), then it really is a stale handle. Sorry for the confusion. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page @ 2014-03-18 22:24 ` NeilBrown 0 siblings, 0 replies; 22+ messages in thread From: NeilBrown @ 2014-03-18 22:24 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Christoph Hellwig, Aneesh Kumar K.V, linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml, Andreas Dilger [-- Attachment #1: Type: text/plain, Size: 2619 bytes --] On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On 03/18/2014 10:43 AM, Christoph Hellwig wrote: > > On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: > >> ESTALE is also returned if the filesystem does not support file-handle -> > >> file mappings. > >> On filesystems which don't provide export_operations (/sys /proc ubifs > >> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > >> generic handle using the 32 bit inode and 32 bit i_generation. > > > > Do we? Seems like the code is erroring out early if there are no > > export_ops? > > It appears to me that Neil's statement isn't correct, at least for /proc > and /sys (see my other mail, to Neil). I'm unsure about whether it is true > for some of those other FSes thought. Indeed, I was wrong. I was looking at int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent) { const struct export_operations *nop = inode->i_sb->s_export_op; if (nop && nop->encode_fh) return nop->encode_fh(inode, fid->raw, max_len, parent); return export_encode_fh(inode, fid, max_len, parent); } which uses a default if there is no 'nop'. However do_sys_name_to_handle() contains if (!path->dentry->d_sb->s_export_op || !path->dentry->d_sb->s_export_op->fh_to_dentry) return -EOPNOTSUPP; long before export_encode_inode_fh() gets called. So the default isn't used. I would have thought that exportfs_encode_inode_fh would never get called if there were no s_export_op pointer - certainly name_to_handle_at and nfsd would never call it in that case. However it seems that This routine will be used to generate a file handle in fdinfo output for inotify subsystem, where if no s_export_op present the general export_encode_fh should be used. Thus add a test if s_export_op present inside exportfs_encode_fh itself. according to commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 Author: Cyrill Gorcunov <gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> Date: Mon Dec 17 16:05:06 2012 -0800 I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD when $FD is an inotify fd which is watching the particular file..... I wouldn't have expected that, but maybe it is a good idea. So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. So if you get ESTALE from open_by_handle_at(), then it really is a stale handle. Sorry for the confusion. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-18 22:24 ` NeilBrown (?) @ 2014-03-19 9:09 ` Michael Kerrisk (man-pages) -1 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-19 9:09 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages, Christoph Hellwig, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger Hi Neil, On 03/18/2014 11:24 PM, NeilBrown wrote: > On Tue, 18 Mar 2014 13:37:15 +0100 "Michael Kerrisk (man-pages)" > <mtk.manpages@gmail.com> wrote: > >> On 03/18/2014 10:43 AM, Christoph Hellwig wrote: >>> On Tue, Mar 18, 2014 at 09:00:07AM +1100, NeilBrown wrote: >>>> ESTALE is also returned if the filesystem does not support file-handle -> >>>> file mappings. >>>> On filesystems which don't provide export_operations (/sys /proc ubifs >>>> romfs cramfs nfs coda ... several others) name_to_handle_at will produce a >>>> generic handle using the 32 bit inode and 32 bit i_generation. >>> >>> Do we? Seems like the code is erroring out early if there are no >>> export_ops? >> >> It appears to me that Neil's statement isn't correct, at least for /proc >> and /sys (see my other mail, to Neil). I'm unsure about whether it is true >> for some of those other FSes thought. > > > Indeed, I was wrong. > > I was looking at > > int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, > int *max_len, struct inode *parent) > { > const struct export_operations *nop = inode->i_sb->s_export_op; > > if (nop && nop->encode_fh) > return nop->encode_fh(inode, fid->raw, max_len, parent); > > return export_encode_fh(inode, fid, max_len, parent); > } > > > which uses a default if there is no 'nop'. > > However do_sys_name_to_handle() contains > > if (!path->dentry->d_sb->s_export_op || > !path->dentry->d_sb->s_export_op->fh_to_dentry) > return -EOPNOTSUPP; > > long before export_encode_inode_fh() gets called. So the default isn't used. Okay. > I would have thought that exportfs_encode_inode_fh would never get called if > there were no s_export_op pointer - certainly name_to_handle_at and nfsd > would never call it in that case. > However it seems that > > This routine will be used to generate a file handle in fdinfo output for > inotify subsystem, where if no s_export_op present the general > export_encode_fh should be used. Thus add a test if s_export_op present > inside exportfs_encode_fh itself. > > according to > > commit ab49bdecc3ebb46ab661f5f05d5c5ea9606406c6 > Author: Cyrill Gorcunov <gorcunov@openvz.org> > Date: Mon Dec 17 16:05:06 2012 -0800 > > > I guess that means that you can extract filehandles from /proc/self/fdinfo/$FD > when $FD is an inotify fd which is watching the particular file..... I > wouldn't have expected that, but maybe it is a good idea. Yes, it does--I tested it, and it works! I was unaware of this feature, though I'm not sure that I'll add anything to a man page just yet. > So yes: if the filesystem doesn't support filehandles you get EOPNOTSUPP. > So if you get ESTALE from open_by_handle_at(), then it really is a stale > handle. Sorry for the confusion. Yup, I've updated the page now. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page @ 2014-03-18 12:35 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-18 12:35 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig On 03/17/2014 11:00 PM, NeilBrown wrote: > On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)" > <mtk.manpages@gmail.com> wrote: > >> Hi Aneesh, (and others) >> >> Below is a man page I've written for name_to_handle_at(2) and >> open_by_name_at(2). Would you be willing to review it please, >> and let me know of any corrections/improvements? >> >> Thanks, >> >> Michael > > Thanks for writing this Michael. The fact that I can only find very small > points to comment on reflects the high quality... Thanks, Neil. But there was at least one good clanger below :-}. > >> Otherwise, >> .IR dirfd >> must be a file descriptor that refers to a directory, and > ^^^^^^^ >> .I pathname >> is interpreted relative to that directory. > > As you clarify later, "must be" is not correct. Maybe this is just an issue > of style, in which case you should obviously keep a consistent style across > man pages, but to me it sounds wrong. I would use "is generally" or similar. Yep, good point. In fact, what I did was rewrite that section completely, to more clearly describe the distinct cases based on dirfd/pathname/AT_EMPTY_PATH. >> The >> .IR mountdirfd >> argument is a file descriptor for a directory under >> the mount point with respect to which >> .IR handle >> should be interpreted. > > mountdirfd does not have to be for a directory. It can be for any object in > the filesystem. And I would say "in", not "under". > If /foo and /foo/bar are both mountpoints, and I want to look up a > filehandle for the filesystem mounted at /foo, then opening "/foo/bar" > wouldn't work even though /foo/bar is "under" /foo. And opening "/foo" would > work even though "/foo" is not under "/foo/" (is it?). Good catch. I got deceived by the name of the argument, which in the kernel source is indeed 'mountdirfd', implying it must be a descriptor for a directory. I'll rename the argument in the man page to 'mount_fd' and fix the description as you suggest here: > The > .IR mountfd > argument is a file descriptor for any object (file, directory, etc.) in the > filesystem with respect to which I did s/filesystem/mounted filesystem/ > .IR handle > should be interpreted. > > ?? >> .B ESTALE >> The specified >> .I handle >> is no longer valid. > > ESTALE is also returned if the filesystem does not support file-handle -> > file mappings. > On filesystems which don't provide export_operations (/sys /proc ubifs > romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > generic handle using the 32 bit inode and 32 bit i_generation. Are you sure about this? When I try name_to_handle_at() on /proc and /sys, it gives an error (EOPNOTSUPP). I haven't tested the other FSes though, so maybe some of them do what you say. > open_by_name_at given this (or any) filehandle will fail with ESTALE. > I don't know how best to include this in the documentation. Maybe a note > earlier noting that some filesystems do not support open_by_name_at(), and > you cannot programatically determine which do except by trying. > At the same time note that a file handle can become in valid if a file is > deleted or for any other reason as determined by the filesystem, and that the > error is the same as for when the filesystem doesn't support open_by_name_at. I've added text about invalid file handles into NOTES, and noted that not all FSes support the production of file handles, but haven't noted ESTALE for the latter, since I don't yet know if your statement above is true for some filesystems. >> For example, one can use the device name in the fifth field of the >> .I mountinfo >> record to search for the corresponding device UUID via the symbolic links in >> .IR /dev/disks/by-uuid . >> (A more comfortable way of obtaining the UUID is to use the >> .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition >> .BR libblkid (3) >> library, which uses the >> .I /sys >> filesystem to obtain the same information.) > > Does it? My understanding from "man libblkid" (it is a while since I've read > the code) is that it either uses info in /dev/disks/by-* or reads directly > from the block devices (maybe using /sys to find them?) and interprets the > superblock to extract a UUID. Thanks (and to Christoph) -- I'll just remove the words "which uses the /sys filesystem to obtain the same information" >> Now delete and re-create the file with the same inode number; >> .BR open_by_handle_at () >> recognizes that the file referred to by the file handle no longer exists. >> >> .in +4n >> .nf >> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number >> 4072121 >> $ \fBecho 'Warum?' > cecilia.txt\fP >> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number >> 4072121 >> $ \fBsudo ./t_open_by_handle_at < fh\fP >> open_by_handle_at: Stale NFS file handle > > Something is very wrong here. > echo foo > somefile > does not "delete and re-create" the file. It opens and truncates. > That operation should not invalidate the filehandle on any sane filesystem. Indeed! I don't know quite what I was smoking as I reviewed that piece. In fact, I started writing this page a long time ago, but then other events intervened, and it was a long time before I came back to it recently. Certainly, when I produced that shell session log, things proceeded (almost) as shown. I'm guessing that what happened is that I by accident edited out a line rm cecilia.txt just before echo 'Warum?' > cecilia.txt Fixed now. (In that case of course, it is of course a matter of chance whether the pathname is re-created with the same i-node number, but if you are quick, it often is. I'll add some explanation to the page.) >> if (argc > 1) >> mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); > > O_DIRECTORY is not appropriate, as mentioned earlier. Fixed (in two places). Thanks for the review, Neil. That helped fix a lot of problems in the page. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page @ 2014-03-18 12:35 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-18 12:35 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Aneesh Kumar K.V, linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig On 03/17/2014 11:00 PM, NeilBrown wrote: > On Mon, 17 Mar 2014 16:57:29 +0100 "Michael Kerrisk (man-pages)" > <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > >> Hi Aneesh, (and others) >> >> Below is a man page I've written for name_to_handle_at(2) and >> open_by_name_at(2). Would you be willing to review it please, >> and let me know of any corrections/improvements? >> >> Thanks, >> >> Michael > > Thanks for writing this Michael. The fact that I can only find very small > points to comment on reflects the high quality... Thanks, Neil. But there was at least one good clanger below :-}. > >> Otherwise, >> .IR dirfd >> must be a file descriptor that refers to a directory, and > ^^^^^^^ >> .I pathname >> is interpreted relative to that directory. > > As you clarify later, "must be" is not correct. Maybe this is just an issue > of style, in which case you should obviously keep a consistent style across > man pages, but to me it sounds wrong. I would use "is generally" or similar. Yep, good point. In fact, what I did was rewrite that section completely, to more clearly describe the distinct cases based on dirfd/pathname/AT_EMPTY_PATH. >> The >> .IR mountdirfd >> argument is a file descriptor for a directory under >> the mount point with respect to which >> .IR handle >> should be interpreted. > > mountdirfd does not have to be for a directory. It can be for any object in > the filesystem. And I would say "in", not "under". > If /foo and /foo/bar are both mountpoints, and I want to look up a > filehandle for the filesystem mounted at /foo, then opening "/foo/bar" > wouldn't work even though /foo/bar is "under" /foo. And opening "/foo" would > work even though "/foo" is not under "/foo/" (is it?). Good catch. I got deceived by the name of the argument, which in the kernel source is indeed 'mountdirfd', implying it must be a descriptor for a directory. I'll rename the argument in the man page to 'mount_fd' and fix the description as you suggest here: > The > .IR mountfd > argument is a file descriptor for any object (file, directory, etc.) in the > filesystem with respect to which I did s/filesystem/mounted filesystem/ > .IR handle > should be interpreted. > > ?? >> .B ESTALE >> The specified >> .I handle >> is no longer valid. > > ESTALE is also returned if the filesystem does not support file-handle -> > file mappings. > On filesystems which don't provide export_operations (/sys /proc ubifs > romfs cramfs nfs coda ... several others) name_to_handle_at will produce a > generic handle using the 32 bit inode and 32 bit i_generation. Are you sure about this? When I try name_to_handle_at() on /proc and /sys, it gives an error (EOPNOTSUPP). I haven't tested the other FSes though, so maybe some of them do what you say. > open_by_name_at given this (or any) filehandle will fail with ESTALE. > I don't know how best to include this in the documentation. Maybe a note > earlier noting that some filesystems do not support open_by_name_at(), and > you cannot programatically determine which do except by trying. > At the same time note that a file handle can become in valid if a file is > deleted or for any other reason as determined by the filesystem, and that the > error is the same as for when the filesystem doesn't support open_by_name_at. I've added text about invalid file handles into NOTES, and noted that not all FSes support the production of file handles, but haven't noted ESTALE for the latter, since I don't yet know if your statement above is true for some filesystems. >> For example, one can use the device name in the fifth field of the >> .I mountinfo >> record to search for the corresponding device UUID via the symbolic links in >> .IR /dev/disks/by-uuid . >> (A more comfortable way of obtaining the UUID is to use the >> .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition >> .BR libblkid (3) >> library, which uses the >> .I /sys >> filesystem to obtain the same information.) > > Does it? My understanding from "man libblkid" (it is a while since I've read > the code) is that it either uses info in /dev/disks/by-* or reads directly > from the block devices (maybe using /sys to find them?) and interprets the > superblock to extract a UUID. Thanks (and to Christoph) -- I'll just remove the words "which uses the /sys filesystem to obtain the same information" >> Now delete and re-create the file with the same inode number; >> .BR open_by_handle_at () >> recognizes that the file referred to by the file handle no longer exists. >> >> .in +4n >> .nf >> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number >> 4072121 >> $ \fBecho 'Warum?' > cecilia.txt\fP >> $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number >> 4072121 >> $ \fBsudo ./t_open_by_handle_at < fh\fP >> open_by_handle_at: Stale NFS file handle > > Something is very wrong here. > echo foo > somefile > does not "delete and re-create" the file. It opens and truncates. > That operation should not invalidate the filehandle on any sane filesystem. Indeed! I don't know quite what I was smoking as I reviewed that piece. In fact, I started writing this page a long time ago, but then other events intervened, and it was a long time before I came back to it recently. Certainly, when I produced that shell session log, things proceeded (almost) as shown. I'm guessing that what happened is that I by accident edited out a line rm cecilia.txt just before echo 'Warum?' > cecilia.txt Fixed now. (In that case of course, it is of course a matter of chance whether the pathname is re-created with the same i-node number, but if you are quick, it often is. I'll add some explanation to the page.) >> if (argc > 1) >> mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); > > O_DIRECTORY is not appropriate, as mentioned earlier. Fixed (in two places). Thanks for the review, Neil. That helped fix a lot of problems in the page. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-18 12:35 ` Michael Kerrisk (man-pages) (?) @ 2014-03-18 13:07 ` Christoph Hellwig 2014-03-18 13:30 ` Michael Kerrisk (man-pages) -1 siblings, 1 reply; 22+ messages in thread From: Christoph Hellwig @ 2014-03-18 13:07 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: NeilBrown, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger On Tue, Mar 18, 2014 at 01:35:06PM +0100, Michael Kerrisk (man-pages) wrote: > Indeed! I don't know quite what I was smoking as I reviewed that piece. > In fact, I started writing this page a long time ago, but then other > events intervened, and it was a long time before I came back to it recently. > Certainly, when I produced that shell session log, things proceeded > (almost) as shown. I'm guessing that what happened is that I by > accident edited out a line > > rm cecilia.txt > > just before > > echo 'Warum?' > cecilia.txt > > Fixed now. (In that case of course, it is of course a matter of chance > whether the pathname is re-created with the same i-node number, but if > you are quick, it often is. I'll add some explanation to the page.) That's why the file handles contain a generation counter that gets incremented in this case. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-18 13:07 ` Christoph Hellwig @ 2014-03-18 13:30 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-18 13:30 UTC (permalink / raw) To: Christoph Hellwig Cc: mtk.manpages, NeilBrown, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger On 03/18/2014 02:07 PM, Christoph Hellwig wrote: > That's why the file handles contain a generation counter that gets > incremented in this case. Ahh, yes. Thanks for the reminder/clue. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages) 2014-03-17 22:00 ` NeilBrown @ 2014-03-18 9:37 ` Christoph Hellwig 2014-03-18 12:41 ` Michael Kerrisk (man-pages) 2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages) 2 siblings, 1 reply; 22+ messages in thread From: Christoph Hellwig @ 2014-03-18 9:37 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, NeilBrown Hi Michael, the man page looks reasonable. If you refer to openat(2) instead of open(2) in the ERRORS section you could avoid duplicating a few of the dirfd and flags related errors. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page 2014-03-18 9:37 ` Christoph Hellwig @ 2014-03-18 12:41 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-18 12:41 UTC (permalink / raw) To: Christoph Hellwig Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, NeilBrown On 03/18/2014 10:37 AM, Christoph Hellwig wrote: > Hi Michael, > > the man page looks reasonable. If you refer to openat(2) instead of > open(2) in the ERRORS section you could avoid duplicating a few of the > dirfd and flags related errors. Good idea. Done. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* For review: open_by_name_at(2) man page [v2] 2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages) 2014-03-17 22:00 ` NeilBrown 2014-03-18 9:37 ` Christoph Hellwig @ 2014-03-18 12:55 ` Michael Kerrisk (man-pages) 2014-03-19 4:13 ` NeilBrown 2014-03-19 6:42 ` For review: open_by_name_at(2) man page [v2] Mike Frysinger 2 siblings, 2 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-18 12:55 UTC (permalink / raw) To: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml Cc: mtk.manpages, Andreas Dilger, NeilBrown, Christoph Hellwig Hi Aneesh, (and others) After integrating review comments from NeilBown and Christoph Hellwig, here is draft 2 of a man page I've written for name_to_handle_at(2) and open_by_name_at(2). Especially thanks to Neil's comments, several parts of the page underwent a substantial rewrite. Would you be willing to review it please, and let me know of any corrections/improvements? There are some FIXMEs in the page that I would especially like some help with. Thanks, Michael '\" t -*- coding: UTF-8 -*- .\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@gmail.com> .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .\" .TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual" .SH NAME name_to_handle_at, open_by_handle_at \- obtain handle for a pathname and open file via a handle .SH SYNOPSIS .nf .B #define _GNU_SOURCE .B #include <sys/types.h> .B #include <sys/stat.h> .B #include <fcntl.h> .BI "int name_to_handle_at(int " dirfd ", const char *" pathname , .BI " struct file_handle *" handle , .BI " int *" mount_id ", int " flags ); .BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle , .BI " int " flags ); .fi .SH DESCRIPTION The .BR name_to_handle_at () and .BR open_by_handle_at () system calls split the functionality of .BR openat (2) into two parts: .BR name_to_handle_at () returns an opaque handle that corresponds to a specified file; .BR open_by_handle_at () opens the file corresponding to a handle returned by a previous call to .BR name_to_handle_at () and returns an open file descriptor. .\" .\" .SS name_to_handle_at() The .BR name_to_handle_at () system call returns a file handle and a mount ID corresponding to the file specified by the .IR dirfd and .IR pathname arguments. The file handle is returned via the argument .IR handle , which is a pointer to a structure of the following form: .in +4n .nf struct file_handle { unsigned int handle_bytes; /* Size of f_handle [in, out] */ int handle_type; /* Handle type [out] */ unsigned char f_handle[0]; /* File identifier (sized by caller) [out] */ }; .fi .in .PP It is the caller's responsibility to allocate the structure with a size large enough to hold the handle returned in .IR f_handle . Before the call, the .IR handle_bytes field should be initialized to contain the allocated size for .IR f_handle . (The constant .BR MAX_HANDLE_SZ , defined in .IR <fcntl.h> , specifies the maximum possible size for a file handle.) Upon successful return, the .IR handle_bytes field is updated to contain the number of bytes actually written to .IR f_handle . The caller can discover the required size for the .I file_handle structure by making a call in which .IR handle->handle_bytes is zero; in this case, the call fails with the error .BR EOVERFLOW and .IR handle->handle_bytes is set to indicate the required size; the caller can then use this information to allocate a structure of the correct size (see EXAMPLE below). Other than the use of the .IR handle_bytes field, the caller should treat the .IR file_handle structure as an opaque data type: the .IR handle_type and .IR f_handle fields are needed only by a subsequent call to .BR open_by_handle_at (). Together, the .I pathname and .I dirfd arguments identify the file for which a handle is to obtained. There are four distinct cases: .IP * 3 If .I pathname is a nonempty string containing an absolute pathname, then a handle is returned for the file referred to by that pathname. In this case, .IR dirfd is ignored. .IP * If .I pathname is a nonempty string containing a relative pathname and .IR dirfd has the special value .BR AT_FDCWD , then .I pathname is interpreted relative to the current working directory of the caller, and a handle is returned for the file to which it refers. .IP * If .I pathname is a nonempty string containing a relative pathname and .IR dirfd is a file descriptor referring to a directory, then .I pathname is interpreted relative to the directory referred to by .IR dirfd , and a handle is returned for the file to which it refers. (See .BR openat (3) for an explanation of why "directory file descriptors" are useful.) .IP * If .I pathname is an empty string and .I flags specifies the value .BR AT_EMPTY_PATH , then .IR dirfd can be an open file descriptor referring to any type of file, or .BR AT_FDCWD , meaning the current working directory, and a handle is returned for the file to which it refers. .PP The .I mount_id argument returns an identifier for the filesystem mount that corresponds to .IR pathname . This corresponds to the first field in one of the records in .IR /proc/self/mountinfo . Opening the pathname in the fifth field of that record yields a file descriptor for the mount point; that file descriptor can be used in a subsequent call to .BR open_by_handle_at (). The .I flags argument is a bit mask constructed by ORing together zero or more of the following value: .TP .B AT_EMPTY_PATH Allow .I pathname to be an empty string. See above. (which may have been obtained using the .BR open (2) .B O_PATH flag). .TP .B AT_SYMLINK_FOLLOW By default, .BR name_to_handle_at () does not dereference .I pathname if it is a symbolic link. The flag .B AT_SYMLINK_FOLLOW can be specified in .I flags to cause .I pathname to be dereferenced if it is a symbolic link. .SS open_by_handle_at() The .BR open_by_handle_at () system call opens the file referred to by .IR handle , a file handle returned by a previous call to .BR name_to_handle_at (). The .IR mount_fd argument is a file descriptor for any object (file, directory, etc.) in the mounted filesystem with respect to which .IR handle should be interpreted. The special value .B AT_FDCWD can be specified, meaning the current working directory of the caller. The .I flags argument is as for .BR open (2). .\" FIXME: Confirm that the following is intended behavior. .\" (It certainly seems to be the behavior, from experimenting.) If .I handle refers to a symbolic link, the caller must specify the .B O_PATH flag, and the symbolic link is not dereferenced (the .B O_NOFOLLOW flag, if specified, is ignored). The caller must have the .B CAP_DAC_READ_SEARCH capability to invoke .BR open_by_handle_at (). .SH RETURN VALUE On success, .BR name_to_handle_at () returns 0, and .BR open_by_handle_at () returns a nonnegative file descriptor. In the event of an error, both system calls return \-1 and set .I errno to indicate the cause of the error. .SH ERRORS .BR name_to_handle_at () and .BR open_by_handle_at () can fail for the same errors as .BR openat (2). In addition, they can fail with the errors noted below. .BR name_to_handle_at () can fail with the following errors: .TP .B EINVAL .I flags includes an invalid bit value. .TP .B EINVAL .IR handle_bytes\->handle_bytes is greater than .BR MAX_HANDLE_SZ . .TP .B ENOENT .I pathname is an empty string, but .BR AT_EMPTY_PATH was not specified in .IR flags . .TP .B ENOTDIR The file descriptor supplied in .I dirfd does not refer to a directory, and it it is not the case that both .I flags includes .BR AT_EMPTY_PATH and .I pathname is an empty string. .TP .B EOPNOTSUPP The filesystem does not support decoding of a pathname to a file handle. .TP .B EOVERFLOW The .I handle->handle_bytes value passed into the call was too small. When this error occurs, .I handle->handle_bytes is updated to indicate the required size for the handle. .\" .\" .PP .BR open_by_handle_at () can fail with the following errors: .TP .B EBADF .IR mount_fd is not an open file descriptor. .TP .B EINVAL .I handle->handle_bytes is greater than .BR MAX_HANDLE_SZ or is equal to zero. .TP .B ELOOP .\" FIXME (see earlier FIXME). Is this the intended behavior? .I handle refers to a symbolic link, but .B O_PATH was not specified in .IR flags . .TP .B EPERM The caller does not have the .BR CAP_DAC_READ_SEARCH capability. .TP .B ESTALE The specified .I handle is no longer valid. .SH VERSIONS These system calls first appeared in Linux 2.6.39. .SH CONFORMING TO These system calls are nonstandard Linux extensions. .SH NOTES A file handle can be generated in one process using .BR name_to_handle_at () and later used in a different process that calls .BR open_by_handle_at (). Not all filesystem types support the translation of pathnames to file handles. .\" FIXME NeilBrown noted: .\" ESTALE is also returned if the filesystem does not support .\" file-handle -> file mappings. .\" On filesystems which don't provide export_operations (/sys /proc .\" ubifs romfs cramfs nfs coda ... several others) name_to_handle_at .\" will produce a generic handle using the 32 bit inode and 32 bit .\" i_generation. open_by_name_at given this (or any) filehandle .\" will fail with ESTALE. .\" However, on /proc and /sys, at least, name_to_handle_at() fails with .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the .\" same error as for an invalid file handle) in the above circumstances? A file handle may become invalid ("stale") if a file is deleted, or for other filesystem-specific reasons. Invalid handles are notified by an .B ESTALE error from .BR open_by_name_at (). These system calls are designed for use by user-space file servers. For example, a user-space NFS server might generate a file handle and pass it to an NFS client. Later, when the client wants to open the file, it could pass the handle back to the server. .\" https://lwn.net/Articles/375888/ .\" "Open by handle" - Jonathan Corbet, 2010-02-23 This sort of functionality allows a user-space file server to operate in a stateless fashion with respect to the files it serves. If .I pathname refers to a symbolic link and .IR flags does not specify .BR AT_SYMLINK_FOLLOW , then .BR name_to_handle_at () returns a handle for the link (rather than the file to which it refers). .\" commit bcda76524cd1fa32af748536f27f674a13e56700 The process receiving the handle can later perform operations on the symbolic link by converting the handle to a file descriptor using .BR open_by_handle_at () with the .BR O_PATH flag, and then passing the file descriptor as the .IR dirfd argument in system calls such as .BR readlinkat (2) and .BR fchownat (2). .SS Obtaining a persistent filesystem ID The mount IDs in .IR /proc/self/mountinfo can be reused as filesystems are unmounted and mounted. Therefore, the mount ID returned by .BR name_to_handle_at (3) (in .IR *mount_id ) should not be treated as a persistent identifier for the corresponding mounted filesystem. However, an application can use the information in the .I mountinfo record that corresponds to the mount ID to derive a persistent identifier. For example, one can use the device name in the fifth field of the .I mountinfo record to search for the corresponding device UUID via the symbolic links in .IR /dev/disks/by-uuid . (A more comfortable way of obtaining the UUID is to use the .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition .BR libblkid (3) library.) That process can then be reversed, using the UUID to look up the device name, and then obtaining the corresponding mount point, in order to produce the .IR mount_fd argument used by .BR open_by_name_at (). .SH EXAMPLE The two programs below demonstrate the use of .BR name_to_handle_at () and .BR open_by_handle_at (). The first program .RI ( t_name_to_handle_at.c ) uses .BR name_to_handle_at () to obtain the file handle and mount ID for the file specified in its command-line argument; the handle and ID are written to standard output. The second program .RI ( t_open_by_handle_at.c ) reads a mount ID and file handle from standard input. The program then employs .BR open_by_handle_at () to open the file using that handle. If an optional command-line argument is supplied, then the .IR mount_fd argument for .BR open_by_handle_at () is obtained by opening the directory named in that argument. Otherwise, .IR mount_fd is obtained by scanning .IR /proc/self/mountinfo to find a record whose mount ID matches the mount ID read from standard input, and the mount directory specified in that record is opened. (These programs do not deal with the fact that mount IDs are not persistent.) The following shell session demonstrates the use of these two programs: .in +4n .nf $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP $ \fB./t_name_to_handle_at cecilia.txt > fh\fP $ \fB./t_open_by_handle_at < fh\fP open_by_handle_at: Operation not permitted $ \fBsudo ./t_open_by_handle_at < fh\fP # Need CAP_SYS_ADMIN Read 28 bytes $ \fBrm cecilia.txt\fP .fi .in Now we delete and (quickly) re-create the file so that it has the same content and (by chance) the same inode. Nevertheless, .BR open_by_handle_at () recognizes that the original file referred to by the file handle no longer exists. .in +4n .nf $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number 4072121 $ \fBrm cecilia.txt\fP $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number 4072121 $ \fBsudo ./t_open_by_handle_at < fh\fP open_by_handle_at: Stale NFS file handle .fi .in .SS Program source: t_name_to_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fhsize, s; if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) { fprintf(stderr, "Usage: %s pathname\\n", argv[0]); exit(EXIT_FAILURE); } /* Allocate file_handle structure */ fhsize = sizeof(struct file_handle *); fhp = malloc(fhsize); if (fhp == NULL) errExit("malloc"); /* Make an initial call to name_to_handle_at() to discover the size required for file handle */ fhp\->handle_bytes = 0; s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0); if (s != \-1 || errno != EOVERFLOW) { fprintf(stderr, "Unexpected result from name_to_handle_at()\\n"); exit(EXIT_FAILURE); } /* Reallocate file_handle structure with correct size */ fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; fhp = realloc(fhp, fhsize); /* Copies fhp\->handle_bytes */ if (fhp == NULL) errExit("realloc"); /* Get file handle from pathname supplied on command line */ if (name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0) == \-1) errExit("name_to_handle_at"); /* Write mount ID, file handle size, and file handle to stdout, for later reuse by t_open_by_handle_at.c */ if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) || write(STDOUT_FILENO, fhp, fhsize) != fhsize) { fprintf(stderr, "Write failure\\n"); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); } .fi .SS Program source: t_open_by_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) /* Scan /proc/self/mountinfo to find the line whose mount ID matches \(aqmount_id\(aq. (An easier way to do this is to install and use the \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.) Open the corresponding mount path and return the resulting file descriptor. */ static int open_mount_path_by_id(int mount_id) { char *linep; size_t lsize; char mount_path[PATH_MAX]; int fmnt_id, fnd, nread; FILE *fp; fp = fopen("/proc/self/mountinfo", "r"); if (fp == NULL) errExit("fopen"); for (fnd = 0; !fnd ; ) { linep = NULL; nread = getline(&linep, &lsize, fp); if (nread == \-1) break; nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path); if (nread != 2) { fprintf(stderr, "Bad sscanf()\\n"); exit(EXIT_FAILURE); } free(linep); if (fmnt_id == mount_id) fnd = 1; } fclose(fp); if (!fnd) { fprintf(stderr, "Could not find mount point\\n"); exit(EXIT_FAILURE); } return open(mount_path, O_RDONLY | O_DIRECTORY); } int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fd, mount_fd, fhsize; ssize_t nread; #define BSIZE 1000 char buf[BSIZE]; if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) { fprintf(stderr, "Usage: %s [mount\-dir]]\\n", argv[0]); exit(EXIT_FAILURE); } /* Read data produced by t_name_to_handle_at.c */ if (read(STDIN_FILENO, &mount_id, sizeof(int)) != sizeof(int)) errExit("read"); if (read(STDIN_FILENO, &fhsize, sizeof(int)) != sizeof(int)) errExit("read"); fhp = malloc(fhsize); if (fhp == NULL) errExit("malloc"); if (read(STDIN_FILENO, fhp, fhsize) != fhsize) errExit("read"); /* Obtain file descriptor for mount point, either by opening the pathname specified on the command line, or by scanning /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq obtained by name_to_handle_at() (in t_name_to_handle_at.c) */ if (argc > 1) mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); else mount_fd = open_mount_path_by_id(mount_id); if (mount_fd == \-1) errExit("opening mount fd"); /* Open name using handle and mount point */ fd = open_by_handle_at(mount_fd, fhp, O_RDONLY); if (fd == \-1) errExit("open_by_handle_at"); /* Try reading a few bytes from the file */ nread = read(fd, buf, BSIZE); if (nread == \-1) errExit("read"); printf("Read %ld bytes\\n", (long) nread); exit(EXIT_SUCCESS); } .fi .SH SEE ALSO .BR blkid (1), .BR findfs (1), .BR open (2), .BR libblkid (3), .BR mount (8) The .I libblkid and .I libmount documentation under the latest .I util-linux release at .UR https://www.kernel.org/pub/linux/utils/util-linux/ .UE ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page [v2] @ 2014-03-19 4:13 ` NeilBrown 0 siblings, 0 replies; 22+ messages in thread From: NeilBrown @ 2014-03-19 4:13 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 3075 bytes --] On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> wrote: > Hi Aneesh, (and others) > > After integrating review comments from NeilBown and Christoph Hellwig, > here is draft 2 of a man page I've written for name_to_handle_at(2) and > open_by_name_at(2). Especially thanks to Neil's comments, several parts > of the page underwent a substantial rewrite. Would you be willing to > review it please, and let me know of any corrections/improvements? I didn't notice before but above and in $SUBJ I see "open_by_name_at", which is fictitious :-) > > Together, the > .I pathname > and > .I dirfd > arguments identify the file for which a handle is to obtained. ^be > > The > .I flags > argument is a bit mask constructed by ORing together > zero or more of the following value: ^s > .TP > .B AT_EMPTY_PATH > Allow > .I pathname > to be an empty string. > See above. > (which may have been obtained using the > .BR open (2) > .B O_PATH > flag). What "may have been obtained" ?? > The > .I flags > argument > is as for > .BR open (2). > .\" FIXME: Confirm that the following is intended behavior. > .\" (It certainly seems to be the behavior, from experimenting.) > If > .I handle > refers to a symbolic link, the caller must specify the > .B O_PATH > flag, and the symbolic link is not dereferenced (the > .B O_NOFOLLOW > flag, if specified, is ignored). It certainly sounds like reasonable behaviour. I cannot comment on intention though. Are you bothered that O_PATH is needed for symlinks? An fd on a symlink is a sufficiently unusual thing that it seems reasonable for a programmer to explicitly say they are expecting one. > > In the event of an error, both system calls return \-1 and set > .I errno > to indicate the cause of the error. > .SH ERRORS > .BR name_to_handle_at () > and > .BR open_by_handle_at () > can fail for the same errors as > .BR openat (2). > In addition, they can fail with the errors noted below. Should you mention EFAULT if mount_id or handle are not valid pointers? > > Not all filesystem types support the translation of pathnames to > file handles. > .\" FIXME NeilBrown noted: > .\" ESTALE is also returned if the filesystem does not support > .\" file-handle -> file mappings. > .\" On filesystems which don't provide export_operations (/sys /proc > .\" ubifs romfs cramfs nfs coda ... several others) name_to_handle_at > .\" will produce a generic handle using the 32 bit inode and 32 bit > .\" i_generation. open_by_name_at given this (or any) filehandle > .\" will fail with ESTALE. > .\" However, on /proc and /sys, at least, name_to_handle_at() fails with > .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the > .\" same error as for an invalid file handle) in the above circumstances? This is all wrong - discard it :-) NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page [v2] @ 2014-03-19 4:13 ` NeilBrown 0 siblings, 0 replies; 22+ messages in thread From: NeilBrown @ 2014-03-19 4:13 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Aneesh Kumar K.V, linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 3105 bytes --] On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)" <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > Hi Aneesh, (and others) > > After integrating review comments from NeilBown and Christoph Hellwig, > here is draft 2 of a man page I've written for name_to_handle_at(2) and > open_by_name_at(2). Especially thanks to Neil's comments, several parts > of the page underwent a substantial rewrite. Would you be willing to > review it please, and let me know of any corrections/improvements? I didn't notice before but above and in $SUBJ I see "open_by_name_at", which is fictitious :-) > > Together, the > .I pathname > and > .I dirfd > arguments identify the file for which a handle is to obtained. ^be > > The > .I flags > argument is a bit mask constructed by ORing together > zero or more of the following value: ^s > .TP > .B AT_EMPTY_PATH > Allow > .I pathname > to be an empty string. > See above. > (which may have been obtained using the > .BR open (2) > .B O_PATH > flag). What "may have been obtained" ?? > The > .I flags > argument > is as for > .BR open (2). > .\" FIXME: Confirm that the following is intended behavior. > .\" (It certainly seems to be the behavior, from experimenting.) > If > .I handle > refers to a symbolic link, the caller must specify the > .B O_PATH > flag, and the symbolic link is not dereferenced (the > .B O_NOFOLLOW > flag, if specified, is ignored). It certainly sounds like reasonable behaviour. I cannot comment on intention though. Are you bothered that O_PATH is needed for symlinks? An fd on a symlink is a sufficiently unusual thing that it seems reasonable for a programmer to explicitly say they are expecting one. > > In the event of an error, both system calls return \-1 and set > .I errno > to indicate the cause of the error. > .SH ERRORS > .BR name_to_handle_at () > and > .BR open_by_handle_at () > can fail for the same errors as > .BR openat (2). > In addition, they can fail with the errors noted below. Should you mention EFAULT if mount_id or handle are not valid pointers? > > Not all filesystem types support the translation of pathnames to > file handles. > .\" FIXME NeilBrown noted: > .\" ESTALE is also returned if the filesystem does not support > .\" file-handle -> file mappings. > .\" On filesystems which don't provide export_operations (/sys /proc > .\" ubifs romfs cramfs nfs coda ... several others) name_to_handle_at > .\" will produce a generic handle using the 32 bit inode and 32 bit > .\" i_generation. open_by_name_at given this (or any) filehandle > .\" will fail with ESTALE. > .\" However, on /proc and /sys, at least, name_to_handle_at() fails with > .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the > .\" same error as for an invalid file handle) in the above circumstances? This is all wrong - discard it :-) NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page [v2] @ 2014-03-19 9:09 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-19 9:09 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig, Al Viro [CC =+ Al Viro] Hi Neil, On 03/19/2014 05:13 AM, NeilBrown wrote: > On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)" > <mtk.manpages@gmail.com> wrote: > >> Hi Aneesh, (and others) >> >> After integrating review comments from NeilBown and Christoph Hellwig, >> here is draft 2 of a man page I've written for name_to_handle_at(2) and >> open_by_name_at(2). Especially thanks to Neil's comments, several parts >> of the page underwent a substantial rewrite. Would you be willing to >> review it please, and let me know of any corrections/improvements? [various typos you reported fixed now.] >> .TP >> .B AT_EMPTY_PATH >> Allow >> .I pathname >> to be an empty string. >> See above. >> (which may have been obtained using the >> .BR open (2) >> .B O_PATH >> flag). > > What "may have been obtained" ?? Crufty text. gone now. >> The >> .I flags >> argument >> is as for >> .BR open (2). >> .\" FIXME: Confirm that the following is intended behavior. >> .\" (It certainly seems to be the behavior, from experimenting.) >> If >> .I handle >> refers to a symbolic link, the caller must specify the >> .B O_PATH >> flag, and the symbolic link is not dereferenced (the >> .B O_NOFOLLOW >> flag, if specified, is ignored). > > It certainly sounds like reasonable behaviour. I cannot comment on intention > though. > Are you bothered that O_PATH is needed for symlinks? No. > An fd on a symlink is a > sufficiently unusual thing that it seems reasonable for a programmer to > explicitly say they are expecting one. I think the point is this: If you have a file handle for a symlink, then you can't follow the symlink, which is why you must specify O_PATH and O_NOFOLLOW becomes irrelevant. I'm curious about the rationale though. I suspect it's something like: the process receiving the handle doesn't have enough information for the symlink to be interpreted, I think because it can;t reliably determine what directory the link lives in. Possibly Al Viro or Aneesh can confirm. >> In the event of an error, both system calls return \-1 and set >> .I errno >> to indicate the cause of the error. >> .SH ERRORS >> .BR name_to_handle_at () >> and >> .BR open_by_handle_at () >> can fail for the same errors as >> .BR openat (2). >> In addition, they can fail with the errors noted below. > > Should you mention EFAULT if mount_id or handle are not valid pointers? Done. >> Not all filesystem types support the translation of pathnames to >> file handles. >> .\" FIXME NeilBrown noted: >> .\" ESTALE is also returned if the filesystem does not support >> .\" file-handle -> file mappings. >> .\" On filesystems which don't provide export_operations (/sys /proc >> .\" ubifs romfs cramfs nfs coda ... several others) name_to_handle_at >> .\" will produce a generic handle using the 32 bit inode and 32 bit >> .\" i_generation. open_by_name_at given this (or any) filehandle >> .\" will fail with ESTALE. >> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with >> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the >> .\" same error as for an invalid file handle) in the above circumstances? > > This is all wrong - discard it :-) Yup. Gone now ;-). Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page [v2] @ 2014-03-19 9:09 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-19 9:09 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Aneesh Kumar K.V, linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig, Al Viro [CC =+ Al Viro] Hi Neil, On 03/19/2014 05:13 AM, NeilBrown wrote: > On Tue, 18 Mar 2014 13:55:15 +0100 "Michael Kerrisk (man-pages)" > <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > >> Hi Aneesh, (and others) >> >> After integrating review comments from NeilBown and Christoph Hellwig, >> here is draft 2 of a man page I've written for name_to_handle_at(2) and >> open_by_name_at(2). Especially thanks to Neil's comments, several parts >> of the page underwent a substantial rewrite. Would you be willing to >> review it please, and let me know of any corrections/improvements? [various typos you reported fixed now.] >> .TP >> .B AT_EMPTY_PATH >> Allow >> .I pathname >> to be an empty string. >> See above. >> (which may have been obtained using the >> .BR open (2) >> .B O_PATH >> flag). > > What "may have been obtained" ?? Crufty text. gone now. >> The >> .I flags >> argument >> is as for >> .BR open (2). >> .\" FIXME: Confirm that the following is intended behavior. >> .\" (It certainly seems to be the behavior, from experimenting.) >> If >> .I handle >> refers to a symbolic link, the caller must specify the >> .B O_PATH >> flag, and the symbolic link is not dereferenced (the >> .B O_NOFOLLOW >> flag, if specified, is ignored). > > It certainly sounds like reasonable behaviour. I cannot comment on intention > though. > Are you bothered that O_PATH is needed for symlinks? No. > An fd on a symlink is a > sufficiently unusual thing that it seems reasonable for a programmer to > explicitly say they are expecting one. I think the point is this: If you have a file handle for a symlink, then you can't follow the symlink, which is why you must specify O_PATH and O_NOFOLLOW becomes irrelevant. I'm curious about the rationale though. I suspect it's something like: the process receiving the handle doesn't have enough information for the symlink to be interpreted, I think because it can;t reliably determine what directory the link lives in. Possibly Al Viro or Aneesh can confirm. >> In the event of an error, both system calls return \-1 and set >> .I errno >> to indicate the cause of the error. >> .SH ERRORS >> .BR name_to_handle_at () >> and >> .BR open_by_handle_at () >> can fail for the same errors as >> .BR openat (2). >> In addition, they can fail with the errors noted below. > > Should you mention EFAULT if mount_id or handle are not valid pointers? Done. >> Not all filesystem types support the translation of pathnames to >> file handles. >> .\" FIXME NeilBrown noted: >> .\" ESTALE is also returned if the filesystem does not support >> .\" file-handle -> file mappings. >> .\" On filesystems which don't provide export_operations (/sys /proc >> .\" ubifs romfs cramfs nfs coda ... several others) name_to_handle_at >> .\" will produce a generic handle using the 32 bit inode and 32 bit >> .\" i_generation. open_by_name_at given this (or any) filehandle >> .\" will fail with ESTALE. >> .\" However, on /proc and /sys, at least, name_to_handle_at() fails with >> .\" EOPNOTSUPP. Are there really filesystems that can deliver ESTALE (the >> .\" same error as for an invalid file handle) in the above circumstances? > > This is all wrong - discard it :-) Yup. Gone now ;-). Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* For review: open_by_handle_at(2) man page [v3] @ 2014-03-19 14:14 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-19 14:14 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig, Al Viro, Mike Frysinger Hi Aneesh, (and others), After integrating review comments from NeilBown, Christoph Hellwig, and Mike Frysinger, here is draft 3 of a man page I've written for name_to_handle_at(2) and open_by_handle_at(2). Would you be willing to review it please, and let me know of any corrections/improvements? There are some FIXMEs in the page that I would especially like some help with. Thanks, Michael .\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages@gmail.com> .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .\" .TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual" .SH NAME name_to_handle_at, open_by_handle_at \- obtain handle for a pathname and open file via a handle .SH SYNOPSIS .nf .B #define _GNU_SOURCE .B #include <sys/types.h> .B #include <sys/stat.h> .B #include <fcntl.h> .BI "int name_to_handle_at(int " dirfd ", const char *" pathname , .BI " struct file_handle *" handle , .BI " int *" mount_id ", int " flags ); .BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle , .BI " int " flags ); .fi .SH DESCRIPTION The .BR name_to_handle_at () and .BR open_by_handle_at () system calls split the functionality of .BR openat (2) into two parts: .BR name_to_handle_at () returns an opaque handle that corresponds to a specified file; .BR open_by_handle_at () opens the file corresponding to a handle returned by a previous call to .BR name_to_handle_at () and returns an open file descriptor. .\" .\" .SS name_to_handle_at() The .BR name_to_handle_at () system call returns a file handle and a mount ID corresponding to the file specified by the .IR dirfd and .IR pathname arguments. The file handle is returned via the argument .IR handle , which is a pointer to a structure of the following form: .in +4n .nf struct file_handle { unsigned int handle_bytes; /* Size of f_handle [in, out] */ int handle_type; /* Handle type [out] */ unsigned char f_handle[0]; /* File identifier (sized by caller) [out] */ }; .fi .in .PP It is the caller's responsibility to allocate the structure with a size large enough to hold the handle returned in .IR f_handle . Before the call, the .IR handle_bytes field should be initialized to contain the allocated size for .IR f_handle . (The constant .BR MAX_HANDLE_SZ , defined in .IR <fcntl.h> , specifies the maximum possible size for a file handle.) Upon successful return, the .IR handle_bytes field is updated to contain the number of bytes actually written to .IR f_handle . The caller can discover the required size for the .I file_handle structure by making a call in which .IR handle->handle_bytes is zero; in this case, the call fails with the error .BR EOVERFLOW and .IR handle->handle_bytes is set to indicate the required size; the caller can then use this information to allocate a structure of the correct size (see EXAMPLE below). Other than the use of the .IR handle_bytes field, the caller should treat the .IR file_handle structure as an opaque data type: the .IR handle_type and .IR f_handle fields are needed only by a subsequent call to .BR open_by_handle_at (). The .I flags argument is a bit mask constructed by ORing together zero or more of .BR AT_EMPTY_PATH and .BR AT_SYMLINK_FOLLOW , described below. Together, the .I pathname and .I dirfd arguments identify the file for which a handle is to be obtained. There are four distinct cases: .IP * 3 If .I pathname is a nonempty string containing an absolute pathname, then a handle is returned for the file referred to by that pathname. In this case, .IR dirfd is ignored. .IP * If .I pathname is a nonempty string containing a relative pathname and .IR dirfd has the special value .BR AT_FDCWD , then .I pathname is interpreted relative to the current working directory of the caller, and a handle is returned for the file to which it refers. .IP * If .I pathname is a nonempty string containing a relative pathname and .IR dirfd is a file descriptor referring to a directory, then .I pathname is interpreted relative to the directory referred to by .IR dirfd , and a handle is returned for the file to which it refers. (See .BR openat (3) for an explanation of why "directory file descriptors" are useful.) .IP * If .I pathname is an empty string and .I flags specifies the value .BR AT_EMPTY_PATH , then .IR dirfd can be an open file descriptor referring to any type of file, or .BR AT_FDCWD , meaning the current working directory, and a handle is returned for the file to which it refers. .PP The .I mount_id argument returns an identifier for the filesystem mount that corresponds to .IR pathname . This corresponds to the first field in one of the records in .IR /proc/self/mountinfo . Opening the pathname in the fifth field of that record yields a file descriptor for the mount point; that file descriptor can be used in a subsequent call to .BR open_by_handle_at (). By default, .BR name_to_handle_at () does not dereference .I pathname if it is a symbolic link, and thus returns a handle for the link itself. If .B AT_SYMLINK_FOLLOW is specified in .IR flags , .I pathname is dereferenced if it is a symbolic link (so that the call returns a handle for the file referred to by the link). .SS open_by_handle_at() The .BR open_by_handle_at () system call opens the file referred to by .IR handle , a file handle returned by a previous call to .BR name_to_handle_at (). The .IR mount_fd argument is a file descriptor for any object (file, directory, etc.) in the mounted filesystem with respect to which .IR handle should be interpreted. The special value .B AT_FDCWD can be specified, meaning the current working directory of the caller. The .I flags argument is as for .BR open (2). .\" FIXME: Confirm that the following is intended behavior. .\" (It certainly seems to be the behavior, from experimenting.) If .I handle refers to a symbolic link, the caller must specify the .B O_PATH flag, and the symbolic link is not dereferenced; the .B O_NOFOLLOW flag, if specified, is ignored. The caller must have the .B CAP_DAC_READ_SEARCH capability to invoke .BR open_by_handle_at (). .SH RETURN VALUE On success, .BR name_to_handle_at () returns 0, and .BR open_by_handle_at () returns a nonnegative file descriptor. In the event of an error, both system calls return \-1 and set .I errno to indicate the cause of the error. .SH ERRORS .BR name_to_handle_at () and .BR open_by_handle_at () can fail for the same errors as .BR openat (2). In addition, they can fail with the errors noted below. .BR name_to_handle_at () can fail with the following errors: .TP .B EFAULT .IR pathname , .IR mount_id , or .IR handle points outside your accessible address space. .TP .B EINVAL .I flags includes an invalid bit value. .TP .B EINVAL .IR handle_bytes\->handle_bytes is greater than .BR MAX_HANDLE_SZ . .TP .B ENOENT .I pathname is an empty string, but .BR AT_EMPTY_PATH was not specified in .IR flags . .TP .B ENOTDIR The file descriptor supplied in .I dirfd does not refer to a directory, and it is not the case that both .I flags includes .BR AT_EMPTY_PATH and .I pathname is an empty string. .TP .B EOPNOTSUPP The filesystem does not support decoding of a pathname to a file handle. .TP .B EOVERFLOW The .I handle->handle_bytes value passed into the call was too small. When this error occurs, .I handle->handle_bytes is updated to indicate the required size for the handle. .\" .\" .PP .BR open_by_handle_at () can fail with the following errors: .TP .B EBADF .IR mount_fd is not an open file descriptor. .TP .B EFAULT .IR handle points outside your accessible address space. .TP .B EINVAL .I handle->handle_bytes is greater than .BR MAX_HANDLE_SZ or is equal to zero. .TP .B ELOOP .\" FIXME (see earlier FIXME). Is this the intended behavior? .I handle refers to a symbolic link, but .B O_PATH was not specified in .IR flags . .TP .B EPERM The caller does not have the .BR CAP_DAC_READ_SEARCH capability. .TP .B ESTALE The specified .I handle is not valid. This error will occur if, for example, the file has been deleted. .SH VERSIONS These system calls first appeared in Linux 2.6.39. .SH CONFORMING TO These system calls are nonstandard Linux extensions. .SH NOTES A file handle can be generated in one process using .BR name_to_handle_at () and later used in a different process that calls .BR open_by_handle_at (). Some filesystem don't support the translation of pathnames to file handles, for example, .IR /proc , .IR /sys , and various network filesystems. A file handle may become invalid ("stale") if a file is deleted, or for other filesystem-specific reasons. Invalid handles are notified by an .B ESTALE error from .BR open_by_handle_at (). These system calls are designed for use by user-space file servers. For example, a user-space NFS server might generate a file handle and pass it to an NFS client. Later, when the client wants to open the file, it could pass the handle back to the server. .\" https://lwn.net/Articles/375888/ .\" "Open by handle" - Jonathan Corbet, 2010-02-23 This sort of functionality allows a user-space file server to operate in a stateless fashion with respect to the files it serves. If .I pathname refers to a symbolic link and .IR flags does not specify .BR AT_SYMLINK_FOLLOW , then .BR name_to_handle_at () returns a handle for the link (rather than the file to which it refers). .\" commit bcda76524cd1fa32af748536f27f674a13e56700 The process receiving the handle can later perform operations on the symbolic link by converting the handle to a file descriptor using .BR open_by_handle_at () with the .BR O_PATH flag, and then passing the file descriptor as the .IR dirfd argument in system calls such as .BR readlinkat (2) and .BR fchownat (2). .SS Obtaining a persistent filesystem ID The mount IDs in .IR /proc/self/mountinfo can be reused as filesystems are unmounted and mounted. Therefore, the mount ID returned by .BR name_to_handle_at () (in .IR *mount_id ) should not be treated as a persistent identifier for the corresponding mounted filesystem. However, an application can use the information in the .I mountinfo record that corresponds to the mount ID to derive a persistent identifier. For example, one can use the device name in the fifth field of the .I mountinfo record to search for the corresponding device UUID via the symbolic links in .IR /dev/disks/by-uuid . (A more comfortable way of obtaining the UUID is to use the .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition .BR libblkid (3) library.) That process can then be reversed, using the UUID to look up the device name, and then obtaining the corresponding mount point, in order to produce the .IR mount_fd argument used by .BR open_by_handle_at (). .SH EXAMPLE The two programs below demonstrate the use of .BR name_to_handle_at () and .BR open_by_handle_at (). The first program .RI ( t_name_to_handle_at.c ) uses .BR name_to_handle_at () to obtain the file handle and mount ID for the file specified in its command-line argument; the handle and mount ID are written to standard output. The second program .RI ( t_open_by_handle_at.c ) reads a mount ID and file handle from standard input. The program then employs .BR open_by_handle_at () to open the file using that handle. If an optional command-line argument is supplied, then the .IR mount_fd argument for .BR open_by_handle_at () is obtained by opening the directory named in that argument. Otherwise, .IR mount_fd is obtained by scanning .IR /proc/self/mountinfo to find a record whose mount ID matches the mount ID read from standard input, and the mount directory specified in that record is opened. (These programs do not deal with the fact that mount IDs are not persistent.) The following shell session demonstrates the use of these two programs: .in +4n .nf $ \fBecho 'Can you please think about it?' > cecilia.txt\fP $ \fB./t_name_to_handle_at cecilia.txt > fh\fP $ \fB./t_open_by_handle_at < fh\fP open_by_handle_at: Operation not permitted $ \fBsudo ./t_open_by_handle_at < fh\fP # Need CAP_SYS_ADMIN Read 31 bytes $ \fBrm cecilia.txt\fP .fi .in Now we delete and (quickly) re-create the file so that it has the same content and (by chance) the same inode. Nevertheless, .BR open_by_handle_at () .\" Christoph Hellwig: That's why the file handles contain a generation .\" counter that gets incremented in this case. recognizes that the original file referred to by the file handle no longer exists. .in +4n .nf $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number 4072121 $ \fBrm cecilia.txt\fP $ \fBecho 'Can you please think about it?' > cecilia.txt\fP $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number 4072121 $ \fBsudo ./t_open_by_handle_at < fh\fP open_by_handle_at: Stale NFS file handle .fi .in .SS Program source: t_name_to_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fhsize, flags, dirfd, j; char *pathname; if (argc != 2) { fprintf(stderr, "Usage: %s pathname\\n", argv[0]); exit(EXIT_FAILURE); } pathname = argv[1]; /* Allocate file_handle structure */ fhsize = sizeof(*fhp); fhp = malloc(fhsize); if (fhp == NULL) errExit("malloc"); /* Make an initial call to name_to_handle_at() to discover the size required for file handle */ dirfd = AT_FDCWD; /* For name_to_handle_at() calls */ flags = 0; /* For name_to_handle_at() calls */ fhp\->handle_bytes = 0; if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) != \-1 || errno != EOVERFLOW) { fprintf(stderr, "Unexpected result from name_to_handle_at()\\n"); exit(EXIT_FAILURE); } /* Reallocate file_handle structure with correct size */ fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; fhp = realloc(fhp, fhsize); /* Copies fhp\->handle_bytes */ if (fhp == NULL) errExit("realloc"); /* Get file handle from pathname supplied on command line */ if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) == \-1) errExit("name_to_handle_at"); /* Write mount ID, file handle size, and file handle to stdout, for later reuse by t_open_by_handle_at.c */ printf("%d\\n", mount_id); printf("%d %d ", fhp\->handle_bytes, fhp\->handle_type); for (j = 0; j < fhp\->handle_bytes; j++) printf(" %02x", fhp\->f_handle[j]); printf("\\n"); exit(EXIT_SUCCESS); } .fi .SS Program source: t_open_by_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) /* Scan /proc/self/mountinfo to find the line whose mount ID matches \(aqmount_id\(aq. (An easier way to do this is to install and use the \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.) Open the corresponding mount path and return the resulting file descriptor. */ static int open_mount_path_by_id(int mount_id) { char *linep; size_t lsize; char mount_path[PATH_MAX]; int mi_mount_id, found; ssize_t nread; FILE *fp; fp = fopen("/proc/self/mountinfo", "r"); if (fp == NULL) errExit("fopen"); found = 0; linep = NULL; while (!found) { nread = getline(&linep, &lsize, fp); if (nread == \-1) break; nread = sscanf(linep, "%d %*d %*s %*s %s", &mi_mount_id, mount_path); if (nread != 2) { fprintf(stderr, "Bad sscanf()\\n"); exit(EXIT_FAILURE); } if (mi_mount_id == mount_id) found = 1; } free(linep); fclose(fp); if (!found) { fprintf(stderr, "Could not find mount point\\n"); exit(EXIT_FAILURE); } return open(mount_path, O_RDONLY); } int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fd, mount_fd, handle_bytes, j; ssize_t nread; char buf[1000]; #define LINE_SIZE 100 char line1[LINE_SIZE], line2[LINE_SIZE]; char *nextp; if ((argc > 1 && strcmp(argv[1], "\-\-help") == 0) || argc > 2) { fprintf(stderr, "Usage: %s [mount\-path]\\n", argv[0]); exit(EXIT_FAILURE); } /* Standard input contains mount ID and file handle information: Line 1: <mount_id> Line 2: <handle_bytes> <handle_type> <bytes of handle in hex> */ if ((fgets(line1, sizeof(line1), stdin) == NULL) || (fgets(line2, sizeof(line2), stdin) == NULL)) { fprintf(stderr, "Missing mount_id / file handle\\n"); exit(EXIT_FAILURE); } mount_id = atoi(line1); handle_bytes = strtoul(line2, &nextp, 0); /* Given handle_bytes, we can now allocate file_handle structure */ fhp = malloc(sizeof(struct file_handle) + handle_bytes); if (fhp == NULL) errExit("malloc"); fhp\->handle_bytes = handle_bytes; fhp\->handle_type = strtoul(nextp, &nextp, 0); for (j = 0; j < fhp\->handle_bytes; j++) fhp\->f_handle[j] = strtoul(nextp, &nextp, 16); /* Obtain file descriptor for mount point, either by opening the pathname specified on the command line, or by scanning /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq that we received from stdin. */ if (argc > 1) mount_fd = open(argv[1], O_RDONLY); else mount_fd = open_mount_path_by_id(mount_id); if (mount_fd == \-1) errExit("opening mount fd"); /* Open file using handle and mount point */ fd = open_by_handle_at(mount_fd, fhp, O_RDONLY); if (fd == \-1) errExit("open_by_handle_at"); /* Try reading a few bytes from the file */ nread = read(fd, buf, sizeof(buf)); if (nread == \-1) errExit("read"); printf("Read %zd bytes\\n", nread); exit(EXIT_SUCCESS); } .fi .SH SEE ALSO .BR open (2), .BR libblkid (3), .BR blkid (8), .BR findfs (8), .BR mount (8) The .I libblkid and .I libmount documentation in the latest .I util-linux release at .UR https://www.kernel.org/pub/linux/utils/util-linux/ .UE ^ permalink raw reply [flat|nested] 22+ messages in thread
* For review: open_by_handle_at(2) man page [v3] @ 2014-03-19 14:14 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-19 14:14 UTC (permalink / raw) To: NeilBrown Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w, Aneesh Kumar K.V, linux-man-u79uwXL29TY76Z2rM5mHXA, Linux-Fsdevel, lkml, Andreas Dilger, Christoph Hellwig, Al Viro, Mike Frysinger Hi Aneesh, (and others), After integrating review comments from NeilBown, Christoph Hellwig, and Mike Frysinger, here is draft 3 of a man page I've written for name_to_handle_at(2) and open_by_handle_at(2). Would you be willing to review it please, and let me know of any corrections/improvements? There are some FIXMEs in the page that I would especially like some help with. Thanks, Michael .\" Copyright (c) 2014 by Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .\" .TH OPEN_BY_HANDLE_AT 2 2014-03-24 "Linux" "Linux Programmer's Manual" .SH NAME name_to_handle_at, open_by_handle_at \- obtain handle for a pathname and open file via a handle .SH SYNOPSIS .nf .B #define _GNU_SOURCE .B #include <sys/types.h> .B #include <sys/stat.h> .B #include <fcntl.h> .BI "int name_to_handle_at(int " dirfd ", const char *" pathname , .BI " struct file_handle *" handle , .BI " int *" mount_id ", int " flags ); .BI "int open_by_handle_at(int " mount_fd ", struct file_handle *" handle , .BI " int " flags ); .fi .SH DESCRIPTION The .BR name_to_handle_at () and .BR open_by_handle_at () system calls split the functionality of .BR openat (2) into two parts: .BR name_to_handle_at () returns an opaque handle that corresponds to a specified file; .BR open_by_handle_at () opens the file corresponding to a handle returned by a previous call to .BR name_to_handle_at () and returns an open file descriptor. .\" .\" .SS name_to_handle_at() The .BR name_to_handle_at () system call returns a file handle and a mount ID corresponding to the file specified by the .IR dirfd and .IR pathname arguments. The file handle is returned via the argument .IR handle , which is a pointer to a structure of the following form: .in +4n .nf struct file_handle { unsigned int handle_bytes; /* Size of f_handle [in, out] */ int handle_type; /* Handle type [out] */ unsigned char f_handle[0]; /* File identifier (sized by caller) [out] */ }; .fi .in .PP It is the caller's responsibility to allocate the structure with a size large enough to hold the handle returned in .IR f_handle . Before the call, the .IR handle_bytes field should be initialized to contain the allocated size for .IR f_handle . (The constant .BR MAX_HANDLE_SZ , defined in .IR <fcntl.h> , specifies the maximum possible size for a file handle.) Upon successful return, the .IR handle_bytes field is updated to contain the number of bytes actually written to .IR f_handle . The caller can discover the required size for the .I file_handle structure by making a call in which .IR handle->handle_bytes is zero; in this case, the call fails with the error .BR EOVERFLOW and .IR handle->handle_bytes is set to indicate the required size; the caller can then use this information to allocate a structure of the correct size (see EXAMPLE below). Other than the use of the .IR handle_bytes field, the caller should treat the .IR file_handle structure as an opaque data type: the .IR handle_type and .IR f_handle fields are needed only by a subsequent call to .BR open_by_handle_at (). The .I flags argument is a bit mask constructed by ORing together zero or more of .BR AT_EMPTY_PATH and .BR AT_SYMLINK_FOLLOW , described below. Together, the .I pathname and .I dirfd arguments identify the file for which a handle is to be obtained. There are four distinct cases: .IP * 3 If .I pathname is a nonempty string containing an absolute pathname, then a handle is returned for the file referred to by that pathname. In this case, .IR dirfd is ignored. .IP * If .I pathname is a nonempty string containing a relative pathname and .IR dirfd has the special value .BR AT_FDCWD , then .I pathname is interpreted relative to the current working directory of the caller, and a handle is returned for the file to which it refers. .IP * If .I pathname is a nonempty string containing a relative pathname and .IR dirfd is a file descriptor referring to a directory, then .I pathname is interpreted relative to the directory referred to by .IR dirfd , and a handle is returned for the file to which it refers. (See .BR openat (3) for an explanation of why "directory file descriptors" are useful.) .IP * If .I pathname is an empty string and .I flags specifies the value .BR AT_EMPTY_PATH , then .IR dirfd can be an open file descriptor referring to any type of file, or .BR AT_FDCWD , meaning the current working directory, and a handle is returned for the file to which it refers. .PP The .I mount_id argument returns an identifier for the filesystem mount that corresponds to .IR pathname . This corresponds to the first field in one of the records in .IR /proc/self/mountinfo . Opening the pathname in the fifth field of that record yields a file descriptor for the mount point; that file descriptor can be used in a subsequent call to .BR open_by_handle_at (). By default, .BR name_to_handle_at () does not dereference .I pathname if it is a symbolic link, and thus returns a handle for the link itself. If .B AT_SYMLINK_FOLLOW is specified in .IR flags , .I pathname is dereferenced if it is a symbolic link (so that the call returns a handle for the file referred to by the link). .SS open_by_handle_at() The .BR open_by_handle_at () system call opens the file referred to by .IR handle , a file handle returned by a previous call to .BR name_to_handle_at (). The .IR mount_fd argument is a file descriptor for any object (file, directory, etc.) in the mounted filesystem with respect to which .IR handle should be interpreted. The special value .B AT_FDCWD can be specified, meaning the current working directory of the caller. The .I flags argument is as for .BR open (2). .\" FIXME: Confirm that the following is intended behavior. .\" (It certainly seems to be the behavior, from experimenting.) If .I handle refers to a symbolic link, the caller must specify the .B O_PATH flag, and the symbolic link is not dereferenced; the .B O_NOFOLLOW flag, if specified, is ignored. The caller must have the .B CAP_DAC_READ_SEARCH capability to invoke .BR open_by_handle_at (). .SH RETURN VALUE On success, .BR name_to_handle_at () returns 0, and .BR open_by_handle_at () returns a nonnegative file descriptor. In the event of an error, both system calls return \-1 and set .I errno to indicate the cause of the error. .SH ERRORS .BR name_to_handle_at () and .BR open_by_handle_at () can fail for the same errors as .BR openat (2). In addition, they can fail with the errors noted below. .BR name_to_handle_at () can fail with the following errors: .TP .B EFAULT .IR pathname , .IR mount_id , or .IR handle points outside your accessible address space. .TP .B EINVAL .I flags includes an invalid bit value. .TP .B EINVAL .IR handle_bytes\->handle_bytes is greater than .BR MAX_HANDLE_SZ . .TP .B ENOENT .I pathname is an empty string, but .BR AT_EMPTY_PATH was not specified in .IR flags . .TP .B ENOTDIR The file descriptor supplied in .I dirfd does not refer to a directory, and it is not the case that both .I flags includes .BR AT_EMPTY_PATH and .I pathname is an empty string. .TP .B EOPNOTSUPP The filesystem does not support decoding of a pathname to a file handle. .TP .B EOVERFLOW The .I handle->handle_bytes value passed into the call was too small. When this error occurs, .I handle->handle_bytes is updated to indicate the required size for the handle. .\" .\" .PP .BR open_by_handle_at () can fail with the following errors: .TP .B EBADF .IR mount_fd is not an open file descriptor. .TP .B EFAULT .IR handle points outside your accessible address space. .TP .B EINVAL .I handle->handle_bytes is greater than .BR MAX_HANDLE_SZ or is equal to zero. .TP .B ELOOP .\" FIXME (see earlier FIXME). Is this the intended behavior? .I handle refers to a symbolic link, but .B O_PATH was not specified in .IR flags . .TP .B EPERM The caller does not have the .BR CAP_DAC_READ_SEARCH capability. .TP .B ESTALE The specified .I handle is not valid. This error will occur if, for example, the file has been deleted. .SH VERSIONS These system calls first appeared in Linux 2.6.39. .SH CONFORMING TO These system calls are nonstandard Linux extensions. .SH NOTES A file handle can be generated in one process using .BR name_to_handle_at () and later used in a different process that calls .BR open_by_handle_at (). Some filesystem don't support the translation of pathnames to file handles, for example, .IR /proc , .IR /sys , and various network filesystems. A file handle may become invalid ("stale") if a file is deleted, or for other filesystem-specific reasons. Invalid handles are notified by an .B ESTALE error from .BR open_by_handle_at (). These system calls are designed for use by user-space file servers. For example, a user-space NFS server might generate a file handle and pass it to an NFS client. Later, when the client wants to open the file, it could pass the handle back to the server. .\" https://lwn.net/Articles/375888/ .\" "Open by handle" - Jonathan Corbet, 2010-02-23 This sort of functionality allows a user-space file server to operate in a stateless fashion with respect to the files it serves. If .I pathname refers to a symbolic link and .IR flags does not specify .BR AT_SYMLINK_FOLLOW , then .BR name_to_handle_at () returns a handle for the link (rather than the file to which it refers). .\" commit bcda76524cd1fa32af748536f27f674a13e56700 The process receiving the handle can later perform operations on the symbolic link by converting the handle to a file descriptor using .BR open_by_handle_at () with the .BR O_PATH flag, and then passing the file descriptor as the .IR dirfd argument in system calls such as .BR readlinkat (2) and .BR fchownat (2). .SS Obtaining a persistent filesystem ID The mount IDs in .IR /proc/self/mountinfo can be reused as filesystems are unmounted and mounted. Therefore, the mount ID returned by .BR name_to_handle_at () (in .IR *mount_id ) should not be treated as a persistent identifier for the corresponding mounted filesystem. However, an application can use the information in the .I mountinfo record that corresponds to the mount ID to derive a persistent identifier. For example, one can use the device name in the fifth field of the .I mountinfo record to search for the corresponding device UUID via the symbolic links in .IR /dev/disks/by-uuid . (A more comfortable way of obtaining the UUID is to use the .\" e.g., http://stackoverflow.com/questions/6748429/using-libblkid-to-find-uuid-of-a-partition .BR libblkid (3) library.) That process can then be reversed, using the UUID to look up the device name, and then obtaining the corresponding mount point, in order to produce the .IR mount_fd argument used by .BR open_by_handle_at (). .SH EXAMPLE The two programs below demonstrate the use of .BR name_to_handle_at () and .BR open_by_handle_at (). The first program .RI ( t_name_to_handle_at.c ) uses .BR name_to_handle_at () to obtain the file handle and mount ID for the file specified in its command-line argument; the handle and mount ID are written to standard output. The second program .RI ( t_open_by_handle_at.c ) reads a mount ID and file handle from standard input. The program then employs .BR open_by_handle_at () to open the file using that handle. If an optional command-line argument is supplied, then the .IR mount_fd argument for .BR open_by_handle_at () is obtained by opening the directory named in that argument. Otherwise, .IR mount_fd is obtained by scanning .IR /proc/self/mountinfo to find a record whose mount ID matches the mount ID read from standard input, and the mount directory specified in that record is opened. (These programs do not deal with the fact that mount IDs are not persistent.) The following shell session demonstrates the use of these two programs: .in +4n .nf $ \fBecho 'Can you please think about it?' > cecilia.txt\fP $ \fB./t_name_to_handle_at cecilia.txt > fh\fP $ \fB./t_open_by_handle_at < fh\fP open_by_handle_at: Operation not permitted $ \fBsudo ./t_open_by_handle_at < fh\fP # Need CAP_SYS_ADMIN Read 31 bytes $ \fBrm cecilia.txt\fP .fi .in Now we delete and (quickly) re-create the file so that it has the same content and (by chance) the same inode. Nevertheless, .BR open_by_handle_at () .\" Christoph Hellwig: That's why the file handles contain a generation .\" counter that gets incremented in this case. recognizes that the original file referred to by the file handle no longer exists. .in +4n .nf $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Display inode number 4072121 $ \fBrm cecilia.txt\fP $ \fBecho 'Can you please think about it?' > cecilia.txt\fP $ \fBstat \-\-printf="%i\\n" cecilia.txt\fP # Check inode number 4072121 $ \fBsudo ./t_open_by_handle_at < fh\fP open_by_handle_at: Stale NFS file handle .fi .in .SS Program source: t_name_to_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fhsize, flags, dirfd, j; char *pathname; if (argc != 2) { fprintf(stderr, "Usage: %s pathname\\n", argv[0]); exit(EXIT_FAILURE); } pathname = argv[1]; /* Allocate file_handle structure */ fhsize = sizeof(*fhp); fhp = malloc(fhsize); if (fhp == NULL) errExit("malloc"); /* Make an initial call to name_to_handle_at() to discover the size required for file handle */ dirfd = AT_FDCWD; /* For name_to_handle_at() calls */ flags = 0; /* For name_to_handle_at() calls */ fhp\->handle_bytes = 0; if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) != \-1 || errno != EOVERFLOW) { fprintf(stderr, "Unexpected result from name_to_handle_at()\\n"); exit(EXIT_FAILURE); } /* Reallocate file_handle structure with correct size */ fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; fhp = realloc(fhp, fhsize); /* Copies fhp\->handle_bytes */ if (fhp == NULL) errExit("realloc"); /* Get file handle from pathname supplied on command line */ if (name_to_handle_at(dirfd, pathname, fhp, &mount_id, flags) == \-1) errExit("name_to_handle_at"); /* Write mount ID, file handle size, and file handle to stdout, for later reuse by t_open_by_handle_at.c */ printf("%d\\n", mount_id); printf("%d %d ", fhp\->handle_bytes, fhp\->handle_type); for (j = 0; j < fhp\->handle_bytes; j++) printf(" %02x", fhp\->f_handle[j]); printf("\\n"); exit(EXIT_SUCCESS); } .fi .SS Program source: t_open_by_handle_at.c \& .nf #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ } while (0) /* Scan /proc/self/mountinfo to find the line whose mount ID matches \(aqmount_id\(aq. (An easier way to do this is to install and use the \(aqlibmount\(aq library provided by the \(aqutil\-linux\(aq project.) Open the corresponding mount path and return the resulting file descriptor. */ static int open_mount_path_by_id(int mount_id) { char *linep; size_t lsize; char mount_path[PATH_MAX]; int mi_mount_id, found; ssize_t nread; FILE *fp; fp = fopen("/proc/self/mountinfo", "r"); if (fp == NULL) errExit("fopen"); found = 0; linep = NULL; while (!found) { nread = getline(&linep, &lsize, fp); if (nread == \-1) break; nread = sscanf(linep, "%d %*d %*s %*s %s", &mi_mount_id, mount_path); if (nread != 2) { fprintf(stderr, "Bad sscanf()\\n"); exit(EXIT_FAILURE); } if (mi_mount_id == mount_id) found = 1; } free(linep); fclose(fp); if (!found) { fprintf(stderr, "Could not find mount point\\n"); exit(EXIT_FAILURE); } return open(mount_path, O_RDONLY); } int main(int argc, char *argv[]) { struct file_handle *fhp; int mount_id, fd, mount_fd, handle_bytes, j; ssize_t nread; char buf[1000]; #define LINE_SIZE 100 char line1[LINE_SIZE], line2[LINE_SIZE]; char *nextp; if ((argc > 1 && strcmp(argv[1], "\-\-help") == 0) || argc > 2) { fprintf(stderr, "Usage: %s [mount\-path]\\n", argv[0]); exit(EXIT_FAILURE); } /* Standard input contains mount ID and file handle information: Line 1: <mount_id> Line 2: <handle_bytes> <handle_type> <bytes of handle in hex> */ if ((fgets(line1, sizeof(line1), stdin) == NULL) || (fgets(line2, sizeof(line2), stdin) == NULL)) { fprintf(stderr, "Missing mount_id / file handle\\n"); exit(EXIT_FAILURE); } mount_id = atoi(line1); handle_bytes = strtoul(line2, &nextp, 0); /* Given handle_bytes, we can now allocate file_handle structure */ fhp = malloc(sizeof(struct file_handle) + handle_bytes); if (fhp == NULL) errExit("malloc"); fhp\->handle_bytes = handle_bytes; fhp\->handle_type = strtoul(nextp, &nextp, 0); for (j = 0; j < fhp\->handle_bytes; j++) fhp\->f_handle[j] = strtoul(nextp, &nextp, 16); /* Obtain file descriptor for mount point, either by opening the pathname specified on the command line, or by scanning /proc/self/mounts to find a mount that matches the \(aqmount_id\(aq that we received from stdin. */ if (argc > 1) mount_fd = open(argv[1], O_RDONLY); else mount_fd = open_mount_path_by_id(mount_id); if (mount_fd == \-1) errExit("opening mount fd"); /* Open file using handle and mount point */ fd = open_by_handle_at(mount_fd, fhp, O_RDONLY); if (fd == \-1) errExit("open_by_handle_at"); /* Try reading a few bytes from the file */ nread = read(fd, buf, sizeof(buf)); if (nread == \-1) errExit("read"); printf("Read %zd bytes\\n", nread); exit(EXIT_SUCCESS); } .fi .SH SEE ALSO .BR open (2), .BR libblkid (3), .BR blkid (8), .BR findfs (8), .BR mount (8) The .I libblkid and .I libmount documentation in the latest .I util-linux release at .UR https://www.kernel.org/pub/linux/utils/util-linux/ .UE -- To unsubscribe from this list: send the line "unsubscribe linux-man" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page [v2] 2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages) 2014-03-19 4:13 ` NeilBrown @ 2014-03-19 6:42 ` Mike Frysinger 2014-03-19 13:11 ` Michael Kerrisk (man-pages) 1 sibling, 1 reply; 22+ messages in thread From: Mike Frysinger @ 2014-03-19 6:42 UTC (permalink / raw) To: Michael Kerrisk (man-pages) Cc: Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, NeilBrown, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 5942 bytes --] On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote: > The > .I flags > argument is a bit mask constructed by ORing together > zero or more of the following value: > .TP > .B AT_EMPTY_PATH > Allow > .I pathname > to be an empty string. > See above. > (which may have been obtained using the > .BR open (2) > .B O_PATH > flag). > .TP > .B AT_SYMLINK_FOLLOW > By default, > .BR name_to_handle_at () > does not dereference > .I pathname > if it is a symbolic link. > The flag > .B AT_SYMLINK_FOLLOW > can be specified in > .I flags > to cause > .I pathname > to be dereferenced if it is a symbolic link. this section is only talking about |flags|, and further this part is only talking about AT_SYMLINK_FOLLOW. so this last sentence sounds super redundant. how about reversing the sentence order so that both are implicit like is done in the openat() page and the description of O_NOFOLLOW ? > .B ENOTDIR > The file descriptor supplied in > .I dirfd > does not refer to a directory, > and it it is not the case that both "it" is duplicated > .SS Obtaining a persistent filesystem ID > The mount IDs in > .IR /proc/self/mountinfo > can be reused as filesystems are unmounted and mounted. > Therefore, the mount ID returned by > .BR name_to_handle_at (3) should be () and not (3) side note: this seems like an easy error to script for ... > $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP aber, ich spreche kein Deutsch :( do we have a standard about sticking to english ? i wonder if people are more likely to be confused or to appreciate it ... > #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ > } while (0) i wonder if err.h makes sense now that this is a man page for completely linux-specific syscalls :). and you use _GNU_SOURCE. > int > main(int argc, char *argv[]) > { > struct file_handle *fhp; > int mount_id, fhsize, s; > > if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) { argc != 2 ? > /* Allocate file_handle structure */ > > fhsize = sizeof(struct file_handle *); pretty sure this is wrong as sizeof() here returns the size of a pointer, not the size of the struct. it's why i prefer the form: fhsize = sizeof(*fhp); less typing and harder to screw up by accident. granted, the case below won't crash since the kernel only reads/writes sizeof(unsigned int) and i'm not aware of any system where that is larger than sizeof(void *), but it's still wrong :). > s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0); another personal style: create dedicated variables for each arg you unpack out of argv[1]. it's generally OK when you only take one arg, but when you get more than one, you end up flipping back and forth between the usage trying to figure out what index 1 represents instead of focusing on what the code is doing. const char *pathname = argv[1]; > fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; fhsize += fhp->handle_bytes ? it's the same, but i think nicer ;) > /* Write mount ID, file handle size, and file handle to stdout, > for later reuse by t_open_by_handle_at.c */ > > if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) || > write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) || > write(STDOUT_FILENO, fhp, fhsize) != fhsize) { seems like a whole lot of code spew for a simple printf() ? you'd have to adjust the other program to use scanf(), but seems like the end result would be nicer for users to experiment with ? > static int > open_mount_path_by_id(int mount_id) > { > char *linep; > size_t lsize; > char mount_path[PATH_MAX]; > int fmnt_id, fnd, nread; could we buy a few more letters for these vars ? i guess fmnt_id is the filesystem mount id, and fnd is "find". also, getline() returns a ssize_t, not an int. > FILE *fp; > > fp = fopen("/proc/self/mountinfo", "r"); only one space before the = i would encourage using the "e" flag whenever possible in the hopes that someone might start using it in their own code base. fp = fopen("/proc/self/mountinfo", "re"); > for (fnd = 0; !fnd ; ) { in my experience, seems like a while() loop makes more sense when you're implementing a while() loop ... fnd = 0; while (!fnd) { > linep = NULL; > nread = getline(&linep, &lsize, fp); this works, but it's unusual when using getline() as it kind of defeats the purpose of using the dyn allocation feature. fnd = 0; linep = NULL; while (!fnd) { nread = getline(&linep, &lsize, fp); ... } free(linep); i don't think it complicates the code much more ? > if (nread == \-1) > break; > > nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path); indent is off here > return open(mount_path, O_RDONLY | O_DIRECTORY); O_CLOEXEC for funsies ? > int > main(int argc, char *argv[]) > { > struct file_handle *fhp; > int mount_id, fd, mount_fd, fhsize; > ssize_t nread; > #define BSIZE 1000 > char buf[BSIZE]; why not sizeof(buf) and avoid the define ? > if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) { > fprintf(stderr, "Usage: %s [mount\-dir]]\\n", > argv[0]); how about also aborting when argc > 2 ? > if (argc > 1) > mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); O_CLOEXEC ? > nread = read(fd, buf, BSIZE); > if (nread == \-1) > errExit("read"); > printf("Read %ld bytes\\n", (long) nread); yikes, that's a bad habit to encourage. read() returns a ssize_t, so print it out using %zd. > .SH SEE ALSO > .BR blkid (1), > .BR findfs (1), i don't have a findfs(1). i do have a findfs(8) ... -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: For review: open_by_name_at(2) man page [v2] 2014-03-19 6:42 ` For review: open_by_name_at(2) man page [v2] Mike Frysinger @ 2014-03-19 13:11 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 22+ messages in thread From: Michael Kerrisk (man-pages) @ 2014-03-19 13:11 UTC (permalink / raw) To: Mike Frysinger Cc: mtk.manpages, Aneesh Kumar K.V, linux-man, Linux-Fsdevel, lkml, Andreas Dilger, NeilBrown, Christoph Hellwig Hi Mike, On 03/19/2014 07:42 AM, Mike Frysinger wrote: > On Tue 18 Mar 2014 13:55:15 Michael Kerrisk wrote: >> The >> .I flags >> argument is a bit mask constructed by ORing together >> zero or more of the following value: >> .TP >> .B AT_EMPTY_PATH >> Allow >> .I pathname >> to be an empty string. >> See above. >> (which may have been obtained using the >> .BR open (2) >> .B O_PATH >> flag). >> .TP >> .B AT_SYMLINK_FOLLOW >> By default, >> .BR name_to_handle_at () >> does not dereference >> .I pathname >> if it is a symbolic link. >> The flag >> .B AT_SYMLINK_FOLLOW >> can be specified in >> .I flags >> to cause >> .I pathname >> to be dereferenced if it is a symbolic link. > > this section is only talking about |flags|, and further this part is only > talking about AT_SYMLINK_FOLLOW. so this last sentence sounds super > redundant. > how about reversing the sentence order so that both are implicit like is done > in the openat() page and the description of O_NOFOLLOW ? I'm not sure that I completely understand you here, but I agree that this could be better. I've rewritten somewhat. >> .B ENOTDIR >> The file descriptor supplied in >> .I dirfd >> does not refer to a directory, >> and it it is not the case that both > > "it" is duplicated Fixed. >> .SS Obtaining a persistent filesystem ID >> The mount IDs in >> .IR /proc/self/mountinfo >> can be reused as filesystems are unmounted and mounted. >> Therefore, the mount ID returned by >> .BR name_to_handle_at (3) > > should be () and not (3) Fixed. > side note: this seems like an easy error to script for ... Yep, I've got some scripts that I run manually now and then to check for this sort of stuff. >> $ \fBecho 'Kannst du bitte überlegen?' > cecilia.txt\fP > > aber, ich spreche kein Deutsch :( > > do we have a standard about sticking to english ? i wonder if people are more > likely to be confused or to appreciate it ... Fair enough. I'm too influenced by recent work on the locale pages (and family conversations ;-)). I'll switch it to English >> #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \\ >> } while (0) > > i wonder if err.h makes sense now that this is a man page for completely > linux-specific syscalls :). and you use _GNU_SOURCE. I'm not really convinced about using these functions, but I'll reflect on it more. >> int >> main(int argc, char *argv[]) >> { >> struct file_handle *fhp; >> int mount_id, fhsize, s; >> >> if (argc < 2 || strcmp(argv[1], "\-\-help") == 0) { > > argc != 2 ? Yes, some cruft crept in. >> /* Allocate file_handle structure */ >> >> fhsize = sizeof(struct file_handle *); > > pretty sure this is wrong <blush> > as sizeof() here returns the size of a pointer, not > the size of the struct. it's why i prefer the form: > > fhsize = sizeof(*fhp); > > less typing and harder to screw up by accident. Yep, changed. > granted, the case below won't crash since the kernel only reads/writes > sizeof(unsigned int) and i'm not aware of any system where that is larger than > sizeof(void *), but it's still wrong :). > >> s = name_to_handle_at(AT_FDCWD, argv[1], fhp, &mount_id, 0); > > another personal style: create dedicated variables for each arg you unpack out > of argv[1]. it's generally OK when you only take one arg, but when you get > more than one, you end up flipping back and forth between the usage trying to > figure out what index 1 represents instead of focusing on what the code is > doing. > const char *pathname = argv[1]; Yup. >> fhsize = sizeof(struct file_handle) + fhp\->handle_bytes; > > fhsize += fhp->handle_bytes ? > > it's the same, but i think nicer ;) Depends on your perspective. It relies on no one changing the code so that fhsize is modified after the earlier initialization. And also, with this line, I see exactly what is going on, in one place. I'll leave as is. >> /* Write mount ID, file handle size, and file handle to stdout, >> for later reuse by t_open_by_handle_at.c */ >> >> if (write(STDOUT_FILENO, &mount_id, sizeof(int)) != sizeof(int) || >> write(STDOUT_FILENO, &fhsize, sizeof(int)) != sizeof(int) || >> write(STDOUT_FILENO, fhp, fhsize) != fhsize) { > > seems like a whole lot of code spew for a simple printf() ? you'd have to > adjust the other program to use scanf(), but seems like the end result would > be nicer for users to experiment with ? Yes. I'd already reflected on exactly that and made a change to using text formats. >> static int >> open_mount_path_by_id(int mount_id) >> { >> char *linep; >> size_t lsize; >> char mount_path[PATH_MAX]; >> int fmnt_id, fnd, nread; > > could we buy a few more letters for these vars ? i guess fmnt_id is the > filesystem mount id, and fnd is "find". When I was a kid, you had to pay a dollar for each letter... (I've made a few changes.) > also, getline() returns a ssize_t, not an int. Fixed. >> FILE *fp; >> >> fp = fopen("/proc/self/mountinfo", "r"); > > only one space before the = Yup. > i would encourage using the "e" flag whenever possible in the hopes that > someone might start using it in their own code base. > > fp = fopen("/proc/self/mountinfo", "re"); I'm of two minds about this. I foresee the day when I get a bug report that says: "Why did you use 'e' here (or O_CLOEXEC)? It's not needed". So, I'm inclined to leave this. >> for (fnd = 0; !fnd ; ) { > > in my experience, seems like a while() loop makes more sense when you're > implementing a while() loop ... > fnd = 0; > while (!fnd) { Yup. ;-}. >> linep = NULL; >> nread = getline(&linep, &lsize, fp); > > this works, but it's unusual when using getline() as it kind of defeats the > purpose of using the dyn allocation feature. > > fnd = 0; > linep = NULL; > while (!fnd) { > nread = getline(&linep, &lsize, fp); > ... > } > free(linep); > > i don't think it complicates the code much more ? Yes. Fixed. >> if (nread == \-1) >> break; >> >> nread = sscanf(linep, "%d %*d %*s %*s %s", &fmnt_id, mount_path); > > indent is off here Fixed. >> return open(mount_path, O_RDONLY | O_DIRECTORY); > > O_CLOEXEC for funsies ? See above comment. >> int >> main(int argc, char *argv[]) >> { >> struct file_handle *fhp; >> int mount_id, fd, mount_fd, fhsize; >> ssize_t nread; >> #define BSIZE 1000 >> char buf[BSIZE]; > > why not sizeof(buf) and avoid the define ? Done. >> if (argc > 1 && strcmp(argv[1], "\-\-help") == 0) { >> fprintf(stderr, "Usage: %s [mount\-dir]]\\n", >> argv[0]); > > how about also aborting when argc > 2 ? Yes. >> if (argc > 1) >> mount_fd = open(argv[1], O_RDONLY | O_DIRECTORY); > > O_CLOEXEC ? See comment above. >> nread = read(fd, buf, BSIZE); >> if (nread == \-1) >> errExit("read"); >> printf("Read %ld bytes\\n", (long) nread); > > yikes, that's a bad habit to encourage. read() returns a ssize_t, so print it > out using %zd. Calling it a bad habit seems a bit too strong. It's a habit conditioned by writing code that runs on systems that don't have C99. Less important these days, of course. I've changed it. >> .SH SEE ALSO >> .BR blkid (1), >> .BR findfs (1), > > i don't have a findfs(1). i do have a findfs(8) ... Thanks. blkid(8) also, actually. Thanks for the comments, Mike. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2014-03-19 14:14 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-03-17 15:57 For review: open_by_name_at(2) man page Michael Kerrisk (man-pages) 2014-03-17 22:00 ` NeilBrown 2014-03-18 9:43 ` Christoph Hellwig 2014-03-18 12:37 ` Michael Kerrisk (man-pages) 2014-03-18 22:24 ` NeilBrown 2014-03-18 22:24 ` NeilBrown 2014-03-19 9:09 ` Michael Kerrisk (man-pages) 2014-03-18 12:35 ` Michael Kerrisk (man-pages) 2014-03-18 12:35 ` Michael Kerrisk (man-pages) 2014-03-18 13:07 ` Christoph Hellwig 2014-03-18 13:30 ` Michael Kerrisk (man-pages) 2014-03-18 9:37 ` Christoph Hellwig 2014-03-18 12:41 ` Michael Kerrisk (man-pages) 2014-03-18 12:55 ` For review: open_by_name_at(2) man page [v2] Michael Kerrisk (man-pages) 2014-03-19 4:13 ` NeilBrown 2014-03-19 4:13 ` NeilBrown 2014-03-19 9:09 ` Michael Kerrisk (man-pages) 2014-03-19 9:09 ` Michael Kerrisk (man-pages) 2014-03-19 14:14 ` For review: open_by_handle_at(2) man page [v3] Michael Kerrisk (man-pages) 2014-03-19 14:14 ` Michael Kerrisk (man-pages) 2014-03-19 6:42 ` For review: open_by_name_at(2) man page [v2] Mike Frysinger 2014-03-19 13:11 ` Michael Kerrisk (man-pages)
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.