From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp1040.oracle.com ([156.151.31.81]:40664 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755412AbdCGSDI (ORCPT ); Tue, 7 Mar 2017 13:03:08 -0500 Date: Tue, 7 Mar 2017 10:02:17 -0800 From: "Darrick J. Wong" To: David Howells Cc: Christoph Hellwig , mtk.manpages@gmail.com, linux-fsdevel , xfs Subject: Re: statx manpage Message-ID: <20170307180217.GF5281@birch.djwong.org> References: <20170307050140.GA12946@infradead.org> <20170307000609.GG5280@birch.djwong.org> <10435.1488907375@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <10435.1488907375@warthog.procyon.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Mar 07, 2017 at 05:22:55PM +0000, David Howells wrote: > Christoph Hellwig wrote: > > > This would be great to have in 4.11 together with the initial statx > > implementation. But until I see documentation and testcases for statx > > I don't really feel comfortable reviewing anything related to it. > > Well, since you asked for documentation, here's a manual page for you to > review:-) HA, ok. This came in while I was scribbling a reply to Christoph's email. I'll have a look. > Note that as it isn't in glibc yet, I've left out all the > set-this-and-that-#define-to-make-it-appear stuff except where it is pertinent > to particular constants. > > I don't suppose you know where the documentation on writing xfstests tests is? > xfstests-dev/doc/ only contains an old and out of date changelog. /me usually just copies one of the newer tests and changes it. > David > --- > '\" t > .\" Copyright (c) 1992 Drew Eckhardt (drew@cs.colorado.edu), March 28, 1992 > .\" Parts Copyright (c) 1995 Nicolai Langfeldt (janl@ifi.uio.no), 1/1/95 > .\" and Copyright (c) 2006, 2007, 2014 Michael Kerrisk > .\" and Copyright (c) 2017 David Howells > .\" > .\" %%%LICENSE_START(VERBATIM) > .\" Permission is granted to make and distribute verbatim copies of this > .\" manual provided the copyright notice and this permission notice are > .\" preserved on all copies. > .\" > .\" Permission is granted to copy and distribute modified versions of this > .\" manual under the conditions for verbatim copying, provided that the > .\" entire resulting derived work is distributed under the terms of a > .\" permission notice identical to this one. > .\" > .\" Since the Linux kernel and libraries are constantly changing, this > .\" manual page may be incorrect or out-of-date. The author(s) assume no > .\" responsibility for errors or omissions, or for damages resulting from > .\" the use of the information contained herein. The author(s) may not > .\" have taken the same level of care in the production of this manual, > .\" which is licensed free of charge, as they might when working > .\" professionally. > .\" > .\" Formatted or processed versions of this manual, if unaccompanied by > .\" the source, must acknowledge the copyright and authors of this work. > .\" %%%LICENSE_END > .\" > .TH STATX 2 2017-03-07 "Linux" "Linux Programmer's Manual" > .SH NAME > statx \- Get file status (extended) > .SH SYNOPSIS > .nf > .B #include > .br > .B #include > .br > .B #include > .br > .BR "#include " "/* Definition of AT_* constants */" > .sp > .BI "int statx(int " dirfd ", const char *" pathname ", int " flags "," > .BI " unsigned int " mask ", struct statx *" buf ); > .fi > .sp > .in -4n > Feature Test Macro Requirements for glibc (see > .BR feature_test_macros (7)): > .in > .ad l > .PD 0 > .sp > .RS 4 > > .RE > .PD > .ad > .SH DESCRIPTION > .PP > This function returns information about a file, storing it in the buffer > pointed to by > .IR buf . > The buffer is filled in according to the following type: > .PP > .in +4n > .nf > struct statx { > __u32 stx_mask; -- Mask of bits indicating filled fields > __u32 stx_blksize; -- Block size for filesystem I/O > __u64 stx_attributes; -- Extra file attribute indicators > __u32 stx_nlink; -- Number of hard links > __u32 stx_uid; -- User ID of owner > __u32 stx_gid; -- Group ID of owner > __u16 stx_mode; -- File type and mode > __u64 stx_ino; -- Inode number > __u64 stx_size; -- Total size in bytes > __u64 stx_blocks; -- Number of 512B blocks allocated > struct statx_timestamp stx_atime; -- Time of last access > struct statx_timestamp stx_btime; -- Time of creation > struct statx_timestamp stx_ctime; -- Time of last status change > struct statx_timestamp stx_mtime; -- Time of last modification > __u32 stx_rdev_major; } Device number if device file "of device file" ? > __u32 stx_rdev_minor; } > __u32 stx_dev_major; } Device number of containing file > __u32 stx_dev_minor; } "Device number of device containing file"? Or perhaps just "ID of device containing file" from the stat(2) manpage? > }; > .fi > .in > .PP > Where the timestamps are defined as: > .PP > .in +4n > .nf > struct statx_timestamp { > __s64 tv_sec; -- Number of seconds before or since 1970 > __s32 tv_nsec; -- Number of nanoseconds before or since tv_sec > }; > .fi > .in > .PP > (Note that reserved space and padding is ommitted) > .SS > Invoking \fBstatx\fR(): > .PP > To access a file's status, no permissions are required on the file itself, but > in the case of > .BR statx () > with a path, execute (search) permission is required on all of the directories > in > .I pathname > that lead to the file. > .PP > .BR statx () > uses > .IR pathname ", " dirfd " and " flags > to locate the target file in one of a variety of ways: > .TP > [*] By absolute path. > .I pathname > points to an absolute path and > .I dirfd > is ignored. The file is looked up by name, starting from the root of the > filesystem as seen by the calling process. > .TP > [*] By cwd-relative path. > .I pathname > points to a relative path and > .IR dirfd " is " AT_FDCWD . > The file is looked up by name, starting from the current working directory. > .TP > [*] By dir-relative path. > .I pathname > points to relative path and > .I dirfd > indicates a file descriptor pointing to a directory. The file is looked up by > name, starting from the directory specified by > .IR dirfd . > .TP > [*] By file descriptor. > .IR pathname " is " NULL " and " dirfd > indicates a file descriptor. The file attached to the file descriptor is > queried directly. The file descriptor may point to any type of file, not just > a directory. > .PP > .I flags > can be used to influence a path-based lookup. A value for > .I flags > is constructed by OR'ing together zero or more of the following constants: > .TP > .BR AT_EMPTY_PATH " (since Linux 2.6.39)" > .\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d > If > .I pathname > is an empty string, operate on the file referred to by > .IR dirfd > (which may have been obtained using the > .BR open (2) > .B O_PATH > flag). > If > .I dirfd > is > .BR AT_FDCWD , > the call operates on the current working directory. > In this case, > .I dirfd > can refer to any type of file, not just a directory. Is (flags & AT_EMPTY_PATH) is the same as (pathname == NULL)? > This flag is Linux-specific; define > .B _GNU_SOURCE > .\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed > to obtain its definition. > .TP > .BR AT_NO_AUTOMOUNT " (since Linux 2.6.38)" > Don't automount the terminal ("basename") component of > .I pathname > if it is a directory that is an automount point. > This allows the caller to gather attributes of an automount point > (rather than the location it would mount). > This flag can be used in tools that scan directories > to prevent mass-automounting of a directory of automount points. > The > .B AT_NO_AUTOMOUNT > flag has no effect if the mount point has already been mounted over. > This flag is Linux-specific; define > .B _GNU_SOURCE > .\" Before glibc 2.16, defining _ATFILE_SOURCE sufficed > to obtain its definition. > .TP > .B AT_SYMLINK_NOFOLLOW > If > .I pathname > is a symbolic link, do not dereference it: > instead return information about the link itself, like > .BR lstat (). > .PP > .I flags > can also be used to control what sort of synchronisation the kernel will do > when querying a file on a remote filesystem. This is done by OR'ing in one of > the following values: > .TP > AT_STATX_SYNC_AS_STAT > Do whatever > .BR stat () > does. This is the default and is very much filesystem specific. > .TP > AT_STATX_FORCE_SYNC > Force the attributes to be synchronised with the server. This may require that > a network filesystem perform a data writeback to get the timestamps correct. > .TP > AT_STATX_DONT_SYNC > Don't synchronise anything, but rather just take whatever the system has cached > if possible. This may mean that the information returned is approximate, but, > on a network filesystem, it may not involve a round trip to the server - even > if no lease is held. > .PP > The > .I mask > argument to > .BR statx () > is used to tell the kernel which fields the caller is interested in > .I mask > is an OR'ed combination of the following constants: > .PP > .in +4n > .TS > lB l. > STATX_TYPE Want stx_mode & S_IFMT > STATX_MODE Want stx_mode & ~S_IFMT > STATX_NLINK Want stx_nlink > STATX_UID Want stx_uid > STATX_GID Want stx_gid > STATX_ATIME Want stx_atime{,_ns} > STATX_MTIME Want stx_mtime{,_ns} > STATX_CTIME Want stx_ctime{,_ns} > STATX_INO Want stx_ino > STATX_SIZE Want stx_size > STATX_BLOCKS Want stx_blocks > STATX_BASIC_STATS [The stuff in the normal stat struct] > STATX_BTIME Want stx_btime{,_ns} > STATX_ALL [All currently available stuff] > .TE > .in > .PP > .B "Do not" > simply set > .I mask > to UINT_MAX as one or more bits may, in future, be used to specify an extension > to the buffer. > .SS > The returned information > .PP > The status information for the target file is returned in the > .I statx > structure pointed to by > .IR buf . > Included in this is > .I stx_mask > which indicates what other information has been returned. > .I stx_mask > has the same format as the mask argument and bits are set in it to indicate > which fields have been filled in. > .PP > It should be noted that the kernel may return fields that weren't requested and > may fail to return fields that were requested, depending on what the backing > filesystem supports. In either case, > .I stx_mask > will not be equal > .IR mask . > .PP > If a filesystem does not support a field or if it has an unrepresentable value > (for instance, a file with an exotic type), then the mask bit corresponding to > that field will be cleared in > .I stx_mask > even if the user asked for it and a dummy value will be filled in for > compatibility purposes if one is available (e.g. a dummy uid and gid may be > specified to mount under some circumstances). > .PP > A filesystem may also fill in fields that the caller didn't ask for if it has > values for them available at no extra cost. If this happens, the corresponding > bits will be set in > .IR stx_mask . > .PP > > .\" Background: inode attributes are modified with i_mutex held, but > .\" read by stat() without taking the mutex. > .I Note: > For performance and simplicity reasons, different fields in the > .I statx > structure may contain state information from different moments > during the execution of the system call. Hm. Judging from the ext4 patch you proposed, I gather this is expected, at least in the btime case. --D > For example, if > .IR stx_mode > or > .IR stx_uid > is changed by another process by calling > .BR chmod (2) > or > .BR chown (2), > .BR stat () > might return the old > .I stx_mode > together with the new > .IR stx_uid , > or the old > .I stx_uid > together with the new > .IR stx_mode . > .PP > Apart from stx_mask (which is described above), the fields in the > .I statx > structure are: > .TP > .I stx_mode > The file type and mode. This is described in more detail below. > .TP > .I stx_size > The size of the file (if it is a regular file or a symbolic link) in bytes. > The size of a symbolic link is the length of the pathname it contains, without > a terminating null byte. > .TP > .I stx_blocks > The number of blocks allocated to the file on the medium, in 512-byte units. > (This may be smaller than > .IR stx_size /512 > when the file has holes.) > .TP > .I stx_blksize > The "preferred" blocksize for efficient filesystem I/O. (Writing to a file in > smaller chunks may cause an inefficient read-modify-rewrite.) > .TP > .I stx_nlink > The number of hard links on a file. > .TP > .I stx_uid > The user ID of the file's owner. > .TP > .I stx_gid > The ID of the group that may access the file. > .TP > .IR stx_dev_major " and " stx_dev_minor > The device on which this file (inode) resides. > .TP > .IR stx_rdev_major " and " stx_rdev_minor > The device that this file (inode) represents if the file is of block or > character device type. > .TP > .I stx_attributes > Further status information about the file. This consists of zero or more of > the following constants OR'ed together: > .in +4n > .TS > lB l. > STATX_ATTR_COMPRESSED File is compressed by the fs > STATX_ATTR_IMMUTABLE File is marked immutable > STATX_ATTR_APPEND File is append-only > STATX_ATTR_NODUMP File is not to be dumped > STATX_ATTR_ENCRYPTED File requires key to decrypt in fs > .TE > .in > .TP > .I stx_atime > The file's last access timestamp. > This field is changed by file accesses, for example, by > .BR execve (2), > .BR mknod (2), > .BR pipe (2), > .BR utime (2), > and > .BR read (2) > (of more than zero bytes). > Other routines, such as > .BR mmap (2), > may or may not update it. > .TP > .I stx_btime > The file's creation timestamp. This is set on file creation and not changed > subsequently. > .TP > .I stx_ctime > The file's last status change timestamp. This field is changed by writing or > by setting inode information (i.e., owner, group, link count, mode, etc.). > .TP > .I stx_mtime > The file's last modification timestamp. This is changed by file modifications, > for example, by > .BR mknod (2), > .BR truncate (2), > .BR utime (2), > and > .BR write (2) > (of more than zero bytes). Moreover, the modification time of a directory is > changed by the creation or deletion of files in that directory. This field is > .I not > changed for changes in owner, group, hard link count, or mode. > > > > .PP > Not all of the Linux filesystems implement all of the timestamp fields. Some > filesystems allow mounting in such a way that file and/or directory accesses do > not cause an update of the > .I stx_atime > field. > (See > .IR noatime , > .IR nodiratime , > and > .I relatime > in > .BR mount (8), > and related information in > .BR mount (2).) > In addition, > .I stx_atime > is not updated if a file is opened with the > .BR O_NOATIME ; > see > .BR open (2). > > .SS File type and mode > .PP > The > .I stx_mode > field contains the combined file type and mode. POSIX refers to the bits in > this field corresponding to the mask > .B S_IFMT > (see below) as the > .IR "file type" , > the 12 bits corresponding to the mask 07777 as the > .IR "file mode bits" > and the least significant 9 bits (0777) as the > .IR "file permission bits" . > .IP > The following mask values are defined for the file type of the > .I stx_mode > field: > .in +4n > .TS > lB l l. > S_IFMT 0170000 bit mask for the file type bit field > > S_IFSOCK 0140000 socket > S_IFLNK 0120000 symbolic link > S_IFREG 0100000 regular file > S_IFBLK 0060000 block device > S_IFDIR 0040000 directory > S_IFCHR 0020000 character device > S_IFIFO 0010000 FIFO > .TE > .in > .IP > Note that > .I stx_mode > has two mask flags covering it: one for the type and one for the mode bits. > .PP > To test for a regular file (for example), one could write: > .nf > .in +4n > statx(AT_FDCWD, pathname, 0, STATX_BASIC_STATS, &sb); > if ((sb.stx_mode & S_IFMT) == S_IFREG) { > /* Handle regular file */ > } > .in > .fi > .PP > Because tests of the above form are common, additional macros are defined by > POSIX to allow the test of the file type in > .I stx_mode > to be written more concisely: > .RS 4 > .TS > lB l. > \fBS_ISREG\fR(m) Is it a regular file? > \fBS_ISDIR\fR(m) Is it a directory? > \fBS_ISCHR\fR(m) Is it a character device? > \fBS_ISBLK\fR(m) Is it a block device? > \fBS_ISFIFO\fR(m) Is it a FIFO (named pipe)? > \fBS_ISLNK\fR(m) Is it a symbolic link? (Not in POSIX.1-1996.) > \fBS_ISSOCK\fR(m) Is it a socket? (Not in POSIX.1-1996.) > .TE > .RE > .PP > The preceding code snippet could thus be rewritten as: > > .nf > .in +4n > statx(AT_FDCWD, pathname, 0, STATX_BASIC_STATS, &sb); > if (S_ISREG(sb.stx_mode)) { > /* Handle regular file */ > } > .in > .fi > .PP > The definitions of most of the above file type test macros > are provided if any of the following feature test macros is defined: > .BR _BSD_SOURCE > (in glibc 2.19 and earlier), > .BR _SVID_SOURCE > (in glibc 2.19 and earlier), > or > .BR _DEFAULT_SOURCE > (in glibc 2.20 and later). > In addition, definitions of all of the above macros except > .BR S_IFSOCK > and > .BR S_ISSOCK () > are provided if > .BR _XOPEN_SOURCE > is defined. > The definition of > .BR S_IFSOCK > can also be exposed by defining > .BR _XOPEN_SOURCE > with a value of 500 or greater. > > The definition of > .BR S_ISSOCK () > is exposed if any of the following feature test macros is defined: > .BR _BSD_SOURCE > (in glibc 2.19 and earlier), > .BR _DEFAULT_SOURCE > (in glibc 2.20 and later), > .BR _XOPEN_SOURCE > with a value of 500 or greater, or > .BR _POSIX_C_SOURCE > with a value of 200112L or greater. > .PP > The following mask values are defined for > the file mode component of the > .I stx_mode > field: > .in +4n > .TS > lB l l. > S_ISUID 04000 set-user-ID bit > S_ISGID 02000 set-group-ID bit (see below) > S_ISVTX 01000 sticky bit (see below) > > S_IRWXU 00700 owner has read, write, and execute permission > S_IRUSR 00400 owner has read permission > S_IWUSR 00200 owner has write permission > S_IXUSR 00100 owner has execute permission > > S_IRWXG 00070 group has read, write, and execute permission > S_IRGRP 00040 group has read permission > S_IWGRP 00020 group has write permission > S_IXGRP 00010 group has execute permission > > S_IRWXO 00007 T{ > others (not in group) have read, write, and execute permission > T} > S_IROTH 00004 others have read permission > S_IWOTH 00002 others have write permission > S_IXOTH 00001 others have execute permission > .TE > .in > .P > The set-group-ID bit > .RB ( S_ISGID ) > has several special uses. > For a directory, it indicates that BSD semantics is to be used > for that directory: files created there inherit their group ID from > the directory, not from the effective group ID of the creating process, > and directories created there will also get the > .B S_ISGID > bit set. > For a file that does not have the group execution bit > .RB ( S_IXGRP ) > set, > the set-group-ID bit indicates mandatory file/record locking. > .P > The sticky bit > .RB ( S_ISVTX ) > on a directory means that a file > in that directory can be renamed or deleted only by the owner > of the file, by the owner of the directory, and by a privileged > process. > > > .SH RETURN VALUE > On success, zero is returned. > On error, \-1 is returned, and > .I errno > is set appropriately. > .SH ERRORS > .TP > .B EINVAL > Invalid flag specified in > .IR flags . > .TP > .B EACCES > Search permission is denied for one of the directories > in the path prefix of > .IR pathname . > (See also > .BR path_resolution (7).) > .TP > .B EBADF > .I dirfd > is not a valid open file descriptor. > .TP > .B EFAULT > Bad address. > .TP > .B ELOOP > Too many symbolic links encountered while traversing the path. > .TP > .B ENAMETOOLONG > .I pathname > is too long. > .TP > .B ENOENT > A component of > .I pathname > does not exist, or > .I pathname > is an empty string. > .TP > .B ENOMEM > Out of memory (i.e., kernel memory). > .TP > .B ENOTDIR > A component of the path prefix of > .I pathname > is not a directory or > .I pathname > is relative and > .I dirfd > is a file descriptor referring to a file other than a directory. > .SH VERSIONS > .BR statx () > was added to Linux in kernel 4.11; > library support is not yet added to glibc. > .SH SEE ALSO > .BR ls (1), > .BR stat (1), > .BR access (2), > .BR chmod (2), > .BR chown (2), > .BR readlink (2), > .BR utime (2), > .BR capabilities (7), > .BR symlink (7)