From: NeilBrown <neilb@suse.de>
To: Josef Bacik <josef@toxicpanda.com>, Chris Mason <clm@fb.com>,
David Sterba <dsterba@suse.com>
Cc: linux-fsdevel@vger.kernel.org,
Linux NFS list <linux-nfs@vger.kernel.org>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: [PATCH 3/4] VFS/btrfs: add STATX_TREE_ID
Date: Mon, 09 Aug 2021 13:55:27 +1000 [thread overview]
Message-ID: <162848132775.25823.2813836616908535300.stgit@noble.brown> (raw)
In-Reply-To: <162848123483.25823.15844774651164477866.stgit@noble.brown>
A new 64bit field is added to the data that can be returned by statx() -
the "tree id".
The tree id serves two needs.
1/ it extends the available inode number space. While a filesystem
SHOULD ensure the inode number is unique across the filesystem,
this is sometimes impractical. In such situations, 'tree id'
may be used to guarantee uniqueness. It can identify a separate
allocation domain.
A particular case when separate allocation domains is useful
is when a directory tree can be effectively "reflink"ed.
Updating all inode numbers in such a tree is prohibitively expensive.
2/ it can identify a collection of objects that provide a coherent
"tree" in some locally-defined sense.
This patch uses STATX_TREE_ID to export the subvol id for btrfs.
samples/vfs/test_statx.c is extended to report the treeid.
Also: a new superblock field is added: s_tree_id_bits. This can store
the number of significant bits in the reported treeid. It is
currently unused, but could be used by overlayfs to determine how
to add a filesystem number to the treeid to differentiate files in
different underlying filesystems.
Signed-off-by: NeilBrown <neilb@suse.de>
---
fs/btrfs/inode.c | 4 ++++
fs/btrfs/super.c | 1 +
fs/stat.c | 1 +
include/linux/fs.h | 2 +-
include/linux/stat.h | 13 +++++++++++++
include/uapi/linux/stat.h | 3 ++-
samples/vfs/test-statx.c | 4 +++-
7 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 30fa64cbe6dc..c878726d090c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9215,6 +9215,10 @@ static int btrfs_getattr(struct user_namespace *mnt_userns,
case BTRFS_MANY_DEVS:
stat->dev = BTRFS_I(inode)->root->anon_dev;
}
+ if (request_mask & STATX_TREE_ID) {
+ stat->tree_id = BTRFS_I(inode)->root->root_key.objectid;
+ stat->result_mask |= STATX_TREE_ID;
+ }
spin_lock(&BTRFS_I(inode)->lock);
delalloc_bytes = BTRFS_I(inode)->new_delalloc_bytes;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b1aecb834234..e6d166150660 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1410,6 +1410,7 @@ static int btrfs_fill_super(struct super_block *sb,
#endif
sb->s_flags |= SB_I_VERSION;
sb->s_iflags |= SB_I_CGROUPWB;
+ sb->s_tree_id_bits = 48;
err = super_setup_bdi(sb);
if (err) {
diff --git a/fs/stat.c b/fs/stat.c
index 1fa38bdec1a6..2dd5d3d67793 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -580,6 +580,7 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
tmp.stx_dev_major = MAJOR(stat->dev);
tmp.stx_dev_minor = MINOR(stat->dev);
tmp.stx_mnt_id = stat->mnt_id;
+ tmp.stx_tree_id = stat->tree_id;
return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 640574294216..a777c1b1706a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1482,7 +1482,7 @@ struct super_block {
unsigned int s_max_links;
fmode_t s_mode;
-
+ short s_tree_id_bits;
/*
* The next field is for VFS *only*. No filesystems have any business
* even looking at it. You had been warned.
diff --git a/include/linux/stat.h b/include/linux/stat.h
index fff27e603814..08ee409786b3 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -46,6 +46,19 @@ struct kstat {
struct timespec64 btime; /* File creation time */
u64 blocks;
u64 mnt_id;
+ /* Treeid can be used to extend the inode number space. Two inodes
+ * with different 'tree_id' are different, even if 'ino' is the same
+ * (though fs should make ino different as often as possible).
+ * When tree_id is requested and STATX_TREE_ID is set in result_mask,
+ * 'ino' MUST be unique across the filesystem. Specifically, two
+ * open files that report the same dev, ino, and tree_id MUST be
+ * the same.
+ * If a directory and an object in that directory have the same dev
+ * and tree_id, they can be assumed to be in a meaningful tree, though
+ * the meaning is subject to local interpretation. The set of inodes
+ * with a common tree_id is not required to be contiguous.
+ */
+ u64 tree_id;
};
#endif
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 1500a0f58041..725cf3f8e873 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -124,7 +124,7 @@ struct statx {
__u32 stx_dev_minor;
/* 0x90 */
__u64 stx_mnt_id;
- __u64 __spare2;
+ __u64 stx_tree_id;
/* 0xa0 */
__u64 __spare3[12]; /* Spare space for future expansion */
/* 0x100 */
@@ -152,6 +152,7 @@ struct statx {
#define STATX_BASIC_STATS 0x000007ffU /* The stuff in the normal stat struct */
#define STATX_BTIME 0x00000800U /* Want/got stx_btime */
#define STATX_MNT_ID 0x00001000U /* Got stx_mnt_id */
+#define STATX_TREE_ID 0x00002000U /* Want/got stx_treeid and clean stX_ino */
#define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
diff --git a/samples/vfs/test-statx.c b/samples/vfs/test-statx.c
index 49c7a46cee07..c1141764fa2e 100644
--- a/samples/vfs/test-statx.c
+++ b/samples/vfs/test-statx.c
@@ -118,6 +118,8 @@ static void dump_statx(struct statx *stx)
break;
}
}
+ if (stx->stx_mask & STATX_TREE_ID)
+ printf(" Tree: %-12llu", (unsigned long long) stx->stx_tree_id);
printf("\n");
if (stx->stx_mask & STATX_MODE)
@@ -218,7 +220,7 @@ int main(int argc, char **argv)
struct statx stx;
int ret, raw = 0, atflag = AT_SYMLINK_NOFOLLOW;
- unsigned int mask = STATX_BASIC_STATS | STATX_BTIME;
+ unsigned int mask = STATX_BASIC_STATS | STATX_BTIME | STATX_TREE_ID;
for (argv++; *argv; argv++) {
if (strcmp(*argv, "-F") == 0) {
next prev parent reply other threads:[~2021-08-09 3:58 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-09 3:55 [PATCH/RFC 0/4] Attempt to make progress with btrfs dev number strangeness NeilBrown
2021-08-09 3:55 ` [PATCH 4/4] Add "tree" number to "inode" number in various /proc files NeilBrown
2021-08-09 3:55 ` NeilBrown [this message]
2021-08-09 3:55 ` [PATCH 1/4] btrfs: include subvol identifier in inode number if -o inumbits= NeilBrown
2021-08-09 3:55 ` [PATCH 2/4] btrfs: add numdevs= mount option NeilBrown
2021-08-09 7:50 ` kernel test robot
2021-08-10 20:51 ` [PATCH/RFC 0/4] Attempt to make progress with btrfs dev number strangeness Josef Bacik
2021-08-11 22:13 ` NeilBrown
2021-08-12 13:54 ` Josef Bacik
2021-08-12 14:06 ` Hugo Mills
2021-08-12 22:35 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=162848132775.25823.2813836616908535300.stgit@noble.brown \
--to=neilb@suse.de \
--cc=clm@fb.com \
--cc=dsterba@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).