* [PATCH 0/6] Sort out overlay layers and fs arrays
@ 2019-11-17 15:43 Amir Goldstein
2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
` (6 more replies)
0 siblings, 7 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
Miklos,
When I started generalizing the lower_layers/lower_fs arrays
I noticed a bug that was introduced in v4.17 with xino.
In the case of lower layer on upper fs, we do not have a pseudo_dev
assigned to lower layer and we expose the real lower st_dev;st_ino.
This happens on non-samefs when xino is disabled (default).
This is a very real bug, not really a corner case and I have an
an xfstest [1] for it that I will post later.
In the mean while, I also pushed a fix to unionmount-testsuite devel
branch [2] to demonstrate the issue.
With upstream kernel, this test ends up with a copied up file
from middle layer, whose on same fs as upper and its exposed
st_dev;st_ino are invalid:
./run --ov=1 --verify hard-link
...
/mnt/a/no_foo110: File unexpectedly on upper layer
Patch 1 in the series is a small fix for stable that fixes the
v4.17 regression in favor of a different, less severe regression.
The new regression can be demonstrated with:
./run --ov=1 --verify --xino hard-link
...
/mnt/a/no_foo110: inode number/layer changed on copy up
(got 39:24707, was 39:24700)
Patches 2-4 generalize the lower_{layer/fs} arrays to layer/fs arrays
and get rid of some special casing of upper layer.
Patches 5-6 use the cleanup to solve the corner case that you pointed
out with bas_uuid [3] and to fix the regression introduced by patch 1.
After patch 6, both unionmount-testsuite configurations
above pass the test st_dev;st_ino verifications.
I doubt if patches 2-6 are stable material, because not sure the
corner cases they fix are worth the trouble.
The series depends on the bad_uuid patch v5 that I posted on Thursday.
I was also considering setting xino=on by default if xino_auto
is enabled, because what have we got to loose?
The inodes whose st_ino fit in lower bits (by far more common) will
use overlay st_dev and the inodes whose st_ino overflow the lower bits
will use pseudo_dev. Seems like a win-win situation, but I wanted to
get your feedback on this before sending out a patch.
Thanks,
Amir.
[1] https://github.com/amir73il/xfstests/commit/c667f26839ae487c509b95abae670fdca1c535c8
[2] https://github.com/amir73il/unionmount-testsuite/commit/1724ef2245c5e56f73e436b37407d00ef498f9bc
[3] https://lore.kernel.org/lkml/CAJfpegufS=OGcvFbWEVumNSCPO_JXyEuJNAbmO5ubscSarVtRQ@mail.gmail.com/
Amir Goldstein (6):
ovl: fix corner case of non-unique st_dev;st_ino
ovl: generalize the lower_layers[] array
ovl: simplify ovl_same_sb() helper
ovl: generalize the lower_fs[] array
ovl: fix corner case of conflicting lower layer uuid
ovl: fix corner case of non-constant st_dev;st_ino
fs/overlayfs/export.c | 6 +-
fs/overlayfs/inode.c | 35 +++++------
fs/overlayfs/namei.c | 10 ++--
fs/overlayfs/overlayfs.h | 23 ++++++-
fs/overlayfs/ovl_entry.h | 14 +++--
fs/overlayfs/readdir.c | 11 ++--
fs/overlayfs/super.c | 125 ++++++++++++++++++++++-----------------
fs/overlayfs/util.c | 18 ++----
8 files changed, 132 insertions(+), 110 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 2/6] ovl: generalize the lower_layers[] array Amir Goldstein
` (5 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs and pure upper inodes use
the real upper st_dev.
It is fine for an overlay pure upper inode to use the same st_dev;st_ino
values as the real upper inode, because the content of those two
different filesystem objects is always the same.
In this case, however:
- two filesystems, A and B
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B
Non pure upper overlay inode, whose origin is in layer 1 will have the
same st_dev;st_ino values as the real lower inode. This may result with
a false positive results of 'diff' between the real lower and copied up
overlay inode.
Fix this by using the upper st_dev;st_ino values in this case.
This breaks the property of constant st_dev;st_ino across copy up
of this case. This breakage will be fixed by a later patch.
Fixes: 5148626b806a ("ovl: allocate anon bdev per unique lower fs")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/inode.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index bc14781886bf..b045cf1826fc 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -200,8 +200,14 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
(!ovl_verify_lower(dentry->d_sb) &&
(is_dir || lowerstat.nlink == 1))) {
- stat->ino = lowerstat.ino;
lower_layer = ovl_layer_lower(dentry);
+ /*
+ * Cannot use origin st_dev;st_ino because
+ * origin inode content may differ from overlay
+ * inode content.
+ */
+ if (samefs || lower_layer->fsid)
+ stat->ino = lowerstat.ino;
}
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/6] ovl: generalize the lower_layers[] array
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 3/6] ovl: simplify ovl_same_sb() helper Amir Goldstein
` (4 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
Rename lower_layers[] array to layers[], extend its size by one
and initialize layers[0] with upper layer values.
Lower layers are now addressed with index 1..numlower.
layers[0] is reserved even with lower only overlay.
This gets rid of special casing upper layer in ovl_iterate_real().
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/export.c | 6 ++----
fs/overlayfs/inode.c | 4 ++--
fs/overlayfs/namei.c | 10 +++++-----
fs/overlayfs/overlayfs.h | 2 +-
fs/overlayfs/ovl_entry.h | 2 +-
fs/overlayfs/readdir.c | 7 +++----
fs/overlayfs/super.c | 43 ++++++++++++++++++++++------------------
fs/overlayfs/util.c | 6 ++++--
8 files changed, 42 insertions(+), 38 deletions(-)
diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index 70e55588aedc..9128cbb3b198 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -424,7 +424,6 @@ static struct dentry *ovl_lookup_real_inode(struct super_block *sb,
struct ovl_layer *layer)
{
struct ovl_fs *ofs = sb->s_fs_info;
- struct ovl_layer upper_layer = { .mnt = ofs->upper_mnt };
struct dentry *index = NULL;
struct dentry *this = NULL;
struct inode *inode;
@@ -466,7 +465,7 @@ static struct dentry *ovl_lookup_real_inode(struct super_block *sb,
* recursive call walks back from indexed upper to the topmost
* connected/hashed upper parent (or up to root).
*/
- this = ovl_lookup_real(sb, upper, &upper_layer);
+ this = ovl_lookup_real(sb, upper, &ofs->layers[0]);
dput(upper);
}
@@ -646,8 +645,7 @@ static struct dentry *ovl_get_dentry(struct super_block *sb,
struct dentry *index)
{
struct ovl_fs *ofs = sb->s_fs_info;
- struct ovl_layer upper_layer = { .mnt = ofs->upper_mnt };
- struct ovl_layer *layer = upper ? &upper_layer : lowerpath->layer;
+ struct ovl_layer *layer = upper ? &ofs->layers[0] : lowerpath->layer;
struct dentry *real = upper ?: (index ?: lowerpath->dentry);
/*
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index b045cf1826fc..dab39933bbd4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -170,7 +170,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
*/
if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
if (!OVL_TYPE_UPPER(type)) {
- lower_layer = ovl_layer_lower(dentry);
+ lower_layer = ovl_dentry_layer(dentry);
} else if (OVL_TYPE_ORIGIN(type)) {
struct kstat lowerstat;
u32 lowermask = STATX_INO | STATX_BLOCKS |
@@ -200,7 +200,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
(!ovl_verify_lower(dentry->d_sb) &&
(is_dir || lowerstat.nlink == 1))) {
- lower_layer = ovl_layer_lower(dentry);
+ lower_layer = ovl_dentry_layer(dentry);
/*
* Cannot use origin st_dev;st_ino because
* origin inode content may differ from overlay
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 80fbca5219d4..f16bf608eed7 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -324,16 +324,16 @@ int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, bool connected,
struct dentry *origin = NULL;
int i;
- for (i = 0; i < ofs->numlower; i++) {
+ for (i = 1; i <= ofs->numlower; i++) {
/*
* If lower fs uuid is not unique among lower fs we cannot match
* fh->uuid to layer.
*/
- if (ofs->lower_layers[i].fsid &&
- ofs->lower_layers[i].fs->bad_uuid)
+ if (ofs->layers[i].fsid &&
+ ofs->layers[i].fs->bad_uuid)
continue;
- origin = ovl_decode_real_fh(fh, ofs->lower_layers[i].mnt,
+ origin = ovl_decode_real_fh(fh, ofs->layers[i].mnt,
connected);
if (origin)
break;
@@ -356,7 +356,7 @@ int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, bool connected,
}
**stackp = (struct ovl_path){
.dentry = origin,
- .layer = &ofs->lower_layers[i]
+ .layer = &ofs->layers[i]
};
return 0;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f283b1d69a9e..50d41a314308 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -237,7 +237,7 @@ enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
struct dentry *ovl_dentry_upper(struct dentry *dentry);
struct dentry *ovl_dentry_lower(struct dentry *dentry);
struct dentry *ovl_dentry_lowerdata(struct dentry *dentry);
-struct ovl_layer *ovl_layer_lower(struct dentry *dentry);
+struct ovl_layer *ovl_dentry_layer(struct dentry *dentry);
struct dentry *ovl_dentry_real(struct dentry *dentry);
struct dentry *ovl_i_dentry_upper(struct inode *inode);
struct inode *ovl_inode_upper(struct inode *inode);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 28348c44ea5b..ffaf7376f4ab 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -48,7 +48,7 @@ struct ovl_fs {
unsigned int numlower;
/* Number of unique lower sb that differ from upper sb */
unsigned int numlowerfs;
- struct ovl_layer *lower_layers;
+ struct ovl_layer *layers;
struct ovl_sb *lower_fs;
/* workbasedir is the path at workdir= mount option */
struct dentry *workbasedir;
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 47a91c9733a5..32a7f8a38091 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -508,7 +508,7 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
ino = stat.ino;
} else if (xinobits && !OVL_TYPE_UPPER(type)) {
ino = ovl_remap_lower_ino(ino, xinobits,
- ovl_layer_lower(this)->fsid,
+ ovl_dentry_layer(this)->fsid,
p->name, p->len);
}
@@ -685,15 +685,14 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
int err;
struct ovl_dir_file *od = file->private_data;
struct dentry *dir = file->f_path.dentry;
- struct ovl_layer *lower_layer = ovl_layer_lower(dir);
struct ovl_readdir_translate rdt = {
.ctx.actor = ovl_fill_real,
.orig_ctx = ctx,
.xinobits = ovl_xino_bits(dir->d_sb),
};
- if (rdt.xinobits && lower_layer)
- rdt.fsid = lower_layer->fsid;
+ if (rdt.xinobits)
+ rdt.fsid = ovl_dentry_layer(dir)->fsid;
if (OVL_TYPE_MERGE(ovl_path_type(dir->d_parent))) {
struct kstat stat;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index cfb236e37fd9..4ff6eede5f90 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -224,13 +224,13 @@ static void ovl_free_fs(struct ovl_fs *ofs)
if (ofs->upperdir_locked)
ovl_inuse_unlock(ofs->upper_mnt->mnt_root);
mntput(ofs->upper_mnt);
- for (i = 0; i < ofs->numlower; i++) {
- iput(ofs->lower_layers[i].trap);
- mntput(ofs->lower_layers[i].mnt);
+ for (i = 1; i <= ofs->numlower; i++) {
+ iput(ofs->layers[i].trap);
+ mntput(ofs->layers[i].mnt);
}
for (i = 0; i < ofs->numlowerfs; i++)
free_anon_bdev(ofs->lower_fs[i].pseudo_dev);
- kfree(ofs->lower_layers);
+ kfree(ofs->layers);
kfree(ofs->lower_fs);
kfree(ofs->config.lowerdir);
@@ -1318,16 +1318,16 @@ static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
return ofs->numlowerfs;
}
-static int ovl_get_lower_layers(struct super_block *sb, struct ovl_fs *ofs,
- struct path *stack, unsigned int numlower)
+static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
+ struct path *stack, unsigned int numlower)
{
int err;
unsigned int i;
err = -ENOMEM;
- ofs->lower_layers = kcalloc(numlower, sizeof(struct ovl_layer),
- GFP_KERNEL);
- if (ofs->lower_layers == NULL)
+ ofs->layers = kcalloc(numlower + 1, sizeof(struct ovl_layer),
+ GFP_KERNEL);
+ if (ofs->layers == NULL)
goto out;
ofs->lower_fs = kcalloc(numlower, sizeof(struct ovl_sb),
@@ -1335,6 +1335,11 @@ static int ovl_get_lower_layers(struct super_block *sb, struct ovl_fs *ofs,
if (ofs->lower_fs == NULL)
goto out;
+ /* idx 0 is reserved for upper fs even with lower only overlay */
+ ofs->layers[0].mnt = ofs->upper_mnt;
+ ofs->layers[0].idx = 0;
+ ofs->layers[0].fsid = 0;
+
for (i = 0; i < numlower; i++) {
struct vfsmount *mnt;
struct inode *trap;
@@ -1368,15 +1373,15 @@ static int ovl_get_lower_layers(struct super_block *sb, struct ovl_fs *ofs,
*/
mnt->mnt_flags |= MNT_READONLY | MNT_NOATIME;
- ofs->lower_layers[ofs->numlower].trap = trap;
- ofs->lower_layers[ofs->numlower].mnt = mnt;
- ofs->lower_layers[ofs->numlower].idx = i + 1;
- ofs->lower_layers[ofs->numlower].fsid = fsid;
+ ofs->numlower++;
+ ofs->layers[ofs->numlower].trap = trap;
+ ofs->layers[ofs->numlower].mnt = mnt;
+ ofs->layers[ofs->numlower].idx = ofs->numlower;
+ ofs->layers[ofs->numlower].fsid = fsid;
if (fsid) {
- ofs->lower_layers[ofs->numlower].fs =
+ ofs->layers[ofs->numlower].fs =
&ofs->lower_fs[fsid - 1];
}
- ofs->numlower++;
}
/*
@@ -1463,7 +1468,7 @@ static struct ovl_entry *ovl_get_lowerstack(struct super_block *sb,
goto out_err;
}
- err = ovl_get_lower_layers(sb, ofs, stack, numlower);
+ err = ovl_get_layers(sb, ofs, stack, numlower);
if (err)
goto out_err;
@@ -1474,7 +1479,7 @@ static struct ovl_entry *ovl_get_lowerstack(struct super_block *sb,
for (i = 0; i < numlower; i++) {
oe->lowerstack[i].dentry = dget(stack[i].dentry);
- oe->lowerstack[i].layer = &ofs->lower_layers[i];
+ oe->lowerstack[i].layer = &ofs->layers[i+1];
}
if (remote)
@@ -1555,9 +1560,9 @@ static int ovl_check_overlapping_layers(struct super_block *sb,
return err;
}
- for (i = 0; i < ofs->numlower; i++) {
+ for (i = 1; i <= ofs->numlower; i++) {
err = ovl_check_layer(sb, ofs,
- ofs->lower_layers[i].mnt->mnt_root,
+ ofs->layers[i].mnt->mnt_root,
"lowerdir");
if (err)
return err;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f5678a3f8350..3fa1ca8ddd48 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -198,11 +198,13 @@ struct dentry *ovl_dentry_lower(struct dentry *dentry)
return oe->numlower ? oe->lowerstack[0].dentry : NULL;
}
-struct ovl_layer *ovl_layer_lower(struct dentry *dentry)
+/* Either top most lower layer or upper layer for pure upper */
+struct ovl_layer *ovl_dentry_layer(struct dentry *dentry)
{
+ struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
struct ovl_entry *oe = dentry->d_fsdata;
- return oe->numlower ? oe->lowerstack[0].layer : NULL;
+ return oe->numlower ? oe->lowerstack[0].layer : &ofs->layers[0];
}
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/6] ovl: simplify ovl_same_sb() helper
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
2019-11-17 15:43 ` [PATCH 2/6] ovl: generalize the lower_layers[] array Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
` (3 subsequent siblings)
6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
No code uses the sb returned from this helper, so make it retrun
a boolean and rename it to ovl_same_fs().
The xino mode is irrelevant when all layers are on same fs, so
instead of describing samefs with mode OVL_XINO_OFF, use a new mode
OVL_XINO_SAME_FS, which is different than the case of non-samefs
with xino=off.
Create a new helper ovl_same_dev(), to use instead of the common check
for (ovl_same_fs() || xinobits).
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/inode.c | 7 +++----
fs/overlayfs/overlayfs.h | 21 ++++++++++++++++++++-
fs/overlayfs/ovl_entry.h | 5 +++++
fs/overlayfs/readdir.c | 4 ++--
fs/overlayfs/super.c | 13 +++++--------
fs/overlayfs/util.c | 12 ------------
6 files changed, 35 insertions(+), 27 deletions(-)
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index dab39933bbd4..717337f2b3fb 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -78,7 +78,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
struct ovl_layer *lower_layer)
{
- bool samefs = ovl_same_sb(dentry->d_sb);
+ bool samefs = ovl_same_fs(dentry->d_sb);
unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
if (samefs) {
@@ -146,7 +146,6 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
struct path realpath;
const struct cred *old_cred;
bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
- bool samefs = ovl_same_sb(dentry->d_sb);
struct ovl_layer *lower_layer = NULL;
int err;
bool metacopy_blocks = false;
@@ -168,7 +167,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
* If lower filesystem supports NFS file handles, this also guaranties
* persistent st_ino across mount cycle.
*/
- if (!is_dir || samefs || ovl_xino_bits(dentry->d_sb)) {
+ if (!is_dir || ovl_same_dev(dentry->d_sb)) {
if (!OVL_TYPE_UPPER(type)) {
lower_layer = ovl_dentry_layer(dentry);
} else if (OVL_TYPE_ORIGIN(type)) {
@@ -565,7 +564,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
* ovl_new_inode(), ino arg is 0, so i_ino will be updated to real
* upper inode i_ino on ovl_inode_init() or ovl_inode_update().
*/
- if (ovl_same_sb(inode->i_sb) || xinobits) {
+ if (ovl_same_dev(inode->i_sb)) {
inode->i_ino = ino;
if (xinobits && fsid && !(ino >> (64 - xinobits)))
inode->i_ino |= (unsigned long)fsid << (64 - xinobits);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 50d41a314308..f7d01c06cdaf 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -45,6 +45,14 @@ enum ovl_entry_flag {
OVL_E_CONNECTED,
};
+enum {
+ OVL_XINO_OFF,
+ OVL_XINO_AUTO,
+ OVL_XINO_ON,
+ /* With samefs, xino is irrelevant */
+ OVL_XINO_SAME_FS,
+};
+
/*
* The tuple (fh,uuid) is a universal unique identifier for a copy up origin,
* where:
@@ -221,7 +229,6 @@ int ovl_want_write(struct dentry *dentry);
void ovl_drop_write(struct dentry *dentry);
struct dentry *ovl_workdir(struct dentry *dentry);
const struct cred *ovl_override_creds(struct super_block *sb);
-struct super_block *ovl_same_sb(struct super_block *sb);
int ovl_can_decode_fh(struct super_block *sb);
struct dentry *ovl_indexdir(struct super_block *sb);
bool ovl_index_all(struct super_block *sb);
@@ -306,6 +313,18 @@ static inline unsigned int ovl_xino_bits(struct super_block *sb)
return ofs->xino_bits;
}
+/* All layers on same fs? */
+static inline bool ovl_same_fs(struct super_block *sb)
+{
+ return OVL_FS(sb)->config.xino == OVL_XINO_SAME_FS;
+}
+
+/* All overlay inodes have same st_dev? */
+static inline bool ovl_same_dev(struct super_block *sb)
+{
+ return OVL_FS(sb)->config.xino != OVL_XINO_OFF;
+}
+
static inline int ovl_inode_lock(struct inode *inode)
{
return mutex_lock_interruptible(&OVL_I(inode)->lock);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index ffaf7376f4ab..ef05817d8d89 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -75,6 +75,11 @@ struct ovl_fs {
unsigned int xino_bits;
};
+static inline struct ovl_fs *OVL_FS(struct super_block *sb)
+{
+ return (struct ovl_fs *)sb->s_fs_info;
+}
+
/* private information held for every overlayfs dentry */
struct ovl_entry {
union {
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 32a7f8a38091..56f13e9ccbe6 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -469,7 +469,7 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p)
int xinobits = ovl_xino_bits(dir->d_sb);
int err = 0;
- if (!ovl_same_sb(dir->d_sb) && !xinobits)
+ if (!ovl_same_dev(dir->d_sb))
goto out;
if (p->name[0] == '.') {
@@ -737,7 +737,7 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx)
* entries.
*/
if (ovl_xino_bits(dentry->d_sb) ||
- (ovl_same_sb(dentry->d_sb) &&
+ (ovl_same_fs(dentry->d_sb) &&
(ovl_is_impure_dir(file) ||
OVL_TYPE_MERGE(ovl_path_type(dentry->d_parent))))) {
return ovl_iterate_real(file, ctx);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 4ff6eede5f90..ae22c3363c33 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -316,12 +316,6 @@ static const char *ovl_redirect_mode_def(void)
return ovl_redirect_dir_def ? "on" : "off";
}
-enum {
- OVL_XINO_OFF,
- OVL_XINO_AUTO,
- OVL_XINO_ON,
-};
-
static const char * const ovl_xino_str[] = {
"off",
"auto",
@@ -358,7 +352,8 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
if (ofs->config.nfs_export != ovl_nfs_export_def)
seq_printf(m, ",nfs_export=%s", ofs->config.nfs_export ?
"on" : "off");
- if (ofs->config.xino != ovl_xino_def())
+ if (ofs->config.xino != ovl_xino_def() &&
+ ofs->config.xino != OVL_XINO_SAME_FS)
seq_printf(m, ",xino=%s", ovl_xino_str[ofs->config.xino]);
if (ofs->config.metacopy != ovl_metacopy_def)
seq_printf(m, ",metacopy=%s",
@@ -1393,8 +1388,10 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
* inode number.
*/
if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
+ if (ofs->config.xino == OVL_XINO_ON)
+ pr_info("overlayfs: \"xino=on\" is useless with all layers on same fs, ignore.\n");
ofs->xino_bits = 0;
- ofs->config.xino = OVL_XINO_OFF;
+ ofs->config.xino = OVL_XINO_SAME_FS;
} else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
/*
* This is a roundup of number of bits needed for numlowerfs+1
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 3fa1ca8ddd48..256f166b4a17 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -40,18 +40,6 @@ const struct cred *ovl_override_creds(struct super_block *sb)
return override_creds(ofs->creator_cred);
}
-struct super_block *ovl_same_sb(struct super_block *sb)
-{
- struct ovl_fs *ofs = sb->s_fs_info;
-
- if (!ofs->numlowerfs)
- return ofs->upper_mnt->mnt_sb;
- else if (ofs->numlowerfs == 1 && !ofs->upper_mnt)
- return ofs->lower_fs[0].sb;
- else
- return NULL;
-}
-
/*
* Check if underlying fs supports file handles and try to determine encoding
* type, in order to deduce maximum inode number used by fs.
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/6] ovl: generalize the lower_fs[] array
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
` (2 preceding siblings ...)
2019-11-17 15:43 ` [PATCH 3/6] ovl: simplify ovl_same_sb() helper Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
2019-11-18 17:01 ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid Amir Goldstein
` (2 subsequent siblings)
6 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
Rename lower_fs[] array to fs[], extend its size by one and initialize
fs[0] with upper fs values. fsid 0 is already reserved even with lower
only overlay, so fs[0] remains uninitialized in this case as well.
Rename numlowerfs to maxfsid and use index 0..maxfsid to access the
fs[] array.
This gets rid of special casing upper layer in ovl_map_dev_ino().
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/inode.c | 31 +++++++++----------
fs/overlayfs/ovl_entry.h | 5 ++-
fs/overlayfs/super.c | 66 +++++++++++++++++++++-------------------
3 files changed, 51 insertions(+), 51 deletions(-)
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 717337f2b3fb..9e894f5e19b4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -75,8 +75,7 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
return err;
}
-static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
- struct ovl_layer *lower_layer)
+static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
{
bool samefs = ovl_same_fs(dentry->d_sb);
unsigned int xinobits = ovl_xino_bits(dentry->d_sb);
@@ -103,9 +102,7 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
pr_warn_ratelimited("overlayfs: inode number too big (%pd2, ino=%llu, xinobits=%d)\n",
dentry, stat->ino, xinobits);
} else {
- if (lower_layer)
- stat->ino |= ((u64)lower_layer->fsid) << shift;
-
+ stat->ino |= ((u64)fsid) << shift;
stat->dev = dentry->d_sb->s_dev;
return 0;
}
@@ -124,15 +121,15 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat,
*/
stat->dev = dentry->d_sb->s_dev;
stat->ino = dentry->d_inode->i_ino;
- } else if (lower_layer && lower_layer->fsid) {
+ } else {
/*
* For non-samefs setup, if we cannot map all layers st_ino
* to a unified address space, we need to make sure that st_dev
- * is unique per lower fs. Upper layer uses real st_dev and
- * lower layers use the unique anonymous bdev assigned to the
- * lower fs.
+ * is unique per lower fs. Layers that are on the same fs as
+ * upper layer use real upper st_dev and other lower layers use
+ * the unique anonymous bdev assigned to the lower fs.
*/
- stat->dev = lower_layer->fs->pseudo_dev;
+ stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
}
return 0;
@@ -146,7 +143,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
struct path realpath;
const struct cred *old_cred;
bool is_dir = S_ISDIR(dentry->d_inode->i_mode);
- struct ovl_layer *lower_layer = NULL;
+ int fsid;
int err;
bool metacopy_blocks = false;
@@ -168,9 +165,8 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
* persistent st_ino across mount cycle.
*/
if (!is_dir || ovl_same_dev(dentry->d_sb)) {
- if (!OVL_TYPE_UPPER(type)) {
- lower_layer = ovl_dentry_layer(dentry);
- } else if (OVL_TYPE_ORIGIN(type)) {
+ fsid = ovl_dentry_layer(dentry)->fsid;
+ if (OVL_TYPE_ORIGIN(type)) {
struct kstat lowerstat;
u32 lowermask = STATX_INO | STATX_BLOCKS |
(!is_dir ? STATX_NLINK : 0);
@@ -199,14 +195,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
(!ovl_verify_lower(dentry->d_sb) &&
(is_dir || lowerstat.nlink == 1))) {
- lower_layer = ovl_dentry_layer(dentry);
/*
* Cannot use origin st_dev;st_ino because
* origin inode content may differ from overlay
* inode content.
*/
- if (samefs || lower_layer->fsid)
+ if (samefs || fsid)
stat->ino = lowerstat.ino;
+ } else {
+ fsid = 0;
}
/*
@@ -240,7 +237,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
}
}
- err = ovl_map_dev_ino(dentry, stat, lower_layer);
+ err = ovl_map_dev_ino(dentry, stat, fsid);
if (err)
goto out;
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index ef05817d8d89..93e9fd5fba9d 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -46,10 +46,9 @@ struct ovl_path {
struct ovl_fs {
struct vfsmount *upper_mnt;
unsigned int numlower;
- /* Number of unique lower sb that differ from upper sb */
- unsigned int numlowerfs;
+ unsigned int maxfsid;
struct ovl_layer *layers;
- struct ovl_sb *lower_fs;
+ struct ovl_sb *fs;
/* workbasedir is the path at workdir= mount option */
struct dentry *workbasedir;
/* workdir is the 'work' directory under workbasedir */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ae22c3363c33..09e53780d9bc 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -228,10 +228,13 @@ static void ovl_free_fs(struct ovl_fs *ofs)
iput(ofs->layers[i].trap);
mntput(ofs->layers[i].mnt);
}
- for (i = 0; i < ofs->numlowerfs; i++)
- free_anon_bdev(ofs->lower_fs[i].pseudo_dev);
kfree(ofs->layers);
- kfree(ofs->lower_fs);
+ if (ofs->fs) {
+ /* fs[0].pseudo_dev is either null or real upper st_dev */
+ for (i = 1; i <= ofs->maxfsid; i++)
+ free_anon_bdev(ofs->fs[i].pseudo_dev);
+ kfree(ofs->fs);
+ }
kfree(ofs->config.lowerdir);
kfree(ofs->config.upperdir);
@@ -1253,7 +1256,7 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
if (!ofs->config.nfs_export && !ofs->upper_mnt)
return true;
- for (i = 0; i < ofs->numlowerfs; i++) {
+ for (i = 1; i <= ofs->maxfsid; i++) {
/*
* We use uuid to associate an overlay lower file handle with a
* lower layer, so we can accept lower fs with null uuid as long
@@ -1261,8 +1264,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
* if we detect multiple lower fs with the same uuid, we
* disable lower file handle decoding on all of them.
*/
- if (uuid_equal(&ofs->lower_fs[i].sb->s_uuid, uuid)) {
- ofs->lower_fs[i].bad_uuid = true;
+ if (uuid_equal(&ofs->fs[i].sb->s_uuid, uuid)) {
+ ofs->fs[i].bad_uuid = true;
return false;
}
}
@@ -1278,13 +1281,9 @@ static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
int err;
bool bad_uuid = false;
- /* fsid 0 is reserved for upper fs even with non upper overlay */
- if (ofs->upper_mnt && ofs->upper_mnt->mnt_sb == sb)
- return 0;
-
- for (i = 0; i < ofs->numlowerfs; i++) {
- if (ofs->lower_fs[i].sb == sb)
- return i + 1;
+ for (i = 0; i <= ofs->maxfsid; i++) {
+ if (ofs->fs[i].sb == sb)
+ return i;
}
if (!ovl_lower_uuid_ok(ofs, &sb->s_uuid)) {
@@ -1305,12 +1304,12 @@ static int ovl_get_fsid(struct ovl_fs *ofs, const struct path *path)
return err;
}
- ofs->lower_fs[ofs->numlowerfs].sb = sb;
- ofs->lower_fs[ofs->numlowerfs].pseudo_dev = dev;
- ofs->lower_fs[ofs->numlowerfs].bad_uuid = bad_uuid;
- ofs->numlowerfs++;
+ ofs->maxfsid++;
+ ofs->fs[ofs->maxfsid].sb = sb;
+ ofs->fs[ofs->maxfsid].pseudo_dev = dev;
+ ofs->fs[ofs->maxfsid].bad_uuid = bad_uuid;
- return ofs->numlowerfs;
+ return ofs->maxfsid;
}
static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
@@ -1325,16 +1324,24 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
if (ofs->layers == NULL)
goto out;
- ofs->lower_fs = kcalloc(numlower, sizeof(struct ovl_sb),
- GFP_KERNEL);
- if (ofs->lower_fs == NULL)
+ ofs->fs = kcalloc(numlower + 1, sizeof(struct ovl_sb), GFP_KERNEL);
+ if (ofs->fs == NULL)
goto out;
- /* idx 0 is reserved for upper fs even with lower only overlay */
+ /* idx/fsid 0 are reserved for upper fs even with lower only overlay */
ofs->layers[0].mnt = ofs->upper_mnt;
ofs->layers[0].idx = 0;
ofs->layers[0].fsid = 0;
+ /*
+ * All lower layers that share the same fs as upper layer, use the real
+ * upper st_dev.
+ */
+ if (ofs->upper_mnt) {
+ ofs->fs[0].sb = ofs->upper_mnt->mnt_sb;
+ ofs->fs[0].pseudo_dev = ofs->upper_mnt->mnt_sb->s_dev;
+ }
+
for (i = 0; i < numlower; i++) {
struct vfsmount *mnt;
struct inode *trap;
@@ -1373,10 +1380,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
ofs->layers[ofs->numlower].mnt = mnt;
ofs->layers[ofs->numlower].idx = ofs->numlower;
ofs->layers[ofs->numlower].fsid = fsid;
- if (fsid) {
- ofs->layers[ofs->numlower].fs =
- &ofs->lower_fs[fsid - 1];
- }
+ ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
}
/*
@@ -1387,19 +1391,19 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
* bits reserved for fsid, it emits a warning and uses the original
* inode number.
*/
- if (!ofs->numlowerfs || (ofs->numlowerfs == 1 && !ofs->upper_mnt)) {
+ if (ofs->maxfsid == 0 || (ofs->maxfsid == 1 && !ofs->upper_mnt)) {
if (ofs->config.xino == OVL_XINO_ON)
pr_info("overlayfs: \"xino=on\" is useless with all layers on same fs, ignore.\n");
ofs->xino_bits = 0;
ofs->config.xino = OVL_XINO_SAME_FS;
} else if (ofs->config.xino == OVL_XINO_ON && !ofs->xino_bits) {
/*
- * This is a roundup of number of bits needed for numlowerfs+1
- * (i.e. ilog2(numlowerfs+1 - 1) + 1). fsid 0 is reserved for
- * upper fs even with non upper overlay.
+ * This is a roundup of number of bits needed for encoding
+ * fsid, where fsid 0 is reserved for upper fs even with
+ * lower only overlay.
*/
BUILD_BUG_ON(ilog2(OVL_MAX_STACK) > 31);
- ofs->xino_bits = ilog2(ofs->numlowerfs) + 1;
+ ofs->xino_bits = ilog2(ofs->maxfsid) + 1;
}
if (ofs->xino_bits) {
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
` (3 preceding siblings ...)
2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino Amir Goldstein
2019-11-18 6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
This fixes ovl_lower_uuid_ok() to correctly detect the corner case:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B
In this case, bad_uuid would not have been set for B, because the check
only involved the list of lower fs. Hence we'll try to decode a layer 2
origin on layer 1 and fail.
We check for conflicting (and null) uuid among all lower layers,
including those layers that are on the same fs as the upper layer.
Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/ovl_entry.h | 2 ++
fs/overlayfs/super.c | 8 ++++++--
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 93e9fd5fba9d..6c04b6a4e6bf 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -24,6 +24,8 @@ struct ovl_sb {
dev_t pseudo_dev;
/* Unusable (conflicting) uuid */
bool bad_uuid;
+ /* Used as a lower layer (but maybe also as upper) */
+ bool is_lower;
};
struct ovl_layer {
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 09e53780d9bc..e80f79bb8a4e 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1256,7 +1256,7 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
if (!ofs->config.nfs_export && !ofs->upper_mnt)
return true;
- for (i = 1; i <= ofs->maxfsid; i++) {
+ for (i = 0; i <= ofs->maxfsid; i++) {
/*
* We use uuid to associate an overlay lower file handle with a
* lower layer, so we can accept lower fs with null uuid as long
@@ -1264,7 +1264,8 @@ static bool ovl_lower_uuid_ok(struct ovl_fs *ofs, const uuid_t *uuid)
* if we detect multiple lower fs with the same uuid, we
* disable lower file handle decoding on all of them.
*/
- if (uuid_equal(&ofs->fs[i].sb->s_uuid, uuid)) {
+ if (ofs->fs[i].is_lower &&
+ uuid_equal(&ofs->fs[i].sb->s_uuid, uuid)) {
ofs->fs[i].bad_uuid = true;
return false;
}
@@ -1336,10 +1337,12 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
/*
* All lower layers that share the same fs as upper layer, use the real
* upper st_dev.
+ * is_lower will be set if upper fs is shared with a lower layer.
*/
if (ofs->upper_mnt) {
ofs->fs[0].sb = ofs->upper_mnt->mnt_sb;
ofs->fs[0].pseudo_dev = ofs->upper_mnt->mnt_sb->s_dev;
+ ofs->fs[0].is_lower = false;
}
for (i = 0; i < numlower; i++) {
@@ -1381,6 +1384,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
ofs->layers[ofs->numlower].idx = ofs->numlower;
ofs->layers[ofs->numlower].fsid = fsid;
ofs->layers[ofs->numlower].fs = &ofs->fs[fsid];
+ ofs->fs[fsid].is_lower = true;
}
/*
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
` (4 preceding siblings ...)
2019-11-17 15:43 ` [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid Amir Goldstein
@ 2019-11-17 15:43 ` Amir Goldstein
2019-11-18 6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
6 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-17 15:43 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, linux-unionfs
On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs, but if lower layer is on
the same fs and upper layer, it has no pseudo_dev assigned.
In this overlay layers setup:
- two filesystems, A and B
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B
Non pure upper overlay inode, whose origin is in layer 1 will have the
st_dev;st_ino values of the real lower inode before copy up and the
st_dev;st_ino values of the real upper inode after copy up.
Fix this inconsitency by assigning a unique pseudo_dev also for upper fs,
that will be used as st_dev value along with the lower inode st_dev for
overlay inodes in the case above.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/overlayfs/inode.c | 13 +++----------
fs/overlayfs/super.c | 15 ++++++++++-----
2 files changed, 13 insertions(+), 15 deletions(-)
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 9e894f5e19b4..00c4e9c17492 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -125,9 +125,8 @@ static int ovl_map_dev_ino(struct dentry *dentry, struct kstat *stat, int fsid)
/*
* For non-samefs setup, if we cannot map all layers st_ino
* to a unified address space, we need to make sure that st_dev
- * is unique per lower fs. Layers that are on the same fs as
- * upper layer use real upper st_dev and other lower layers use
- * the unique anonymous bdev assigned to the lower fs.
+ * is unique per underlying fs, so we use the unique anonymous
+ * bdev assigned to the underlying fs.
*/
stat->dev = OVL_FS(dentry->d_sb)->fs[fsid].pseudo_dev;
}
@@ -195,13 +194,7 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
if (ovl_test_flag(OVL_INDEX, d_inode(dentry)) ||
(!ovl_verify_lower(dentry->d_sb) &&
(is_dir || lowerstat.nlink == 1))) {
- /*
- * Cannot use origin st_dev;st_ino because
- * origin inode content may differ from overlay
- * inode content.
- */
- if (samefs || fsid)
- stat->ino = lowerstat.ino;
+ stat->ino = lowerstat.ino;
} else {
fsid = 0;
}
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e80f79bb8a4e..9270e059eb9b 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -230,8 +230,7 @@ static void ovl_free_fs(struct ovl_fs *ofs)
}
kfree(ofs->layers);
if (ofs->fs) {
- /* fs[0].pseudo_dev is either null or real upper st_dev */
- for (i = 1; i <= ofs->maxfsid; i++)
+ for (i = 0; i <= ofs->maxfsid; i++)
free_anon_bdev(ofs->fs[i].pseudo_dev);
kfree(ofs->fs);
}
@@ -1335,13 +1334,19 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
ofs->layers[0].fsid = 0;
/*
- * All lower layers that share the same fs as upper layer, use the real
- * upper st_dev.
+ * All lower layers that share the same fs as upper layer, use the same
+ * pseudo_dev as upper layer. Allocate fs[0].pseudo_dev even for lower
+ * only overlay to simplify ovl_fs_free().
* is_lower will be set if upper fs is shared with a lower layer.
*/
+ err = get_anon_bdev(&ofs->fs[0].pseudo_dev);
+ if (err) {
+ pr_err("overlayfs: failed to get anonymous bdev for upper fs\n");
+ goto out;
+ }
+
if (ofs->upper_mnt) {
ofs->fs[0].sb = ofs->upper_mnt->mnt_sb;
- ofs->fs[0].pseudo_dev = ofs->upper_mnt->mnt_sb->s_dev;
ofs->fs[0].is_lower = false;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
` (5 preceding siblings ...)
2019-11-17 15:43 ` [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino Amir Goldstein
@ 2019-11-18 6:03 ` Amir Goldstein
2019-11-18 7:57 ` Amir Goldstein
6 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-18 6:03 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs
On Sun, Nov 17, 2019 at 5:43 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> Miklos,
>
> When I started generalizing the lower_layers/lower_fs arrays
> I noticed a bug that was introduced in v4.17 with xino.
>
> In the case of lower layer on upper fs, we do not have a pseudo_dev
> assigned to lower layer and we expose the real lower st_dev;st_ino.
> This happens on non-samefs when xino is disabled (default).
> This is a very real bug, not really a corner case and I have an
> an xfstest [1] for it that I will post later.
>
> In the mean while, I also pushed a fix to unionmount-testsuite devel
> branch [2] to demonstrate the issue.
>
> With upstream kernel, this test ends up with a copied up file
> from middle layer, whose on same fs as upper and its exposed
> st_dev;st_ino are invalid:
>
> ./run --ov=1 --verify hard-link
> ...
> /mnt/a/no_foo110: File unexpectedly on upper layer
>
> Patch 1 in the series is a small fix for stable that fixes the
> v4.17 regression in favor of a different, less severe regression.
> The new regression can be demonstrated with:
>
> ./run --ov=1 --verify --xino hard-link
> ...
> /mnt/a/no_foo110: inode number/layer changed on copy up
> (got 39:24707, was 39:24700)
>
> Patches 2-4 generalize the lower_{layer/fs} arrays to layer/fs arrays
> and get rid of some special casing of upper layer.
>
> Patches 5-6 use the cleanup to solve the corner case that you pointed
> out with bas_uuid [3] and to fix the regression introduced by patch 1.
>
> After patch 6, both unionmount-testsuite configurations
> above pass the test st_dev;st_ino verifications.
>
> I doubt if patches 2-6 are stable material, because not sure the
> corner cases they fix are worth the trouble.
>
> The series depends on the bad_uuid patch v5 that I posted on Thursday.
>
> I was also considering setting xino=on by default if xino_auto
> is enabled, because what have we got to loose?
>
> The inodes whose st_ino fit in lower bits (by far more common) will
> use overlay st_dev and the inodes whose st_ino overflow the lower bits
> will use pseudo_dev. Seems like a win-win situation, but I wanted to
> get your feedback on this before sending out a patch.
>
Arrr.. yes, there is a catch.
Overflowing lower bits has a price beyond just using pseudo_dev.
It introduces the possibility of inode number conflicts on directories,
because directories never use pseudo_dev.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
2019-11-18 6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
@ 2019-11-18 7:57 ` Amir Goldstein
2019-11-22 9:31 ` Amir Goldstein
0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-18 7:57 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs
On Mon, Nov 18, 2019 at 8:03 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Sun, Nov 17, 2019 at 5:43 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > Miklos,
> >
> > When I started generalizing the lower_layers/lower_fs arrays
> > I noticed a bug that was introduced in v4.17 with xino.
> >
> > In the case of lower layer on upper fs, we do not have a pseudo_dev
> > assigned to lower layer and we expose the real lower st_dev;st_ino.
> > This happens on non-samefs when xino is disabled (default).
> > This is a very real bug, not really a corner case and I have an
> > an xfstest [1] for it that I will post later.
> >
> > In the mean while, I also pushed a fix to unionmount-testsuite devel
> > branch [2] to demonstrate the issue.
> >
> > With upstream kernel, this test ends up with a copied up file
> > from middle layer, whose on same fs as upper and its exposed
> > st_dev;st_ino are invalid:
> >
> > ./run --ov=1 --verify hard-link
> > ...
> > /mnt/a/no_foo110: File unexpectedly on upper layer
> >
> > Patch 1 in the series is a small fix for stable that fixes the
> > v4.17 regression in favor of a different, less severe regression.
> > The new regression can be demonstrated with:
> >
> > ./run --ov=1 --verify --xino hard-link
> > ...
> > /mnt/a/no_foo110: inode number/layer changed on copy up
> > (got 39:24707, was 39:24700)
> >
> > Patches 2-4 generalize the lower_{layer/fs} arrays to layer/fs arrays
> > and get rid of some special casing of upper layer.
> >
> > Patches 5-6 use the cleanup to solve the corner case that you pointed
> > out with bas_uuid [3] and to fix the regression introduced by patch 1.
> >
> > After patch 6, both unionmount-testsuite configurations
> > above pass the test st_dev;st_ino verifications.
> >
> > I doubt if patches 2-6 are stable material, because not sure the
> > corner cases they fix are worth the trouble.
> >
> > The series depends on the bad_uuid patch v5 that I posted on Thursday.
> >
> > I was also considering setting xino=on by default if xino_auto
> > is enabled, because what have we got to loose?
> >
> > The inodes whose st_ino fit in lower bits (by far more common) will
> > use overlay st_dev and the inodes whose st_ino overflow the lower bits
> > will use pseudo_dev. Seems like a win-win situation, but I wanted to
> > get your feedback on this before sending out a patch.
> >
>
> Arrr.. yes, there is a catch.
> Overflowing lower bits has a price beyond just using pseudo_dev.
> It introduces the possibility of inode number conflicts on directories,
> because directories never use pseudo_dev.
>
But we could mitigate that problem if we reserve an fsid for volatile
directory inode numbers. get_next_ino is 32bit anyway.
I am going to take a swing at having xino=auto always enabling xino.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 4/6] ovl: generalize the lower_fs[] array
2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
@ 2019-11-18 17:01 ` Amir Goldstein
0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-18 17:01 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs
On Sun, Nov 17, 2019 at 5:44 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> Rename lower_fs[] array to fs[], extend its size by one and initialize
> fs[0] with upper fs values. fsid 0 is already reserved even with lower
> only overlay, so fs[0] remains uninitialized in this case as well.
>
> Rename numlowerfs to maxfsid and use index 0..maxfsid to access the
> fs[] array.
>
FYI, I don't know what I was thinking with the conversion to maxfsid.
I don't like that and intend to change it back to numfs.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
2019-11-18 7:57 ` Amir Goldstein
@ 2019-11-22 9:31 ` Amir Goldstein
2019-11-25 14:45 ` Amir Goldstein
0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2019-11-22 9:31 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs
> > > I was also considering setting xino=on by default if xino_auto
> > > is enabled, because what have we got to loose?
> > >
> > > The inodes whose st_ino fit in lower bits (by far more common) will
> > > use overlay st_dev and the inodes whose st_ino overflow the lower bits
> > > will use pseudo_dev. Seems like a win-win situation, but I wanted to
> > > get your feedback on this before sending out a patch.
> > >
> >
> > Arrr.. yes, there is a catch.
> > Overflowing lower bits has a price beyond just using pseudo_dev.
> > It introduces the possibility of inode number conflicts on directories,
> > because directories never use pseudo_dev.
> >
>
> But we could mitigate that problem if we reserve an fsid for volatile
> directory inode numbers. get_next_ino is 32bit anyway.
> I am going to take a swing at having xino=auto always enabling xino.
>
FWIW, pushed WIP branch to:
https://github.com/amir73il/linux/commits/ovl-ino
It is based on an updated ovl-layers branch of the $SUBJECT series.
During cleanup, I've found several corner cases bugs of setting
i_ino value and fixed them.
None of those bugs are critical.
AFAIK, the only user that complained on inconsistent i_ino is
an xfstest that is checking ino in /proc/locks.
However, I do think that the cleanup I made simplifies the code
which was a bit spaghetti in that area and with some more TLC
we can get to enabling xino from xino=auto even for filesystems
that have seldom high ino bits.
That could be a real benefit, because it is unlikely that users
will have enough knowledge or certainty about underlying fs
to declare xino=on.
I need to clear up some time to test i_ino changes and
the collision avoidance code, but for now, at least the ovl-ino branch
passes the existing regression tests.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/6] Sort out overlay layers and fs arrays
2019-11-22 9:31 ` Amir Goldstein
@ 2019-11-25 14:45 ` Amir Goldstein
0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2019-11-25 14:45 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: Colin Ian King, overlayfs
On Fri, Nov 22, 2019 at 11:31 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> > > > I was also considering setting xino=on by default if xino_auto
> > > > is enabled, because what have we got to loose?
> > > >
> > > > The inodes whose st_ino fit in lower bits (by far more common) will
> > > > use overlay st_dev and the inodes whose st_ino overflow the lower bits
> > > > will use pseudo_dev. Seems like a win-win situation, but I wanted to
> > > > get your feedback on this before sending out a patch.
> > > >
> > >
> > > Arrr.. yes, there is a catch.
> > > Overflowing lower bits has a price beyond just using pseudo_dev.
> > > It introduces the possibility of inode number conflicts on directories,
> > > because directories never use pseudo_dev.
> > >
> >
> > But we could mitigate that problem if we reserve an fsid for volatile
> > directory inode numbers. get_next_ino is 32bit anyway.
> > I am going to take a swing at having xino=auto always enabling xino.
> >
>
> FWIW, pushed WIP branch to:
> https://github.com/amir73il/linux/commits/ovl-ino
>
> It is based on an updated ovl-layers branch of the $SUBJECT series.
>
> During cleanup, I've found several corner cases bugs of setting
> i_ino value and fixed them.
> None of those bugs are critical.
> AFAIK, the only user that complained on inconsistent i_ino is
> an xfstest that is checking ino in /proc/locks.
>
> However, I do think that the cleanup I made simplifies the code
> which was a bit spaghetti in that area and with some more TLC
> we can get to enabling xino from xino=auto even for filesystems
> that have seldom high ino bits.
>
> That could be a real benefit, because it is unlikely that users
> will have enough knowledge or certainty about underlying fs
> to declare xino=on.
>
> I need to clear up some time to test i_ino changes and
> the collision avoidance code, but for now, at least the ovl-ino branch
> passes the existing regression tests.
>
Documenting all this in words has become too complex for me,
so I resorted to visual aids [1].
The table doesn't mention this explicitly, but xino=auto can now
(code in ovl-ino) provide all desired features for any non-samefs setup.
Tests are still WIP. Will post the work when they are done.
I am using a nested overlay [2] as a way to test xino overflow behavior.
Thanks,
Amir.
[1] https://github.com/amir73il/linux/blob/ovl-ino/Documentation/filesystems/overlayfs.rst#inode-properties
[2] https://github.com/amir73il/unionmount-testsuite/commits/ovl-nested
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-11-25 14:45 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-17 15:43 [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
2019-11-17 15:43 ` [PATCH 1/6] ovl: fix corner case of non-unique st_dev;st_ino Amir Goldstein
2019-11-17 15:43 ` [PATCH 2/6] ovl: generalize the lower_layers[] array Amir Goldstein
2019-11-17 15:43 ` [PATCH 3/6] ovl: simplify ovl_same_sb() helper Amir Goldstein
2019-11-17 15:43 ` [PATCH 4/6] ovl: generalize the lower_fs[] array Amir Goldstein
2019-11-18 17:01 ` Amir Goldstein
2019-11-17 15:43 ` [PATCH 5/6] ovl: fix corner case of conflicting lower layer uuid Amir Goldstein
2019-11-17 15:43 ` [PATCH 6/6] ovl: fix corner case of non-constant st_dev;st_ino Amir Goldstein
2019-11-18 6:03 ` [PATCH 0/6] Sort out overlay layers and fs arrays Amir Goldstein
2019-11-18 7:57 ` Amir Goldstein
2019-11-22 9:31 ` Amir Goldstein
2019-11-25 14:45 ` Amir Goldstein
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).