linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3)
@ 2017-01-12  9:16 Nakajima Akira
  2017-01-12 10:24 ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Nakajima Akira @ 2017-01-12  9:16 UTC (permalink / raw)
  To: linux-kernel

Bug:
Duplicate inode number when mount --bind some directories to same  
mountpoint. (from Fedora18 to 4.10-rc3)
Fedora17 and earlier works correctly.


And,
Above kernel ver 3.6 (Fedora18 including 4.10-rc3) creates many structs  
of mount than ver 3.3 (Fedora17).
Is this a correct specification?
Looks kernel creates same many structs of mount.


================================================================
Systemtap script to see kernel creates many structs of mount.

[root@fedora17 home]# cat /root/mnt.stp
#! /usr/bin/stap

probe kernel.function("alloc_vfsmnt").return {
	printf("%s() new_mnt:%p\n", probefunc(), $return);
}

probe kernel.function("clone_mnt").return {	// do_mount, copy_tree
	name = @cast($return, "mount")->mnt_mountpoint->d_iname;
	inode = @cast($return, "mount")->mnt_mountpoint->d_inode;
	ino = @cast($return, "mount")->mnt_mountpoint->d_inode->i_ino;
	printf("%s() mnt:%p d_iname:%s inode:%p ino:%u\n", probefunc(),  
$return, kernel_string(name), inode, ino);
}

================================================================
Systemtap script result on Fedora17
Kernel create 1 struct of mount.

[root@fedora17 home]# mkdir a b
[root@fedora17 home]# ls -i
655540 a  655542 b

[root@fedora17 home]# /root/mnt.stp &
[root@fedora17 home]# mount --bind a /mnt
[root@fedora17 home]# alloc_vfsmnt() new_mnt:0xffff880136bdaf00
clone_mnt() mnt:0xffff880136bdaf00 d_iname:a inode:0xffff88013081cb00  
ino:655540

[root@fedora17 home]# mount --bind b /mnt
[root@fedora17 home]# alloc_vfsmnt() new_mnt:0xffff8801355b4f00
clone_mnt() mnt:0xffff8801355b4f00 d_iname:b inode:0xffff88013081c790  
ino:655542

[root@fedora17 home]# ls -i
655540 a  655542 b

================================================================
Systemtap script result on Fedora25
Kernel create many structs of mount.
And, inode number of "a" changes to 547586 of "b".


[root@fedora25 home]# mkdir a b
[root@fedora25 home]# ls -i
547584 a  547586 b

[root@fedora25 home]# /root/mnt.stp &
[root@fedora25 home]# mount --bind a /mnt
[root@fedora25 home]# clone_mnt() new_mnt:0xffff99e4b7cdc900
do_mount() mnt:0xffff99e4b7cdc900 d_iname:a inode:0xffff99e4b9dcc948  
ino:547584
clone_mnt() new_mnt:0xffff99e4b7cdcc00
copy_tree() mnt:0xffff99e4b7cdcc00 d_iname:a inode:0xffff99e4b9dcc948  
ino:547584
clone_mnt() new_mnt:0xffff99e4b7cdc000
copy_tree() mnt:0xffff99e4b7cdc000 d_iname:a inode:0xffff99e4b9dcc948  
ino:547584
clone_mnt() new_mnt:0xffff99e4b7cdc480
copy_tree() mnt:0xffff99e4b7cdc480 d_iname:a inode:0xffff99e4b9dcc948  
ino:547584
clone_mnt() new_mnt:0xffff99e4b7cdc180
copy_tree() mnt:0xffff99e4b7cdc180 d_iname:a inode:0xffff99e4b9dcc948  
ino:547584

[root@fedora25 home]# mount --bind b /mnt
clone_mnt() new_mnt:0xffff99e4b7cb1480
do_mount() mnt:0xffff99e4b7cb1480 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e4b7cb1180
copy_tree() mnt:0xffff99e4b7cb1180 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e4b7cb1000
copy_tree() mnt:0xffff99e4b7cb1000 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9d80
copy_tree() mnt:0xffff99e436df9d80 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9600
copy_tree() mnt:0xffff99e436df9600 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9780
copy_tree() mnt:0xffff99e436df9780 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9a80
copy_tree() mnt:0xffff99e436df9a80 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9900
copy_tree() mnt:0xffff99e436df9900 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9c00
copy_tree() mnt:0xffff99e436df9c00 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9180
copy_tree() mnt:0xffff99e436df9180 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586
clone_mnt() new_mnt:0xffff99e436df9480
copy_tree() mnt:0xffff99e436df9480 d_iname:b inode:0xffff99e4b9dceac8  
ino:547586

[root@fedora25 home]# ls -i
547586 a  547586 b
   ******** Duplicate inode number ********

[root@fedora25 home]# echo ok > /mnt/zzz
[root@fedora25 home]# ls /home/*
/home/a:
zzz

/home/b:
zzz

   ******** Actually /home/a/zzz is no exist, but can see ********


-----
Akira Nakajima

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3)
  2017-01-12  9:16 Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3) Nakajima Akira
@ 2017-01-12 10:24 ` Al Viro
  2017-01-13  1:40   ` Nakajima Akira
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2017-01-12 10:24 UTC (permalink / raw)
  To: Nakajima Akira; +Cc: linux-kernel

On Thu, Jan 12, 2017 at 06:16:35PM +0900, Nakajima Akira wrote:
> Bug:
> Duplicate inode number when mount --bind some directories to same
> mountpoint. (from Fedora18 to 4.10-rc3)
> Fedora17 and earlier works correctly.

Explain, please.  "Duplicate inode number" between what and what?

> And,
> Above kernel ver 3.6 (Fedora18 including 4.10-rc3) creates many structs of
> mount than ver 3.3 (Fedora17).
> Is this a correct specification?
> Looks kernel creates same many structs of mount.

alloc_vfsmnt() and clone_mnt() are internal functions, no promises of
stability had ever been given...  As for the differences between these
setups... almost certainly an effect of changed shared-subtree settings.
Userland, not kernel.

> [root@fedora17 home]# mkdir a b
> [root@fedora17 home]# ls -i
> 655540 a  655542 b
> 
> [root@fedora17 home]# /root/mnt.stp &
> [root@fedora17 home]# mount --bind a /mnt
> [root@fedora17 home]# alloc_vfsmnt() new_mnt:0xffff880136bdaf00
> clone_mnt() mnt:0xffff880136bdaf00 d_iname:a inode:0xffff88013081cb00
> ino:655540
> 
> [root@fedora17 home]# mount --bind b /mnt
> [root@fedora17 home]# alloc_vfsmnt() new_mnt:0xffff8801355b4f00
> clone_mnt() mnt:0xffff8801355b4f00 d_iname:b inode:0xffff88013081c790
> ino:655542
> 
> [root@fedora17 home]# ls -i
> 655540 a  655542 b

> Systemtap script result on Fedora25
> Kernel create many structs of mount.
> And, inode number of "a" changes to 547586 of "b".
> 
> 
> [root@fedora25 home]# mkdir a b
> [root@fedora25 home]# ls -i
> 547584 a  547586 b
> 
> [root@fedora25 home]# /root/mnt.stp &
> [root@fedora25 home]# mount --bind a /mnt
> [root@fedora25 home]# clone_mnt() new_mnt:0xffff99e4b7cdc900
> do_mount() mnt:0xffff99e4b7cdc900 d_iname:a inode:0xffff99e4b9dcc948
> ino:547584
> clone_mnt() new_mnt:0xffff99e4b7cdcc00
> copy_tree() mnt:0xffff99e4b7cdcc00 d_iname:a inode:0xffff99e4b9dcc948
> ino:547584
> clone_mnt() new_mnt:0xffff99e4b7cdc000
> copy_tree() mnt:0xffff99e4b7cdc000 d_iname:a inode:0xffff99e4b9dcc948
> ino:547584
> clone_mnt() new_mnt:0xffff99e4b7cdc480
> copy_tree() mnt:0xffff99e4b7cdc480 d_iname:a inode:0xffff99e4b9dcc948
> ino:547584
> clone_mnt() new_mnt:0xffff99e4b7cdc180
> copy_tree() mnt:0xffff99e4b7cdc180 d_iname:a inode:0xffff99e4b9dcc948
> ino:547584
> 
> [root@fedora25 home]# mount --bind b /mnt
> clone_mnt() new_mnt:0xffff99e4b7cb1480
> do_mount() mnt:0xffff99e4b7cb1480 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e4b7cb1180
> copy_tree() mnt:0xffff99e4b7cb1180 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e4b7cb1000
> copy_tree() mnt:0xffff99e4b7cb1000 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9d80
> copy_tree() mnt:0xffff99e436df9d80 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9600
> copy_tree() mnt:0xffff99e436df9600 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9780
> copy_tree() mnt:0xffff99e436df9780 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9a80
> copy_tree() mnt:0xffff99e436df9a80 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9900
> copy_tree() mnt:0xffff99e436df9900 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9c00
> copy_tree() mnt:0xffff99e436df9c00 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9180
> copy_tree() mnt:0xffff99e436df9180 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> clone_mnt() new_mnt:0xffff99e436df9480
> copy_tree() mnt:0xffff99e436df9480 d_iname:b inode:0xffff99e4b9dceac8
> ino:547586
> 
> [root@fedora25 home]# ls -i
> 547586 a  547586 b

What I would like to see is the contents of /proc/self/mountinfo -
systemtap be damned, there is a sane interface for getting the
mount tree setup.  BTW, what's in that /root/mnt.stp thing?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3)
  2017-01-12 10:24 ` Al Viro
@ 2017-01-13  1:40   ` Nakajima Akira
  2017-01-13  3:26     ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: Nakajima Akira @ 2017-01-13  1:40 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel

On 2017/01/12 19:24, Al Viro wrote:
> On Thu, Jan 12, 2017 at 06:16:35PM +0900, Nakajima Akira wrote:
>> Bug:
>> Duplicate inode number when mount --bind some directories to same
>> mountpoint. (from Fedora18 to 4.10-rc3)
>> Fedora17 and earlier works correctly.
>
> Explain, please.  "Duplicate inode number" between what and what?

Duplicate inode number between mounted directories.

Example)
# cd /home
# mkdir a b
# ls -i
100 a  999 b
# mount --bind a /mnt
# mount --bind b /mnt
# ls -i
999 a  999 b
	
Inode number of directory "a" is changed to "b".
Then we see directory "b" when ls "a".


>> And,
>> Above kernel ver 3.6 (Fedora18 including 4.10-rc3) creates many structs of
>> mount than ver 3.3 (Fedora17).
>> Is this a correct specification?
>> Looks kernel creates same many structs of mount.

> alloc_vfsmnt() and clone_mnt() are internal functions, no promises of
> stability had ever been given...  As for the differences between these
> setups... almost certainly an effect of changed shared-subtree settings.
> Userland, not kernel.


>> Systemtap script result on Fedora25
>> Kernel create many structs of mount.
>> And, inode number of "a" changes to 547586 of "b".

> What I would like to see is the contents of /proc/self/mountinfo -
> systemtap be damned, there is a sane interface for getting the
> mount tree setup.  BTW, what's in that /root/mnt.stp thing?

/root/mnt.stp is following.

In result of script,
Kernel creates many same structs of mount, It looks waste of memory.
But I don't know whether it is correct specification or not.

================================================================
# cat /root/mnt.stp
#! /usr/bin/stap

probe kernel.function("alloc_vfsmnt").return {
     printf("%s() new_mnt:%p\n", probefunc(), $return);
}

probe kernel.function("clone_mnt").return {    // do_mount, copy_tree
     name = @cast($return, "mount")->mnt_mountpoint->d_iname;
     inode = @cast($return, "mount")->mnt_mountpoint->d_inode;
     ino = @cast($return, "mount")->mnt_mountpoint->d_inode->i_ino;
     printf("%s() mnt:%p d_iname:%s inode:%p ino:%u\n", probefunc(), 
$return, kernel_string(name), inode, ino);
}

================================================================
/proc/self/mountinfo is following

17 61 0:17 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw
18 61 0:4 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
19 61 0:6 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs 
rw,size=2013132k,nr_inodes=503283,mode=755
20 17 0:18 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime 
shared:7 - securityfs securityfs rw
21 19 0:19 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw
22 19 0:20 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts 
rw,gid=5,mode=620,ptmxmode=000
23 61 0:21 / /run rw,nosuid,nodev shared:22 - tmpfs tmpfs rw,mode=755
24 17 0:22 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs 
tmpfs ro,mode=755
25 24 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime 
shared:9 - cgroup cgroup 
rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
26 17 0:24 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - 
pstore pstore rw
27 24 0:25 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime 
shared:10 - cgroup cgroup rw,cpu,cpuacct
28 24 0:26 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime 
shared:11 - cgroup cgroup rw,blkio
29 24 0:27 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime 
shared:12 - cgroup cgroup rw,perf_event
30 24 0:28 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime 
shared:13 - cgroup cgroup rw,pids
31 24 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime 
shared:14 - cgroup cgroup rw,freezer
32 24 0:30 / /sys/fs/cgroup/net_cls,net_prio 
rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup 
rw,net_cls,net_prio
33 24 0:31 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime 
shared:16 - cgroup cgroup rw,cpuset
34 24 0:32 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime 
shared:17 - cgroup cgroup rw,devices
35 24 0:33 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime 
shared:18 - cgroup cgroup rw,hugetlb
36 24 0:34 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime 
shared:19 - cgroup cgroup rw,memory
58 17 0:35 / /sys/kernel/config rw,relatime shared:21 - configfs configfs rw
61 0 252:1 / / rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered
16 19 0:15 / /dev/mqueue rw,relatime shared:23 - mqueue mqueue rw
37 19 0:16 / /dev/hugepages rw,relatime shared:24 - hugetlbfs hugetlbfs rw
38 61 0:37 / /tmp rw,nosuid,nodev shared:25 - tmpfs tmpfs rw
39 18 0:38 / /proc/sys/fs/binfmt_misc rw,relatime shared:26 - autofs 
systemd-1 
rw,fd=38,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=12648
40 17 0:7 / /sys/kernel/debug rw,relatime shared:27 - debugfs debugfs rw
72 61 0:39 / /var/lib/nfs/rpc_pipefs rw,relatime shared:28 - rpc_pipefs 
sunrpc rw
74 18 0:40 / /proc/fs/nfsd rw,relatime shared:29 - nfsd nfsd rw
111 23 0:41 / /run/user/0 rw,nosuid,nodev,relatime shared:64 - tmpfs 
tmpfs rw,size=404708k,mode=700
116 61 252:1 /home/a /mnt rw,relatime shared:1 - ext4 /dev/vda1 
rw,data=ordered
120 116 252:1 /home/b /mnt rw,relatime shared:1 - ext4 /dev/vda1 
rw,data=ordered
121 61 252:1 /home/b /home/a rw,relatime shared:1 - ext4 /dev/vda1 
rw,data=ordered

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3)
  2017-01-13  1:40   ` Nakajima Akira
@ 2017-01-13  3:26     ` Al Viro
  2017-01-16  1:33       ` Nakajima Akira
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2017-01-13  3:26 UTC (permalink / raw)
  To: Nakajima Akira; +Cc: linux-kernel

On Fri, Jan 13, 2017 at 10:40:08AM +0900, Nakajima Akira wrote:
> On 2017/01/12 19:24, Al Viro wrote:
> > On Thu, Jan 12, 2017 at 06:16:35PM +0900, Nakajima Akira wrote:
> > > Bug:
> > > Duplicate inode number when mount --bind some directories to same
> > > mountpoint. (from Fedora18 to 4.10-rc3)
> > > Fedora17 and earlier works correctly.
> > 
> > Explain, please.  "Duplicate inode number" between what and what?
> 
> Duplicate inode number between mounted directories.
> 
> Example)
> # cd /home
> # mkdir a b
> # ls -i
> 100 a  999 b
> # mount --bind a /mnt
> # mount --bind b /mnt
> # ls -i
> 999 a  999 b
> 	
> Inode number of directory "a" is changed to "b".
> Then we see directory "b" when ls "a".

61 0 252:1 / / rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered

Root, marked shared (peer group 1).  /home is not a mountpoint, /mnt
wasn't one until your mounts (i.e. both are within the same mount as /).

Since /home/a is a subtree of a shared mount, any clone of it will, by
default, join the same peer group.  Which means that binding it on /mnt
results in

116 61 252:1 /home/a /mnt rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered

i.e. ext4[vda1]home/a being mounted on /mnt and marked peer of root mount.
Accordingly, any mount/umount event in either will be duplicated to all
peers (provided that they contain a counterpart of affected mountpoint).
In particular, binding /home/b on /mnt (i.e. on top of ext4[vda1]home/mnt)
propagates to the corresponding points in all peers - including the root
mount, where it corresponds to /home/a.  Result:

120 116 252:1 /home/b /mnt rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered
121 61 252:1 /home/b /home/a rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered

The same tree (ext4[vda1]home/b) is mounted on root in mount 116
(i.e. the thing found on /mnt) and on /home/a in mount 61 (i.e. /home/a).

Since /home/b is on a shared mount, both clones are put in the same peer
group (i.e. the same group 1).

You asked for it, you've got it...  Well, fedora folks did, actually.
I'm none too fond of their default setup (root made shared), but that has
nothing to do with the kernel.  Userland (systemd, as far as I can tell)
is setting the things up that way, and it's even documented in fedora
release notes...  Kernel mechanisms involved in that had been there for
a long time and they are also documented (man 2 mount, look for MS_SHARED
and related flags in there).

Take it up with fedora folks...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3)
  2017-01-13  3:26     ` Al Viro
@ 2017-01-16  1:33       ` Nakajima Akira
  0 siblings, 0 replies; 5+ messages in thread
From: Nakajima Akira @ 2017-01-16  1:33 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel

 > 120 116 252:1 /home/b /mnt rw,relatime shared:1 - ext4 /dev/vda1 
rw,data=ordered
 > 121 61 252:1 /home/b /home/a rw,relatime shared:1 - ext4 /dev/vda1 
rw,data=ordered

Thanks for your polite explanation.
I can understand this is correct kernel specification
and above Fedora18 and RHEL7.0 is using shared mount as default.


I should have done it as follows.

# mount --make-private --bind a /mnt
# mount --make-private --bind b /mnt
# ls -i
100 a  999 b

# tali /proc/self/mountinfo
200 59 8:3 /home/a /mnt rw,relatime - ext4 /dev/sda3 
rw,stripe=64,data=ordered
205 200 8:3 /home/b /mnt rw,relatime - ext4 /dev/sda3 
rw,stripe=64,data=ordered


On 2017/01/13 12:26, Al Viro wrote:
> On Fri, Jan 13, 2017 at 10:40:08AM +0900, Nakajima Akira wrote:
>> On 2017/01/12 19:24, Al Viro wrote:
>>> On Thu, Jan 12, 2017 at 06:16:35PM +0900, Nakajima Akira wrote:
>>>> Bug:
>>>> Duplicate inode number when mount --bind some directories to same
>>>> mountpoint. (from Fedora18 to 4.10-rc3)
>>>> Fedora17 and earlier works correctly.
>>>
>>> Explain, please.  "Duplicate inode number" between what and what?
>>
>> Duplicate inode number between mounted directories.
>>
>> Example)
>> # cd /home
>> # mkdir a b
>> # ls -i
>> 100 a  999 b
>> # mount --bind a /mnt
>> # mount --bind b /mnt
>> # ls -i
>> 999 a  999 b
>> 	
>> Inode number of directory "a" is changed to "b".
>> Then we see directory "b" when ls "a".
>
> 61 0 252:1 / / rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered
>
> Root, marked shared (peer group 1).  /home is not a mountpoint, /mnt
> wasn't one until your mounts (i.e. both are within the same mount as /).
>
> Since /home/a is a subtree of a shared mount, any clone of it will, by
> default, join the same peer group.  Which means that binding it on /mnt
> results in
>
> 116 61 252:1 /home/a /mnt rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered
>
> i.e. ext4[vda1]home/a being mounted on /mnt and marked peer of root mount.
> Accordingly, any mount/umount event in either will be duplicated to all
> peers (provided that they contain a counterpart of affected mountpoint).
> In particular, binding /home/b on /mnt (i.e. on top of ext4[vda1]home/mnt)
> propagates to the corresponding points in all peers - including the root
> mount, where it corresponds to /home/a.  Result:
>
> 120 116 252:1 /home/b /mnt rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered
> 121 61 252:1 /home/b /home/a rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered
>
> The same tree (ext4[vda1]home/b) is mounted on root in mount 116
> (i.e. the thing found on /mnt) and on /home/a in mount 61 (i.e. /home/a).
>
> Since /home/b is on a shared mount, both clones are put in the same peer
> group (i.e. the same group 1).
>
> You asked for it, you've got it...  Well, fedora folks did, actually.
> I'm none too fond of their default setup (root made shared), but that has
> nothing to do with the kernel.  Userland (systemd, as far as I can tell)
> is setting the things up that way, and it's even documented in fedora
> release notes...  Kernel mechanisms involved in that had been there for
> a long time and they are also documented (man 2 mount, look for MS_SHARED
> and related flags in there).
>
> Take it up with fedora folks...

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-01-16  1:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-12  9:16 Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3) Nakajima Akira
2017-01-12 10:24 ` Al Viro
2017-01-13  1:40   ` Nakajima Akira
2017-01-13  3:26     ` Al Viro
2017-01-16  1:33       ` Nakajima Akira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).