From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751067AbdAMD0y (ORCPT ); Thu, 12 Jan 2017 22:26:54 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:32888 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750897AbdAMD0x (ORCPT ); Thu, 12 Jan 2017 22:26:53 -0500 Date: Fri, 13 Jan 2017 03:26:51 +0000 From: Al Viro To: Nakajima Akira Cc: linux-kernel@vger.kernel.org Subject: Re: Duplicate inode number when mount --bind some directories to same mountpoint. (from Fedora18 to 4.10-rc3) Message-ID: <20170113032651.GF1555@ZenIV.linux.org.uk> References: <9795d554-d236-2096-497d-e25622042d41@nttcom.co.jp> <20170112102425.GW1555@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 13, 2017 at 10:40:08AM +0900, Nakajima Akira wrote: > On 2017/01/12 19:24, Al Viro wrote: > > On Thu, Jan 12, 2017 at 06:16:35PM +0900, Nakajima Akira wrote: > > > Bug: > > > Duplicate inode number when mount --bind some directories to same > > > mountpoint. (from Fedora18 to 4.10-rc3) > > > Fedora17 and earlier works correctly. > > > > Explain, please. "Duplicate inode number" between what and what? > > Duplicate inode number between mounted directories. > > Example) > # cd /home > # mkdir a b > # ls -i > 100 a 999 b > # mount --bind a /mnt > # mount --bind b /mnt > # ls -i > 999 a 999 b > > Inode number of directory "a" is changed to "b". > Then we see directory "b" when ls "a". 61 0 252:1 / / rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered Root, marked shared (peer group 1). /home is not a mountpoint, /mnt wasn't one until your mounts (i.e. both are within the same mount as /). Since /home/a is a subtree of a shared mount, any clone of it will, by default, join the same peer group. Which means that binding it on /mnt results in 116 61 252:1 /home/a /mnt rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered i.e. ext4[vda1]home/a being mounted on /mnt and marked peer of root mount. Accordingly, any mount/umount event in either will be duplicated to all peers (provided that they contain a counterpart of affected mountpoint). In particular, binding /home/b on /mnt (i.e. on top of ext4[vda1]home/mnt) propagates to the corresponding points in all peers - including the root mount, where it corresponds to /home/a. Result: 120 116 252:1 /home/b /mnt rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered 121 61 252:1 /home/b /home/a rw,relatime shared:1 - ext4 /dev/vda1 rw,data=ordered The same tree (ext4[vda1]home/b) is mounted on root in mount 116 (i.e. the thing found on /mnt) and on /home/a in mount 61 (i.e. /home/a). Since /home/b is on a shared mount, both clones are put in the same peer group (i.e. the same group 1). You asked for it, you've got it... Well, fedora folks did, actually. I'm none too fond of their default setup (root made shared), but that has nothing to do with the kernel. Userland (systemd, as far as I can tell) is setting the things up that way, and it's even documented in fedora release notes... Kernel mechanisms involved in that had been there for a long time and they are also documented (man 2 mount, look for MS_SHARED and related flags in there). Take it up with fedora folks...