From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 17 Jan 2017 16:46:37 +0100 From: Jan Kara To: Miklos Szeredi Cc: Jan Kara , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org Subject: Re: [Lsf-pc] [LSF/MM TOPIC] sharing pages between mappings Message-ID: <20170117154637.GT2517@quack2.suse.cz> References: <20170111115143.GJ16116@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: On Wed 11-01-17 15:13:19, Miklos Szeredi wrote: > On Wed, Jan 11, 2017 at 12:51 PM, Jan Kara wrote: > > On Wed 11-01-17 11:29:28, Miklos Szeredi wrote: > >> I know there's work on this for xfs, but could this be done in generic mm > >> code? > >> > >> What are the obstacles? page->mapping and page->index are the obvious > >> ones. > > > > Yes, these two are the main that come to my mind. Also you'd need to > > somehow share the mapping->i_mmap tree so that unmap_mapping_range() works. > > > >> If that's too difficult is it maybe enough to share mappings between > >> files while they are completely identical and clone the mapping when > >> necessary? > > > > Well, but how would the page->mapping->host indirection work? Even if you > > have identical contents of the mappings, you still need to be aware there > > are several inodes behind them and you need to pick the right one > > somehow... > > When do we actually need page->mapping->host? The only place where > it's not available is page writeback. Then we can know that the > original page was already cow-ed and after being cowed, the page > belong only to a single inode. Yeah, in principle the information may exist, however propagating it to all appropriate place may be a mess. > What then happens if the newly written data is cloned before being > written back? We can either write back the page during the clone, so > that only clean pages are ever shared. Or we can let dirty pages be > shared between inodes. The former is what I'd suggest for sanity... I.e. share only read-only pages. > In that latter case the question is: do we > care about which inode we use for writing back the data? Is the inode > needed at all? I don't know enough about filesystem internals to see > clearly what happens in such a situation. > > >> All COW filesystems would benefit, as well as layered ones: lots of > >> fuse fs, and in some cases overlayfs too. > >> > >> Related: what can DAX do in the presence of cloned block? > > > > For DAX handling a block COW should be doable if that is what you are > > asking about. Handling of blocks that can be written to while they are > > shared will be rather difficult (you have problems with keeping dirty bits > > in the radix tree consistent if nothing else). > > What happens if you do: > > - clone_file_range(A, off1, B, off2, len); > > - mmap both A and B using DAX. > > The mapping will contain the same struct page for two different mappings, no? Not the same struct page, as DAX does not have pages with struct page. However the same pfn will be underlying off1 of A and off2 of B. And for reads this is just fine. Once you want to write, you have to make sure you COW before you start modifying the data or you'll get data corruption (we synchronize operations using the exceptional entries in mapping->page_tree in DAX and these are separate for A and B). Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [LSF/MM TOPIC] sharing pages between mappings Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: multipart/signed; boundary="Apple-Mail=_24F3C8FA-2453-4243-89D1-AB7658AA1EF6"; protocol="application/pgp-signature"; micalg=pgp-sha256 From: Andreas Dilger In-Reply-To: Date: Wed, 11 Jan 2017 13:35:02 -0700 Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org Message-Id: References: To: Miklos Szeredi Sender: owner-linux-mm@kvack.org List-ID: --Apple-Mail=_24F3C8FA-2453-4243-89D1-AB7658AA1EF6 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On Jan 11, 2017, at 3:29 AM, Miklos Szeredi wrote: >=20 > I know there's work on this for xfs, but could this be done in generic = mm code? >=20 > What are the obstacles? page->mapping and page->index are the obvious = ones. >=20 > If that's too difficult is it maybe enough to share mappings between > files while they are completely identical and clone the mapping when > necessary? >=20 > All COW filesystems would benefit, as well as layered ones: lots of > fuse fs, and in some cases overlayfs too. For layered filesystems it would also be useful to have an API to move pages between mappings easily. > Related: what can DAX do in the presence of cloned block? >=20 > Thanks, > Miklos > -- > To unsubscribe from this list: send the line "unsubscribe = linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Cheers, Andreas --Apple-Mail=_24F3C8FA-2453-4243-89D1-AB7658AA1EF6 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iQIVAwUBWHaW+nKl2rkXzB/gAQjc7w//a18EVoNWjZcRbvDR7W+hTNBjQcM4gkpZ 8sh3xLTjApmxTp6CLpp1EdEGqHemyvuPlweFR/Yc5kKnqhHV7pP2PkRuUf42M2bd KGZ6DSD1z+hjdLJ9/gAotWEWUOlU9ou4UyEeHsRrv+upZv65/DrXsvkLfZr9Ych4 yDCtPWI1AT47p3S7huGTAnsK+XJLfGsDVfwEuTp7QbJ1k/NsrVDNBAoV6ikhjKnr a9KfMCKrRnvb6HRVy/PaOZfUWF7JhRRSCxiTBxBSbwaQRXIjRQ0H/J5EaRalruQ9 Y5dPsL80pe1tM9cDj1nyAFfgqHWJOKgdTN9cS8wH3/2OUwB3gyZyqrkMgqsRLW4q wztLpwO65Z+rdmp4k13EB0UuxNHg2wh3hX991aD+wztahN/dhfxeqAE7sCttO9tC S+6rj/9twfyUdOjpLxu31chowUgcNKMZespczTMLz0yIe6NFS7WI1vAfNFxtmhvn 1D6f51F4bpEAtih9PzoxL3rN+qmENW+TO6LFpgTYhSw5iXy8QmfyS4Iz6DGfec+8 NpuWZ8mp+Ak+NVTDH450T4yblKq9/3MCIXlPnHeethrYYwZrYvN5Q28Znfc5zNlQ KXOjP9pzgwOzjq1Rp0Uv1MWv4pbvTFgTA4Zimejx64mGU3PTRX7Noe4UdCvdV5mB OA/ov38SKWE= =EwIV -----END PGP SIGNATURE----- --Apple-Mail=_24F3C8FA-2453-4243-89D1-AB7658AA1EF6-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 11 Jan 2017 10:05:37 -0800 From: "Darrick J. Wong" To: Jan Kara Cc: Miklos Szeredi , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org Subject: Re: [Lsf-pc] [LSF/MM TOPIC] sharing pages between mappings Message-ID: <20170111180537.GA10498@birch.djwong.org> References: <20170111115143.GJ16116@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170111115143.GJ16116@quack2.suse.cz> Sender: owner-linux-mm@kvack.org List-ID: On Wed, Jan 11, 2017 at 12:51:43PM +0100, Jan Kara wrote: > On Wed 11-01-17 11:29:28, Miklos Szeredi wrote: > > I know there's work on this for xfs, but could this be done in generic mm > > code? > > > > What are the obstacles? page->mapping and page->index are the obvious > > ones. > > Yes, these two are the main that come to my mind. Also you'd need to > somehow share the mapping->i_mmap tree so that unmap_mapping_range() works. > > > If that's too difficult is it maybe enough to share mappings between > > files while they are completely identical and clone the mapping when > > necessary? > > Well, but how would the page->mapping->host indirection work? Even if you > have identical contents of the mappings, you still need to be aware there > are several inodes behind them and you need to pick the right one > somehow... > > > All COW filesystems would benefit, as well as layered ones: lots of > > fuse fs, and in some cases overlayfs too. > > > > Related: what can DAX do in the presence of cloned block? > > For DAX handling a block COW should be doable if that is what you are > asking about. Handling of blocks that can be written to while they are > shared will be rather difficult (you have problems with keeping dirty bits > in the radix tree consistent if nothing else). I'm also interested in this topic, though I haven't gotten any further than a hand-wavy notion of handling cow by allocating new blocks, memcpy the contents to the new blocks (how?), then update the mappings to point to the new blocks (how?). It looks a lot easier now with the iomap stuff, but that's as far as I got. :) (IOWs it basically took all the time since the last LSF to get reflink polished enough to handle regular files reasonably well.) --D > > Honza > -- > Jan Kara > SUSE Labs, CR > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: MIME-Version: 1.0 In-Reply-To: <20170111115143.GJ16116@quack2.suse.cz> References: <20170111115143.GJ16116@quack2.suse.cz> From: Miklos Szeredi Date: Wed, 11 Jan 2017 15:13:19 +0100 Message-ID: Subject: Re: [Lsf-pc] [LSF/MM TOPIC] sharing pages between mappings To: Jan Kara Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: On Wed, Jan 11, 2017 at 12:51 PM, Jan Kara wrote: > On Wed 11-01-17 11:29:28, Miklos Szeredi wrote: >> I know there's work on this for xfs, but could this be done in generic mm >> code? >> >> What are the obstacles? page->mapping and page->index are the obvious >> ones. > > Yes, these two are the main that come to my mind. Also you'd need to > somehow share the mapping->i_mmap tree so that unmap_mapping_range() works. > >> If that's too difficult is it maybe enough to share mappings between >> files while they are completely identical and clone the mapping when >> necessary? > > Well, but how would the page->mapping->host indirection work? Even if you > have identical contents of the mappings, you still need to be aware there > are several inodes behind them and you need to pick the right one > somehow... When do we actually need page->mapping->host? The only place where it's not available is page writeback. Then we can know that the original page was already cow-ed and after being cowed, the page belong only to a single inode. What then happens if the newly written data is cloned before being written back? We can either write back the page during the clone, so that only clean pages are ever shared. Or we can let dirty pages be shared between inodes. In that latter case the question is: do we care about which inode we use for writing back the data? Is the inode needed at all? I don't know enough about filesystem internals to see clearly what happens in such a situation. >> All COW filesystems would benefit, as well as layered ones: lots of >> fuse fs, and in some cases overlayfs too. >> >> Related: what can DAX do in the presence of cloned block? > > For DAX handling a block COW should be doable if that is what you are > asking about. Handling of blocks that can be written to while they are > shared will be rather difficult (you have problems with keeping dirty bits > in the radix tree consistent if nothing else). What happens if you do: - clone_file_range(A, off1, B, off2, len); - mmap both A and B using DAX. The mapping will contain the same struct page for two different mappings, no? Thanks, Miklos -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 11 Jan 2017 12:51:43 +0100 From: Jan Kara To: Miklos Szeredi Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org Subject: Re: [Lsf-pc] [LSF/MM TOPIC] sharing pages between mappings Message-ID: <20170111115143.GJ16116@quack2.suse.cz> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: On Wed 11-01-17 11:29:28, Miklos Szeredi wrote: > I know there's work on this for xfs, but could this be done in generic mm > code? > > What are the obstacles? page->mapping and page->index are the obvious > ones. Yes, these two are the main that come to my mind. Also you'd need to somehow share the mapping->i_mmap tree so that unmap_mapping_range() works. > If that's too difficult is it maybe enough to share mappings between > files while they are completely identical and clone the mapping when > necessary? Well, but how would the page->mapping->host indirection work? Even if you have identical contents of the mappings, you still need to be aware there are several inodes behind them and you need to pick the right one somehow... > All COW filesystems would benefit, as well as layered ones: lots of > fuse fs, and in some cases overlayfs too. > > Related: what can DAX do in the presence of cloned block? For DAX handling a block COW should be doable if that is what you are asking about. Handling of blocks that can be written to while they are shared will be rather difficult (you have problems with keeping dirty bits in the radix tree consistent if nothing else). Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: MIME-Version: 1.0 From: Miklos Szeredi Date: Wed, 11 Jan 2017 11:29:28 +0100 Message-ID: Subject: [LSF/MM TOPIC] sharing pages between mappings To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, lsf-pc@lists.linux-foundation.org Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: I know there's work on this for xfs, but could this be done in generic mm code? What are the obstacles? page->mapping and page->index are the obvious ones. If that's too difficult is it maybe enough to share mappings between files while they are completely identical and clone the mapping when necessary? All COW filesystems would benefit, as well as layered ones: lots of fuse fs, and in some cases overlayfs too. Related: what can DAX do in the presence of cloned block? Thanks, Miklos -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org