* sharing page cache pages between multiple mappings @ 2016-05-19 8:20 Miklos Szeredi 2016-05-19 9:05 ` Michal Hocko 0 siblings, 1 reply; 6+ messages in thread From: Miklos Szeredi @ 2016-05-19 8:20 UTC (permalink / raw) To: linux-mm, linux-fsdevel, linux-kernel, linux-btrfs Has anyone thought about sharing pages between multiple files? The obvious application is for COW filesytems where there are logically distinct files that physically share data and could easily share the cache as well if there was infrastructure for it. Thanks, Miklos -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: sharing page cache pages between multiple mappings 2016-05-19 8:20 sharing page cache pages between multiple mappings Miklos Szeredi @ 2016-05-19 9:05 ` Michal Hocko 2016-05-19 10:17 ` Miklos Szeredi 0 siblings, 1 reply; 6+ messages in thread From: Michal Hocko @ 2016-05-19 9:05 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-mm, linux-fsdevel, linux-kernel, linux-btrfs On Thu 19-05-16 10:20:13, Miklos Szeredi wrote: > Has anyone thought about sharing pages between multiple files? > > The obvious application is for COW filesytems where there are > logically distinct files that physically share data and could easily > share the cache as well if there was infrastructure for it. FYI this has been discussed at LSFMM this year[1]. I wasn't at the session so cannot tell you any details but the LWN article covers it at least briefly. [1] https://lwn.net/Articles/684826/ -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: sharing page cache pages between multiple mappings 2016-05-19 9:05 ` Michal Hocko @ 2016-05-19 10:17 ` Miklos Szeredi 2016-05-19 10:53 ` Michal Hocko 2016-05-19 23:48 ` Dave Chinner 0 siblings, 2 replies; 6+ messages in thread From: Miklos Szeredi @ 2016-05-19 10:17 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-fsdevel, linux-kernel, linux-btrfs, Darrick J. Wong On Thu, May 19, 2016 at 11:05 AM, Michal Hocko <mhocko@kernel.org> wrote: > On Thu 19-05-16 10:20:13, Miklos Szeredi wrote: >> Has anyone thought about sharing pages between multiple files? >> >> The obvious application is for COW filesytems where there are >> logically distinct files that physically share data and could easily >> share the cache as well if there was infrastructure for it. > > FYI this has been discussed at LSFMM this year[1]. I wasn't at the > session so cannot tell you any details but the LWN article covers it at > least briefly. Cool, so it's not such a crazy idea. Darrick, would you mind briefly sharing your ideas regarding this? The use case I have is fixing overlayfs weird behavior. The following may result in "buf" not matching "data": int fr = open("foo", O_RDONLY); int fw = open("foo", O_RDWR); write(fw, data, sizeof(data)); read(fr, buf, sizeof(data)); The reason is that "foo" is on a read-only layer, and opening it for read-write triggers copy-up into a read-write layer. However the old, read-only open still refers to the unmodified file. Fixing this properly requires that when opening a file, we don't delegate operations fully to the underlying file, but rather allow sharing of pages from underlying file until the file is copied up. At that point we switch to sharing pages with the read-write copy. Another use case is direct access in fuse: people often want I/O operations on a fuse file to go directly to an underlying file. Doing this properly requires sharing pages between the real, underlying file and the fuse file. Thanks, Miklos -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: sharing page cache pages between multiple mappings 2016-05-19 10:17 ` Miklos Szeredi @ 2016-05-19 10:53 ` Michal Hocko 2016-05-19 23:48 ` Dave Chinner 1 sibling, 0 replies; 6+ messages in thread From: Michal Hocko @ 2016-05-19 10:53 UTC (permalink / raw) To: Miklos Szeredi Cc: linux-mm, linux-fsdevel, linux-kernel, linux-btrfs, Darrick J. Wong On Thu 19-05-16 12:17:14, Miklos Szeredi wrote: > On Thu, May 19, 2016 at 11:05 AM, Michal Hocko <mhocko@kernel.org> wrote: > > On Thu 19-05-16 10:20:13, Miklos Szeredi wrote: > >> Has anyone thought about sharing pages between multiple files? > >> > >> The obvious application is for COW filesytems where there are > >> logically distinct files that physically share data and could easily > >> share the cache as well if there was infrastructure for it. > > > > FYI this has been discussed at LSFMM this year[1]. I wasn't at the > > session so cannot tell you any details but the LWN article covers it at > > least briefly. > > Cool, so it's not such a crazy idea. FWIW it is ;) -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: sharing page cache pages between multiple mappings 2016-05-19 10:17 ` Miklos Szeredi 2016-05-19 10:53 ` Michal Hocko @ 2016-05-19 23:48 ` Dave Chinner 2016-05-20 10:37 ` Miklos Szeredi 1 sibling, 1 reply; 6+ messages in thread From: Dave Chinner @ 2016-05-19 23:48 UTC (permalink / raw) To: Miklos Szeredi Cc: Michal Hocko, linux-mm, linux-fsdevel, linux-kernel, linux-btrfs, Darrick J. Wong On Thu, May 19, 2016 at 12:17:14PM +0200, Miklos Szeredi wrote: > On Thu, May 19, 2016 at 11:05 AM, Michal Hocko <mhocko@kernel.org> wrote: > > On Thu 19-05-16 10:20:13, Miklos Szeredi wrote: > >> Has anyone thought about sharing pages between multiple files? > >> > >> The obvious application is for COW filesytems where there are > >> logically distinct files that physically share data and could easily > >> share the cache as well if there was infrastructure for it. > > > > FYI this has been discussed at LSFMM this year[1]. I wasn't at the > > session so cannot tell you any details but the LWN article covers it at > > least briefly. > > Cool, so it's not such a crazy idea. Oh, it most certainly is crazy. :P > Darrick, would you mind briefly sharing your ideas regarding this? The current line of though is that we'll only attempt this in XFS on inodes that are known to share underlying physical extents. i.e. files that have blocks that have been reflinked or deduped. That way we can overload the breaking of reflink blocks (via copy on write) with unsharing the pages in the page cache for that inode. i.e. shared pages can propagate upwards in overlay if it uses reflink for copy-up and writes will then break the sharing with the underlying source without overlay having to do anything special. Right now I'm not sure what mechanism we will use - we want to support files that have a mix of private and shared pages, so that implies we are not going to be sharing mappings but sharing pages instead. However, we've been looking at this as being completely encapsulated within the filesystem because it's tightly linked to changes in the physical layout of the filesystem, not as general "share this mapping between two unrelated inodes" infrastructure. That may change as we dig deeper into it... > The use case I have is fixing overlayfs weird behavior. The following > may result in "buf" not matching "data": > > int fr = open("foo", O_RDONLY); > int fw = open("foo", O_RDWR); > write(fw, data, sizeof(data)); > read(fr, buf, sizeof(data)); > > The reason is that "foo" is on a read-only layer, and opening it for > read-write triggers copy-up into a read-write layer. However the old, > read-only open still refers to the unmodified file. > > Fixing this properly requires that when opening a file, we don't > delegate operations fully to the underlying file, but rather allow > sharing of pages from underlying file until the file is copied up. At > that point we switch to sharing pages with the read-write copy. Unless I'm missing something here (quite possible!), I'm not sure we can fix that problem with page cache sharing or reflink. It implies we are sharing pages in a downwards direction - private overlay pages/mappings from multiple inodes would need to be shared with a single underlying shared read-only inode, and I lack the imagination to see how that works... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: sharing page cache pages between multiple mappings 2016-05-19 23:48 ` Dave Chinner @ 2016-05-20 10:37 ` Miklos Szeredi 0 siblings, 0 replies; 6+ messages in thread From: Miklos Szeredi @ 2016-05-20 10:37 UTC (permalink / raw) To: Dave Chinner Cc: Michal Hocko, linux-mm, linux-fsdevel, linux-kernel, linux-btrfs, Darrick J. Wong On Fri, May 20, 2016 at 1:48 AM, Dave Chinner <david@fromorbit.com> wrote: > On Thu, May 19, 2016 at 12:17:14PM +0200, Miklos Szeredi wrote: >> On Thu, May 19, 2016 at 11:05 AM, Michal Hocko <mhocko@kernel.org> wrote: >> > On Thu 19-05-16 10:20:13, Miklos Szeredi wrote: >> >> Has anyone thought about sharing pages between multiple files? >> >> >> >> The obvious application is for COW filesytems where there are >> >> logically distinct files that physically share data and could easily >> >> share the cache as well if there was infrastructure for it. >> > >> > FYI this has been discussed at LSFMM this year[1]. I wasn't at the >> > session so cannot tell you any details but the LWN article covers it at >> > least briefly. >> >> Cool, so it's not such a crazy idea. > > Oh, it most certainly is crazy. :P > >> Darrick, would you mind briefly sharing your ideas regarding this? > > The current line of though is that we'll only attempt this in XFS on > inodes that are known to share underlying physical extents. i.e. > files that have blocks that have been reflinked or deduped. That > way we can overload the breaking of reflink blocks (via copy on > write) with unsharing the pages in the page cache for that inode. > i.e. shared pages can propagate upwards in overlay if it uses > reflink for copy-up and writes will then break the sharing with the > underlying source without overlay having to do anything special. > > Right now I'm not sure what mechanism we will use - we want to > support files that have a mix of private and shared pages, so that > implies we are not going to be sharing mappings but sharing pages > instead. However, we've been looking at this as being completely > encapsulated within the filesystem because it's tightly linked to > changes in the physical layout of the filesystem, not as general > "share this mapping between two unrelated inodes" infrastructure. > That may change as we dig deeper into it... > >> The use case I have is fixing overlayfs weird behavior. The following >> may result in "buf" not matching "data": >> >> int fr = open("foo", O_RDONLY); >> int fw = open("foo", O_RDWR); >> write(fw, data, sizeof(data)); >> read(fr, buf, sizeof(data)); >> >> The reason is that "foo" is on a read-only layer, and opening it for >> read-write triggers copy-up into a read-write layer. However the old, >> read-only open still refers to the unmodified file. >> >> Fixing this properly requires that when opening a file, we don't >> delegate operations fully to the underlying file, but rather allow >> sharing of pages from underlying file until the file is copied up. At >> that point we switch to sharing pages with the read-write copy. > > Unless I'm missing something here (quite possible!), I'm not sure > we can fix that problem with page cache sharing or reflink. It > implies we are sharing pages in a downwards direction - private > overlay pages/mappings from multiple inodes would need to be shared > with a single underlying shared read-only inode, and I lack the > imagination to see how that works... Indeed, reflink doesn't make this work. We could reflink-up on any open (or on lookup), not just on write, it's a trivial change in overlayfs. Drawback is slower first open/lookup and space used by duplicate trees even without modification on the overlay. Not sure if that's a problem in practice. I'll think about the generic downwards sharing. For overlayfs it doesn't need to be per-page, so that might make it somewhat simpler problem. Thanks, Miklos -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-05-20 10:37 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-19 8:20 sharing page cache pages between multiple mappings Miklos Szeredi 2016-05-19 9:05 ` Michal Hocko 2016-05-19 10:17 ` Miklos Szeredi 2016-05-19 10:53 ` Michal Hocko 2016-05-19 23:48 ` Dave Chinner 2016-05-20 10:37 ` Miklos Szeredi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).