linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Tejun Heo <tj@kernel.org>, Jan Kara <jack@suse.cz>,
	Josef Bacik <jbacik@fb.com>
Subject: Re: [PATCH 00/14] Small step toward KSM for file back page.
Date: Thu, 8 Oct 2020 14:48:49 -0400	[thread overview]
Message-ID: <20201008184849.GA3514601@redhat.com> (raw)
In-Reply-To: <20201008154341.GJ20115@casper.infradead.org>

On Thu, Oct 08, 2020 at 04:43:41PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 08, 2020 at 11:30:28AM -0400, Jerome Glisse wrote:
> > On Wed, Oct 07, 2020 at 11:09:16PM +0100, Matthew Wilcox wrote:
> > > So ... why don't you put a PageKsm page in the page cache?  That way you
> > > can share code with the current KSM implementation.  You'd need
> > > something like this:
> > 
> > I do just that but there is no need to change anything in page cache.
> 
> That's clearly untrue.  If you just put a PageKsm page in the page
> cache today, here's what will happen on a truncate:
> 
> void truncate_inode_pages_range(struct address_space *mapping,
>                                 loff_t lstart, loff_t lend)
> {
> ...
>                 struct page *page = find_lock_page(mapping, start - 1);
> 
> find_lock_page() does this:
>         return pagecache_get_page(mapping, offset, FGP_LOCK, 0);
> 
> pagecache_get_page():
> 
> repeat:
>         page = find_get_entry(mapping, index);
> ...
>         if (fgp_flags & FGP_LOCK) {
> ...
>                 if (unlikely(compound_head(page)->mapping != mapping)) {
>                         unlock_page(page);
>                         put_page(page);
>                         goto repeat;
> 
> so it's just going to spin.  There are plenty of other codepaths that
> would need to be checked.  If you haven't found them, that shows you
> don't understand the problem deeply enough yet.

I also change truncate, splice and few other special cases that do
not goes through GUP/page fault/mkwrite (memory debug too but that's
a different beast).


> I believe we should solve this problem, but I don't think you're going
> about it the right way.

I have done much more than what i posted but there is bug that i
need to hammer down before posting everything and i wanted to get
the discussion started. I guess i will finish tracking that one
down and post the whole thing.


> > So flow is:
> > 
> >   Same as before:
> >     1 - write fault (address, vma)
> >     2 - regular write fault handler -> find page in page cache
> > 
> >   New to common page fault code:
> >     3 - ksm check in write fault common code (same as ksm today
> >         for anonymous page fault code path).
> >     4 - break ksm (address, vma) -> (file offset, mapping)
> >         4.a - use mapping and file offset to lookup the proper
> >               fs specific information that were save when the
> >               page was made ksm.
> >         4.b - allocate new page and initialize it with that
> >               information (and page content), update page cache
> >               and mappings ie all the pte who where pointing to
> >               the ksm for that mapping at that offset to now use
> >               the new page (like KSM for anonymous page today).
> 
> But by putting that logic in the page fault path, you've missed
> the truncate path.  And maybe other places.  Putting the logic
> down in pagecache_get_page() means you _don't_ need to find
> all the places that call pagecache_get_page().

They are cases where pagecache is not even in the loop ie you
already have the page and you do not need to look it up (page
fault, some fs common code, anything that goes through GUP,
memory reclaim, ...). Making all those places having to go
through page cache all the times will slow them down and many
are hot code path that i do not believe we want to slow even
if a feature is not use.

Cheers,
Jérôme


      reply	other threads:[~2020-10-08 18:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07  1:05 [PATCH 00/14] Small step toward KSM for file back page jglisse
2020-10-07  1:05 ` [PATCH 01/14] mm/pxa: page exclusive access add header file for all helpers jglisse
2020-10-07  1:05 ` [PATCH 02/14] fs: define filler_t as a function pointer type jglisse
2020-10-07  1:05 ` [PATCH 03/14] fs: directly use a_ops->freepage() instead of a local copy of it jglisse
2020-10-07  1:05 ` [PATCH 04/14] mm: add struct address_space to readpage() callback jglisse
2020-10-07  1:05 ` [PATCH 05/14] mm: add struct address_space to writepage() callback jglisse
2020-10-07  1:05 ` [PATCH 06/14] mm: add struct address_space to set_page_dirty() callback jglisse
2020-10-07  1:05 ` [PATCH 07/14] mm: add struct address_space to invalidatepage() callback jglisse
2020-10-07  1:05 ` [PATCH 08/14] mm: add struct address_space to releasepage() callback jglisse
2020-10-07  1:05 ` [PATCH 09/14] mm: add struct address_space to freepage() callback jglisse
2020-10-07  1:05 ` [PATCH 10/14] mm: add struct address_space to putback_page() callback jglisse
2020-10-07  1:06 ` [PATCH 11/14] mm: add struct address_space to launder_page() callback jglisse
2020-10-07  1:06 ` [PATCH 12/14] mm: add struct address_space to is_partially_uptodate() callback jglisse
2020-10-07  1:06 ` [PATCH 13/14] mm: add struct address_space to isolate_page() callback jglisse
2020-10-07  1:06 ` [PATCH 14/14] mm: add struct address_space to is_dirty_writeback() callback jglisse
2020-10-07  3:20 ` [PATCH 00/14] Small step toward KSM for file back page Matthew Wilcox
2020-10-07 14:48   ` Jerome Glisse
2020-10-07 17:05     ` Matthew Wilcox
2020-10-07 17:54       ` Jerome Glisse
2020-10-07 18:33         ` Matthew Wilcox
2020-10-07 21:45           ` Jerome Glisse
2020-10-07 22:09         ` Matthew Wilcox
2020-10-08 15:30           ` Jerome Glisse
2020-10-08 15:43             ` Matthew Wilcox
2020-10-08 18:48               ` Jerome Glisse [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201008184849.GA3514601@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).