From: Mel Gorman <mgorman@techsingularity.net>
To: Jan Kara <jack@suse.cz>
Cc: Linux-MM <linux-mm@kvack.org>,
Linux-FSDevel <linux-fsdevel@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <ak@linux.intel.com>,
Dave Hansen <dave.hansen@intel.com>,
Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 3/8] mm, truncate: Remove all exceptional entries from pagevec under one lock
Date: Thu, 12 Oct 2017 15:53:06 +0100 [thread overview]
Message-ID: <20171012145306.2lepcjtpdxshua6j@techsingularity.net> (raw)
In-Reply-To: <20171012133323.GB29293@quack2.suse.cz>
On Thu, Oct 12, 2017 at 03:33:23PM +0200, Jan Kara wrote:
> > return;
> >
> > - if (dax_mapping(mapping)) {
> > - dax_delete_mapping_entry(mapping, index);
> > - return;
> > + dax = dax_mapping(mapping);
> > + if (!dax)
> > + spin_lock_irq(&mapping->tree_lock);
> > +
> > + for (i = ei, j = ei; i < pagevec_count(pvec); i++) {
> > + struct page *page = pvec->pages[i];
> > + pgoff_t index = indices[i];
> > +
> > + if (!radix_tree_exceptional_entry(page)) {
> > + pvec->pages[j++] = page;
> > + continue;
> > + }
> > +
> > + if (unlikely(dax)) {
> > + dax_delete_mapping_entry(mapping, index);
> > + continue;
> > + }
> > +
> > + __clear_shadow_entry(mapping, index, page);
> > }
> > - clear_shadow_entry(mapping, index, entry);
> > +
> > + if (!dax)
> > + spin_unlock_irq(&mapping->tree_lock);
> > + pvec->nr = j;
> > }
>
> When I look at this I think could make things cleaner. I have the following
> observations:
>
> 1) All truncate_inode_pages(), invalidate_mapping_pages(),
> invalidate_inode_pages2_range() essentially do very similar thing and would
> benefit from a similar kind of batching.
>
While this is true, the benefit is much more marginal that I didn't feel
the level of churn was justified. Primarily it would help fadvise() and
invalidating when buffered and direct IO is mixed. I didn't think it would
be that much cleaner as a result so I left it.
> 2) As you observed and measured, batching of radix tree operations makes
> sense both when removing pages and shadow entries, I'm very confident it
> would make sense for DAX exceptional entries as well.
>
True, but I didn't have a suitable setup for testing DAX so I wasn't
comfortable with making the change. dax_delete_mapping_entry can sleep but it
should be as simple as not taking the spinlock in dax_delete_mapping_entry
and always locking in truncate_exceptional_pvec_entries. dax is already
releasing the mapping->tree_lock if it needs to sleep and I didn't spot
any other gotcha but I'd prefer that change was done by someone that can
verify it works properly.
> 3) In all cases (i.e., those three functions and for all entry types) the
> workflow seems to be:
> * lockless lookup of entries
> * prepare entry for reclaim (or determine it is not elligible)
> * lock mapping->tree_lock
> * verify entry is still elligible for reclaim (otherwise bail)
> * clear radix tree entry
> * unlock mapping->tree_lock
> * final cleanup of the entry
>
> So I'm wondering whether we cannot somehow refactor stuff so that batching
> of radix tree operations could be shared and we wouldn't have to duplicate
> it in all those cases.
>
> But it would be rather large overhaul of the code so it may be a bit out of
> scope for these improvements...
>
I think it would be out of scope for this improvement but I can look into
it if the series is accepted. I think it would be a lot of churn for fairly
marginal benefit though.
> > @@ -409,8 +445,8 @@ void truncate_inode_pages_range(struct address_space *mapping,
> > }
> >
> > if (radix_tree_exceptional_entry(page)) {
> > - truncate_exceptional_entry(mapping, index,
> > - page);
> > + if (ei != PAGEVEC_SIZE)
> > + ei = i;
>
> This should be ei == PAGEVEC_SIZE I think.
>
> Otherwise the patch looks good to me so feel free to add:
>
Fixed.
> Reviewed-by: Jan Kara <jack@suse.cz>
Thanks
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-12 14:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-12 9:30 [PATCH 0/8] Follow-up for speed up page cache truncation Mel Gorman
2017-10-12 9:30 ` [PATCH 1/8] mm, page_alloc: Enable/disable IRQs once when freeing a list of pages Mel Gorman
2017-10-12 9:30 ` [PATCH 2/8] mm, truncate: Do not check mapping for every page being truncated Mel Gorman
2017-10-12 12:15 ` Jan Kara
2017-10-12 12:41 ` Mel Gorman
2017-10-12 19:11 ` Johannes Weiner
2017-10-12 9:30 ` [PATCH 3/8] mm, truncate: Remove all exceptional entries from pagevec under one lock Mel Gorman
2017-10-12 13:33 ` Jan Kara
2017-10-12 14:53 ` Mel Gorman [this message]
2017-10-12 19:45 ` Johannes Weiner
2017-10-12 9:30 ` [PATCH 4/8] mm: Only drain per-cpu pagevecs once per pagevec usage Mel Gorman
2017-10-12 9:31 ` [PATCH 5/8] mm, pagevec: Remove cold parameter for pagevecs Mel Gorman
2017-10-12 9:31 ` [PATCH 6/8] mm: Remove cold parameter for release_pages Mel Gorman
2017-10-12 9:31 ` [PATCH 7/8] mm, Remove cold parameter from free_hot_cold_page* Mel Gorman
2017-10-12 9:31 ` [PATCH 8/8] mm: Remove __GFP_COLD Mel Gorman
2017-10-18 7:59 [PATCH 0/8] Follow-up for speed up page cache truncation v2 Mel Gorman
2017-10-18 7:59 ` [PATCH 3/8] mm, truncate: Remove all exceptional entries from pagevec under one lock Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171012145306.2lepcjtpdxshua6j@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=ak@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).