From: Ryan Roberts <ryan.roberts@arm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Will Deacon <will@kernel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Nick Piggin <npiggin@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
Arnd Bergmann <arnd@arndb.de>,
David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Yin Fengwei <fengwei.yin@intel.com>,
Yang Shi <shy828301@gmail.com>,
"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/5] mm: Implement folio_remove_rmap_range()
Date: Wed, 30 Aug 2023 16:42:07 +0100 [thread overview]
Message-ID: <1696b47d-fcaa-4d60-b9b2-3f2178127dcb@arm.com> (raw)
In-Reply-To: <ZO9XgttGXnx5dxFZ@casper.infradead.org>
On 30/08/2023 15:51, Matthew Wilcox wrote:
> On Wed, Aug 30, 2023 at 10:50:07AM +0100, Ryan Roberts wrote:
>> Like page_remove_rmap() but batch-removes the rmap for a range of pages
>> belonging to a folio. This can provide a small speedup due to less
>> manipuation of the various counters. But more crucially, if removing the
>> rmap for all pages of a folio in a batch, there is no need to
>> (spuriously) add it to the deferred split list, which saves significant
>> cost when there is contention for the split queue lock.
>>
>> All contained pages are accounted using the order-0 folio (or base page)
>> scheme.
>>
>> page_remove_rmap() is refactored so that it forwards to
>> folio_remove_rmap_range() for !compound cases, and both functions now
>> share a common epilogue function. The intention here is to avoid
>> duplication of code.
>
> What would you think to doing it like this instead? This probably doesn't
> even compile and it's definitely not sanity checked; just trying to get
> across an idea of the shape of this code. I think this is more like
> what DavidH was asking for (but he's on holiday this week so won't be
> able to confirm).
I think it was actually Yu Zhou who was arguing for something more like this?
I don't love it, because as I've expressed previously, I don't think an API that
operates on a "range" of pages has any business manipulating a PMD mapping,
because it can't be subdivided. So my preference is for what I've already got.
But certainly your proposal to use nr=-1 to mean compound is the nicest way I've
seen since you no longer make nr redundant in this case.
Given you and Yu both have a lot more experience with this code than me, I'll
follow your advice and do it this way for the next rev.
Thanks,
Ryan
>
>
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index a3825ce81102..d442d1e5425d 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -202,6 +202,8 @@ void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr,
> struct vm_area_struct *, bool compound);
> void page_remove_rmap(struct page *, struct vm_area_struct *,
> bool compound);
> +void folio_remove_rmap_range(struct folio *folio, struct page *page,
> + int nr, struct vm_area_struct *vma);
>
> void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *,
> unsigned long address, rmap_t flags);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index ec7f8e6c9e48..2592be47452e 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1380,24 +1380,26 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
> }
>
> /**
> - * page_remove_rmap - take down pte mapping from a page
> - * @page: page to remove mapping from
> - * @vma: the vm area from which the mapping is removed
> - * @compound: uncharge the page as compound or small page
> + * folio_remove_rmap_range - Take down PTE mappings from a range of pages.
> + * @folio: Folio containing all pages in range.
> + * @page: First page in range to unmap.
> + * @nr: Number of pages to unmap. -1 to unmap a PMD.
> + * @vma: The VM area containing the range.
> *
> - * The caller needs to hold the pte lock.
> + * All pages in the range must belong to the same VMA & folio.
> + *
> + * Context: Caller holds the pte lock.
> */
> -void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> - bool compound)
> +void folio_remove_rmap_range(struct folio *folio, struct page *page,
> + int pages, struct vm_area_struct *vma)
> {
> - struct folio *folio = page_folio(page);
> atomic_t *mapped = &folio->_nr_pages_mapped;
> + int nr_unmapped = 0;
> + int nr_mapped = 0;
> int nr = 0, nr_pmdmapped = 0;
> bool last;
> enum node_stat_item idx;
>
> - VM_BUG_ON_PAGE(compound && !PageHead(page), page);
> -
> /* Hugetlb pages are not counted in NR_*MAPPED */
> if (unlikely(folio_test_hugetlb(folio))) {
> /* hugetlb pages are always mapped with pmds */
> @@ -1405,14 +1407,25 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> return;
> }
>
> - /* Is page being unmapped by PTE? Is this its last map to be removed? */
> - if (likely(!compound)) {
> - last = atomic_add_negative(-1, &page->_mapcount);
> - nr = last;
> - if (last && folio_test_large(folio)) {
> - nr = atomic_dec_return_relaxed(mapped);
> - nr = (nr < COMPOUND_MAPPED);
> + /* Are we taking down a PMD mapping? */
> + if (likely(pages > 0)) {
> + VM_WARN_ON_ONCE(page < folio_page(folio, 0) ||
> + page + pages > folio_page(folio,
> + folio_nr_pages(folio)));
> + while (pages) {
> + /* Is this the page's last map to be removed? */
> + last = atomic_add_negative(-1, &page->_mapcount);
> + if (last)
> + nr_unmapped++;
> + pages--;
> + page++;
> }
> +
> + /* Pages still mapped if folio mapped entirely */
> + nr_mapped = atomic_sub_return_relaxed(nr_unmapped, mapped);
> + if (nr_mapped >= COMPOUND_MAPPED)
> + nr_unmapped = 0;
> +
> } else if (folio_test_pmd_mappable(folio)) {
> /* That test is redundant: it's for safety or to optimize out */
>
> @@ -1441,18 +1454,19 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> idx = NR_FILE_PMDMAPPED;
> __lruvec_stat_mod_folio(folio, idx, -nr_pmdmapped);
> }
> +
> if (nr) {
> idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED;
> __lruvec_stat_mod_folio(folio, idx, -nr);
>
> /*
> - * Queue anon THP for deferred split if at least one
> - * page of the folio is unmapped and at least one page
> - * is still mapped.
> + * Queue large anon folio for deferred split if at least one
> + * page of the folio is unmapped and at least one page is still
> + * mapped.
> */
> - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio))
> - if (!compound || nr < nr_pmdmapped)
> - deferred_split_folio(folio);
> + if (folio_test_large(folio) && folio_test_anon(folio) &&
> + nr_mapped)
> + deferred_split_folio(folio);
> }
>
> /*
> @@ -1466,6 +1480,20 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> munlock_vma_folio(folio, vma, compound);
> }
>
> +/**
> + * page_remove_rmap - take down pte mapping from a page
> + * @page: page to remove mapping from
> + * @vma: the vm area from which the mapping is removed
> + * @compound: uncharge the page as compound or small page
> + *
> + * The caller needs to hold the pte lock.
> + */
> +void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
> + bool compound)
> +{
> + folio_remove_rmap_range(page_folio(page), page, compound ? -1 : 1, vma);
> +}
> +
> /*
> * @arg: enum ttu_flags will be passed to this argument
> */
next prev parent reply other threads:[~2023-08-30 15:42 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-30 9:50 [PATCH v2 0/5] Optimize mmap_exit for large folios Ryan Roberts
2023-08-30 9:50 ` [PATCH v2 1/5] mm: Implement folio_remove_rmap_range() Ryan Roberts
2023-08-30 14:51 ` Matthew Wilcox
2023-08-30 15:42 ` Ryan Roberts [this message]
2023-08-30 16:24 ` David Hildenbrand
2023-08-30 9:50 ` [PATCH v2 2/5] mm/mmu_gather: generalize mmu_gather rmap removal mechanism Ryan Roberts
2023-08-30 9:50 ` [PATCH v2 3/5] mm/mmu_gather: Remove encoded_page infrastructure Ryan Roberts
2023-08-30 9:50 ` [PATCH v2 4/5] mm: Refector release_pages() Ryan Roberts
2023-08-30 19:11 ` Matthew Wilcox
2023-08-30 21:17 ` Matthew Wilcox
2023-08-31 19:57 ` Ryan Roberts
2023-09-01 4:36 ` Matthew Wilcox
2023-08-30 9:50 ` [PATCH v2 5/5] mm/mmu_gather: Store and process pages in contig ranges Ryan Roberts
2023-08-30 15:07 ` Matthew Wilcox
2023-08-30 15:32 ` Ryan Roberts
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1696b47d-fcaa-4d60-b9b2-3f2178127dcb@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=arnd@arndb.de \
--cc=borntraeger@linux.ibm.com \
--cc=david@redhat.com \
--cc=fengwei.yin@intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=shy828301@gmail.com \
--cc=svens@linux.ibm.com \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yuzhao@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).