All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Yang Shi <shy828301@gmail.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Oscar Salvador <osalvador@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	Bharata B Rao <bharata@amd.com>,
	Alistair Popple <apopple@nvidia.com>,
	haoxin <xhao@linux.alibaba.com>
Subject: Re: [PATCH 8/8] migrate_pages: batch flushing TLB
Date: Tue, 03 Jan 2023 14:19:23 -0500	[thread overview]
Message-ID: <D8F02562-AA7E-4868-BE58-F1144728A352@nvidia.com> (raw)
In-Reply-To: <20221227002859.27740-9-ying.huang@intel.com>

[-- Attachment #1: Type: text/plain, Size: 4490 bytes --]

On 26 Dec 2022, at 19:28, Huang Ying wrote:

> The TLB flushing will cost quite some CPU cycles during the folio
> migration in some situations.  For example, when migrate a folio of a
> process with multiple active threads that run on multiple CPUs.  After
> batching the _unmap and _move in migrate_pages(), the TLB flushing can
> be batched easily with the existing TLB flush batching mechanism.
> This patch implements that.
>
> We use the following test case to test the patch.
>
> On a 2-socket Intel server,
>
> - Run pmbench memory accessing benchmark
>
> - Run `migratepages` to migrate pages of pmbench between node 0 and
>   node 1 back and forth.
>
> With the patch, the TLB flushing IPI reduces 99.1% during the test and
> the number of pages migrated successfully per second increases 291.7%.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Bharata B Rao <bharata@amd.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: haoxin <xhao@linux.alibaba.com>
> ---
>  mm/migrate.c |  4 +++-
>  mm/rmap.c    | 20 +++++++++++++++++---
>  2 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 70a40b8fee1f..d7413164e748 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1215,7 +1215,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>  		/* Establish migration ptes */
>  		VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>  			       !folio_test_ksm(src) && !anon_vma, src);
> -		try_to_migrate(src, 0);
> +		try_to_migrate(src, TTU_BATCH_FLUSH);
>  		page_was_mapped = 1;
>  	}
>
> @@ -1732,6 +1732,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  	stats->nr_thp_failed += thp_retry;
>  	stats->nr_failed_pages += nr_retry_pages;
>  move:
> +	try_to_unmap_flush();
> +
>  	retry = 1;
>  	for (pass = 0; pass < 10 && (retry || large_retry); pass++) {
>  		retry = 0;
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b616870a09be..2e125f3e462e 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1976,7 +1976,21 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
>  		} else {
>  			flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
>  			/* Nuke the page table entry. */
> -			pteval = ptep_clear_flush(vma, address, pvmw.pte);
> +			if (should_defer_flush(mm, flags)) {
> +				/*
> +				 * We clear the PTE but do not flush so potentially
> +				 * a remote CPU could still be writing to the folio.
> +				 * If the entry was previously clean then the
> +				 * architecture must guarantee that a clear->dirty
> +				 * transition on a cached TLB entry is written through
> +				 * and traps if the PTE is unmapped.
> +				 */
> +				pteval = ptep_get_and_clear(mm, address, pvmw.pte);
> +
> +				set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
> +			} else {
> +				pteval = ptep_clear_flush(vma, address, pvmw.pte);
> +			}
>  		}
>

This is only for PTE mapped pages, right? We also need something similar
in set_pmd_migration_entry() in mm/huge_memory.c for PMD-mapped THPs.
Oh, since you limit NR_MAX_BATCHED_MIGRATION to HPAGE_PMD_NR and count
nr_pages with folio_nr_pages(), THPs will only be migrated one by one.
This is not obvious from the cover letter.

Are you planning to support batched THP migration? If not, it might be
better to update cover letter to be explicit about it and add comments
in migrate_pages(). It would be nice to also note that we need to
increase NR_MAX_BATCHED_MIGRATION beyond HPAGE_PMD_NR and make similar
changes in set_pmd_migration_entry() to get batched THP migration support.

>  		/* Set the dirty flag on the folio now the pte is gone. */
> @@ -2148,10 +2162,10 @@ void try_to_migrate(struct folio *folio, enum ttu_flags flags)
>
>  	/*
>  	 * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and
> -	 * TTU_SPLIT_HUGE_PMD and TTU_SYNC flags.
> +	 * TTU_SPLIT_HUGE_PMD, TTU_SYNC, and TTU_BATCH_FLUSH flags.
>  	 */
>  	if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD |
> -					TTU_SYNC)))
> +					TTU_SYNC | TTU_BATCH_FLUSH)))
>  		return;
>
>  	if (folio_is_zone_device(folio) &&
> -- 
> 2.35.1


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

  reply	other threads:[~2023-01-03 19:23 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-27  0:28 [PATCH 0/8] migrate_pages(): batch TLB flushing Huang Ying
2022-12-27  0:28 ` [PATCH 1/8] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
2023-01-03 18:06   ` Zi Yan
2023-01-05  3:02   ` Alistair Popple
2023-01-05  5:53     ` Huang, Ying
2023-01-05  6:50       ` Alistair Popple
2023-01-05  7:06         ` Huang, Ying
2022-12-27  0:28 ` [PATCH 2/8] migrate_pages: separate hugetlb folios migration Huang Ying
2022-12-28 23:17   ` Andrew Morton
2023-01-02 23:53     ` Huang, Ying
2023-01-05  4:13   ` Alistair Popple
2023-01-05  5:51     ` Huang, Ying
2023-01-05  6:43       ` Alistair Popple
2023-01-05  7:31         ` Huang, Ying
2023-01-05  7:39           ` Alistair Popple
2023-01-09  7:23             ` Huang, Ying
2023-01-10  1:37               ` Alistair Popple
2022-12-27  0:28 ` [PATCH 3/8] migrate_pages: restrict number of pages to migrate in batch Huang Ying
2023-01-03 18:40   ` Zi Yan
2023-01-04  0:24     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 4/8] migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
2023-01-03 18:55   ` Zi Yan
2023-01-05 18:26   ` Nathan Chancellor
2023-01-05 18:57     ` Kees Cook
2023-01-08 23:33       ` Huang, Ying
2022-12-27  0:28 ` [PATCH 5/8] migrate_pages: batch _unmap and _move Huang Ying
2022-12-28 23:22   ` Andrew Morton
2023-01-02 23:29     ` Huang, Ying
2023-01-03 19:01   ` Zi Yan
2023-01-04  0:34     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 6/8] migrate_pages: move migrate_folio_done() and migrate_folio_unmap() Huang Ying
2023-01-03 19:02   ` Zi Yan
2023-01-04  1:26     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 7/8] migrate_pages: share more code between _unmap and _move Huang Ying
2023-01-04  7:12   ` Alistair Popple
2023-01-06  4:15     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 8/8] migrate_pages: batch flushing TLB Huang Ying
2023-01-03 19:19   ` Zi Yan [this message]
2023-01-04  1:41     ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D8F02562-AA7E-4868-BE58-F1144728A352@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bharata@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=xhao@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.