[v2,6/7] mm: migrate: check mapcount for THP instead of ref count
diff mbox series

Message ID 20210413212416.3273-7-shy828301@gmail.com
State New
Headers show
Series
  • mm: thp: use generic THP migration for NUMA hinting fault
Related show

Commit Message

Yang Shi April 13, 2021, 9:24 p.m. UTC
The generic migration path will check refcount, so no need check refcount here.
But the old code actually prevents from migrating shared THP (mapped by multiple
processes), so bail out early if mapcount is > 1 to keep the behavior.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/migrate.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

Comments

Huang, Ying April 14, 2021, 3 a.m. UTC | #1
Yang Shi <shy828301@gmail.com> writes:

> The generic migration path will check refcount, so no need check refcount here.
> But the old code actually prevents from migrating shared THP (mapped by multiple
> processes), so bail out early if mapcount is > 1 to keep the behavior.

What prevents us from migrating shared THP?  If no, why not just remove
the old refcount checking?

Best Regards,
Huang, Ying

> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/migrate.c | 16 ++++------------
>  1 file changed, 4 insertions(+), 12 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a72994c68ec6..dc7cc7f3a124 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2067,6 +2067,10 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>  
>  	VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page), page);
>  
> +	/* Do not migrate THP mapped by multiple processes */
> +	if (PageTransHuge(page) && page_mapcount(page) > 1)
> +		return 0;
> +
>  	/* Avoid migrating to a node that is nearly full */
>  	if (!migrate_balanced_pgdat(pgdat, compound_nr(page)))
>  		return 0;
> @@ -2074,18 +2078,6 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>  	if (isolate_lru_page(page))
>  		return 0;
>  
> -	/*
> -	 * migrate_misplaced_transhuge_page() skips page migration's usual
> -	 * check on page_count(), so we must do it here, now that the page
> -	 * has been isolated: a GUP pin, or any other pin, prevents migration.
> -	 * The expected page count is 3: 1 for page's mapcount and 1 for the
> -	 * caller's pin and 1 for the reference taken by isolate_lru_page().
> -	 */
> -	if (PageTransHuge(page) && page_count(page) != 3) {
> -		putback_lru_page(page);
> -		return 0;
> -	}
> -
>  	page_lru = page_is_file_lru(page);
>  	mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + page_lru,
>  				thp_nr_pages(page));
Zi Yan April 14, 2021, 3:02 p.m. UTC | #2
On 13 Apr 2021, at 23:00, Huang, Ying wrote:

> Yang Shi <shy828301@gmail.com> writes:
>
>> The generic migration path will check refcount, so no need check refcount here.
>> But the old code actually prevents from migrating shared THP (mapped by multiple
>> processes), so bail out early if mapcount is > 1 to keep the behavior.
>
> What prevents us from migrating shared THP?  If no, why not just remove
> the old refcount checking?

If two or more processes are in different NUMA nodes, a THP shared by them can be
migrated back and forth between NUMA nodes, which is quite costly. Unless we have
a better way of figuring out a good location for such pages to reduce the number
of migration, it might be better not to move them, right?

>
> Best Regards,
> Huang, Ying
>
>> Signed-off-by: Yang Shi <shy828301@gmail.com>
>> ---
>>  mm/migrate.c | 16 ++++------------
>>  1 file changed, 4 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index a72994c68ec6..dc7cc7f3a124 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -2067,6 +2067,10 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>
>>  	VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page), page);
>>
>> +	/* Do not migrate THP mapped by multiple processes */
>> +	if (PageTransHuge(page) && page_mapcount(page) > 1)
>> +		return 0;
>> +
>>  	/* Avoid migrating to a node that is nearly full */
>>  	if (!migrate_balanced_pgdat(pgdat, compound_nr(page)))
>>  		return 0;
>> @@ -2074,18 +2078,6 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
>>  	if (isolate_lru_page(page))
>>  		return 0;
>>
>> -	/*
>> -	 * migrate_misplaced_transhuge_page() skips page migration's usual
>> -	 * check on page_count(), so we must do it here, now that the page
>> -	 * has been isolated: a GUP pin, or any other pin, prevents migration.
>> -	 * The expected page count is 3: 1 for page's mapcount and 1 for the
>> -	 * caller's pin and 1 for the reference taken by isolate_lru_page().
>> -	 */
>> -	if (PageTransHuge(page) && page_count(page) != 3) {
>> -		putback_lru_page(page);
>> -		return 0;
>> -	}
>> -
>>  	page_lru = page_is_file_lru(page);
>>  	mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + page_lru,
>>  				thp_nr_pages(page));


—
Best Regards,
Yan Zi
Yang Shi April 14, 2021, 5:23 p.m. UTC | #3
On Tue, Apr 13, 2021 at 8:00 PM Huang, Ying <ying.huang@intel.com> wrote:
>
> Yang Shi <shy828301@gmail.com> writes:
>
> > The generic migration path will check refcount, so no need check refcount here.
> > But the old code actually prevents from migrating shared THP (mapped by multiple
> > processes), so bail out early if mapcount is > 1 to keep the behavior.
>
> What prevents us from migrating shared THP?  If no, why not just remove
> the old refcount checking?

We could migrate shared THP if we don't care about the bounce back and
forth between nodes as Zi Yan described. The other reason is, as I
mentioned in the cover letter,  I'd like to keep the behavior as
consistent as possible between before and after for now. The old
behavior does prevent migrating shared THP, so I did so in this
series. We definitely could optimize the behavior later on.

>
> Best Regards,
> Huang, Ying
>
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/migrate.c | 16 ++++------------
> >  1 file changed, 4 insertions(+), 12 deletions(-)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index a72994c68ec6..dc7cc7f3a124 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -2067,6 +2067,10 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
> >
> >       VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page), page);
> >
> > +     /* Do not migrate THP mapped by multiple processes */
> > +     if (PageTransHuge(page) && page_mapcount(page) > 1)
> > +             return 0;
> > +
> >       /* Avoid migrating to a node that is nearly full */
> >       if (!migrate_balanced_pgdat(pgdat, compound_nr(page)))
> >               return 0;
> > @@ -2074,18 +2078,6 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
> >       if (isolate_lru_page(page))
> >               return 0;
> >
> > -     /*
> > -      * migrate_misplaced_transhuge_page() skips page migration's usual
> > -      * check on page_count(), so we must do it here, now that the page
> > -      * has been isolated: a GUP pin, or any other pin, prevents migration.
> > -      * The expected page count is 3: 1 for page's mapcount and 1 for the
> > -      * caller's pin and 1 for the reference taken by isolate_lru_page().
> > -      */
> > -     if (PageTransHuge(page) && page_count(page) != 3) {
> > -             putback_lru_page(page);
> > -             return 0;
> > -     }
> > -
> >       page_lru = page_is_file_lru(page);
> >       mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + page_lru,
> >                               thp_nr_pages(page));
Huang, Ying April 15, 2021, 6:45 a.m. UTC | #4
"Zi Yan" <ziy@nvidia.com> writes:

> On 13 Apr 2021, at 23:00, Huang, Ying wrote:
>
>> Yang Shi <shy828301@gmail.com> writes:
>>
>>> The generic migration path will check refcount, so no need check refcount here.
>>> But the old code actually prevents from migrating shared THP (mapped by multiple
>>> processes), so bail out early if mapcount is > 1 to keep the behavior.
>>
>> What prevents us from migrating shared THP?  If no, why not just remove
>> the old refcount checking?
>
> If two or more processes are in different NUMA nodes, a THP shared by them can be
> migrated back and forth between NUMA nodes, which is quite costly. Unless we have
> a better way of figuring out a good location for such pages to reduce the number
> of migration, it might be better not to move them, right?
>

Some mechanism has been provided in should_numa_migrate_memory() to
identify the shared pages from the private pages.  Do you find it
doesn't work well in some situations?

The multiple threads in one process which run on different NUMA nodes
may share pages too.  So it isn't a good solution to exclude pages
shared by multiple processes.

Best Regards,
Huang, Ying
Zi Yan April 15, 2021, 6:57 p.m. UTC | #5
On 15 Apr 2021, at 2:45, Huang, Ying wrote:

> "Zi Yan" <ziy@nvidia.com> writes:
>
>> On 13 Apr 2021, at 23:00, Huang, Ying wrote:
>>
>>> Yang Shi <shy828301@gmail.com> writes:
>>>
>>>> The generic migration path will check refcount, so no need check refcount here.
>>>> But the old code actually prevents from migrating shared THP (mapped by multiple
>>>> processes), so bail out early if mapcount is > 1 to keep the behavior.
>>>
>>> What prevents us from migrating shared THP?  If no, why not just remove
>>> the old refcount checking?
>>
>> If two or more processes are in different NUMA nodes, a THP shared by them can be
>> migrated back and forth between NUMA nodes, which is quite costly. Unless we have
>> a better way of figuring out a good location for such pages to reduce the number
>> of migration, it might be better not to move them, right?
>>
>
> Some mechanism has been provided in should_numa_migrate_memory() to
> identify the shared pages from the private pages.  Do you find it
> doesn't work well in some situations?
>
> The multiple threads in one process which run on different NUMA nodes
> may share pages too.  So it isn't a good solution to exclude pages
> shared by multiple processes.

After recheck the patch, it seems that no shared THP migration here is a side effect
of the original page_count check, which might not be intended and be worth fixing.
But Yang just want to solve one problem, simplifying THP NUMA migration,
at a time. Maybe a separate patch would be better for both discussing and fixing this problem.


—
Best Regards,
Yan Zi

Patch
diff mbox series

diff --git a/mm/migrate.c b/mm/migrate.c
index a72994c68ec6..dc7cc7f3a124 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2067,6 +2067,10 @@  static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
 
 	VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page), page);
 
+	/* Do not migrate THP mapped by multiple processes */
+	if (PageTransHuge(page) && page_mapcount(page) > 1)
+		return 0;
+
 	/* Avoid migrating to a node that is nearly full */
 	if (!migrate_balanced_pgdat(pgdat, compound_nr(page)))
 		return 0;
@@ -2074,18 +2078,6 @@  static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
 	if (isolate_lru_page(page))
 		return 0;
 
-	/*
-	 * migrate_misplaced_transhuge_page() skips page migration's usual
-	 * check on page_count(), so we must do it here, now that the page
-	 * has been isolated: a GUP pin, or any other pin, prevents migration.
-	 * The expected page count is 3: 1 for page's mapcount and 1 for the
-	 * caller's pin and 1 for the reference taken by isolate_lru_page().
-	 */
-	if (PageTransHuge(page) && page_count(page) != 3) {
-		putback_lru_page(page);
-		return 0;
-	}
-
 	page_lru = page_is_file_lru(page);
 	mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + page_lru,
 				thp_nr_pages(page));