All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Zi Yan <ziy@nvidia.com>, Yang Shi <shy828301@gmail.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Oscar Salvador <osalvador@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	Bharata B Rao <bharata@amd.com>, haoxin <xhao@linux.alibaba.com>
Subject: Re: [PATCH 2/8] migrate_pages: separate hugetlb folios migration
Date: Thu, 05 Jan 2023 17:43:05 +1100	[thread overview]
Message-ID: <877cy1scg5.fsf@nvidia.com> (raw)
In-Reply-To: <87pmbtedfp.fsf@yhuang6-desk2.ccr.corp.intel.com>


"Huang, Ying" <ying.huang@intel.com> writes:

> Alistair Popple <apopple@nvidia.com> writes:
>
>> Huang Ying <ying.huang@intel.com> writes:
>>
>>> This is a preparation patch to batch the folio unmapping and moving
>>> for the non-hugetlb folios.  Based on that we can batch the TLB
>>> shootdown during the folio migration and make it possible to use some
>>> hardware accelerator for the folio copying.
>>>
>>> In this patch the hugetlb folios and non-hugetlb folios migration is
>>> separated in migrate_pages() to make it easy to change the non-hugetlb
>>> folios migration implementation.
>>>
>>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Yang Shi <shy828301@gmail.com>
>>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>> Cc: Matthew Wilcox <willy@infradead.org>
>>> Cc: Bharata B Rao <bharata@amd.com>
>>> Cc: Alistair Popple <apopple@nvidia.com>
>>> Cc: haoxin <xhao@linux.alibaba.com>
>>> ---
>>>  mm/migrate.c | 114 ++++++++++++++++++++++++++++++++++++++++++++-------
>>>  1 file changed, 99 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index ec9263a33d38..bdbe73fe2eb7 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1404,6 +1404,87 @@ struct migrate_pages_stats {
>>>  	int nr_thp_split;
>>>  };
>>>  
>>> +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page,
>>> +			    free_page_t put_new_page, unsigned long private,
>>> +			    enum migrate_mode mode, int reason,
>>> +			    struct migrate_pages_stats *stats,
>>> +			    struct list_head *ret_folios)
>>> +{
>>> +	int retry = 1;
>>> +	int nr_failed = 0;
>>> +	int nr_retry_pages = 0;
>>> +	int pass = 0;
>>> +	struct folio *folio, *folio2;
>>> +	int rc = 0, nr_pages;
>>> +
>>> +	for (pass = 0; pass < 10 && retry; pass++) {
>>> +		retry = 0;
>>> +		nr_retry_pages = 0;
>>> +
>>> +		list_for_each_entry_safe(folio, folio2, from, lru) {
>>> +			if (!folio_test_hugetlb(folio))
>>> +				continue;
>>> +
>>> +			nr_pages = folio_nr_pages(folio);
>>> +
>>> +			cond_resched();
>>> +
>>> +			rc = unmap_and_move_huge_page(get_new_page,
>>> +						      put_new_page, private,
>>> +						      &folio->page, pass > 2, mode,
>>> +						      reason, ret_folios);
>>> +			/*
>>> +			 * The rules are:
>>> +			 *	Success: hugetlb folio will be put back
>>> +			 *	-EAGAIN: stay on the from list
>>> +			 *	-ENOMEM: stay on the from list
>>> +			 *	-ENOSYS: stay on the from list
>>> +			 *	Other errno: put on ret_folios list
>>> +			 */
>>> +			switch(rc) {
>>> +			case -ENOSYS:
>>> +				/* Hugetlb migration is unsupported */
>>> +				nr_failed++;
>>> +				stats->nr_failed_pages += nr_pages;
>>> +				list_move_tail(&folio->lru, ret_folios);
>>> +				break;
>>> +			case -ENOMEM:
>>> +				/*
>>> +				 * When memory is low, don't bother to try to migrate
>>> +				 * other folios, just exit.
>>> +				 */
>>> +				nr_failed++;
>>
>> This currently isn't relevant for -ENOMEM and I think it would be
>> clearer if it was dropped.
>
> OK.
>
>>> +				stats->nr_failed_pages += nr_pages;
>>
>> Makes sense not to continue migration with low memory, but shouldn't we
>> add the remaining unmigrated hugetlb folios to stats->nr_failed_pages as
>> well? Ie. don't we still have to continue the iteration to to find and
>> account for these?
>
> I think nr_failed_pages only counts tried pages.  IIUC, it's the
> original behavior and behavior for non-hugetlb pages too.

Hmm, I agree it seems this is the original behavior but that behaviour
seems arbitrary and wrong IMHO. The page failed to migrate, therefore it
should count as such. The fact we didn't even try seems irrelevant.

Indeed it looks like this was introduced because it was confusing to see
no failures even though migrate_pages() was called - see dfef2ef4027b
("mm, migrate: increment fail count on ENOMEM").

But that seems inconsistent - why count this one folio as failed because
of the allocation failure while other folios which would also likely
cause allocation failures don't get counted? Fixing it is probably
outside the scope of this series so I won't insist, but it would be nice
as it could still lead to confusion in some scenarios.

[...]

>>> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>  		nr_retry_pages = 0;
>>>  
>>>  		list_for_each_entry_safe(folio, folio2, from, lru) {
>>> +			if (folio_test_hugetlb(folio)) {
>>
>> How do we hit this case? Shouldn't migrate_hugetlbs() have already moved
>> any hugetlb folios off the from list?
>
> Retried hugetlb folios will be kept in from list.

Couldn't migrate_hugetlbs() remove the failing retried pages from the
list on the final pass? That seems cleaner to me.

>>> +				list_move_tail(&folio->lru, &ret_folios);
>>> +				continue;
>>> +			}
>>> +
>>>  			/*
>>>  			 * Large folio statistics is based on the source large
>>>  			 * folio. Capture required information that might get
>>>  			 * lost during migration.
>>>  			 */
>>> -			is_large = folio_test_large(folio) && !folio_test_hugetlb(folio);
>>> +			is_large = folio_test_large(folio);
>>>  			is_thp = is_large && folio_test_pmd_mappable(folio);
>>>  			nr_pages = folio_nr_pages(folio);
>>> +
>>>  			cond_resched();
>>>  
>>> -			if (folio_test_hugetlb(folio))
>>> -				rc = unmap_and_move_huge_page(get_new_page,
>>> -						put_new_page, private,
>>> -						&folio->page, pass > 2, mode,
>>> -						reason,
>>> -						&ret_folios);
>>> -			else
>>> -				rc = unmap_and_move(get_new_page, put_new_page,
>>> -						private, folio, pass > 2, mode,
>>> -						reason, &ret_folios);
>>> +			rc = unmap_and_move(get_new_page, put_new_page,
>>> +					    private, folio, pass > 2, mode,
>>> +					    reason, &ret_folios);
>>>  			/*
>>>  			 * The rules are:
>>> -			 *	Success: non hugetlb folio will be freed, hugetlb
>>> -			 *		 folio will be put back
>>> +			 *	Success: folio will be freed
>>>  			 *	-EAGAIN: stay on the from list
>>>  			 *	-ENOMEM: stay on the from list
>>>  			 *	-ENOSYS: stay on the from list
>>> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>  						stats.nr_thp_split += is_thp;
>>>  						break;
>>>  					}
>>> -				/* Hugetlb migration is unsupported */
>>>  				} else if (!no_split_folio_counting) {
>>>  					nr_failed++;
>>>  				}
>
> Best Regards,
> Huang, Ying


  reply	other threads:[~2023-01-05  6:49 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-27  0:28 [PATCH 0/8] migrate_pages(): batch TLB flushing Huang Ying
2022-12-27  0:28 ` [PATCH 1/8] migrate_pages: organize stats with struct migrate_pages_stats Huang Ying
2023-01-03 18:06   ` Zi Yan
2023-01-05  3:02   ` Alistair Popple
2023-01-05  5:53     ` Huang, Ying
2023-01-05  6:50       ` Alistair Popple
2023-01-05  7:06         ` Huang, Ying
2022-12-27  0:28 ` [PATCH 2/8] migrate_pages: separate hugetlb folios migration Huang Ying
2022-12-28 23:17   ` Andrew Morton
2023-01-02 23:53     ` Huang, Ying
2023-01-05  4:13   ` Alistair Popple
2023-01-05  5:51     ` Huang, Ying
2023-01-05  6:43       ` Alistair Popple [this message]
2023-01-05  7:31         ` Huang, Ying
2023-01-05  7:39           ` Alistair Popple
2023-01-09  7:23             ` Huang, Ying
2023-01-10  1:37               ` Alistair Popple
2022-12-27  0:28 ` [PATCH 3/8] migrate_pages: restrict number of pages to migrate in batch Huang Ying
2023-01-03 18:40   ` Zi Yan
2023-01-04  0:24     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 4/8] migrate_pages: split unmap_and_move() to _unmap() and _move() Huang Ying
2023-01-03 18:55   ` Zi Yan
2023-01-05 18:26   ` Nathan Chancellor
2023-01-05 18:57     ` Kees Cook
2023-01-08 23:33       ` Huang, Ying
2022-12-27  0:28 ` [PATCH 5/8] migrate_pages: batch _unmap and _move Huang Ying
2022-12-28 23:22   ` Andrew Morton
2023-01-02 23:29     ` Huang, Ying
2023-01-03 19:01   ` Zi Yan
2023-01-04  0:34     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 6/8] migrate_pages: move migrate_folio_done() and migrate_folio_unmap() Huang Ying
2023-01-03 19:02   ` Zi Yan
2023-01-04  1:26     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 7/8] migrate_pages: share more code between _unmap and _move Huang Ying
2023-01-04  7:12   ` Alistair Popple
2023-01-06  4:15     ` Huang, Ying
2022-12-27  0:28 ` [PATCH 8/8] migrate_pages: batch flushing TLB Huang Ying
2023-01-03 19:19   ` Zi Yan
2023-01-04  1:41     ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877cy1scg5.fsf@nvidia.com \
    --to=apopple@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bharata@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=xhao@linux.alibaba.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.