All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
@ 2022-08-25  1:49 Alistair Popple
  2022-08-25 23:51 ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Alistair Popple @ 2022-08-25  1:49 UTC (permalink / raw)
  To: linux-mm, Andrew Morton
  Cc: peterx, John Hubbard, Ralph Campbell, Alistair Popple, Peter Xu

Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
the way trylock_page() in migrate_vma_collect_pmd() works without
updating the comment. Reword the comment to be less misleading and a
better reflection of what happens.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reported-by: Peter Xu <peterx@redhat.com>
Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
---
 mm/migrate_device.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 5052093d0262..0736f846de0b 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
 		get_page(page);
 
 		/*
-		 * Optimize for the common case where page is only mapped once
-		 * in one process. If we can lock the page, then we can safely
-		 * set up a special migration page table entry now.
+		 * If we can't lock the page we can't migrate it. If we can it's
+		 * safe to set up a migration entry now. In the common case
+		 * where the page is mapped once in a single process setting up
+		 * the migration entry now is an optimisation to avoid walking
+		 * the rmap later with try_to_migrate().
 		 */
 		if (trylock_page(page)) {
 			bool anon_exclusive;
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-25  1:49 [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment Alistair Popple
@ 2022-08-25 23:51 ` Peter Xu
  2022-08-26  0:34   ` Alistair Popple
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Xu @ 2022-08-25 23:51 UTC (permalink / raw)
  To: Alistair Popple
  Cc: linux-mm, Andrew Morton, peterx, John Hubbard, Ralph Campbell

On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
> the way trylock_page() in migrate_vma_collect_pmd() works without
> updating the comment. Reword the comment to be less misleading and a
> better reflection of what happens.
> 
> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> Reported-by: Peter Xu <peterx@redhat.com>
> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
> ---
>  mm/migrate_device.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 5052093d0262..0736f846de0b 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  		get_page(page);
>  
>  		/*
> -		 * Optimize for the common case where page is only mapped once
> -		 * in one process. If we can lock the page, then we can safely
> -		 * set up a special migration page table entry now.
> +		 * If we can't lock the page we can't migrate it. If we can it's
> +		 * safe to set up a migration entry now. In the common case
> +		 * where the page is mapped once in a single process setting up
> +		 * the migration entry now is an optimisation to avoid walking
> +		 * the rmap later with try_to_migrate().
>  		 */

IMHO the last sentence can still be a bit confusing - here we 100% rely on
the trylock() to proceed or we'll stop migration right away. IMHO that
means this is not an optimization, since optimizations should always be
optional but not the case here.

Meanwhile it'll be great to also mention about why trylock is needed and no
further attempt to use lock_page().  The comment in prepare() previously
was great but unfortunately that code clip was removed.

In short, do you think something like this might be clearer?

		/*
		 * We rely on the trylock() to migrate the pte.  If this
		 * fails, we'll fail the migration of this page.  IOW, the
		 * migration is very much best-effort, just like we'll also
		 * bail out if we found page pinned by other users after
		 * page being locked.
		 *
		 * We don't use lock_page() here or even later.  It's
		 * because if to do so we may need to take the locks for
		 * multiple pages one by one, and the order to take the
		 * page locks is unclear.  For example, if two processes
		 * want to lock multiple pages but in different order it
		 * can lead to deadlock.
		 */

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-25 23:51 ` Peter Xu
@ 2022-08-26  0:34   ` Alistair Popple
  2022-08-26 15:03     ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Alistair Popple @ 2022-08-26  0:34 UTC (permalink / raw)
  To: Peter Xu; +Cc: linux-mm, Andrew Morton, peterx, John Hubbard, Ralph Campbell


Peter Xu <peterx@redhat.com> writes:

> On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
>> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
>> the way trylock_page() in migrate_vma_collect_pmd() works without
>> updating the comment. Reword the comment to be less misleading and a
>> better reflection of what happens.
>>
>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>> Reported-by: Peter Xu <peterx@redhat.com>
>> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
>> ---
>>  mm/migrate_device.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>> index 5052093d0262..0736f846de0b 100644
>> --- a/mm/migrate_device.c
>> +++ b/mm/migrate_device.c
>> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>>  		get_page(page);
>>
>>  		/*
>> -		 * Optimize for the common case where page is only mapped once
>> -		 * in one process. If we can lock the page, then we can safely
>> -		 * set up a special migration page table entry now.
>> +		 * If we can't lock the page we can't migrate it. If we can it's
>> +		 * safe to set up a migration entry now. In the common case
>> +		 * where the page is mapped once in a single process setting up
>> +		 * the migration entry now is an optimisation to avoid walking
>> +		 * the rmap later with try_to_migrate().
>>  		 */
>
> IMHO the last sentence can still be a bit confusing - here we 100% rely on
> the trylock() to proceed or we'll stop migration right away. IMHO that
> means this is not an optimization, since optimizations should always be
> optional but not the case here.

We have to lock the page here, we don't have to install the migration
entries. Installing the migration entries here is optional and is the
optimisation.

> Meanwhile it'll be great to also mention about why trylock is needed and no
> further attempt to use lock_page().  The comment in prepare() previously
> was great but unfortunately that code clip was removed.

Will add.

> In short, do you think something like this might be clearer?

I think it's important to mention the optimisation, otherwise the
temptation is to remove the installation of migration entries here and
rely on try_to_migrate() to do it later. I would actually like to be
able to do that because it simplifies the code in many ways but based on
my testing the optimisation turns out to be very worth while.

> 		/*
> 		 * We rely on the trylock() to migrate the pte.  If this
> 		 * fails, we'll fail the migration of this page.  IOW, the
> 		 * migration is very much best-effort, just like we'll also
> 		 * bail out if we found page pinned by other users after
> 		 * page being locked.

Honestly I think this describes what the code does rather than why and
is likely to become outdated and confusing. IMHO it's quite clear from
the code that the migration will fail here if we can't lock the page.

And the fact that migration can fail is already covered in the existing
documentation. See for example the extensive comment on
migrate_vma_setup() about how the whole flow works. Eg:

 * Once migrate_vma_pages() returns the caller may inspect which pages were
 * successfully migrated, and which were not.  Successfully migrated pages will
 * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.

> 		 *
> 		 * We don't use lock_page() here or even later.  It's
> 		 * because if to do so we may need to take the locks for
> 		 * multiple pages one by one, and the order to take the
> 		 * page locks is unclear.  For example, if two processes
> 		 * want to lock multiple pages but in different order it
> 		 * can lead to deadlock.
> 		 */
>
> Thanks,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-26  0:34   ` Alistair Popple
@ 2022-08-26 15:03     ` Peter Xu
  2022-08-26 17:17       ` David Hildenbrand
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Xu @ 2022-08-26 15:03 UTC (permalink / raw)
  To: Alistair Popple
  Cc: linux-mm, Andrew Morton, peterx, John Hubbard, Ralph Campbell

On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote:
> 
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
> >> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
> >> the way trylock_page() in migrate_vma_collect_pmd() works without
> >> updating the comment. Reword the comment to be less misleading and a
> >> better reflection of what happens.
> >>
> >> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> >> Reported-by: Peter Xu <peterx@redhat.com>
> >> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
> >> ---
> >>  mm/migrate_device.c | 8 +++++---
> >>  1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> >> index 5052093d0262..0736f846de0b 100644
> >> --- a/mm/migrate_device.c
> >> +++ b/mm/migrate_device.c
> >> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >>  		get_page(page);
> >>
> >>  		/*
> >> -		 * Optimize for the common case where page is only mapped once
> >> -		 * in one process. If we can lock the page, then we can safely
> >> -		 * set up a special migration page table entry now.
> >> +		 * If we can't lock the page we can't migrate it. If we can it's
> >> +		 * safe to set up a migration entry now. In the common case
> >> +		 * where the page is mapped once in a single process setting up
> >> +		 * the migration entry now is an optimisation to avoid walking
> >> +		 * the rmap later with try_to_migrate().
> >>  		 */
> >
> > IMHO the last sentence can still be a bit confusing - here we 100% rely on
> > the trylock() to proceed or we'll stop migration right away. IMHO that
> > means this is not an optimization, since optimizations should always be
> > optional but not the case here.
> 
> We have to lock the page here, we don't have to install the migration
> entries. Installing the migration entries here is optional and is the
> optimisation.

I see what you mean now.

> 
> > Meanwhile it'll be great to also mention about why trylock is needed and no
> > further attempt to use lock_page().  The comment in prepare() previously
> > was great but unfortunately that code clip was removed.
> 
> Will add.
> 
> > In short, do you think something like this might be clearer?
> 
> I think it's important to mention the optimisation, otherwise the
> temptation is to remove the installation of migration entries here and
> rely on try_to_migrate() to do it later. I would actually like to be
> able to do that because it simplifies the code in many ways but based on
> my testing the optimisation turns out to be very worth while.
> 
> > 		/*
> > 		 * We rely on the trylock() to migrate the pte.  If this
> > 		 * fails, we'll fail the migration of this page.  IOW, the
> > 		 * migration is very much best-effort, just like we'll also
> > 		 * bail out if we found page pinned by other users after
> > 		 * page being locked.
> 
> Honestly I think this describes what the code does rather than why and
> is likely to become outdated and confusing. IMHO it's quite clear from
> the code that the migration will fail here if we can't lock the page.

If you see that's what I was struggling to understand previously, so not
clear at least to me. :) Since normally a function like page migration
should (from the gut feeling) not rely on trylock only.

> 
> And the fact that migration can fail is already covered in the existing
> documentation. See for example the extensive comment on
> migrate_vma_setup() about how the whole flow works. Eg:
> 
>  * Once migrate_vma_pages() returns the caller may inspect which pages were
>  * successfully migrated, and which were not.  Successfully migrated pages will
>  * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.

IIUC trylock failure could be detected even earlier than that.  E.g. the
ppc code has:

	ret = migrate_vma_setup(&mig);
	if (ret)
		return -1;

	spage = migrate_pfn_to_page(*mig.src);
	if (!spage || !(*mig.src & MIGRATE_PFN_MIGRATE))
		goto out_finalize;

So afaict it won't even reach migrate_vma_pages() by missing of the flag
MIGRATE_PFN_VALID in the src mpfn.

Probably because the migrate_vma_*() API is extensive enough so it's hard
to summarize its usage in a short paragraph.  I'm probably asking too much..

No strong opinion on the rest as long as we can add back some explanation
on why lock_page() is not used.

Thanks again for doing this.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-26 15:03     ` Peter Xu
@ 2022-08-26 17:17       ` David Hildenbrand
  2022-08-26 21:48         ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2022-08-26 17:17 UTC (permalink / raw)
  To: Peter Xu, Alistair Popple
  Cc: linux-mm, Andrew Morton, peterx, John Hubbard, Ralph Campbell

On 26.08.22 17:03, Peter Xu wrote:
> On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote:
>>
>> Peter Xu <peterx@redhat.com> writes:
>>
>>> On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
>>>> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
>>>> the way trylock_page() in migrate_vma_collect_pmd() works without
>>>> updating the comment. Reword the comment to be less misleading and a
>>>> better reflection of what happens.
>>>>
>>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>> Reported-by: Peter Xu <peterx@redhat.com>
>>>> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
>>>> ---
>>>>  mm/migrate_device.c | 8 +++++---
>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>>>> index 5052093d0262..0736f846de0b 100644
>>>> --- a/mm/migrate_device.c
>>>> +++ b/mm/migrate_device.c
>>>> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>>>>  		get_page(page);
>>>>
>>>>  		/*
>>>> -		 * Optimize for the common case where page is only mapped once
>>>> -		 * in one process. If we can lock the page, then we can safely
>>>> -		 * set up a special migration page table entry now.
>>>> +		 * If we can't lock the page we can't migrate it. If we can it's
>>>> +		 * safe to set up a migration entry now. In the common case
>>>> +		 * where the page is mapped once in a single process setting up
>>>> +		 * the migration entry now is an optimisation to avoid walking
>>>> +		 * the rmap later with try_to_migrate().
>>>>  		 */
>>>
>>> IMHO the last sentence can still be a bit confusing - here we 100% rely on
>>> the trylock() to proceed or we'll stop migration right away. IMHO that
>>> means this is not an optimization, since optimizations should always be
>>> optional but not the case here.
>>
>> We have to lock the page here, we don't have to install the migration
>> entries. Installing the migration entries here is optional and is the
>> optimisation.
> 
> I see what you mean now.
> 
>>
>>> Meanwhile it'll be great to also mention about why trylock is needed and no
>>> further attempt to use lock_page().  The comment in prepare() previously
>>> was great but unfortunately that code clip was removed.
>>
>> Will add.
>>
>>> In short, do you think something like this might be clearer?
>>
>> I think it's important to mention the optimisation, otherwise the
>> temptation is to remove the installation of migration entries here and
>> rely on try_to_migrate() to do it later. I would actually like to be
>> able to do that because it simplifies the code in many ways but based on
>> my testing the optimisation turns out to be very worth while.
>>
>>> 		/*
>>> 		 * We rely on the trylock() to migrate the pte.  If this
>>> 		 * fails, we'll fail the migration of this page.  IOW, the
>>> 		 * migration is very much best-effort, just like we'll also
>>> 		 * bail out if we found page pinned by other users after
>>> 		 * page being locked.
>>
>> Honestly I think this describes what the code does rather than why and
>> is likely to become outdated and confusing. IMHO it's quite clear from
>> the code that the migration will fail here if we can't lock the page.
> 
> If you see that's what I was struggling to understand previously, so not
> clear at least to me. :) Since normally a function like page migration
> should (from the gut feeling) not rely on trylock only.

IIRC, ordinary page migration will also only trylock. See
isolate_movable_page().


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-26 17:17       ` David Hildenbrand
@ 2022-08-26 21:48         ` Peter Xu
  2022-08-29  7:09           ` Alistair Popple
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Xu @ 2022-08-26 21:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Alistair Popple, linux-mm, Andrew Morton, peterx, John Hubbard,
	Ralph Campbell

On Fri, Aug 26, 2022 at 07:17:06PM +0200, David Hildenbrand wrote:
> On 26.08.22 17:03, Peter Xu wrote:
> > On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote:
> >>
> >> Peter Xu <peterx@redhat.com> writes:
> >>
> >>> On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
> >>>> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
> >>>> the way trylock_page() in migrate_vma_collect_pmd() works without
> >>>> updating the comment. Reword the comment to be less misleading and a
> >>>> better reflection of what happens.
> >>>>
> >>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> >>>> Reported-by: Peter Xu <peterx@redhat.com>
> >>>> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
> >>>> ---
> >>>>  mm/migrate_device.c | 8 +++++---
> >>>>  1 file changed, 5 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> >>>> index 5052093d0262..0736f846de0b 100644
> >>>> --- a/mm/migrate_device.c
> >>>> +++ b/mm/migrate_device.c
> >>>> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >>>>  		get_page(page);
> >>>>
> >>>>  		/*
> >>>> -		 * Optimize for the common case where page is only mapped once
> >>>> -		 * in one process. If we can lock the page, then we can safely
> >>>> -		 * set up a special migration page table entry now.
> >>>> +		 * If we can't lock the page we can't migrate it. If we can it's
> >>>> +		 * safe to set up a migration entry now. In the common case
> >>>> +		 * where the page is mapped once in a single process setting up
> >>>> +		 * the migration entry now is an optimisation to avoid walking
> >>>> +		 * the rmap later with try_to_migrate().
> >>>>  		 */
> >>>
> >>> IMHO the last sentence can still be a bit confusing - here we 100% rely on
> >>> the trylock() to proceed or we'll stop migration right away. IMHO that
> >>> means this is not an optimization, since optimizations should always be
> >>> optional but not the case here.
> >>
> >> We have to lock the page here, we don't have to install the migration
> >> entries. Installing the migration entries here is optional and is the
> >> optimisation.
> > 
> > I see what you mean now.
> > 
> >>
> >>> Meanwhile it'll be great to also mention about why trylock is needed and no
> >>> further attempt to use lock_page().  The comment in prepare() previously
> >>> was great but unfortunately that code clip was removed.
> >>
> >> Will add.
> >>
> >>> In short, do you think something like this might be clearer?
> >>
> >> I think it's important to mention the optimisation, otherwise the
> >> temptation is to remove the installation of migration entries here and
> >> rely on try_to_migrate() to do it later. I would actually like to be
> >> able to do that because it simplifies the code in many ways but based on
> >> my testing the optimisation turns out to be very worth while.
> >>
> >>> 		/*
> >>> 		 * We rely on the trylock() to migrate the pte.  If this
> >>> 		 * fails, we'll fail the migration of this page.  IOW, the
> >>> 		 * migration is very much best-effort, just like we'll also
> >>> 		 * bail out if we found page pinned by other users after
> >>> 		 * page being locked.
> >>
> >> Honestly I think this describes what the code does rather than why and
> >> is likely to become outdated and confusing. IMHO it's quite clear from
> >> the code that the migration will fail here if we can't lock the page.
> > 
> > If you see that's what I was struggling to understand previously, so not
> > clear at least to me. :) Since normally a function like page migration
> > should (from the gut feeling) not rely on trylock only.
> 
> IIRC, ordinary page migration will also only trylock. See
> isolate_movable_page().

Fair enough.

I think it depends on what we want to do with the migration.  I think not
even all users of isolate_movable_page() only depend on trylock, and it can
retry with lock_page later.  E.g.:

- soft_offline_in_use_page
  - isolate_page
    - isolate_movable_page <----- trylock here
  - (if isolate_page fails...) migrate_pages
    - unmap_and_move       <----- lock_page here after 2 unsuccessful rounds

And I also agree migration may always fail (e.g. by spurious page refcounts
being taken), so feel free to ignore the "gut feeling" - that's indeed kind
of subjective.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-26 21:48         ` Peter Xu
@ 2022-08-29  7:09           ` Alistair Popple
  2022-08-29 15:30             ` Peter Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Alistair Popple @ 2022-08-29  7:09 UTC (permalink / raw)
  To: Peter Xu
  Cc: David Hildenbrand, linux-mm, Andrew Morton, peterx, John Hubbard,
	Ralph Campbell


Peter Xu <peterx@redhat.com> writes:

[...]

>> >>> Meanwhile it'll be great to also mention about why trylock is needed and no
>> >>> further attempt to use lock_page().  The comment in prepare() previously
>> >>> was great but unfortunately that code clip was removed.
>> >>
>> >> Will add.
>> >>
>> >>> In short, do you think something like this might be clearer?
>> >>
>> >> I think it's important to mention the optimisation, otherwise the
>> >> temptation is to remove the installation of migration entries here and
>> >> rely on try_to_migrate() to do it later. I would actually like to be
>> >> able to do that because it simplifies the code in many ways but based on
>> >> my testing the optimisation turns out to be very worth while.
>> >>
>> >>> 		/*
>> >>> 		 * We rely on the trylock() to migrate the pte.  If this
>> >>> 		 * fails, we'll fail the migration of this page.  IOW, the
>> >>> 		 * migration is very much best-effort, just like we'll also
>> >>> 		 * bail out if we found page pinned by other users after
>> >>> 		 * page being locked.
>> >>
>> >> Honestly I think this describes what the code does rather than why and
>> >> is likely to become outdated and confusing. IMHO it's quite clear from
>> >> the code that the migration will fail here if we can't lock the page.
>> >
>> > If you see that's what I was struggling to understand previously, so not
>> > clear at least to me. :) Since normally a function like page migration
>> > should (from the gut feeling) not rely on trylock only.
>>
>> IIRC, ordinary page migration will also only trylock. See
>> isolate_movable_page().
>
> Fair enough.
>
> I think it depends on what we want to do with the migration.  I think not
> even all users of isolate_movable_page() only depend on trylock, and it can
> retry with lock_page later.  E.g.:
>
> - soft_offline_in_use_page
>   - isolate_page
>     - isolate_movable_page <----- trylock here
>   - (if isolate_page fails...) migrate_pages
>     - unmap_and_move       <----- lock_page here after 2 unsuccessful rounds
>
> And I also agree migration may always fail (e.g. by spurious page refcounts
> being taken), so feel free to ignore the "gut feeling" - that's indeed kind
> of subjective.

So how about the following:

		/*
		 * We rely on trylock_page() to avoid deadlock between
		 * concurrent migrations where each is waiting on the others
		 * page lock. If we can't immediately lock the page we fail this
		 * migration as it is only best effort anyway.
		 *
		 * If we can lock the page it's safe to set up a migration entry
		 * now. In the common case where the page is mapped once in a
		 * single process setting up the migration entry now is an
		 * optimisation to avoid walking the rmap later with
		 * try_to_migrate().
		 */

Thanks.

 - Alistair


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment
  2022-08-29  7:09           ` Alistair Popple
@ 2022-08-29 15:30             ` Peter Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Xu @ 2022-08-29 15:30 UTC (permalink / raw)
  To: Alistair Popple
  Cc: David Hildenbrand, linux-mm, Andrew Morton, peterx, John Hubbard,
	Ralph Campbell

On Mon, Aug 29, 2022 at 05:09:25PM +1000, Alistair Popple wrote:
> So how about the following:
> 
> 		/*
> 		 * We rely on trylock_page() to avoid deadlock between
> 		 * concurrent migrations where each is waiting on the others
> 		 * page lock. If we can't immediately lock the page we fail this
> 		 * migration as it is only best effort anyway.
> 		 *
> 		 * If we can lock the page it's safe to set up a migration entry
> 		 * now. In the common case where the page is mapped once in a
> 		 * single process setting up the migration entry now is an
> 		 * optimisation to avoid walking the rmap later with
> 		 * try_to_migrate().
> 		 */

Good to me, thanks.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-08-29 15:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-25  1:49 [PATCH] mm/migrate_device.c: Fix a misleading and out-dated comment Alistair Popple
2022-08-25 23:51 ` Peter Xu
2022-08-26  0:34   ` Alistair Popple
2022-08-26 15:03     ` Peter Xu
2022-08-26 17:17       ` David Hildenbrand
2022-08-26 21:48         ` Peter Xu
2022-08-29  7:09           ` Alistair Popple
2022-08-29 15:30             ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.