linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Clarify huge_pte_offset() semantics
@ 2017-07-24 17:33 Punit Agrawal
  2017-07-24 17:33 ` [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries Punit Agrawal
  2017-07-24 17:33 ` [RFC PATCH 2/2] mm/hugetlb: Support swap entries in huge_pte_offset() Punit Agrawal
  0 siblings, 2 replies; 5+ messages in thread
From: Punit Agrawal @ 2017-07-24 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Punit Agrawal, Naoya Horiguchi, linux-mm, linux-kernel,
	linux-arch, steve.capper, will.deacon, catalin.marinas,
	kirill.shutemov, Michal Hocko, Mike Kravetz

Hi,

The generic implementation of huge_pte_offset() has inconsistent
behaviour when looking up hugepage PUDs vs PMDs entries that are not
present (returning NULL vs pte_t*).

Similarly, it returns NULL when encountering swap entries although all
the callers have special checks to properly deal with swap entries.

Without clear semantics, it is difficult to determine what is the
expected behaviour of huge_pte_offset() without going through all the
scenarios where it used.

I faced this recently when updating the arm64 implementation of
huge_pte_offset() to handle swap entries (related to enabling poisoned
memeory)[0]. And will come across again when I update it for
contiguous hugepage support now that core changes have been merged.

To address these issues, this small series -

* makes huge_pte_offset() consistent between PUD and PMDs
* adds support for returning swap entries
* and most importantly, documents the expected behaviour of
  huge_pte_offset()

All feedback welcome.

Thanks,
Punit

[0]

Punit Agrawal (2):
  mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD
    entries
  mm/hugetlb: Support swap entries in huge_pte_offset()

 mm/hugetlb.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries
  2017-07-24 17:33 [RFC PATCH 0/2] Clarify huge_pte_offset() semantics Punit Agrawal
@ 2017-07-24 17:33 ` Punit Agrawal
  2017-07-25 12:29   ` Catalin Marinas
  2017-07-24 17:33 ` [RFC PATCH 2/2] mm/hugetlb: Support swap entries in huge_pte_offset() Punit Agrawal
  1 sibling, 1 reply; 5+ messages in thread
From: Punit Agrawal @ 2017-07-24 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Punit Agrawal, Naoya Horiguchi, linux-mm, linux-kernel,
	linux-arch, steve.capper, will.deacon, catalin.marinas,
	kirill.shutemov, Michal Hocko, Mike Kravetz

When walking the page tables to resolve an address that points to
!present_p*d() entry, huge_pte_offset() returns inconsistent values
depending on the level of page table (PUD or PMD).

In the case of a PUD entry, it returns NULL while in the case of a PMD
entry, it returns a pointer to the page table entry.

Make huge_pte_offset() consistent by always returning NULL on
encountering a !present_p*d() entry. Document the behaviour to clarify
the expected semantics of this function.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
---
 mm/hugetlb.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bc48ee783dd9..686eb6fa9eb1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4603,6 +4603,13 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
 	return pte;
 }
 
+/*
+ * huge_pte_offset() - Walk the page table to resolve the hugepage
+ * entry at address @addr
+ *
+ * Return: Pointer to page table entry (PUD or PMD) for address @addr
+ * or NULL if the entry is not present.
+ */
 pte_t *huge_pte_offset(struct mm_struct *mm,
 		       unsigned long addr, unsigned long sz)
 {
@@ -4617,13 +4624,20 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
 	p4d = p4d_offset(pgd, addr);
 	if (!p4d_present(*p4d))
 		return NULL;
+
 	pud = pud_offset(p4d, addr);
 	if (!pud_present(*pud))
 		return NULL;
 	if (pud_huge(*pud))
 		return (pte_t *)pud;
+
 	pmd = pmd_offset(pud, addr);
-	return (pte_t *) pmd;
+	if (!pmd_present(*pmd))
+		return NULL;
+	if (pmd_huge(*pmd))
+		return (pte_t *) pmd;
+
+	return NULL;
 }
 
 #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/2] mm/hugetlb: Support swap entries in huge_pte_offset()
  2017-07-24 17:33 [RFC PATCH 0/2] Clarify huge_pte_offset() semantics Punit Agrawal
  2017-07-24 17:33 ` [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries Punit Agrawal
@ 2017-07-24 17:33 ` Punit Agrawal
  1 sibling, 0 replies; 5+ messages in thread
From: Punit Agrawal @ 2017-07-24 17:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Punit Agrawal, Naoya Horiguchi, linux-mm, linux-kernel,
	linux-arch, steve.capper, will.deacon, catalin.marinas,
	kirill.shutemov, Michal Hocko, Mike Kravetz

Although huge_pte_offset() returns NULL when encountering swap page
table entries, the callers of huge_pte_offset() correctly handling swap
entries.

Add support to the huge_pte_offset() to return the swap entries when it
encounters them during the page table walks.

Also update the function documentation to explicitly state this
behaviour. This is to help clarify expectations for architecture
specific implementations of huge_pte_offset().

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
---
 mm/hugetlb.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 686eb6fa9eb1..72dd1139a8e4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4607,8 +4607,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
  * huge_pte_offset() - Walk the page table to resolve the hugepage
  * entry at address @addr
  *
- * Return: Pointer to page table entry (PUD or PMD) for address @addr
- * or NULL if the entry is not present.
+ * Return: Pointer to page table or swap entry (PUD or PMD) for address @addr
+ * or NULL if the entry is p*d_none().
  */
 pte_t *huge_pte_offset(struct mm_struct *mm,
 		       unsigned long addr, unsigned long sz)
@@ -4626,15 +4626,17 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
 		return NULL;
 
 	pud = pud_offset(p4d, addr);
-	if (!pud_present(*pud))
+	if (pud_none(*pud))
 		return NULL;
-	if (pud_huge(*pud))
+	/* hugepage or swap? */
+	if (pud_huge(*pud) || !pud_present(*pud))
 		return (pte_t *)pud;
 
 	pmd = pmd_offset(pud, addr);
-	if (!pmd_present(*pmd))
+	if (pmd_none(*pmd))
 		return NULL;
-	if (pmd_huge(*pmd))
+	/* hugepage or swap? */
+	if (pmd_huge(*pmd) || !pmd_present(*pmd))
 		return (pte_t *) pmd;
 
 	return NULL;
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries
  2017-07-24 17:33 ` [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries Punit Agrawal
@ 2017-07-25 12:29   ` Catalin Marinas
  2017-07-25 14:37     ` Punit Agrawal
  0 siblings, 1 reply; 5+ messages in thread
From: Catalin Marinas @ 2017-07-25 12:29 UTC (permalink / raw)
  To: Punit Agrawal
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	linux-arch, steve.capper, will.deacon, kirill.shutemov,
	Michal Hocko, Mike Kravetz

Hi Punit,

On Mon, Jul 24, 2017 at 06:33:17PM +0100, Punit Agrawal wrote:
> When walking the page tables to resolve an address that points to
> !present_p*d() entry, huge_pte_offset() returns inconsistent values
> depending on the level of page table (PUD or PMD).
> 
> In the case of a PUD entry, it returns NULL while in the case of a PMD
> entry, it returns a pointer to the page table entry.
> 
> Make huge_pte_offset() consistent by always returning NULL on
> encountering a !present_p*d() entry. Document the behaviour to clarify
> the expected semantics of this function.

Nitpick: "p*d_present" instead of "present_p*d".

> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index bc48ee783dd9..686eb6fa9eb1 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -4603,6 +4603,13 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
>  	return pte;
>  }
>  
> +/*
> + * huge_pte_offset() - Walk the page table to resolve the hugepage
> + * entry at address @addr
> + *
> + * Return: Pointer to page table entry (PUD or PMD) for address @addr
> + * or NULL if the entry is not present.
> + */
>  pte_t *huge_pte_offset(struct mm_struct *mm,
>  		       unsigned long addr, unsigned long sz)
>  {
> @@ -4617,13 +4624,20 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
>  	p4d = p4d_offset(pgd, addr);
>  	if (!p4d_present(*p4d))
>  		return NULL;
> +
>  	pud = pud_offset(p4d, addr);
>  	if (!pud_present(*pud))
>  		return NULL;
>  	if (pud_huge(*pud))
>  		return (pte_t *)pud;
> +
>  	pmd = pmd_offset(pud, addr);
> -	return (pte_t *) pmd;
> +	if (!pmd_present(*pmd))
> +		return NULL;

This breaks the current behaviour for swap entries in the pmd (for pud
is already broken but maybe no-one uses them). It is fixed in the
subsequent patch together with the pud but the series is no longer
bisectable. Maybe it's better if you fold the two patches together (or
change the order, though I'm not sure how readable it is).

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries
  2017-07-25 12:29   ` Catalin Marinas
@ 2017-07-25 14:37     ` Punit Agrawal
  0 siblings, 0 replies; 5+ messages in thread
From: Punit Agrawal @ 2017-07-25 14:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Andrew Morton, Naoya Horiguchi, linux-mm, linux-kernel,
	linux-arch, steve.capper, will.deacon, kirill.shutemov,
	Michal Hocko, Mike Kravetz

Catalin Marinas <catalin.marinas@arm.com> writes:

> Hi Punit,
>
> On Mon, Jul 24, 2017 at 06:33:17PM +0100, Punit Agrawal wrote:
>> When walking the page tables to resolve an address that points to
>> !present_p*d() entry, huge_pte_offset() returns inconsistent values
>> depending on the level of page table (PUD or PMD).
>> 
>> In the case of a PUD entry, it returns NULL while in the case of a PMD
>> entry, it returns a pointer to the page table entry.
>> 
>> Make huge_pte_offset() consistent by always returning NULL on
>> encountering a !present_p*d() entry. Document the behaviour to clarify
>> the expected semantics of this function.
>
> Nitpick: "p*d_present" instead of "present_p*d".

Thanks for spotting. Fixed both the instances locally.

>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index bc48ee783dd9..686eb6fa9eb1 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -4603,6 +4603,13 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
>>  	return pte;
>>  }
>>  
>> +/*
>> + * huge_pte_offset() - Walk the page table to resolve the hugepage
>> + * entry at address @addr
>> + *
>> + * Return: Pointer to page table entry (PUD or PMD) for address @addr
>> + * or NULL if the entry is not present.
>> + */
>>  pte_t *huge_pte_offset(struct mm_struct *mm,
>>  		       unsigned long addr, unsigned long sz)
>>  {
>> @@ -4617,13 +4624,20 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
>>  	p4d = p4d_offset(pgd, addr);
>>  	if (!p4d_present(*p4d))
>>  		return NULL;
>> +
>>  	pud = pud_offset(p4d, addr);
>>  	if (!pud_present(*pud))
>>  		return NULL;
>>  	if (pud_huge(*pud))
>>  		return (pte_t *)pud;
>> +
>>  	pmd = pmd_offset(pud, addr);
>> -	return (pte_t *) pmd;
>> +	if (!pmd_present(*pmd))
>> +		return NULL;
>
> This breaks the current behaviour for swap entries in the pmd (for pud
> is already broken but maybe no-one uses them). It is fixed in the
> subsequent patch together with the pud but the series is no longer
> bisectable. Maybe it's better if you fold the two patches together (or
> change the order, though I'm not sure how readable it is).

I missed the change in behaviour for pmd swap entries. I'll squash the
two patches and re-post.

Thanks for the review.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-07-25 14:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-24 17:33 [RFC PATCH 0/2] Clarify huge_pte_offset() semantics Punit Agrawal
2017-07-24 17:33 ` [RFC PATCH 1/2] mm/hugetlb: Make huge_pte_offset() consistent between PUD and PMD entries Punit Agrawal
2017-07-25 12:29   ` Catalin Marinas
2017-07-25 14:37     ` Punit Agrawal
2017-07-24 17:33 ` [RFC PATCH 2/2] mm/hugetlb: Support swap entries in huge_pte_offset() Punit Agrawal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).