linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled
@ 2021-02-05  2:39 Aneesh Kumar K.V
  2021-02-05  8:29 ` David Hildenbrand
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Aneesh Kumar K.V @ 2021-02-05  2:39 UTC (permalink / raw)
  To: linux-nvdimm, dan.j.williams, Kirill A . Shutemov, Jan Kara
  Cc: linux-mm, linuxppc-dev, Aneesh Kumar K.V

Differentiate between hardware not supporting hugepages and user disabling THP
via 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'

For the devdax namespace, the kernel handles the above via the
supported_alignment attribute and failing to initialize the namespace
if the namespace align value is not supported on the platform.

For the fsdax namespace, the kernel will continue to initialize
the namespace. This can result in the kernel creating a huge pte
entry even though the hardware don't support the same.

We do want hugepage support with pmem even if the end-user disabled THP
via sysfs file (/sys/kernel/mm/transparent_hugepage/enabled). Hence
differentiate between hardware/firmware lacking support vs user-controlled
disable of THP and prevent a huge fault if the hardware lacks hugepage
support.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 include/linux/huge_mm.h | 15 +++++++++------
 mm/huge_memory.c        |  6 +++++-
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 6a19f35f836b..ba973efcd369 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -78,6 +78,7 @@ static inline vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn,
 }
 
 enum transparent_hugepage_flag {
+	TRANSPARENT_HUGEPAGE_NEVER_DAX,
 	TRANSPARENT_HUGEPAGE_FLAG,
 	TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
 	TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG,
@@ -123,6 +124,13 @@ extern unsigned long transparent_hugepage_flags;
  */
 static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
 {
+
+	/*
+	 * If the hardware/firmware marked hugepage support disabled.
+	 */
+	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
+		return false;
+
 	if (vma->vm_flags & VM_NOHUGEPAGE)
 		return false;
 
@@ -134,12 +142,7 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
 
 	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG))
 		return true;
-	/*
-	 * For dax vmas, try to always use hugepage mappings. If the kernel does
-	 * not support hugepages, fsdax mappings will fallback to PAGE_SIZE
-	 * mappings, and device-dax namespaces, that try to guarantee a given
-	 * mapping size, will fail to enable
-	 */
+
 	if (vma_is_dax(vma))
 		return true;
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9237976abe72..d698b7e27447 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -386,7 +386,11 @@ static int __init hugepage_init(void)
 	struct kobject *hugepage_kobj;
 
 	if (!has_transparent_hugepage()) {
-		transparent_hugepage_flags = 0;
+		/*
+		 * Hardware doesn't support hugepages, hence disable
+		 * DAX PMD support.
+		 */
+		transparent_hugepage_flags = 1 << TRANSPARENT_HUGEPAGE_NEVER_DAX;
 		return -EINVAL;
 	}
 
-- 
2.29.2
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled
  2021-02-05  2:39 [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled Aneesh Kumar K.V
@ 2021-02-05  8:29 ` David Hildenbrand
  2021-02-05 17:47 ` Dan Williams
  2021-02-10  5:18 ` Pankaj Gupta
  2 siblings, 0 replies; 4+ messages in thread
From: David Hildenbrand @ 2021-02-05  8:29 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-nvdimm, dan.j.williams,
	Kirill A . Shutemov, Jan Kara
  Cc: linux-mm, linuxppc-dev

On 05.02.21 03:39, Aneesh Kumar K.V wrote:
> Differentiate between hardware not supporting hugepages and user disabling THP
> via 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
> 
> For the devdax namespace, the kernel handles the above via the
> supported_alignment attribute and failing to initialize the namespace
> if the namespace align value is not supported on the platform.
> 
> For the fsdax namespace, the kernel will continue to initialize
> the namespace. This can result in the kernel creating a huge pte
> entry even though the hardware don't support the same.
> 
> We do want hugepage support with pmem even if the end-user disabled THP
> via sysfs file (/sys/kernel/mm/transparent_hugepage/enabled). Hence
> differentiate between hardware/firmware lacking support vs user-controlled
> disable of THP and prevent a huge fault if the hardware lacks hugepage
> support.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>   include/linux/huge_mm.h | 15 +++++++++------
>   mm/huge_memory.c        |  6 +++++-
>   2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 6a19f35f836b..ba973efcd369 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -78,6 +78,7 @@ static inline vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn,
>   }
>   
>   enum transparent_hugepage_flag {
> +	TRANSPARENT_HUGEPAGE_NEVER_DAX,
>   	TRANSPARENT_HUGEPAGE_FLAG,
>   	TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
>   	TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG,
> @@ -123,6 +124,13 @@ extern unsigned long transparent_hugepage_flags;
>    */
>   static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>   {
> +
> +	/*
> +	 * If the hardware/firmware marked hugepage support disabled.
> +	 */
> +	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
> +		return false;
> +
>   	if (vma->vm_flags & VM_NOHUGEPAGE)
>   		return false;
>   
> @@ -134,12 +142,7 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>   
>   	if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG))
>   		return true;
> -	/*
> -	 * For dax vmas, try to always use hugepage mappings. If the kernel does
> -	 * not support hugepages, fsdax mappings will fallback to PAGE_SIZE
> -	 * mappings, and device-dax namespaces, that try to guarantee a given
> -	 * mapping size, will fail to enable
> -	 */
> +
>   	if (vma_is_dax(vma))
>   		return true;
>   
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9237976abe72..d698b7e27447 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -386,7 +386,11 @@ static int __init hugepage_init(void)
>   	struct kobject *hugepage_kobj;
>   
>   	if (!has_transparent_hugepage()) {
> -		transparent_hugepage_flags = 0;
> +		/*
> +		 * Hardware doesn't support hugepages, hence disable
> +		 * DAX PMD support.
> +		 */
> +		transparent_hugepage_flags = 1 << TRANSPARENT_HUGEPAGE_NEVER_DAX;
>   		return -EINVAL;
>   	}
>   
> 

Looks sane to me from my limited understanding of that code :)

-- 
Thanks,

David / dhildenb
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled
  2021-02-05  2:39 [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled Aneesh Kumar K.V
  2021-02-05  8:29 ` David Hildenbrand
@ 2021-02-05 17:47 ` Dan Williams
  2021-02-10  5:18 ` Pankaj Gupta
  2 siblings, 0 replies; 4+ messages in thread
From: Dan Williams @ 2021-02-05 17:47 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: linux-nvdimm, Kirill A . Shutemov, Jan Kara, Linux MM,
	linuxppc-dev, Andrew Morton

[ add Andrew ]

On Thu, Feb 4, 2021 at 6:40 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> Differentiate between hardware not supporting hugepages and user disabling THP
> via 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
>
> For the devdax namespace, the kernel handles the above via the
> supported_alignment attribute and failing to initialize the namespace
> if the namespace align value is not supported on the platform.
>
> For the fsdax namespace, the kernel will continue to initialize
> the namespace. This can result in the kernel creating a huge pte
> entry even though the hardware don't support the same.
>
> We do want hugepage support with pmem even if the end-user disabled THP
> via sysfs file (/sys/kernel/mm/transparent_hugepage/enabled). Hence
> differentiate between hardware/firmware lacking support vs user-controlled
> disable of THP and prevent a huge fault if the hardware lacks hugepage
> support.

Looks good to me.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

I assume this will go through Andrew.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled
  2021-02-05  2:39 [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled Aneesh Kumar K.V
  2021-02-05  8:29 ` David Hildenbrand
  2021-02-05 17:47 ` Dan Williams
@ 2021-02-10  5:18 ` Pankaj Gupta
  2 siblings, 0 replies; 4+ messages in thread
From: Pankaj Gupta @ 2021-02-10  5:18 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: linux-nvdimm, Kirill A . Shutemov, Jan Kara, Linux MM, linuxppc-dev

> Differentiate between hardware not supporting hugepages and user disabling THP
> via 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
>
> For the devdax namespace, the kernel handles the above via the
> supported_alignment attribute and failing to initialize the namespace
> if the namespace align value is not supported on the platform.
>
> For the fsdax namespace, the kernel will continue to initialize
> the namespace. This can result in the kernel creating a huge pte
> entry even though the hardware don't support the same.
>
> We do want hugepage support with pmem even if the end-user disabled THP
> via sysfs file (/sys/kernel/mm/transparent_hugepage/enabled). Hence
> differentiate between hardware/firmware lacking support vs user-controlled
> disable of THP and prevent a huge fault if the hardware lacks hugepage
> support.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  include/linux/huge_mm.h | 15 +++++++++------
>  mm/huge_memory.c        |  6 +++++-
>  2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 6a19f35f836b..ba973efcd369 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -78,6 +78,7 @@ static inline vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn,
>  }
>
>  enum transparent_hugepage_flag {
> +       TRANSPARENT_HUGEPAGE_NEVER_DAX,
>         TRANSPARENT_HUGEPAGE_FLAG,
>         TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
>         TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG,
> @@ -123,6 +124,13 @@ extern unsigned long transparent_hugepage_flags;
>   */
>  static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>  {
> +
> +       /*
> +        * If the hardware/firmware marked hugepage support disabled.
> +        */
> +       if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_NEVER_DAX))
> +               return false;
> +
>         if (vma->vm_flags & VM_NOHUGEPAGE)
>                 return false;
>
> @@ -134,12 +142,7 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>
>         if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG))
>                 return true;
> -       /*
> -        * For dax vmas, try to always use hugepage mappings. If the kernel does
> -        * not support hugepages, fsdax mappings will fallback to PAGE_SIZE
> -        * mappings, and device-dax namespaces, that try to guarantee a given
> -        * mapping size, will fail to enable
> -        */
> +
>         if (vma_is_dax(vma))
>                 return true;
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9237976abe72..d698b7e27447 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -386,7 +386,11 @@ static int __init hugepage_init(void)
>         struct kobject *hugepage_kobj;
>
>         if (!has_transparent_hugepage()) {
> -               transparent_hugepage_flags = 0;
> +               /*
> +                * Hardware doesn't support hugepages, hence disable
> +                * DAX PMD support.
> +                */
> +               transparent_hugepage_flags = 1 << TRANSPARENT_HUGEPAGE_NEVER_DAX;
>                 return -EINVAL;
>         }

 Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-02-10  5:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-05  2:39 [PATCH] mm/pmem: Avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled Aneesh Kumar K.V
2021-02-05  8:29 ` David Hildenbrand
2021-02-05 17:47 ` Dan Williams
2021-02-10  5:18 ` Pankaj Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).