All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] convert hugetlb fault functions to folios
@ 2023-01-19 21:14 Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page " Sidhartha Kumar
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

============== OVERVIEW ===========================
This series converts the hugetlb page faulting functions to operate on
folios. These include hugetlb_no_page(), hugetlb_wp(),
copy_hugetlb_page_range(), and hugetlb_mcopy_atomic_pte().

patches 1-3:
	- convert prerequisite helper functions to folios.
patch 4:
	- add a folio variable to the hugetlb fault functions and start
	  a partial conversion to folios.
patch 5:
	- fully convert hugetlb fault functions to folios.
patches 6-8: 
	- convert three functions to take in a folio rather than a page as
	  all callers now use folios.
patch 9:
  - update documentation that references alloc_huge_page

============== TEST COVERAGE ============================

Linux Test Project Hugetlb Test Fault Coverage

[opc@sidhakum-devel kernel (master)]$ sudo ./funccount 'hugetlb_no_page'
FUNC                              COUNT
hugetlb_no_page                    7796

[opc@sidhakum-devel kernel (master)]$ sudo ./funccount 'hugetlb_wp'
FUNC                              COUNT
hugetlb_wp                         4623

Using fallocate commands to create files on hugeltbfs

[opc@sidhakum-devel kernel (master)]$ sudo ./funccount 'hugetlbfs*'
FUNC                              COUNT
hugetlbfs_fallocate                   1


Userfaultfd selftest
./userfaultfd hugetlb 256 50 /dev/hugepages/hugefile

[opc@sidhakum-devel kernel (master)]$ sudo ./funccount 'hugetlb_mcopy_atomic_pte' 
FUNC                              COUNT
hugetlb_mcopy_atomic_pte           6240


[opc@sidhakum-devel kernel (master)]$ sudo ./funccount 'copy_hugetlb_page_range' 
FUNC                              COUNT
copy_hugetlb_page_range               3

============== PERFORMANCE ============================
using bpftrace to track time spent in fault functions over 10 rounds of
the LTP hugetlb tests

pre-patch:

@hugetlb_wp_nsecs:
[256, 512)          3675 |@@@@@@@@@@                                          |
[512, 1K)          18875 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1K, 2K)            1366 |@@@                                                 |
[2K, 4K)              77 |                                                    |
[4K, 8K)              12 |                                                    |
[8K, 16K)             10 |                                                    |
[16K, 32K)             2 |                                                    |
[32K, 64K)             0 |                                                    |
[64K, 128K)            0 |                                                    |
[128K, 256K)           0 |                                                    |
[256K, 512K)           0 |                                                    |
[512K, 1M)             0 |                                                    |
[1M, 2M)               0 |                                                    |
[2M, 4M)               0 |                                                    |
[4M, 8M)               0 |                                                    |
[8M, 16M)              0 |                                                    |
[16M, 32M)             0 |                                                    |
[32M, 64M)             0 |                                                    |
[64M, 128M)            0 |                                                    |
[128M, 256M)           0 |                                                    |
[256M, 512M)           0 |                                                    |
[512M, 1G)             0 |                                                    |
[1G, 2G)               0 |                                                    |
[2G, 4G)               1 |                                                    |

@hugetlb_no_page_nsecs:                                                                                                                                                                                                                    
[64, 128)              1 |                                                    |                                                                                                                                                            
[128, 256)             0 |                                                    |                                                                                                                                                            
[256, 512)            67 |                                                    |                                                                                                                                                            
[512, 1K)             66 |                                                    |                                                                                                                                                            
[1K, 2K)              65 |                                                    |                                                                                                                                                            
[2K, 4K)             198 |                                                    |                                                                                                                                                            
[4K, 8K)              97 |                                                    |                                                                                                                                                            
[8K, 16K)              3 |                                                    |                                                                                                                                                            
[16K, 32K)             4 |                                                    |                                                                                                                                                            
[32K, 64K)           678 |                                                    |                                                                                                                                                            
[64K, 128K)         3401 |@                                                   |                                                                                                                                                            
[128K, 256K)       96746 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|                                                                                                                                                            
[256K, 512K)         107 |                                                    |                                                                                                                                                            
[512K, 1M)             0 |                                                    |                                                                                                                                                            
[1M, 2M)               1 |                                                    |                                                                                                                                                            
[2M, 4M)               0 |                                                    |                                                                                                                                                            
[4M, 8M)               1 |                                                    |                                                                                                                                                            
[8M, 16M)              0 |                                                    |                                                                                                                                                            
[16M, 32M)             0 |                                                    |                                                                                                                                                            
[32M, 64M)             0 |                                                    |                                                                                                                                                            
[64M, 128M)            0 |                                                    |                                                                                                                                                            
[128M, 256M)           0 |                                                    |                                                                                                                                                            
[256M, 512M)           0 |                                                    |                                                                                                                                                            
[512M, 1G)             0 |                                                    |                                                                                                                                                            
[1G, 2G)               0 |                                                    |                                                                                                                                                            
[2G, 4G)             283 |                                                    |


post patch:

@hugetlb_wp_nsecs: 
[256, 512)          7640 |@@@@@@@@@@@@@@@@@                                   |
[512, 1K)          23314 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1K, 2K)            1171 |@@                                                  |
[2K, 4K)              60 |                                                    |
[4K, 8K)              14 |                                                    |
[8K, 16K)             29 |                                                    |
[16K, 32K)             2 |                                                    |

@hugetlb_no_page_nsecs:                                                                                                                                                                                                             [11/27]
[256, 512)            64 |                                                    |                                                                                                                                                            
[512, 1K)             75 |                                                    |                                                                                                                                                            
[1K, 2K)              70 |                                                    |                                                                                                                                                            
[2K, 4K)             209 |                                                    |                                                                                                                                                            
[4K, 8K)              73 |                                                    |                                                                                                                                                            
[8K, 16K)              5 |                                                    |                                                                                                                                                            
[16K, 32K)             1 |                                                    |                                                                                                                                                            
[32K, 64K)          2562 |@                                                   |                                                                                                                                                            
[64K, 128K)         7084 |@@@                                                 |                                                                                                                                                            
[128K, 256K)       99292 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|                                                                                                                                                            
[256K, 512K)          66 |                                                    |                                                                                                                                                            
[512K, 1M)             0 |                                                    |
[1M, 2M)               1 |                                                    |
[2M, 4M)               0 |                                                    |
[4M, 8M)               0 |                                                    |
[8M, 16M)              0 |                                                    |
[16M, 32M)             0 |                                                    |
[32M, 64M)             0 |                                                    |
[64M, 128M)            0 |                                                    |
[128M, 256M)           0 |                                                    |
[256M, 512M)           0 |                                                    |
[512M, 1G)             0 |                                                    |
[1G, 2G)               0 |                                                    |
[2G, 4G)             428 |                                                    |



rebased on 01/18/2023

Sidhartha Kumar (9):
  mm/hugetlb: convert hugetlb_install_page to folios
  mm/hugetlb: convert hugetlbfs_pagecache_present() to folios
  mm/hugetlb: convert putback_active_hugepage to take in a folio
  mm/rmap: change hugepage_add_new_anon_rmap to take in a folio
  mm/hugetlb: convert alloc_huge_page to alloc_hugetlb_folio
  mm/hugetlb: convert restore_reserve_on_error to take in a folio
  mm/hugetlb: convert hugetlb_add_to_page_cache to take in a folio
  mm/hugetlb: convert hugetlb_wp() to take in a folio
  Documentation/mm: update hugetlbfs documentation to mention
    alloc_hugetlb_folio

 Documentation/mm/hugetlbfs_reserv.rst         |  21 +-
 .../zh_CN/mm/hugetlbfs_reserv.rst             |  14 +-
 fs/hugetlbfs/inode.c                          |  38 +--
 include/linux/hugetlb.h                       |  16 +-
 include/linux/rmap.h                          |   2 +-
 mm/hugetlb.c                                  | 249 +++++++++---------
 mm/mempolicy.c                                |   6 +-
 mm/migrate.c                                  |   8 +-
 mm/rmap.c                                     |   6 +-
 9 files changed, 179 insertions(+), 181 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page to folios
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-20 23:21   ` Mike Kravetz
  2023-01-19 21:14 ` [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() " Sidhartha Kumar
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Change hugetlb_install_page() to hugetlb_install_folio(). This reduces one
user of the Huge Page flag macros which take in a page.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d20c8b09890e..849206e94742 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4946,14 +4946,14 @@ static bool is_hugetlb_entry_hwpoisoned(pte_t pte)
 }
 
 static void
-hugetlb_install_page(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
-		     struct page *new_page)
+hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr,
+		     struct folio *new_folio)
 {
-	__SetPageUptodate(new_page);
-	hugepage_add_new_anon_rmap(new_page, vma, addr);
-	set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1));
+	__folio_mark_uptodate(new_folio);
+	hugepage_add_new_anon_rmap(&new_folio->page, vma, addr);
+	set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
 	hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
-	SetHPageMigratable(new_page);
+	folio_set_hugetlb_migratable(new_folio);
 }
 
 int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
@@ -5107,7 +5107,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 					/* huge_ptep of dst_pte won't change as in child */
 					goto again;
 				}
-				hugetlb_install_page(dst_vma, dst_pte, addr, new);
+				hugetlb_install_folio(dst_vma, dst_pte, addr, page_folio(new));
 				spin_unlock(src_ptl);
 				spin_unlock(dst_ptl);
 				continue;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() to folios
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page " Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-20  5:43   ` Matthew Wilcox
  2023-01-19 21:14 ` [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio Sidhartha Kumar
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Convert hugetlbfs_pagecache_present() to use folios internally by
replacing a call to find_get_page() to filemap_get_folio().

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 849206e94742..04cbdf5025a5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5653,15 +5653,15 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
 {
 	struct address_space *mapping;
 	pgoff_t idx;
-	struct page *page;
+	struct folio *folio;
 
 	mapping = vma->vm_file->f_mapping;
 	idx = vma_hugecache_offset(h, vma, address);
 
-	page = find_get_page(mapping, idx);
-	if (page)
-		put_page(page);
-	return page != NULL;
+	folio = filemap_get_folio(mapping, idx);
+	if (folio)
+		folio_put(folio);
+	return folio != NULL;
 }
 
 int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping,
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page " Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() " Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-20  5:48   ` Matthew Wilcox
  2023-01-20 23:47   ` Mike Kravetz
  2023-01-19 21:14 ` [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap " Sidhartha Kumar
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Convert putback_active_hugepage() to putback_active_hugetlb_folio(), this
removes one user of the Huge Page macros which take in a page. The callers
in migrate.c are also cleaned up by being able to directly use the src and
dst folio variables.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 include/linux/hugetlb.h | 4 ++--
 mm/hugetlb.c            | 8 ++++----
 mm/migrate.c            | 8 ++++----
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 06eb59711e4a..f88c832bdfa4 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -174,7 +174,7 @@ int isolate_hugetlb(struct folio *folio, struct list_head *list);
 int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison);
 int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
 				bool *migratable_cleared);
-void putback_active_hugepage(struct page *page);
+void putback_active_hugetlb_folio(struct folio *folio);
 void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason);
 void free_huge_page(struct page *page);
 void hugetlb_fix_reserve_counts(struct inode *inode);
@@ -428,7 +428,7 @@ static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
 	return 0;
 }
 
-static inline void putback_active_hugepage(struct page *page)
+static inline void putback_active_hugetlb_folio(struct folio *folio)
 {
 }
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 04cbdf5025a5..c37a26c8392c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7302,13 +7302,13 @@ int get_huge_page_for_hwpoison(unsigned long pfn, int flags,
 	return ret;
 }
 
-void putback_active_hugepage(struct page *page)
+void putback_active_hugetlb_folio(struct folio *folio)
 {
 	spin_lock_irq(&hugetlb_lock);
-	SetHPageMigratable(page);
-	list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist);
+	folio_set_hugetlb_migratable(folio);
+	list_move_tail(&folio->lru, &(folio_hstate(folio))->hugepage_activelist);
 	spin_unlock_irq(&hugetlb_lock);
-	put_page(page);
+	folio_put(folio);
 }
 
 void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason)
diff --git a/mm/migrate.c b/mm/migrate.c
index 92d53e0b609a..a35f2b36df01 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -151,7 +151,7 @@ void putback_movable_pages(struct list_head *l)
 
 	list_for_each_entry_safe(page, page2, l, lru) {
 		if (unlikely(PageHuge(page))) {
-			putback_active_hugepage(page);
+			putback_active_hugetlb_folio(page_folio(page));
 			continue;
 		}
 		list_del(&page->lru);
@@ -1374,7 +1374,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 
 	if (folio_ref_count(src) == 1) {
 		/* page was freed from under us. So we are done. */
-		putback_active_hugepage(hpage);
+		putback_active_hugetlb_folio(src);
 		return MIGRATEPAGE_SUCCESS;
 	}
 
@@ -1459,7 +1459,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	folio_unlock(src);
 out:
 	if (rc == MIGRATEPAGE_SUCCESS)
-		putback_active_hugepage(hpage);
+		putback_active_hugetlb_folio(src);
 	else if (rc != -EAGAIN)
 		list_move_tail(&src->lru, ret);
 
@@ -1471,7 +1471,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	if (put_new_page)
 		put_new_page(new_hpage, private);
 	else
-		putback_active_hugepage(new_hpage);
+		putback_active_hugetlb_folio(dst);
 
 	return rc;
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap to take in a folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
                   ` (2 preceding siblings ...)
  2023-01-19 21:14 ` [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-20  6:00   ` Matthew Wilcox
  2023-01-19 21:14 ` [PATCH 5/9] mm/hugetlb: convert alloc_huge_page to alloc_hugetlb_folio Sidhartha Kumar
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Change hugepage_add_new_anon_rmap() to take in a folio directly. While
adding a folio variable to the four call sites, convert page functions
with their folio equivalents. This removes three callers of the Huge Page
macros which take in a page.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 include/linux/rmap.h |  2 +-
 mm/hugetlb.c         | 90 +++++++++++++++++++++++++-------------------
 mm/rmap.c            |  6 +--
 3 files changed, 55 insertions(+), 43 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index a6bd1f0a183d..a4570da03e58 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -203,7 +203,7 @@ void page_remove_rmap(struct page *, struct vm_area_struct *,
 
 void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *,
 		unsigned long address, rmap_t flags);
-void hugepage_add_new_anon_rmap(struct page *, struct vm_area_struct *,
+void hugepage_add_new_anon_rmap(struct folio *, struct vm_area_struct *,
 		unsigned long address);
 
 static inline void __page_dup_rmap(struct page *page, bool compound)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c37a26c8392c..6f25055c3ba5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4950,7 +4950,7 @@ hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long add
 		     struct folio *new_folio)
 {
 	__folio_mark_uptodate(new_folio);
-	hugepage_add_new_anon_rmap(&new_folio->page, vma, addr);
+	hugepage_add_new_anon_rmap(new_folio, vma, addr);
 	set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, &new_folio->page, 1));
 	hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
 	folio_set_hugetlb_migratable(new_folio);
@@ -5479,6 +5479,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	pte_t pte;
 	struct hstate *h = hstate_vma(vma);
 	struct page *old_page, *new_page;
+	struct folio *new_folio = NULL;
 	int outside_reserve = 0;
 	vm_fault_t ret = 0;
 	unsigned long haddr = address & huge_page_mask(h);
@@ -5590,6 +5591,9 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 		goto out_release_old;
 	}
 
+	if (new_page)
+		new_folio = page_folio(new_page);
+
 	/*
 	 * When the original hugepage is shared one, it does not have
 	 * anon_vma prepared.
@@ -5599,9 +5603,9 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 		goto out_release_all;
 	}
 
-	copy_user_huge_page(new_page, old_page, address, vma,
+	copy_user_huge_page(&new_folio->page, old_page, address, vma,
 			    pages_per_huge_page(h));
-	__SetPageUptodate(new_page);
+	__folio_mark_uptodate(new_folio);
 
 	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, haddr,
 				haddr + huge_page_size(h));
@@ -5618,10 +5622,10 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 		huge_ptep_clear_flush(vma, haddr, ptep);
 		mmu_notifier_invalidate_range(mm, range.start, range.end);
 		page_remove_rmap(old_page, vma, true);
-		hugepage_add_new_anon_rmap(new_page, vma, haddr);
+		hugepage_add_new_anon_rmap(new_folio, vma, haddr);
 		set_huge_pte_at(mm, haddr, ptep,
-				make_huge_pte(vma, new_page, !unshare));
-		SetHPageMigratable(new_page);
+				make_huge_pte(vma, &new_folio->page, !unshare));
+		folio_set_hugetlb_migratable(new_folio);
 		/* Make the old page be freed below */
 		new_page = old_page;
 	}
@@ -5633,8 +5637,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * unshare)
 	 */
 	if (new_page != old_page)
-		restore_reserve_on_error(h, vma, haddr, new_page);
-	put_page(new_page);
+		restore_reserve_on_error(h, vma, haddr, &new_folio->page);
+	folio_put(new_folio);
 out_release_old:
 	put_page(old_page);
 
@@ -5756,6 +5760,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	int anon_rmap = 0;
 	unsigned long size;
 	struct page *page;
+	struct folio *folio = NULL;
 	pte_t new_pte;
 	spinlock_t *ptl;
 	unsigned long haddr = address & huge_page_mask(h);
@@ -5833,12 +5838,16 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 				ret = 0;
 			goto out;
 		}
-		clear_huge_page(page, address, pages_per_huge_page(h));
-		__SetPageUptodate(page);
+
+		if (page)
+			folio = page_folio(page);
+
+		clear_huge_page(&folio->page, address, pages_per_huge_page(h));
+		__folio_mark_uptodate(folio);
 		new_page = true;
 
 		if (vma->vm_flags & VM_MAYSHARE) {
-			int err = hugetlb_add_to_page_cache(page, mapping, idx);
+			int err = hugetlb_add_to_page_cache(&folio->page, mapping, idx);
 			if (err) {
 				/*
 				 * err can't be -EEXIST which implies someone
@@ -5847,13 +5856,13 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 				 * to the page cache. So it's safe to call
 				 * restore_reserve_on_error() here.
 				 */
-				restore_reserve_on_error(h, vma, haddr, page);
-				put_page(page);
+				restore_reserve_on_error(h, vma, haddr, &folio->page);
+				folio_put(folio);
 				goto out;
 			}
 			new_pagecache_page = true;
 		} else {
-			lock_page(page);
+			folio_lock(folio);
 			if (unlikely(anon_vma_prepare(vma))) {
 				ret = VM_FAULT_OOM;
 				goto backout_unlocked;
@@ -5861,12 +5870,13 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			anon_rmap = 1;
 		}
 	} else {
+		folio = page_folio(page);
 		/*
 		 * If memory error occurs between mmap() and fault, some process
 		 * don't have hwpoisoned swap entry for errored virtual address.
 		 * So we need to block hugepage fault by PG_hwpoison bit check.
 		 */
-		if (unlikely(PageHWPoison(page))) {
+		if (unlikely(folio_test_hwpoison(folio))) {
 			ret = VM_FAULT_HWPOISON_LARGE |
 				VM_FAULT_SET_HINDEX(hstate_index(h));
 			goto backout_unlocked;
@@ -5874,8 +5884,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 
 		/* Check for page in userfault range. */
 		if (userfaultfd_minor(vma)) {
-			unlock_page(page);
-			put_page(page);
+			folio_unlock(folio);
+			folio_put(folio);
 			/* See comment in userfaultfd_missing() block above */
 			if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) {
 				ret = 0;
@@ -5909,10 +5919,10 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 		goto backout;
 
 	if (anon_rmap)
-		hugepage_add_new_anon_rmap(page, vma, haddr);
+		hugepage_add_new_anon_rmap(folio, vma, haddr);
 	else
-		page_dup_file_rmap(page, true);
-	new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE)
+		page_dup_file_rmap(&folio->page, true);
+	new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE)
 				&& (vma->vm_flags & VM_SHARED)));
 	/*
 	 * If this pte was previously wr-protected, keep it wr-protected even
@@ -5925,7 +5935,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	hugetlb_count_add(pages_per_huge_page(h), mm);
 	if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
 		/* Optimization, do the COW without a second fault */
-		ret = hugetlb_wp(mm, vma, address, ptep, flags, page, ptl);
+		ret = hugetlb_wp(mm, vma, address, ptep, flags, &folio->page, ptl);
 	}
 
 	spin_unlock(ptl);
@@ -5936,9 +5946,9 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	 * been isolated for migration.
 	 */
 	if (new_page)
-		SetHPageMigratable(page);
+		folio_set_hugetlb_migratable(folio);
 
-	unlock_page(page);
+	folio_unlock(folio);
 out:
 	hugetlb_vma_unlock_read(vma);
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -5948,10 +5958,10 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	spin_unlock(ptl);
 backout_unlocked:
 	if (new_page && !new_pagecache_page)
-		restore_reserve_on_error(h, vma, haddr, page);
+		restore_reserve_on_error(h, vma, haddr, &folio->page);
 
-	unlock_page(page);
-	put_page(page);
+	folio_unlock(folio);
+	folio_put(folio);
 	goto out;
 }
 
@@ -6176,6 +6186,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	spinlock_t *ptl;
 	int ret = -ENOMEM;
 	struct page *page;
+	struct folio *folio = NULL;
 	int writable;
 	bool page_in_pagecache = false;
 
@@ -6251,12 +6262,15 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		*pagep = NULL;
 	}
 
+	if (page)
+		folio = page_folio(page);
+
 	/*
-	 * The memory barrier inside __SetPageUptodate makes sure that
+	 * The memory barrier inside __folio_mark_uptodate makes sure that
 	 * preceding stores to the page contents become visible before
 	 * the set_pte_at() write.
 	 */
-	__SetPageUptodate(page);
+	__folio_mark_uptodate(folio);
 
 	/* Add shared, newly allocated pages to the page cache. */
 	if (vm_shared && !is_continue) {
@@ -6271,7 +6285,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		 * hugetlb_fault_mutex_table that here must be hold by
 		 * the caller.
 		 */
-		ret = hugetlb_add_to_page_cache(page, mapping, idx);
+		ret = hugetlb_add_to_page_cache(&folio->page, mapping, idx);
 		if (ret)
 			goto out_release_nounlock;
 		page_in_pagecache = true;
@@ -6280,7 +6294,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	ptl = huge_pte_lock(h, dst_mm, dst_pte);
 
 	ret = -EIO;
-	if (PageHWPoison(page))
+	if (folio_test_hwpoison(folio))
 		goto out_release_unlock;
 
 	/*
@@ -6293,9 +6307,9 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		goto out_release_unlock;
 
 	if (page_in_pagecache)
-		page_dup_file_rmap(page, true);
+		page_dup_file_rmap(&folio->page, true);
 	else
-		hugepage_add_new_anon_rmap(page, dst_vma, dst_addr);
+		hugepage_add_new_anon_rmap(folio, dst_vma, dst_addr);
 
 	/*
 	 * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY
@@ -6306,7 +6320,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	else
 		writable = dst_vma->vm_flags & VM_WRITE;
 
-	_dst_pte = make_huge_pte(dst_vma, page, writable);
+	_dst_pte = make_huge_pte(dst_vma, &folio->page, writable);
 	/*
 	 * Always mark UFFDIO_COPY page dirty; note that this may not be
 	 * extremely important for hugetlbfs for now since swapping is not
@@ -6328,20 +6342,20 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 
 	spin_unlock(ptl);
 	if (!is_continue)
-		SetHPageMigratable(page);
+		folio_set_hugetlb_migratable(folio);
 	if (vm_shared || is_continue)
-		unlock_page(page);
+		folio_unlock(folio);
 	ret = 0;
 out:
 	return ret;
 out_release_unlock:
 	spin_unlock(ptl);
 	if (vm_shared || is_continue)
-		unlock_page(page);
+		folio_unlock(folio);
 out_release_nounlock:
 	if (!page_in_pagecache)
-		restore_reserve_on_error(h, dst_vma, dst_addr, page);
-	put_page(page);
+		restore_reserve_on_error(h, dst_vma, dst_addr, &folio->page);
+	folio_put(folio);
 	goto out;
 }
 #endif /* CONFIG_USERFAULTFD */
diff --git a/mm/rmap.c b/mm/rmap.c
index 948ca17a96ad..e6d94bd19879 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2547,15 +2547,13 @@ void hugepage_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
 				     !!(flags & RMAP_EXCLUSIVE));
 }
 
-void hugepage_add_new_anon_rmap(struct page *page,
+void hugepage_add_new_anon_rmap(struct folio *folio,
 			struct vm_area_struct *vma, unsigned long address)
 {
-	struct folio *folio = page_folio(page);
-
 	BUG_ON(address < vma->vm_start || address >= vma->vm_end);
 	/* increment count (starts at -1) */
 	atomic_set(&folio->_entire_mapcount, 0);
 	folio_clear_hugetlb_restore_reserve(folio);
-	__page_set_anon_rmap(folio, page, vma, address, 1);
+	__page_set_anon_rmap(folio, &folio->page, vma, address, 1);
 }
 #endif /* CONFIG_HUGETLB_PAGE */
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/9] mm/hugetlb: convert alloc_huge_page to alloc_hugetlb_folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
                   ` (3 preceding siblings ...)
  2023-01-19 21:14 ` [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap " Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 6/9] mm/hugetlb: convert restore_reserve_on_error to take in a folio Sidhartha Kumar
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Change alloc_huge_page() to alloc_hugetlb_folio() by changing all callers
to handle the now folio return type of the function. In this conversion,
alloc_huge_page_vma() is also changed to alloc_hugetlb_folio_vma(). Many
additions of the use of &folio->page are cleaned up in subsequent patches.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 fs/hugetlbfs/inode.c    |  38 +++++++-------
 include/linux/hugetlb.h |   8 +--
 mm/hugetlb.c            | 113 +++++++++++++++++-----------------------
 mm/mempolicy.c          |   6 ++-
 4 files changed, 76 insertions(+), 89 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 48f1a8ad2243..e2f8951103be 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -819,7 +819,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		 * This is supposed to be the vaddr where the page is being
 		 * faulted in, but we have no vaddr here.
 		 */
-		struct page *page;
+		struct folio *folio;
 		unsigned long addr;
 
 		cond_resched();
@@ -844,48 +844,48 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 		/* See if already present in mapping to avoid alloc/free */
-		page = find_get_page(mapping, index);
-		if (page) {
-			put_page(page);
+		folio = filemap_get_folio(mapping, index);
+		if (folio) {
+			folio_put(folio);
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 			hugetlb_drop_vma_policy(&pseudo_vma);
 			continue;
 		}
 
 		/*
-		 * Allocate page without setting the avoid_reserve argument.
+		 * Allocate folio without setting the avoid_reserve argument.
 		 * There certainly are no reserves associated with the
 		 * pseudo_vma.  However, there could be shared mappings with
 		 * reserves for the file at the inode level.  If we fallocate
-		 * pages in these areas, we need to consume the reserves
+		 * folios in these areas, we need to consume the reserves
 		 * to keep reservation accounting consistent.
 		 */
-		page = alloc_huge_page(&pseudo_vma, addr, 0);
+		folio = alloc_hugetlb_folio(&pseudo_vma, addr, 0);
 		hugetlb_drop_vma_policy(&pseudo_vma);
-		if (IS_ERR(page)) {
+		if (IS_ERR(folio)) {
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
-			error = PTR_ERR(page);
+			error = PTR_ERR(folio);
 			goto out;
 		}
-		clear_huge_page(page, addr, pages_per_huge_page(h));
-		__SetPageUptodate(page);
-		error = hugetlb_add_to_page_cache(page, mapping, index);
+		clear_huge_page(&folio->page, addr, pages_per_huge_page(h));
+		__folio_mark_uptodate(folio);
+		error = hugetlb_add_to_page_cache(&folio->page, mapping, index);
 		if (unlikely(error)) {
-			restore_reserve_on_error(h, &pseudo_vma, addr, page);
-			put_page(page);
+			restore_reserve_on_error(h, &pseudo_vma, addr, &folio->page);
+			folio_put(folio);
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 			goto out;
 		}
 
 		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
-		SetHPageMigratable(page);
+		folio_set_hugetlb_migratable(folio);
 		/*
-		 * unlock_page because locked by hugetlb_add_to_page_cache()
-		 * put_page() due to reference from alloc_huge_page()
+		 * folio_unlock because locked by hugetlb_add_to_page_cache()
+		 * folio_put() due to reference from alloc_hugetlb_folio()
 		 */
-		unlock_page(page);
-		put_page(page);
+		folio_unlock(folio);
+		folio_put(folio);
 	}
 
 	if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > inode->i_size)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index f88c832bdfa4..8f054d81e63b 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -716,11 +716,11 @@ struct huge_bootmem_page {
 };
 
 int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list);
-struct page *alloc_huge_page(struct vm_area_struct *vma,
+struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
 struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid,
 				nodemask_t *nmask, gfp_t gfp_mask);
-struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
+struct folio *alloc_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma,
 				unsigned long address);
 int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping,
 			pgoff_t idx);
@@ -1032,7 +1032,7 @@ static inline int isolate_or_dissolve_huge_page(struct page *page,
 	return -ENOMEM;
 }
 
-static inline struct page *alloc_huge_page(struct vm_area_struct *vma,
+static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 					   unsigned long addr,
 					   int avoid_reserve)
 {
@@ -1046,7 +1046,7 @@ alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid,
 	return NULL;
 }
 
-static inline struct page *alloc_huge_page_vma(struct hstate *h,
+static inline struct folio *alloc_hugetlb_folio_vma(struct hstate *h,
 					       struct vm_area_struct *vma,
 					       unsigned long address)
 {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6f25055c3ba5..46467e80716f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2493,7 +2493,7 @@ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid,
 }
 
 /* mempolicy aware migration callback */
-struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
+struct folio *alloc_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma,
 		unsigned long address)
 {
 	struct mempolicy *mpol;
@@ -2507,7 +2507,7 @@ struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma,
 	folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask);
 	mpol_cond_put(mpol);
 
-	return &folio->page;
+	return folio;
 }
 
 /*
@@ -2798,14 +2798,14 @@ static long vma_del_reservation(struct hstate *h,
 
 /*
  * This routine is called to restore reservation information on error paths.
- * It should ONLY be called for pages allocated via alloc_huge_page(), and
- * the hugetlb mutex should remain held when calling this routine.
+ * It should ONLY be called for folios allocated via alloc_hugetlb_folio(),
+ * and the hugetlb mutex should remain held when calling this routine.
  *
  * It handles two specific cases:
  * 1) A reservation was in place and the page consumed the reservation.
  *    HPageRestoreReserve is set in the page.
  * 2) No reservation was in place for the page, so HPageRestoreReserve is
- *    not set.  However, alloc_huge_page always updates the reserve map.
+ *    not set.  However, alloc_hugetlb_folio always updates the reserve map.
  *
  * In case 1, free_huge_page later in the error path will increment the
  * global reserve count.  But, free_huge_page does not have enough context
@@ -2814,7 +2814,7 @@ static long vma_del_reservation(struct hstate *h,
  * reserve count adjustments to be made by free_huge_page.  Make sure the
  * reserve map indicates there is a reservation present.
  *
- * In case 2, simply undo reserve map modifications done by alloc_huge_page.
+ * In case 2, simply undo reserve map modifications done by alloc_hugetlb_folio.
  */
 void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
 			unsigned long address, struct page *page)
@@ -2844,8 +2844,8 @@ void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
 		if (!rc) {
 			/*
 			 * This indicates there is an entry in the reserve map
-			 * not added by alloc_huge_page.  We know it was added
-			 * before the alloc_huge_page call, otherwise
+			 * not added by alloc_hugetlb_folio.  We know it was added
+			 * before the alloc_hugetlb_folio call, otherwise
 			 * hugetlb_restore_reserve would be set on the folio.
 			 * Remove the entry so that a subsequent allocation
 			 * does not consume a reservation.
@@ -3014,7 +3014,7 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
 	return ret;
 }
 
-struct page *alloc_huge_page(struct vm_area_struct *vma,
+struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 				    unsigned long addr, int avoid_reserve)
 {
 	struct hugepage_subpool *spool = subpool_vma(vma);
@@ -3130,7 +3130,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 			hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
 					pages_per_huge_page(h), folio);
 	}
-	return &folio->page;
+	return folio;
 
 out_uncharge_cgroup:
 	hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg);
@@ -5080,34 +5080,34 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 			} else if (page_try_dup_anon_rmap(ptepage, true,
 							  src_vma)) {
 				pte_t src_pte_old = entry;
-				struct page *new;
+				struct folio *new_folio;
 
 				spin_unlock(src_ptl);
 				spin_unlock(dst_ptl);
 				/* Do not use reserve as it's private owned */
-				new = alloc_huge_page(dst_vma, addr, 1);
-				if (IS_ERR(new)) {
+				new_folio = alloc_hugetlb_folio(dst_vma, addr, 1);
+				if (IS_ERR(new_folio)) {
 					put_page(ptepage);
-					ret = PTR_ERR(new);
+					ret = PTR_ERR(new_folio);
 					break;
 				}
-				copy_user_huge_page(new, ptepage, addr, dst_vma,
+				copy_user_huge_page(&new_folio->page, ptepage, addr, dst_vma,
 						    npages);
 				put_page(ptepage);
 
-				/* Install the new huge page if src pte stable */
+				/* Install the new hugetlb folio if src pte stable */
 				dst_ptl = huge_pte_lock(h, dst, dst_pte);
 				src_ptl = huge_pte_lockptr(h, src, src_pte);
 				spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
 				entry = huge_ptep_get(src_pte);
 				if (!pte_same(src_pte_old, entry)) {
 					restore_reserve_on_error(h, dst_vma, addr,
-								new);
-					put_page(new);
+								&new_folio->page);
+					folio_put(new_folio);
 					/* huge_ptep of dst_pte won't change as in child */
 					goto again;
 				}
-				hugetlb_install_folio(dst_vma, dst_pte, addr, page_folio(new));
+				hugetlb_install_folio(dst_vma, dst_pte, addr, new_folio);
 				spin_unlock(src_ptl);
 				spin_unlock(dst_ptl);
 				continue;
@@ -5478,8 +5478,8 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	const bool unshare = flags & FAULT_FLAG_UNSHARE;
 	pte_t pte;
 	struct hstate *h = hstate_vma(vma);
-	struct page *old_page, *new_page;
-	struct folio *new_folio = NULL;
+	struct page *old_page;
+	struct folio *new_folio;
 	int outside_reserve = 0;
 	vm_fault_t ret = 0;
 	unsigned long haddr = address & huge_page_mask(h);
@@ -5540,9 +5540,9 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * be acquired again before returning to the caller, as expected.
 	 */
 	spin_unlock(ptl);
-	new_page = alloc_huge_page(vma, haddr, outside_reserve);
+	new_folio = alloc_hugetlb_folio(vma, haddr, outside_reserve);
 
-	if (IS_ERR(new_page)) {
+	if (IS_ERR(new_folio)) {
 		/*
 		 * If a process owning a MAP_PRIVATE mapping fails to COW,
 		 * it is due to references held by a child and an insufficient
@@ -5587,13 +5587,9 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 			return 0;
 		}
 
-		ret = vmf_error(PTR_ERR(new_page));
+		ret = vmf_error(PTR_ERR(new_folio));
 		goto out_release_old;
 	}
-
-	if (new_page)
-		new_folio = page_folio(new_page);
-
 	/*
 	 * When the original hugepage is shared one, it does not have
 	 * anon_vma prepared.
@@ -5627,7 +5623,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 				make_huge_pte(vma, &new_folio->page, !unshare));
 		folio_set_hugetlb_migratable(new_folio);
 		/* Make the old page be freed below */
-		new_page = old_page;
+		new_folio = page_folio(old_page);
 	}
 	spin_unlock(ptl);
 	mmu_notifier_invalidate_range_end(&range);
@@ -5636,7 +5632,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * No restore in case of successful pagetable update (Break COW or
 	 * unshare)
 	 */
-	if (new_page != old_page)
+	if (new_folio != page_folio(old_page))
 		restore_reserve_on_error(h, vma, haddr, &new_folio->page);
 	folio_put(new_folio);
 out_release_old:
@@ -5759,8 +5755,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	vm_fault_t ret = VM_FAULT_SIGBUS;
 	int anon_rmap = 0;
 	unsigned long size;
-	struct page *page;
-	struct folio *folio = NULL;
+	struct folio *folio;
 	pte_t new_pte;
 	spinlock_t *ptl;
 	unsigned long haddr = address & huge_page_mask(h);
@@ -5784,8 +5779,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	 * before we get page_table_lock.
 	 */
 	new_page = false;
-	page = find_lock_page(mapping, idx);
-	if (!page) {
+	folio = filemap_lock_folio(mapping, idx);
+	if (!folio) {
 		size = i_size_read(mapping->host) >> huge_page_shift(h);
 		if (idx >= size)
 			goto out;
@@ -5818,8 +5813,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 							VM_UFFD_MISSING);
 		}
 
-		page = alloc_huge_page(vma, haddr, 0);
-		if (IS_ERR(page)) {
+		folio = alloc_hugetlb_folio(vma, haddr, 0);
+		if (IS_ERR(folio)) {
 			/*
 			 * Returning error will result in faulting task being
 			 * sent SIGBUS.  The hugetlb fault mutex prevents two
@@ -5833,15 +5828,11 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			 * sure there really is no pte entry.
 			 */
 			if (hugetlb_pte_stable(h, mm, ptep, old_pte))
-				ret = vmf_error(PTR_ERR(page));
+				ret = vmf_error(PTR_ERR(folio));
 			else
 				ret = 0;
 			goto out;
 		}
-
-		if (page)
-			folio = page_folio(page);
-
 		clear_huge_page(&folio->page, address, pages_per_huge_page(h));
 		__folio_mark_uptodate(folio);
 		new_page = true;
@@ -5870,7 +5861,6 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			anon_rmap = 1;
 		}
 	} else {
-		folio = page_folio(page);
 		/*
 		 * If memory error occurs between mmap() and fault, some process
 		 * don't have hwpoisoned swap entry for errored virtual address.
@@ -6185,15 +6175,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 	pte_t _dst_pte;
 	spinlock_t *ptl;
 	int ret = -ENOMEM;
-	struct page *page;
-	struct folio *folio = NULL;
+	struct folio *folio;
 	int writable;
 	bool page_in_pagecache = false;
 
 	if (is_continue) {
 		ret = -EFAULT;
-		page = find_lock_page(mapping, idx);
-		if (!page)
+		folio = filemap_lock_folio(mapping, idx);
+		if (!folio)
 			goto out;
 		page_in_pagecache = true;
 	} else if (!*pagep) {
@@ -6206,34 +6195,34 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 			goto out;
 		}
 
-		page = alloc_huge_page(dst_vma, dst_addr, 0);
-		if (IS_ERR(page)) {
+		folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0);
+		if (IS_ERR(folio)) {
 			ret = -ENOMEM;
 			goto out;
 		}
 
-		ret = copy_huge_page_from_user(page,
+		ret = copy_huge_page_from_user(&folio->page,
 						(const void __user *) src_addr,
 						pages_per_huge_page(h), false);
 
 		/* fallback to copy_from_user outside mmap_lock */
 		if (unlikely(ret)) {
 			ret = -ENOENT;
-			/* Free the allocated page which may have
+			/* Free the allocated folio which may have
 			 * consumed a reservation.
 			 */
-			restore_reserve_on_error(h, dst_vma, dst_addr, page);
-			put_page(page);
+			restore_reserve_on_error(h, dst_vma, dst_addr, &folio->page);
+			folio_put(folio);
 
-			/* Allocate a temporary page to hold the copied
+			/* Allocate a temporary folio to hold the copied
 			 * contents.
 			 */
-			page = alloc_huge_page_vma(h, dst_vma, dst_addr);
-			if (!page) {
+			folio = alloc_hugetlb_folio_vma(h, dst_vma, dst_addr);
+			if (!folio) {
 				ret = -ENOMEM;
 				goto out;
 			}
-			*pagep = page;
+			*pagep = &folio->page;
 			/* Set the outparam pagep and return to the caller to
 			 * copy the contents outside the lock. Don't free the
 			 * page.
@@ -6249,22 +6238,18 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 			goto out;
 		}
 
-		page = alloc_huge_page(dst_vma, dst_addr, 0);
-		if (IS_ERR(page)) {
+		folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0);
+		if (IS_ERR(folio)) {
 			put_page(*pagep);
 			ret = -ENOMEM;
 			*pagep = NULL;
 			goto out;
 		}
-		copy_user_huge_page(page, *pagep, dst_addr, dst_vma,
+		copy_user_huge_page(&folio->page, *pagep, dst_addr, dst_vma,
 				    pages_per_huge_page(h));
 		put_page(*pagep);
 		*pagep = NULL;
 	}
-
-	if (page)
-		folio = page_folio(page);
-
 	/*
 	 * The memory barrier inside __folio_mark_uptodate makes sure that
 	 * preceding stores to the page contents become visible before
@@ -6887,7 +6872,7 @@ bool hugetlb_reserve_pages(struct inode *inode,
 			/*
 			 * pages in this range were added to the reserve
 			 * map between region_chg and region_add.  This
-			 * indicates a race with alloc_huge_page.  Adjust
+			 * indicates a race with alloc_hugetlb_folio.  Adjust
 			 * the subpool and reserve counts modified above
 			 * based on the difference.
 			 */
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index fd99d303e34f..945b41c245a5 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1218,9 +1218,11 @@ static struct page *new_page(struct page *page, unsigned long start)
 			break;
 	}
 
-	if (folio_test_hugetlb(src))
-		return alloc_huge_page_vma(page_hstate(&src->page),
+	if (folio_test_hugetlb(src)) {
+		dst = alloc_hugetlb_folio_vma(folio_hstate(src),
 				vma, address);
+		return &dst->page;
+	}
 
 	if (folio_test_large(src))
 		gfp = GFP_TRANSHUGE;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/9] mm/hugetlb: convert restore_reserve_on_error to take in a folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
                   ` (4 preceding siblings ...)
  2023-01-19 21:14 ` [PATCH 5/9] mm/hugetlb: convert alloc_huge_page to alloc_hugetlb_folio Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 7/9] mm/hugetlb: convert hugetlb_add_to_page_cache " Sidhartha Kumar
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Every caller of restore_reserve_on_error() is now passing in &folio->page,
change the function to take in a folio directly and clean up the call
sites.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 fs/hugetlbfs/inode.c    |  2 +-
 include/linux/hugetlb.h |  2 +-
 mm/hugetlb.c            | 21 ++++++++++-----------
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index e2f8951103be..880ebc58f330 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -871,7 +871,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		__folio_mark_uptodate(folio);
 		error = hugetlb_add_to_page_cache(&folio->page, mapping, index);
 		if (unlikely(error)) {
-			restore_reserve_on_error(h, &pseudo_vma, addr, &folio->page);
+			restore_reserve_on_error(h, &pseudo_vma, addr, folio);
 			folio_put(folio);
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 			goto out;
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 8f054d81e63b..e76be7e8df2c 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -725,7 +725,7 @@ struct folio *alloc_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *v
 int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping,
 			pgoff_t idx);
 void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
-				unsigned long address, struct page *page);
+				unsigned long address, struct folio *folio);
 
 /* arch callback */
 int __init __alloc_bootmem_huge_page(struct hstate *h, int nid);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 46467e80716f..9568d49c12d6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2802,9 +2802,9 @@ static long vma_del_reservation(struct hstate *h,
  * and the hugetlb mutex should remain held when calling this routine.
  *
  * It handles two specific cases:
- * 1) A reservation was in place and the page consumed the reservation.
- *    HPageRestoreReserve is set in the page.
- * 2) No reservation was in place for the page, so HPageRestoreReserve is
+ * 1) A reservation was in place and the folio consumed the reservation.
+ *    hugetlb_restore_reserve is set in the folio.
+ * 2) No reservation was in place for the page, so hugetlb_restore_reserve is
  *    not set.  However, alloc_hugetlb_folio always updates the reserve map.
  *
  * In case 1, free_huge_page later in the error path will increment the
@@ -2817,9 +2817,8 @@ static long vma_del_reservation(struct hstate *h,
  * In case 2, simply undo reserve map modifications done by alloc_hugetlb_folio.
  */
 void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
-			unsigned long address, struct page *page)
+			unsigned long address, struct folio *folio)
 {
-	struct folio *folio = page_folio(page);
 	long rc = vma_needs_reservation(h, vma, address);
 
 	if (folio_test_hugetlb_restore_reserve(folio)) {
@@ -5102,7 +5101,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 				entry = huge_ptep_get(src_pte);
 				if (!pte_same(src_pte_old, entry)) {
 					restore_reserve_on_error(h, dst_vma, addr,
-								&new_folio->page);
+								new_folio);
 					folio_put(new_folio);
 					/* huge_ptep of dst_pte won't change as in child */
 					goto again;
@@ -5633,7 +5632,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * unshare)
 	 */
 	if (new_folio != page_folio(old_page))
-		restore_reserve_on_error(h, vma, haddr, &new_folio->page);
+		restore_reserve_on_error(h, vma, haddr, new_folio);
 	folio_put(new_folio);
 out_release_old:
 	put_page(old_page);
@@ -5847,7 +5846,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 				 * to the page cache. So it's safe to call
 				 * restore_reserve_on_error() here.
 				 */
-				restore_reserve_on_error(h, vma, haddr, &folio->page);
+				restore_reserve_on_error(h, vma, haddr, folio);
 				folio_put(folio);
 				goto out;
 			}
@@ -5948,7 +5947,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	spin_unlock(ptl);
 backout_unlocked:
 	if (new_page && !new_pagecache_page)
-		restore_reserve_on_error(h, vma, haddr, &folio->page);
+		restore_reserve_on_error(h, vma, haddr, folio);
 
 	folio_unlock(folio);
 	folio_put(folio);
@@ -6211,7 +6210,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 			/* Free the allocated folio which may have
 			 * consumed a reservation.
 			 */
-			restore_reserve_on_error(h, dst_vma, dst_addr, &folio->page);
+			restore_reserve_on_error(h, dst_vma, dst_addr, folio);
 			folio_put(folio);
 
 			/* Allocate a temporary folio to hold the copied
@@ -6339,7 +6338,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		folio_unlock(folio);
 out_release_nounlock:
 	if (!page_in_pagecache)
-		restore_reserve_on_error(h, dst_vma, dst_addr, &folio->page);
+		restore_reserve_on_error(h, dst_vma, dst_addr, folio);
 	folio_put(folio);
 	goto out;
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/9] mm/hugetlb: convert hugetlb_add_to_page_cache to take in a folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
                   ` (5 preceding siblings ...)
  2023-01-19 21:14 ` [PATCH 6/9] mm/hugetlb: convert restore_reserve_on_error to take in a folio Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() " Sidhartha Kumar
  2023-01-19 21:14 ` [PATCH 9/9] Documentation/mm: update hugetlbfs documentation to mention alloc_hugetlb_folio Sidhartha Kumar
  8 siblings, 0 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Every caller of hugetlb_add_to_page_cache() is now passing in
&folio->page, change the function to take in a folio directly
and clean up the call sites.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 fs/hugetlbfs/inode.c    | 2 +-
 include/linux/hugetlb.h | 2 +-
 mm/hugetlb.c            | 9 ++++-----
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 880ebc58f330..76e146bdcba0 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -869,7 +869,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		}
 		clear_huge_page(&folio->page, addr, pages_per_huge_page(h));
 		__folio_mark_uptodate(folio);
-		error = hugetlb_add_to_page_cache(&folio->page, mapping, index);
+		error = hugetlb_add_to_page_cache(folio, mapping, index);
 		if (unlikely(error)) {
 			restore_reserve_on_error(h, &pseudo_vma, addr, folio);
 			folio_put(folio);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index e76be7e8df2c..2dcae9bb4495 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -722,7 +722,7 @@ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid,
 				nodemask_t *nmask, gfp_t gfp_mask);
 struct folio *alloc_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma,
 				unsigned long address);
-int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping,
+int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
 			pgoff_t idx);
 void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
 				unsigned long address, struct folio *folio);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9568d49c12d6..4ab3eda6db18 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5663,10 +5663,9 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
 	return folio != NULL;
 }
 
-int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping,
+int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
 			   pgoff_t idx)
 {
-	struct folio *folio = page_folio(page);
 	struct inode *inode = mapping->host;
 	struct hstate *h = hstate_inode(inode);
 	int err;
@@ -5678,7 +5677,7 @@ int hugetlb_add_to_page_cache(struct page *page, struct address_space *mapping,
 		__folio_clear_locked(folio);
 		return err;
 	}
-	ClearHPageRestoreReserve(page);
+	folio_clear_hugetlb_restore_reserve(folio);
 
 	/*
 	 * mark folio dirty so that it will not be removed from cache/file
@@ -5837,7 +5836,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 		new_page = true;
 
 		if (vma->vm_flags & VM_MAYSHARE) {
-			int err = hugetlb_add_to_page_cache(&folio->page, mapping, idx);
+			int err = hugetlb_add_to_page_cache(folio, mapping, idx);
 			if (err) {
 				/*
 				 * err can't be -EEXIST which implies someone
@@ -6269,7 +6268,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 		 * hugetlb_fault_mutex_table that here must be hold by
 		 * the caller.
 		 */
-		ret = hugetlb_add_to_page_cache(&folio->page, mapping, idx);
+		ret = hugetlb_add_to_page_cache(folio, mapping, idx);
 		if (ret)
 			goto out_release_nounlock;
 		page_in_pagecache = true;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() to take in a folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
                   ` (6 preceding siblings ...)
  2023-01-19 21:14 ` [PATCH 7/9] mm/hugetlb: convert hugetlb_add_to_page_cache " Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  2023-01-24 15:23   ` Gerald Schaefer
  2023-01-19 21:14 ` [PATCH 9/9] Documentation/mm: update hugetlbfs documentation to mention alloc_hugetlb_folio Sidhartha Kumar
  8 siblings, 1 reply; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

Change the pagecache_page argument of hugetlb_wp to pagecache_folio.
Replaces a call to find_lock_page() with filemap_lock_folio().

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4ab3eda6db18..20127271b64c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5472,7 +5472,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma,
  */
 static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 		       unsigned long address, pte_t *ptep, unsigned int flags,
-		       struct page *pagecache_page, spinlock_t *ptl)
+		       struct folio *pagecache_folio, spinlock_t *ptl)
 {
 	const bool unshare = flags & FAULT_FLAG_UNSHARE;
 	pte_t pte;
@@ -5529,7 +5529,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * of the full address range.
 	 */
 	if (is_vma_resv_set(vma, HPAGE_RESV_OWNER) &&
-			old_page != pagecache_page)
+			page_folio(old_page) != pagecache_folio)
 		outside_reserve = 1;
 
 	get_page(old_page);
@@ -5923,7 +5923,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	hugetlb_count_add(pages_per_huge_page(h), mm);
 	if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
 		/* Optimization, do the COW without a second fault */
-		ret = hugetlb_wp(mm, vma, address, ptep, flags, &folio->page, ptl);
+		ret = hugetlb_wp(mm, vma, address, ptep, flags, folio, ptl);
 	}
 
 	spin_unlock(ptl);
@@ -5986,7 +5986,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	u32 hash;
 	pgoff_t idx;
 	struct page *page = NULL;
-	struct page *pagecache_page = NULL;
+	struct folio *pagecache_folio = NULL;
 	struct hstate *h = hstate_vma(vma);
 	struct address_space *mapping;
 	int need_wait_lock = 0;
@@ -6068,7 +6068,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		/* Just decrements count, does not deallocate */
 		vma_end_reservation(h, vma, haddr);
 
-		pagecache_page = find_lock_page(mapping, idx);
+		pagecache_folio = filemap_lock_folio(mapping, idx);
 	}
 
 	ptl = huge_pte_lock(h, mm, ptep);
@@ -6088,9 +6088,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		};
 
 		spin_unlock(ptl);
-		if (pagecache_page) {
-			unlock_page(pagecache_page);
-			put_page(pagecache_page);
+		if (pagecache_folio) {
+			folio_unlock(pagecache_folio);
+			folio_put(pagecache_folio);
 		}
 		hugetlb_vma_unlock_read(vma);
 		mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -6099,22 +6099,22 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 
 	/*
 	 * hugetlb_wp() requires page locks of pte_page(entry) and
-	 * pagecache_page, so here we need take the former one
-	 * when page != pagecache_page or !pagecache_page.
+	 * pagecache_folio, so here we need take the former one
+	 * when page != pagecache_folio or !pagecache_folio.
 	 */
 	page = pte_page(entry);
-	if (page != pagecache_page)
+	if (page_folio(page) != pagecache_folio)
 		if (!trylock_page(page)) {
 			need_wait_lock = 1;
 			goto out_ptl;
 		}
 
-	get_page(page);
+	folio_get(pagecache_folio);
 
 	if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) {
 		if (!huge_pte_write(entry)) {
 			ret = hugetlb_wp(mm, vma, address, ptep, flags,
-					 pagecache_page, ptl);
+					 pagecache_folio, ptl);
 			goto out_put_page;
 		} else if (likely(flags & FAULT_FLAG_WRITE)) {
 			entry = huge_pte_mkdirty(entry);
@@ -6125,15 +6125,15 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 						flags & FAULT_FLAG_WRITE))
 		update_mmu_cache(vma, haddr, ptep);
 out_put_page:
-	if (page != pagecache_page)
+	if (page_folio(page) != pagecache_folio)
 		unlock_page(page);
 	put_page(page);
 out_ptl:
 	spin_unlock(ptl);
 
-	if (pagecache_page) {
-		unlock_page(pagecache_page);
-		put_page(pagecache_page);
+	if (pagecache_folio) {
+		folio_unlock(pagecache_folio);
+		folio_put(pagecache_folio);
 	}
 out_mutex:
 	hugetlb_vma_unlock_read(vma);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 9/9] Documentation/mm: update hugetlbfs documentation to mention alloc_hugetlb_folio
  2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
                   ` (7 preceding siblings ...)
  2023-01-19 21:14 ` [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() " Sidhartha Kumar
@ 2023-01-19 21:14 ` Sidhartha Kumar
  8 siblings, 0 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-19 21:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: akpm, songmuchun, mike.kravetz, willy, jhubbard, Sidhartha Kumar

alloc_huge_page() has been renamed to alloc_hugetlb_folio() so update the
documentation which mentions alloc_huge_page(). Subpool information has
been moved from page->private to having a dedicated field in struct folio,
reflect this in the documentation.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 Documentation/mm/hugetlbfs_reserv.rst         | 21 ++++++++++---------
 .../zh_CN/mm/hugetlbfs_reserv.rst             | 14 ++++++-------
 2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/Documentation/mm/hugetlbfs_reserv.rst b/Documentation/mm/hugetlbfs_reserv.rst
index f143954e0d05..611728c49bff 100644
--- a/Documentation/mm/hugetlbfs_reserv.rst
+++ b/Documentation/mm/hugetlbfs_reserv.rst
@@ -181,14 +181,14 @@ Consuming Reservations/Allocating a Huge Page
 
 Reservations are consumed when huge pages associated with the reservations
 are allocated and instantiated in the corresponding mapping.  The allocation
-is performed within the routine alloc_huge_page()::
+is performed within the routine alloc_hugetlb_folio()::
 
-	struct page *alloc_huge_page(struct vm_area_struct *vma,
+	struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 				     unsigned long addr, int avoid_reserve)
 
-alloc_huge_page is passed a VMA pointer and a virtual address, so it can
+alloc_hugetlb_folio is passed a VMA pointer and a virtual address, so it can
 consult the reservation map to determine if a reservation exists.  In addition,
-alloc_huge_page takes the argument avoid_reserve which indicates reserves
+alloc_hugetlb_folio takes the argument avoid_reserve which indicates reserves
 should not be used even if it appears they have been set aside for the
 specified address.  The avoid_reserve argument is most often used in the case
 of Copy on Write and Page Migration where additional copies of an existing
@@ -208,7 +208,8 @@ a reservation for the allocation.  After determining whether a reservation
 exists and can be used for the allocation, the routine dequeue_huge_page_vma()
 is called.  This routine takes two arguments related to reservations:
 
-- avoid_reserve, this is the same value/argument passed to alloc_huge_page()
+- avoid_reserve, this is the same value/argument passed to
+  alloc_hugetlb_folio().
 - chg, even though this argument is of type long only the values 0 or 1 are
   passed to dequeue_huge_page_vma.  If the value is 0, it indicates a
   reservation exists (see the section "Memory Policy and Reservations" for
@@ -233,9 +234,9 @@ the scope reservations.  Even if a surplus page is allocated, the same
 reservation based adjustments as above will be made: SetPagePrivate(page) and
 resv_huge_pages--.
 
-After obtaining a new huge page, (page)->private is set to the value of
-the subpool associated with the page if it exists.  This will be used for
-subpool accounting when the page is freed.
+After obtaining a new hugetlb folio, (folio)->_hugetlb_subpool is set to the
+value of the subpool associated with the page if it exists.  This will be used
+for subpool accounting when the folio is freed.
 
 The routine vma_commit_reservation() is then called to adjust the reserve
 map based on the consumption of the reservation.  In general, this involves
@@ -246,8 +247,8 @@ was no reservation in a shared mapping or this was a private mapping a new
 entry must be created.
 
 It is possible that the reserve map could have been changed between the call
-to vma_needs_reservation() at the beginning of alloc_huge_page() and the
-call to vma_commit_reservation() after the page was allocated.  This would
+to vma_needs_reservation() at the beginning of alloc_hugetlb_folio() and the
+call to vma_commit_reservation() after the folio was allocated.  This would
 be possible if hugetlb_reserve_pages was called for the same page in a shared
 mapping.  In such cases, the reservation count and subpool free page count
 will be off by one.  This rare condition can be identified by comparing the
diff --git a/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst b/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst
index 752e5696cd47..826a50c47389 100644
--- a/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst
+++ b/Documentation/translations/zh_CN/mm/hugetlbfs_reserv.rst
@@ -142,14 +142,14 @@ HPAGE_RESV_OWNER标志被设置,以表明该VMA拥有预留。
 消耗预留/分配一个巨页
 ===========================
 
-当与预留相关的巨页在相应的映射中被分配和实例化时,预留就被消耗了。该分配是在函数alloc_huge_page()
+当与预留相关的巨页在相应的映射中被分配和实例化时,预留就被消耗了。该分配是在函数alloc_hugetlb_folio()
 中进行的::
 
-	struct page *alloc_huge_page(struct vm_area_struct *vma,
+	struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 				     unsigned long addr, int avoid_reserve)
 
-alloc_huge_page被传递给一个VMA指针和一个虚拟地址,因此它可以查阅预留映射以确定是否存在预留。
-此外,alloc_huge_page需要一个参数avoid_reserve,该参数表示即使看起来已经为指定的地址预留了
+alloc_hugetlb_folio被传递给一个VMA指针和一个虚拟地址,因此它可以查阅预留映射以确定是否存在预留。
+此外,alloc_hugetlb_folio需要一个参数avoid_reserve,该参数表示即使看起来已经为指定的地址预留了
 预留,也不应该使用预留。avoid_reserve参数最常被用于写时拷贝和页面迁移的情况下,即现有页面的额
 外拷贝被分配。
 
@@ -162,7 +162,7 @@ vma_needs_reservation()返回的值通常为0或1。如果该地址存在预留
 确定预留是否存在并可用于分配后,调用dequeue_huge_page_vma()函数。这个函数需要两个与预留有关
 的参数:
 
-- avoid_reserve,这是传递给alloc_huge_page()的同一个值/参数。
+- avoid_reserve,这是传递给alloc_hugetlb_folio()的同一个值/参数。
 - chg,尽管这个参数的类型是long,但只有0或1的值被传递给dequeue_huge_page_vma。如果该值为0,
   则表明存在预留(关于可能的问题,请参见 “预留和内存策略” 一节)。如果值
   为1,则表示不存在预留,如果可能的话,必须从全局空闲池中取出该页。
@@ -179,7 +179,7 @@ free_huge_pages的值被递减。如果有一个与该页相关的预留,将
 的剩余巨页和超额分配的问题。即使分配了一个多余的页面,也会进行与上面一样的基于预留的调整:
 SetPagePrivate(page) 和 resv_huge_pages--.
 
-在获得一个新的巨页后,(page)->private被设置为与该页面相关的子池的值,如果它存在的话。当页
+在获得一个新的巨页后,(folio)->_hugetlb_subpool被设置为与该页面相关的子池的值,如果它存在的话。当页
 面被释放时,这将被用于子池的计数。
 
 然后调用函数vma_commit_reservation(),根据预留的消耗情况调整预留映射。一般来说,这涉及
@@ -199,7 +199,7 @@ SetPagePrivate(page)和resv_huge_pages-。
 已经存在,所以不做任何改变。然而,如果共享映射中没有预留,或者这是一个私有映射,则必须创建
 一个新的条目。
 
-在alloc_huge_page()开始调用vma_needs_reservation()和页面分配后调用
+在alloc_hugetlb_folio()开始调用vma_needs_reservation()和页面分配后调用
 vma_commit_reservation()之间,预留映射有可能被改变。如果hugetlb_reserve_pages在共
 享映射中为同一页面被调用,这将是可能的。在这种情况下,预留计数和子池空闲页计数会有一个偏差。
 这种罕见的情况可以通过比较vma_needs_reservation和vma_commit_reservation的返回值来
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() to folios
  2023-01-19 21:14 ` [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() " Sidhartha Kumar
@ 2023-01-20  5:43   ` Matthew Wilcox
  2023-01-20 23:40     ` Mike Kravetz
  0 siblings, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2023-01-20  5:43 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, jhubbard

On Thu, Jan 19, 2023 at 01:14:39PM -0800, Sidhartha Kumar wrote:
> +++ b/mm/hugetlb.c
> @@ -5653,15 +5653,15 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
>  {
>  	struct address_space *mapping;
>  	pgoff_t idx;
> -	struct page *page;
> +	struct folio *folio;
>  
>  	mapping = vma->vm_file->f_mapping;
>  	idx = vma_hugecache_offset(h, vma, address);
>  
> -	page = find_get_page(mapping, idx);
> -	if (page)
> -		put_page(page);
> -	return page != NULL;
> +	folio = filemap_get_folio(mapping, idx);
> +	if (folio)
> +		folio_put(folio);
> +	return folio != NULL;
>  }

Seems to me this function could be ...

	struct address_space *mapping = vma->vm_file->f_mapping;
	pgoff_t index = vma_hugecache_offset(h, vma, address);
	bool present;

	rcu_read_lock();
	present = page_cache_next_miss(mapping, index, 1) != index;
	rcu_read_unlock();

	return present;

No need to get/drop a refcount on the folio.  It's a bit similar to
filemap_range_has_page(), but the API is wrong.  Maybe there's room
for a little refactoring here.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio
  2023-01-19 21:14 ` [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio Sidhartha Kumar
@ 2023-01-20  5:48   ` Matthew Wilcox
  2023-01-20 23:47   ` Mike Kravetz
  1 sibling, 0 replies; 21+ messages in thread
From: Matthew Wilcox @ 2023-01-20  5:48 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, jhubbard

On Thu, Jan 19, 2023 at 01:14:40PM -0800, Sidhartha Kumar wrote:
> -void putback_active_hugepage(struct page *page)
> +void putback_active_hugetlb_folio(struct folio *folio)

Maybe call this folio_putback_active_hugetlb()?  It fits better
with folio_putback_lru().

> +++ b/mm/migrate.c
> @@ -151,7 +151,7 @@ void putback_movable_pages(struct list_head *l)
>  
>  	list_for_each_entry_safe(page, page2, l, lru) {
>  		if (unlikely(PageHuge(page))) {
> -			putback_active_hugepage(page);
> +			putback_active_hugetlb_folio(page_folio(page));
>  			continue;
>  		}
>  		list_del(&page->lru);

Maybe we need a patch first to convert this loop from an iteration of
page->lru to folio->lru to avoid that call to page_folio()?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap to take in a folio
  2023-01-19 21:14 ` [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap " Sidhartha Kumar
@ 2023-01-20  6:00   ` Matthew Wilcox
  2023-01-20 20:45     ` Sidhartha Kumar
  0 siblings, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2023-01-20  6:00 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, jhubbard

On Thu, Jan 19, 2023 at 01:14:41PM -0800, Sidhartha Kumar wrote:
> @@ -5599,9 +5603,9 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
>  		goto out_release_all;
>  	}
>  
> -	copy_user_huge_page(new_page, old_page, address, vma,
> +	copy_user_huge_page(&new_folio->page, old_page, address, vma,
>  			    pages_per_huge_page(h));

We have a folio_copy(), but it feels to me like we need a
folio_copy_user() so that we can use copy_user_page() on machines
with virtual caches.

> @@ -6176,6 +6186,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  	spinlock_t *ptl;
>  	int ret = -ENOMEM;
>  	struct page *page;
> +	struct folio *folio = NULL;
>  	int writable;
>  	bool page_in_pagecache = false;
>  
> @@ -6251,12 +6262,15 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  		*pagep = NULL;
>  	}
>  
> +	if (page)
> +		folio = page_folio(page);
> +
>  	/*
> -	 * The memory barrier inside __SetPageUptodate makes sure that
> +	 * The memory barrier inside __folio_mark_uptodate makes sure that
>  	 * preceding stores to the page contents become visible before
>  	 * the set_pte_at() write.
>  	 */
> -	__SetPageUptodate(page);
> +	__folio_mark_uptodate(folio);

I suggest that "page" can never be NULL or __SetPageUptodate() would
have crashed.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap to take in a folio
  2023-01-20  6:00   ` Matthew Wilcox
@ 2023-01-20 20:45     ` Sidhartha Kumar
  2023-01-20 21:17       ` Matthew Wilcox
  0 siblings, 1 reply; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-20 20:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, jhubbard

On 1/19/23 10:00 PM, Matthew Wilcox wrote:
> On Thu, Jan 19, 2023 at 01:14:41PM -0800, Sidhartha Kumar wrote:
>> @@ -5599,9 +5603,9 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma,
>>   		goto out_release_all;
>>   	}
>>   
>> -	copy_user_huge_page(new_page, old_page, address, vma,
>> +	copy_user_huge_page(&new_folio->page, old_page, address, vma,
>>   			    pages_per_huge_page(h));
> 
> We have a folio_copy(), but it feels to me like we need a
> folio_copy_user() so that we can use copy_user_page() on machines
> with virtual caches.
> 
>> @@ -6176,6 +6186,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>>   	spinlock_t *ptl;
>>   	int ret = -ENOMEM;
>>   	struct page *page;
>> +	struct folio *folio = NULL;
>>   	int writable;
>>   	bool page_in_pagecache = false;
>>   
>> @@ -6251,12 +6262,15 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>>   		*pagep = NULL;
>>   	}
>>   
>> +	if (page)
>> +		folio = page_folio(page);
>> +
>>   	/*
>> -	 * The memory barrier inside __SetPageUptodate makes sure that
>> +	 * The memory barrier inside __folio_mark_uptodate makes sure that
>>   	 * preceding stores to the page contents become visible before
>>   	 * the set_pte_at() write.
>>   	 */
>> -	__SetPageUptodate(page);
>> +	__folio_mark_uptodate(folio);
> 

Hi Matthew,

In the snippet:

page = alloc_huge_page(dst_vma, dst_addr, 0);
if (IS_ERR(page)) {
	put_page(*pagep);
	ret = -ENOMEM;
	*pagep = NULL;
	goto out;
}
copy_user_huge_page(page, *pagep, dst_addr, dst_vma,
		pages_per_huge_page(h));

I thought the IS_ERR() call does not handle the NULL case and is a check 
for high memory addresses, and copy_user_huge_page() path does not seem 
to handle the NULL case as well but alloc_huge_page() can possibly 
return NULL so I was unsure about how to handle the folio conversion.

> I suggest that "page" can never be NULL or __SetPageUptodate() would
> have crashed.
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap to take in a folio
  2023-01-20 20:45     ` Sidhartha Kumar
@ 2023-01-20 21:17       ` Matthew Wilcox
  2023-01-20 22:05         ` Sidhartha Kumar
  0 siblings, 1 reply; 21+ messages in thread
From: Matthew Wilcox @ 2023-01-20 21:17 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, jhubbard

On Fri, Jan 20, 2023 at 12:45:38PM -0800, Sidhartha Kumar wrote:
> > > @@ -6176,6 +6186,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > >   	spinlock_t *ptl;
> > >   	int ret = -ENOMEM;
> > >   	struct page *page;
> > > +	struct folio *folio = NULL;
> > >   	int writable;
> > >   	bool page_in_pagecache = false;
> > > @@ -6251,12 +6262,15 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > >   		*pagep = NULL;
> > >   	}
> > > +	if (page)
> > > +		folio = page_folio(page);
> > > +
> > >   	/*
> > > -	 * The memory barrier inside __SetPageUptodate makes sure that
> > > +	 * The memory barrier inside __folio_mark_uptodate makes sure that
> > >   	 * preceding stores to the page contents become visible before
> > >   	 * the set_pte_at() write.
> > >   	 */
> > > -	__SetPageUptodate(page);
> > > +	__folio_mark_uptodate(folio);
> > 
> 
> Hi Matthew,
> 
> In the snippet:
> 
> page = alloc_huge_page(dst_vma, dst_addr, 0);
> if (IS_ERR(page)) {
> 	put_page(*pagep);
> 	ret = -ENOMEM;
> 	*pagep = NULL;
> 	goto out;
> }
> copy_user_huge_page(page, *pagep, dst_addr, dst_vma,
> 		pages_per_huge_page(h));
> 
> I thought the IS_ERR() call does not handle the NULL case and is a check for
> high memory addresses, and copy_user_huge_page() path does not seem to
> handle the NULL case as well but alloc_huge_page() can possibly return NULL
> so I was unsure about how to handle the folio conversion.

I'm not sure how alloc_huge_page() can return NULL.  It seems like it
returns ERR_PTR(-ENOSPC) or ERR_PTR(-ENOMEM) if it cannot allocate memory?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap to take in a folio
  2023-01-20 21:17       ` Matthew Wilcox
@ 2023-01-20 22:05         ` Sidhartha Kumar
  0 siblings, 0 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-20 22:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, jhubbard

On 1/20/23 1:17 PM, Matthew Wilcox wrote:
> On Fri, Jan 20, 2023 at 12:45:38PM -0800, Sidhartha Kumar wrote:
>>>> @@ -6176,6 +6186,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>>>>    	spinlock_t *ptl;
>>>>    	int ret = -ENOMEM;
>>>>    	struct page *page;
>>>> +	struct folio *folio = NULL;
>>>>    	int writable;
>>>>    	bool page_in_pagecache = false;
>>>> @@ -6251,12 +6262,15 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>>>>    		*pagep = NULL;
>>>>    	}
>>>> +	if (page)
>>>> +		folio = page_folio(page);
>>>> +
>>>>    	/*
>>>> -	 * The memory barrier inside __SetPageUptodate makes sure that
>>>> +	 * The memory barrier inside __folio_mark_uptodate makes sure that
>>>>    	 * preceding stores to the page contents become visible before
>>>>    	 * the set_pte_at() write.
>>>>    	 */
>>>> -	__SetPageUptodate(page);
>>>> +	__folio_mark_uptodate(folio);
>>>
>>
>> Hi Matthew,
>>
>> In the snippet:
>>
>> page = alloc_huge_page(dst_vma, dst_addr, 0);
>> if (IS_ERR(page)) {
>> 	put_page(*pagep);
>> 	ret = -ENOMEM;
>> 	*pagep = NULL;
>> 	goto out;
>> }
>> copy_user_huge_page(page, *pagep, dst_addr, dst_vma,
>> 		pages_per_huge_page(h));
>>
>> I thought the IS_ERR() call does not handle the NULL case and is a check for
>> high memory addresses, and copy_user_huge_page() path does not seem to
>> handle the NULL case as well but alloc_huge_page() can possibly return NULL
>> so I was unsure about how to handle the folio conversion.
> 
> I'm not sure how alloc_huge_page() can return NULL.  It seems like it
> returns ERR_PTR(-ENOSPC) or ERR_PTR(-ENOMEM) if it cannot allocate memory?
> 
I see now, I agree that page cannot be NULL at the return from 
alloc_huge_page, I will make that change in v2.

Thanks,
Sidhartha Kumar


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page to folios
  2023-01-19 21:14 ` [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page " Sidhartha Kumar
@ 2023-01-20 23:21   ` Mike Kravetz
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Kravetz @ 2023-01-20 23:21 UTC (permalink / raw)
  To: Sidhartha Kumar; +Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, jhubbard

On 01/19/23 13:14, Sidhartha Kumar wrote:
> Change hugetlb_install_page() to hugetlb_install_folio(). This reduces one
> user of the Huge Page flag macros which take in a page.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  mm/hugetlb.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)

Looks good, thanks!

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() to folios
  2023-01-20  5:43   ` Matthew Wilcox
@ 2023-01-20 23:40     ` Mike Kravetz
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Kravetz @ 2023-01-20 23:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Sidhartha Kumar, linux-kernel, linux-mm, akpm, songmuchun, jhubbard

On 01/20/23 05:43, Matthew Wilcox wrote:
> On Thu, Jan 19, 2023 at 01:14:39PM -0800, Sidhartha Kumar wrote:
> > +++ b/mm/hugetlb.c
> > @@ -5653,15 +5653,15 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
> >  {
> >  	struct address_space *mapping;
> >  	pgoff_t idx;
> > -	struct page *page;
> > +	struct folio *folio;
> >  
> >  	mapping = vma->vm_file->f_mapping;
> >  	idx = vma_hugecache_offset(h, vma, address);
> >  
> > -	page = find_get_page(mapping, idx);
> > -	if (page)
> > -		put_page(page);
> > -	return page != NULL;
> > +	folio = filemap_get_folio(mapping, idx);
> > +	if (folio)
> > +		folio_put(folio);
> > +	return folio != NULL;
> >  }
> 
> Seems to me this function could be ...
> 
> 	struct address_space *mapping = vma->vm_file->f_mapping;
> 	pgoff_t index = vma_hugecache_offset(h, vma, address);
> 	bool present;
> 
> 	rcu_read_lock();
> 	present = page_cache_next_miss(mapping, index, 1) != index;
> 	rcu_read_unlock();
> 
> 	return present;
> 
> No need to get/drop a refcount on the folio.  It's a bit similar to
> filemap_range_has_page(), but the API is wrong.  Maybe there's room
> for a little refactoring here.

Thanks Matthew, I did not know those APIs were available.  Perhaps just
use page_cache_next_miss as suggested above for now.

FYI - There is the same pattern in hugetlbfs_fallocate()

		/* See if already present in mapping to avoid alloc/free */
		folio = filemap_get_folio(mapping, index);
		if (folio) {
			folio_put(folio);
			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
			hugetlb_drop_vma_policy(&pseudo_vma);
			continue;
		}

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio
  2023-01-19 21:14 ` [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio Sidhartha Kumar
  2023-01-20  5:48   ` Matthew Wilcox
@ 2023-01-20 23:47   ` Mike Kravetz
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Kravetz @ 2023-01-20 23:47 UTC (permalink / raw)
  To: Sidhartha Kumar; +Cc: linux-kernel, linux-mm, akpm, songmuchun, willy, jhubbard

On 01/19/23 13:14, Sidhartha Kumar wrote:
> Convert putback_active_hugepage() to putback_active_hugetlb_folio(), this
> removes one user of the Huge Page macros which take in a page. The callers
> in migrate.c are also cleaned up by being able to directly use the src and
> dst folio variables.
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> ---
>  include/linux/hugetlb.h | 4 ++--
>  mm/hugetlb.c            | 8 ++++----
>  mm/migrate.c            | 8 ++++----
>  3 files changed, 10 insertions(+), 10 deletions(-)

Thanks,

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() to take in a folio
  2023-01-19 21:14 ` [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() " Sidhartha Kumar
@ 2023-01-24 15:23   ` Gerald Schaefer
  2023-01-24 18:38     ` Sidhartha Kumar
  0 siblings, 1 reply; 21+ messages in thread
From: Gerald Schaefer @ 2023-01-24 15:23 UTC (permalink / raw)
  To: Sidhartha Kumar
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, willy,
	jhubbard, linux-s390

On Thu, 19 Jan 2023 13:14:45 -0800
Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:

[...]
>  	page = pte_page(entry);
> -	if (page != pagecache_page)
> +	if (page_folio(page) != pagecache_folio)
>  		if (!trylock_page(page)) {
>  			need_wait_lock = 1;
>  			goto out_ptl;
>  		}
>  
> -	get_page(page);
> +	folio_get(pagecache_folio);
>  

We get a kernel crash on s390 in mprotect testcase from libhugetlbfs
testsuite, starting with next-20230120, bisected to this commit.

We get here with pagecache_folio == NULL, and crash in folio_get().
It doesn´t seem right to replace the get_page() with folio_get()
here, the matching put_page() at out_put_page: also wasn't changed
correspondingly. Also, pagecache_folio == NULL seems to be a valid
case here, on all architectures.

Reverting this folio_get() to get_page() fixes the crash. Not sure
though if I missed something. I think you only want to replace
pagecache_page with pagecache_folio, like in the rest of the commit,
and not page -> pagecache_folio for this get_page().

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() to take in a folio
  2023-01-24 15:23   ` Gerald Schaefer
@ 2023-01-24 18:38     ` Sidhartha Kumar
  0 siblings, 0 replies; 21+ messages in thread
From: Sidhartha Kumar @ 2023-01-24 18:38 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: linux-kernel, linux-mm, akpm, songmuchun, mike.kravetz, willy,
	jhubbard, linux-s390

On 1/24/23 7:23 AM, Gerald Schaefer wrote:
> On Thu, 19 Jan 2023 13:14:45 -0800
> Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote:
> 
> [...]
>>   	page = pte_page(entry);
>> -	if (page != pagecache_page)
>> +	if (page_folio(page) != pagecache_folio)
>>   		if (!trylock_page(page)) {
>>   			need_wait_lock = 1;
>>   			goto out_ptl;
>>   		}
>>   
>> -	get_page(page);
>> +	folio_get(pagecache_folio);
>>   
> 
> We get a kernel crash on s390 in mprotect testcase from libhugetlbfs
> testsuite, starting with next-20230120, bisected to this commit.
> 
> We get here with pagecache_folio == NULL, and crash in folio_get().
> It doesn´t seem right to replace the get_page() with folio_get()
> here, the matching put_page() at out_put_page: also wasn't changed
> correspondingly. Also, pagecache_folio == NULL seems to be a valid
> case here, on all architectures.
> 
> Reverting this folio_get() to get_page() fixes the crash. Not sure
> though if I missed something. I think you only want to replace
> pagecache_page with pagecache_folio, like in the rest of the commit,
> and not page -> pagecache_folio for this get_page().
Ya that get_page(page) line should have stayed how it was before as 
pagecache_folio is replacing instances of pagecache_page. Thanks for 
catching this, I'll fix this change in a v2 of this patch series.

Thanks,
Sidhartha Kumar


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-01-24 19:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19 21:14 [PATCH 0/9] convert hugetlb fault functions to folios Sidhartha Kumar
2023-01-19 21:14 ` [PATCH 1/9] mm/hugetlb: convert hugetlb_install_page " Sidhartha Kumar
2023-01-20 23:21   ` Mike Kravetz
2023-01-19 21:14 ` [PATCH 2/9] mm/hugetlb: convert hugetlbfs_pagecache_present() " Sidhartha Kumar
2023-01-20  5:43   ` Matthew Wilcox
2023-01-20 23:40     ` Mike Kravetz
2023-01-19 21:14 ` [PATCH 3/9] mm/hugetlb: convert putback_active_hugepage to take in a folio Sidhartha Kumar
2023-01-20  5:48   ` Matthew Wilcox
2023-01-20 23:47   ` Mike Kravetz
2023-01-19 21:14 ` [PATCH 4/9] mm/rmap: change hugepage_add_new_anon_rmap " Sidhartha Kumar
2023-01-20  6:00   ` Matthew Wilcox
2023-01-20 20:45     ` Sidhartha Kumar
2023-01-20 21:17       ` Matthew Wilcox
2023-01-20 22:05         ` Sidhartha Kumar
2023-01-19 21:14 ` [PATCH 5/9] mm/hugetlb: convert alloc_huge_page to alloc_hugetlb_folio Sidhartha Kumar
2023-01-19 21:14 ` [PATCH 6/9] mm/hugetlb: convert restore_reserve_on_error to take in a folio Sidhartha Kumar
2023-01-19 21:14 ` [PATCH 7/9] mm/hugetlb: convert hugetlb_add_to_page_cache " Sidhartha Kumar
2023-01-19 21:14 ` [PATCH 8/9] mm/hugetlb: convert hugetlb_wp() " Sidhartha Kumar
2023-01-24 15:23   ` Gerald Schaefer
2023-01-24 18:38     ` Sidhartha Kumar
2023-01-19 21:14 ` [PATCH 9/9] Documentation/mm: update hugetlbfs documentation to mention alloc_hugetlb_folio Sidhartha Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.