linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
@ 2023-10-10 14:27 Naoya Horiguchi
  2023-10-10 14:27 ` [PATCH v1 1/5] include/uapi/linux/kernel-page-flags.h: define KPF_FOLIO Naoya Horiguchi
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-10 14:27 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, David Hildenbrand,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel

Hi everyone,

This patchset addresses 2 issues in /proc/kpageflags.

  1. We can't easily tell folio from thp, because currently both pages are
     judged as thp, and
  2. we see some garbage data in records of compound tail pages because
     we use tail pages to store some internal data.

These issues require userspace programs to do additional work to understand
the page status, which makes situation more complicated.

This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
by hiding part of page flag info on tail pages of compound pages for issue 2.

I think that technically some compound pages like thp/hugetlb/slab could be
considered as folio, but in this version KPF_FOLIO is set only on folios
in pagecache (so "folios in narrower meaning").  I'm not confident about
this choice, so if you have any idea about this, please let me know.

How we can see using tools/mm/page-types.c will change like below (only focusing
on compound pages).

Before patchset:

  // anonymous thp
  voffset         offset  len     flags
  ...
  700000000       156c00  1       ___U_l_____Ma_bH______t_____________f_d_____1
  700000001       156c01  1       L__U_______Ma___T_____t_____________f_______1
  700000002       156c02  1fe     ___________Ma___T_____t_____________f_______1

  // file thp
  700000000       15d600  1       __RUDl_____M__bH______t_____________f__I____1
  700000001       15d601  1       L__U_______M____T_____t_____________f_______1
  700000002       15d602  1fe     ___________M____T_____t_____________f_______1

  // large folio
  700000000       154f84  1       __RU_l_____M___H______t________P____f_____F_1
  700000001       154f85  1       ________W__M____T_____t_____________f_____F_1
  700000002       154f86  2       ___________M____T_____t_____________f_____F_1
  700000004       14d0a4  1       __RU_l_____M___H______t________P____f_____F_1
  700000005       14d0a5  1       ________W__M____T_____t_____________f_____F_1
  700000006       14d0a6  2       ___________M____T_____t_____________f_____F_1
  ...

  // free hugetlb (HVO disabled)
  offset  len     flags
  ...
  106a00  1       _______________H_G___________________________
  106a01  1       L__U__A_________TG___________________________
  106a02  1fe     ________________TG___________________________

  // anonymous hugetlb (HVO disabled)
  700000000       157200  1       ___U_______Ma__H_G__________________f_d_____1
  700000001       157201  1       L__U__A____Ma___TG__________________f_______1
  700000002       157202  1fe     ___________Ma___TG__________________f_______1

  // free hugetlb (HVO enabled)
  12a600  1       _______________H_G___________________________
  12a601  1       L__U__A_________TG___________________________
  12a602  3f      ________________TG___________________________
  12a641  1       L__U__A_________TG___________________________
  12a642  3f      ________________TG___________________________
  ...

  // anonymous hugetlb (HVO enabled)
  700000000       15e600  1       ___U_______Ma__H_G__________________f_d_____1
  700000001       15e601  1       L__U__A____Ma___TG__________________f_______1
  700000002       15e602  3e      ___________Ma___TG__________________f_______1
  700000040       15e640  1       ___U_______Ma___TG__________________f_d_____1
  700000041       15e641  1       L__U__A____Ma___TG__________________f_______1
  700000042       15e642  3e      ___________Ma___TG__________________f_______1
  ...

  // slab
               flags      page-count       MB  symbolic-flags                     long-symbolic-flags
  0x0000000000000080            5304       20  _______S_____________________________________      slab
  0x0000000000008080            1488        5  _______S_______H_____________________________      slab,compound_head
  0x0000000000010081             365        1  L______S________T____________________________      locked,slab,compound_tail
  0x0000000000010080            4142       16  _______S________T____________________________      slab,compound_tail
  0x0000000000010180             649        2  _______SW_______T____________________________      slab,writeback,compound_tail
  0x0000000000010181             474        1  L______SW_______T____________________________      locked,slab,writeback,compound_tail
  0x0000000000201080             192        0  _______S____a________x_______________________      slab,anonymous,ksm
  0x0000000000001080             427        1  _______S____a________________________________      slab,anonymous
  0x0000000000409080             237        0  _______S____a__H______t______________________      slab,anonymous,compound_head,thp
  0x0000000000411081              78        0  L______S____a___T_____t______________________      locked,slab,anonymous,compound_tail,thp
  0x0000000000609080              77        0  _______S____a__H_____xt______________________      slab,anonymous,compound_head,ksm,thp
  0x0000000000611081              32        0  L______S____a___T____xt______________________      locked,slab,anonymous,compound_tail,ksm,thp
  0x0000000000411080             698        2  _______S____a___T_____t______________________      slab,anonymous,compound_tail,thp
  0x0000000000611080             142        0  _______S____a___T____xt______________________      slab,anonymous,compound_tail,ksm,thp
  0x0000000000611180              32        0  _______SW___a___T____xt______________________      slab,writeback,anonymous,compound_tail,ksm,thp
  0x0000000000411181              95        0  L______SW___a___T_____t______________________      locked,slab,writeback,anonymous,compound_tail,thp
  0x0000000000411180              64        0  _______SW___a___T_____t______________________      slab,writeback,anonymous,compound_tail,thp
  0x0000000000611181              13        0  L______SW___a___T____xt______________________      locked,slab,writeback,anonymous,compound_tail,ksm,thp


After patchset:

  // anonymous thp
  700000000       117000  1       ___U_l_____Ma_bH______t_____________f_d_____1
  700000001       117001  1ff     ________________T_____t_____________f_______1

  // file thp
  700000000       118400  1       __RUDl_____M__bH______t_____________f__I____1
  700000001       118401  1ff     ________________T_____t_____________f_______1

  // large folio
  700000000       148da4  1       __RU_l_____M___H___________f___P____f_____F_1
  700000001       148da5  3       ________________T__________f________f_____F_1
  700000004       148da8  1       __RU_l_____M___H___________f___P____f_____F_1
  700000005       148da9  3       ________________T__________f________f_____F_1

  // free hugetlb (HVO disabled)
  116000  1       _______________H_G___________________________
  116001  1ff     ________________TG___________________________

  // anonymous hugetlb (HVO disabled)
  700000000       116000  1       ___U_______Ma__H_G__________________f_d_____1
  700000001       116001  1ff     ________________TG__________________f_______1

  // free hugetlb (HVO enabled)
  116000  1       _______________H_G___________________________
  116001  1ff     ________________TG___________________________

  // anonymous hugetlb (HVO enabled)
  700000000       116000  1       ___U_______Ma__H_G__________________f_d_____1
  700000001       116001  1ff     ________________TG__________________f_______1

  // slab
  0x0000000000000080            5659       22  _______S_____________________________________      slab
  0x0000000000008080            1644        6  _______S_______H_____________________________      slab,compound_head
  0x0000000000010080            6196       24  _______S________T____________________________      slab,compound_tail

Thanks,
Naoya Horiguchi
---
Summary:

Naoya Horiguchi (5):
      include/uapi/linux/kernel-page-flags.h: define KPF_FOLIO
      mm: kpageflags: distinguish thp and folio
      mm, kpageflags: separate code path for hugetlb pages
      mm, kpageflags: fix invalid output for PageSlab
      tools/mm/page-types.c: hide compound pages in non-raw mode

 fs/proc/page.c                         | 90 +++++++++++++++++++---------------
 include/uapi/linux/kernel-page-flags.h |  1 +
 tools/mm/page-types.c                  |  3 +-
 3 files changed, 53 insertions(+), 41 deletions(-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 1/5] include/uapi/linux/kernel-page-flags.h: define KPF_FOLIO
  2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
@ 2023-10-10 14:27 ` Naoya Horiguchi
  2023-10-10 14:27 ` [PATCH v1 2/5] mm: kpageflags: distinguish thp and folio Naoya Horiguchi
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-10 14:27 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, David Hildenbrand,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

Define a new KPF flag to represent folio in /proc/kpageflags and
in-kernel user page-types.c.

Note that in page-types.c I chose 'f' for the character representing folio,
which conflicts with KPF_SOFTDIRTY, but we have no other choice because
all reasonable choices ('f', 'F', 'o', 'O', 'l', 'L', 'i', and 'I') are
already used.  You need to pay attention to long flag names or the position
of 'f' in short form.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
 include/uapi/linux/kernel-page-flags.h | 1 +
 tools/mm/page-types.c                  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h
index 6f2f2720f3ac..9b43dadb7f49 100644
--- a/include/uapi/linux/kernel-page-flags.h
+++ b/include/uapi/linux/kernel-page-flags.h
@@ -36,5 +36,6 @@
 #define KPF_ZERO_PAGE		24
 #define KPF_IDLE		25
 #define KPF_PGTABLE		26
+#define KPF_FOLIO		27
 
 #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */
diff --git a/tools/mm/page-types.c b/tools/mm/page-types.c
index 8d5595b6c59f..b78448d19e88 100644
--- a/tools/mm/page-types.c
+++ b/tools/mm/page-types.c
@@ -126,6 +126,7 @@ static const char * const page_flag_names[] = {
 	[KPF_PGTABLE]		= "g:pgtable",
 	[KPF_ZERO_PAGE]		= "z:zero_page",
 	[KPF_IDLE]              = "i:idle_page",
+	[KPF_FOLIO]		= "f:folio",
 
 	[KPF_RESERVED]		= "r:reserved",
 	[KPF_MLOCKED]		= "m:mlocked",
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 2/5] mm: kpageflags: distinguish thp and folio
  2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
  2023-10-10 14:27 ` [PATCH v1 1/5] include/uapi/linux/kernel-page-flags.h: define KPF_FOLIO Naoya Horiguchi
@ 2023-10-10 14:27 ` Naoya Horiguchi
  2023-10-10 14:27 ` [PATCH v1 3/5] mm, kpageflags: separate code path for hugetlb pages Naoya Horiguchi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-10 14:27 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, David Hildenbrand,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

Currently a large folio is considered as thp in the output of
/proc/kpageflags because stable_page_flags() does not have resolution to
distinguish thp and large folio.  This is confusing and the readers of
/proc/kpageflags need additional checks in userspace, which is inefficient.

Check page order in stable_page_flags() to distinguish thp and large folio.
Although thp (or other types of compound page) is a special form of folio,
but the KPF_FOLIO means "folio" in narrower meaning (representing folios
in page cache), so KPF_FOLIO is not set on thp or hugetlb pages.  This is
because thp and hugetlb (and other compound pages) have their own KPF_*
flags, so those are already identifiable.

Thp and folio use some struct page's field of the first tail pages for
internal use.  There's no point to parse and show flag info based on tail
pages, so return immediately when finding thp/folio tail pages.

The output below shows how this patch changes the output of page-types.

Before patch:

  // anonymous thp
  voffset         offset  len     flags
  ...
  700000000       156c00  1       ___U_l_____Ma_bH______t_____________f_d_____1
  700000001       156c01  1       L__U_______Ma___T_____t_____________f_______1
  700000002       156c02  1fe     ___________Ma___T_____t_____________f_______1
                                                        ^
                                                    this 't' means thp
  // file thp
  700000000       15d600  1       __RUDl_____M__bH______t_____________f__I____1
  700000001       15d601  1       L__U_______M____T_____t_____________f_______1
  700000002       15d602  1fe     ___________M____T_____t_____________f_______1

  // large folio
  700000000       154f84  1       __RU_l_____M___H______t________P____f_____F_1
  700000001       154f85  1       ________W__M____T_____t_____________f_____F_1
  700000002       154f86  2       ___________M____T_____t_____________f_____F_1

After patch:

  // anonymous thp
  700000000       117000  1       ___U_l_____Ma_bH______t_____________f_d_____1
  700000001       117001  1ff     ________________T_____t_____________f_______1

  // file thp
  700000000       118400  1       __RUDl_____M__bH______t_____________f__I____1
  700000001       118401  1ff     ________________T_____t_____________f_______1

  // large folio
  700000000       148da4  1       __RU_l_____M___H___________f___P____f_____F_1
  700000001       148da5  3       ________________T__________f________f_____F_1
                                                             ^
                                                    this 'f' means folio

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
 fs/proc/page.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index 195b077c0fac..78f675f791c1 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -154,11 +154,24 @@ u64 stable_page_flags(struct page *page)
 	else if (PageTransCompound(page)) {
 		struct page *head = compound_head(page);
 
-		if (PageLRU(head) || PageAnon(head))
-			u |= 1 << KPF_THP;
-		else if (is_huge_zero_page(head)) {
+		/*
+		 * We need to check PageLRU/PageAnon to make sure a given page
+		 * is a thp, not a huge zero page or a generic compound page
+		 * (allocated by drivers with __GFP_COMP).
+		 */
+		if (PageLRU(head) || PageAnon(head)) {
+			if (compound_order(head) == HPAGE_PMD_ORDER)
+				u |= 1 << KPF_THP;
+			else
+				u |= 1 << KPF_FOLIO;
+		} else if (is_huge_zero_page(head))
 			u |= 1 << KPF_ZERO_PAGE;
-			u |= 1 << KPF_THP;
+
+		if (PageHead(page))
+			u |= 1 << KPF_COMPOUND_HEAD;
+		if (PageTail(page)) {
+			u |= 1 << KPF_COMPOUND_TAIL;
+			return u;
 		}
 	} else if (is_zero_pfn(page_to_pfn(page)))
 		u |= 1 << KPF_ZERO_PAGE;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 3/5] mm, kpageflags: separate code path for hugetlb pages
  2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
  2023-10-10 14:27 ` [PATCH v1 1/5] include/uapi/linux/kernel-page-flags.h: define KPF_FOLIO Naoya Horiguchi
  2023-10-10 14:27 ` [PATCH v1 2/5] mm: kpageflags: distinguish thp and folio Naoya Horiguchi
@ 2023-10-10 14:27 ` Naoya Horiguchi
  2023-10-10 14:28 ` [PATCH v1 4/5] mm, kpageflags: fix invalid output for PageSlab Naoya Horiguchi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-10 14:27 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, David Hildenbrand,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

Hugetlb pages use some struct page's field of the first few tail pages for
internal use.  There's no point to parse and show the kpageflags info based
on tail pages, so return immediately when finding hugetlb tail pages.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
 fs/proc/page.c | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index 78f675f791c1..9b6ded8a2c90 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -135,23 +135,20 @@ u64 stable_page_flags(struct page *page)
 	if (PageKsm(page))
 		u |= 1 << KPF_KSM;
 
-	/*
-	 * compound pages: export both head/tail info
-	 * they together define a compound page's start/end pos and order
-	 */
-	if (PageHead(page))
-		u |= 1 << KPF_COMPOUND_HEAD;
-	if (PageTail(page))
-		u |= 1 << KPF_COMPOUND_TAIL;
-	if (PageHuge(page))
+	if (PageHuge(page)) {
 		u |= 1 << KPF_HUGE;
+		if (PageHead(page))
+			u |= 1 << KPF_COMPOUND_HEAD;
+		if (PageTail(page)) {
+			u |= 1 << KPF_COMPOUND_TAIL;
+			return u;
+		}
 	/*
-	 * PageTransCompound can be true for non-huge compound pages (slab
-	 * pages or pages allocated by drivers with __GFP_COMP) because it
-	 * just checks PG_head/PG_tail, so we need to check PageLRU/PageAnon
-	 * to make sure a given page is a thp, not a non-huge compound page.
+	 * PageTransCompound can be true for any types of compound pages,
+	 * because it just checks PG_head and PageTail, but at this point
+	 * PageSlab and PageHuge are already checked to be false.
 	 */
-	else if (PageTransCompound(page)) {
+	} else if (PageTransCompound(page)) {
 		struct page *head = compound_head(page);
 
 		/*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 4/5] mm, kpageflags: fix invalid output for PageSlab
  2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
                   ` (2 preceding siblings ...)
  2023-10-10 14:27 ` [PATCH v1 3/5] mm, kpageflags: separate code path for hugetlb pages Naoya Horiguchi
@ 2023-10-10 14:28 ` Naoya Horiguchi
  2023-10-10 14:28 ` [PATCH v1 5/5] tools/mm/page-types.c: hide compound pages in non-raw mode Naoya Horiguchi
  2023-10-12  8:33 ` [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages David Hildenbrand
  5 siblings, 0 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-10 14:28 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, David Hildenbrand,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

The flag field of slab tail pages is used for internal purpose and
there's no point in exposing such info to userspace.

Here's the output of `page-types -r -b slab` command now:

               flags      page-count       MB  symbolic-flags                     long-symbolic-flags
  0x0000000000000080            5304       20  _______S_____________________________________      slab
  0x0000000000008080            1488        5  _______S_______H_____________________________      slab,compound_head
  0x0000000000010081             365        1  L______S________T____________________________      locked,slab,compound_tail
  0x0000000000010080            4142       16  _______S________T____________________________      slab,compound_tail
  0x0000000000010180             649        2  _______SW_______T____________________________      slab,writeback,compound_tail
  0x0000000000010181             474        1  L______SW_______T____________________________      locked,slab,writeback,compound_tail
  0x0000000000201080             192        0  _______S____a________x_______________________      slab,anonymous,ksm
  0x0000000000001080             427        1  _______S____a________________________________      slab,anonymous
  0x0000000000409080             237        0  _______S____a__H______t______________________      slab,anonymous,compound_head,thp
  0x0000000000411081              78        0  L______S____a___T_____t______________________      locked,slab,anonymous,compound_tail,thp
  0x0000000000609080              77        0  _______S____a__H_____xt______________________      slab,anonymous,compound_head,ksm,thp
  0x0000000000611081              32        0  L______S____a___T____xt______________________      locked,slab,anonymous,compound_tail,ksm,thp
  0x0000000000411080             698        2  _______S____a___T_____t______________________      slab,anonymous,compound_tail,thp
  0x0000000000611080             142        0  _______S____a___T____xt______________________      slab,anonymous,compound_tail,ksm,thp
  0x0000000000611180              32        0  _______SW___a___T____xt______________________      slab,writeback,anonymous,compound_tail,ksm,thp
  0x0000000000411181              95        0  L______SW___a___T_____t______________________      locked,slab,writeback,anonymous,compound_tail,thp
  0x0000000000411180              64        0  _______SW___a___T_____t______________________      slab,writeback,anonymous,compound_tail,thp
  0x0000000000611181              13        0  L______SW___a___T____xt______________________      locked,slab,writeback,anonymous,compound_tail,ksm,thp

In this output, "locked" and "writeback" flags are completely pointless
because these are encoded in folio->_flags_1 via folio_set_order() and
those pages are actually not locked nor written back.

As for "anonymous" and "ksm" flags, these are encoded in folio->mapping
and the actual value is like 0xdead000000000003.  I'm not sure how this
value is set, but according to the comment in include/linux/page-flags.h:

  > * For slab pages, since slab reuses the bits in struct page to store its
  > * internal states, the page->mapping does not exist as such, nor do these
  > * flags below.  So in order to avoid testing non-existent bits, please
  > * make sure that PageSlab(page) actually evaluates to false before calling
  > * the following functions (e.g., PageAnon).  See mm/slab.h.

, so we don't have to check PageAnon and PageKsm for slab pages.
So return immediately when finding slab tail pages.

Note that KPF_HWPOISON is special and it can be helpful to make it visible
in /prock/kpageflag even on compound tail pages.

After this patch, `page-types -r -b slab` command shows the following simpler
output (without any invalid flags).

  0x0000000000000080            5659       22  _______S_____________________________________      slab
  0x0000000000008080            1644        6  _______S_______H_____________________________      slab,compound_head
  0x0000000000010080            6196       24  _______S________T____________________________      slab,compound_tail

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
 fs/proc/page.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index 9b6ded8a2c90..899b96a26fbd 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -122,18 +122,18 @@ u64 stable_page_flags(struct page *page)
 	k = page->flags;
 	u = 0;
 
-	/*
-	 * pseudo flags for the well known (anonymous) memory mapped pages
-	 *
-	 * Note that page->_mapcount is overloaded in SLAB, so the
-	 * simple test in page_mapped() is not enough.
-	 */
-	if (!PageSlab(page) && page_mapped(page))
-		u |= 1 << KPF_MMAP;
-	if (PageAnon(page))
-		u |= 1 << KPF_ANON;
-	if (PageKsm(page))
-		u |= 1 << KPF_KSM;
+#ifdef CONFIG_MEMORY_FAILURE
+	u |= kpf_copy_bit(k, KPF_HWPOISON, PG_hwpoison);
+#endif
+
+	if (PageSlab(page)) {
+		u |= 1 << KPF_SLAB;
+		if (PageHead(page))
+			u |= 1 << KPF_COMPOUND_HEAD;
+		if (PageTail(page))
+			u |= 1 << KPF_COMPOUND_TAIL;
+		return u;
+	}
 
 	if (PageHuge(page)) {
 		u |= 1 << KPF_HUGE;
@@ -173,9 +173,18 @@ u64 stable_page_flags(struct page *page)
 	} else if (is_zero_pfn(page_to_pfn(page)))
 		u |= 1 << KPF_ZERO_PAGE;
 
+	/*
+	 * pseudo flags for the well known (anonymous) memory mapped pages
+	 */
+	if (page_mapped(page))
+		u |= 1 << KPF_MMAP;
+	if (PageAnon(page))
+		u |= 1 << KPF_ANON;
+	if (PageKsm(page))
+		u |= 1 << KPF_KSM;
 
 	/*
-	 * Caveats on high order pages: PG_buddy and PG_slab will only be set
+	 * Caveats on high order pages: PG_buddy will only be set
 	 * on the head page.
 	 */
 	if (PageBuddy(page))
@@ -192,11 +201,6 @@ u64 stable_page_flags(struct page *page)
 		u |= 1 << KPF_IDLE;
 
 	u |= kpf_copy_bit(k, KPF_LOCKED,	PG_locked);
-
-	u |= kpf_copy_bit(k, KPF_SLAB,		PG_slab);
-	if (PageTail(page) && PageSlab(page))
-		u |= 1 << KPF_SLAB;
-
 	u |= kpf_copy_bit(k, KPF_ERROR,		PG_error);
 	u |= kpf_copy_bit(k, KPF_DIRTY,		PG_dirty);
 	u |= kpf_copy_bit(k, KPF_UPTODATE,	PG_uptodate);
@@ -214,10 +218,6 @@ u64 stable_page_flags(struct page *page)
 	u |= kpf_copy_bit(k, KPF_UNEVICTABLE,	PG_unevictable);
 	u |= kpf_copy_bit(k, KPF_MLOCKED,	PG_mlocked);
 
-#ifdef CONFIG_MEMORY_FAILURE
-	u |= kpf_copy_bit(k, KPF_HWPOISON,	PG_hwpoison);
-#endif
-
 #ifdef CONFIG_ARCH_USES_PG_UNCACHED
 	u |= kpf_copy_bit(k, KPF_UNCACHED,	PG_uncached);
 #endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 5/5] tools/mm/page-types.c: hide compound pages in non-raw mode
  2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
                   ` (3 preceding siblings ...)
  2023-10-10 14:28 ` [PATCH v1 4/5] mm, kpageflags: fix invalid output for PageSlab Naoya Horiguchi
@ 2023-10-10 14:28 ` Naoya Horiguchi
  2023-10-12  8:33 ` [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages David Hildenbrand
  5 siblings, 0 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-10 14:28 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, David Hildenbrand,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

In non-raw mode (i.e. calling page-types without -r flag), any flags
for compound pages except for hugetlb are supposed to be hidden.
But currently KPF_THP is shown and the newly added flag KPF_FOLIO is
also shown, which is unexpected.  So hide them.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
 tools/mm/page-types.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/mm/page-types.c b/tools/mm/page-types.c
index b78448d19e88..c37e1e79bc61 100644
--- a/tools/mm/page-types.c
+++ b/tools/mm/page-types.c
@@ -508,7 +508,7 @@ static uint64_t well_known_flags(uint64_t flags)
 
 	/* hide non-hugeTLB compound pages */
 	if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE)))
-		flags &= ~BITS_COMPOUND;
+		flags &= ~(BITS_COMPOUND|BIT(THP)|BIT(FOLIO));
 
 	return flags;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
                   ` (4 preceding siblings ...)
  2023-10-10 14:28 ` [PATCH v1 5/5] tools/mm/page-types.c: hide compound pages in non-raw mode Naoya Horiguchi
@ 2023-10-12  8:33 ` David Hildenbrand
  2023-10-12 15:02   ` Naoya Horiguchi
  5 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2023-10-12  8:33 UTC (permalink / raw)
  To: Naoya Horiguchi, linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Kirill A. Shutemov, Mike Kravetz,
	Miaohe Lin, Vlastimil Babka, Muchun Song, Naoya Horiguchi,
	linux-kernel

On 10.10.23 16:27, Naoya Horiguchi wrote:
> Hi everyone,
> 
> This patchset addresses 2 issues in /proc/kpageflags.
> 
>    1. We can't easily tell folio from thp, because currently both pages are
>       judged as thp, and
>    2. we see some garbage data in records of compound tail pages because
>       we use tail pages to store some internal data.
> 
> These issues require userspace programs to do additional work to understand
> the page status, which makes situation more complicated.
> 
> This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
> by hiding part of page flag info on tail pages of compound pages for issue 2.
> 
> I think that technically some compound pages like thp/hugetlb/slab could be
> considered as folio, but in this version KPF_FOLIO is set only on folios

At least thp+hugetlb are most certainly folios. Regarding slab, I 
suspect we no longer call them folios (cannot be mapped to user space). 
But Im not sure about the type hierarchy.

> in pagecache (so "folios in narrower meaning").  I'm not confident about
> this choice, so if you have any idea about this, please let me know.

It does sound inconsistent. What exactly do you want to tell user space 
with the new flag?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-12  8:33 ` [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages David Hildenbrand
@ 2023-10-12 15:02   ` Naoya Horiguchi
  2023-10-12 15:30     ` David Hildenbrand
  0 siblings, 1 reply; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-12 15:02 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, Andrew Morton, Matthew Wilcox, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel

On Thu, Oct 12, 2023 at 10:33:04AM +0200, David Hildenbrand wrote:
> On 10.10.23 16:27, Naoya Horiguchi wrote:
> > Hi everyone,
> > 
> > This patchset addresses 2 issues in /proc/kpageflags.
> > 
> >    1. We can't easily tell folio from thp, because currently both pages are
> >       judged as thp, and
> >    2. we see some garbage data in records of compound tail pages because
> >       we use tail pages to store some internal data.
> > 
> > These issues require userspace programs to do additional work to understand
> > the page status, which makes situation more complicated.
> > 
> > This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
> > by hiding part of page flag info on tail pages of compound pages for issue 2.
> > 
> > I think that technically some compound pages like thp/hugetlb/slab could be
> > considered as folio, but in this version KPF_FOLIO is set only on folios
> 
> At least thp+hugetlb are most certainly folios. Regarding slab, I suspect we
> no longer call them folios (cannot be mapped to user space). But Im not sure
> about the type hierarchy.

I'm not sure about the exact definition of "folio", and I think it's better
to make KPF_FOLIO set based on the definition.
"being mapped to userspace" can be one possible criteria for the definition.
But reading source code, folio_slab() and slab_folio() convert between
struct slab and struct folio, so I feel that someone might think a slab is
a kind of folio.

> 
> > in pagecache (so "folios in narrower meaning").  I'm not confident about
> > this choice, so if you have any idea about this, please let me know.
> 
> It does sound inconsistent. What exactly do you want to tell user space with
> the new flag?

The current most problematic behavior is to report folio as thp (order-2
pagecache page is definitely a folio but not a thp), and this is what the
new flag is intended to tell.

Thanks,
Naoya Horiguchi


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-12 15:02   ` Naoya Horiguchi
@ 2023-10-12 15:30     ` David Hildenbrand
  2023-10-13  0:54       ` Naoya Horiguchi
  2023-10-13 15:03       ` Matthew Wilcox
  0 siblings, 2 replies; 15+ messages in thread
From: David Hildenbrand @ 2023-10-12 15:30 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, Matthew Wilcox, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel

On 12.10.23 17:02, Naoya Horiguchi wrote:
> On Thu, Oct 12, 2023 at 10:33:04AM +0200, David Hildenbrand wrote:
>> On 10.10.23 16:27, Naoya Horiguchi wrote:
>>> Hi everyone,
>>>
>>> This patchset addresses 2 issues in /proc/kpageflags.
>>>
>>>     1. We can't easily tell folio from thp, because currently both pages are
>>>        judged as thp, and
>>>     2. we see some garbage data in records of compound tail pages because
>>>        we use tail pages to store some internal data.
>>>
>>> These issues require userspace programs to do additional work to understand
>>> the page status, which makes situation more complicated.
>>>
>>> This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
>>> by hiding part of page flag info on tail pages of compound pages for issue 2.
>>>
>>> I think that technically some compound pages like thp/hugetlb/slab could be
>>> considered as folio, but in this version KPF_FOLIO is set only on folios
>>
>> At least thp+hugetlb are most certainly folios. Regarding slab, I suspect we
>> no longer call them folios (cannot be mapped to user space). But Im not sure
>> about the type hierarchy.
> 
> I'm not sure about the exact definition of "folio", and I think it's better
> to make KPF_FOLIO set based on the definition.

Me neither. But in any case a THP *is* a folio. So you'd have to set 
that flag in any case.

And any order-0 page (i.e., anon, pagecache) is also a folio. What you 
seem to imply with folio is "large folio". So KPF_FOLIO is really wrong 
as far as I can tell.

> "being mapped to userspace" can be one possible criteria for the definition.
> But reading source code, folio_slab() and slab_folio() convert between
> struct slab and struct folio, so I feel that someone might think a slab is
> a kind of folio.

I keep forgetting if "folio" is just the generic term for any order-0 or 
compound page, or only for some of them. I usually live in the "anon" 
world, so I don't get reminded that often :)


>>> in pagecache (so "folios in narrower meaning").  I'm not confident about
>>> this choice, so if you have any idea about this, please let me know.
>>
>> It does sound inconsistent. What exactly do you want to tell user space with
>> the new flag?
> 
> The current most problematic behavior is to report folio as thp (order-2
> pagecache page is definitely a folio but not a thp), and this is what the
> new flag is intended to tell.

We are currently considering calling these sub-PMD sized THPs 
"small-sized THP". [1] Arguably, we're starting with the anon part where 
we won't get around exposing them to the user in sysfs.

So I wouldn't immediately say that these things are not THPs. They are 
not PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. 
Whereby slabs can also be order-0 folios, but hugetlb can't.


Looking at other interfaces, we do expose:

include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16

So maybe we should just continue talking about compound pages or do we 
have to use both terms here in this interface?

[1] https://lkml.kernel.org/r/20230929114421.3761121-1-ryan.roberts@arm.com

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-12 15:30     ` David Hildenbrand
@ 2023-10-13  0:54       ` Naoya Horiguchi
  2023-10-13  7:46         ` David Hildenbrand
  2023-10-13 15:03       ` Matthew Wilcox
  1 sibling, 1 reply; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-13  0:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-mm, Andrew Morton, Matthew Wilcox, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel

On Thu, Oct 12, 2023 at 05:30:34PM +0200, David Hildenbrand wrote:
> On 12.10.23 17:02, Naoya Horiguchi wrote:
> > On Thu, Oct 12, 2023 at 10:33:04AM +0200, David Hildenbrand wrote:
> > > On 10.10.23 16:27, Naoya Horiguchi wrote:
> > > > Hi everyone,
> > > > 
> > > > This patchset addresses 2 issues in /proc/kpageflags.
> > > > 
> > > >     1. We can't easily tell folio from thp, because currently both pages are
> > > >        judged as thp, and
> > > >     2. we see some garbage data in records of compound tail pages because
> > > >        we use tail pages to store some internal data.
> > > > 
> > > > These issues require userspace programs to do additional work to understand
> > > > the page status, which makes situation more complicated.
> > > > 
> > > > This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
> > > > by hiding part of page flag info on tail pages of compound pages for issue 2.
> > > > 
> > > > I think that technically some compound pages like thp/hugetlb/slab could be
> > > > considered as folio, but in this version KPF_FOLIO is set only on folios
> > > 
> > > At least thp+hugetlb are most certainly folios. Regarding slab, I suspect we
> > > no longer call them folios (cannot be mapped to user space). But Im not sure
> > > about the type hierarchy.
> > 
> > I'm not sure about the exact definition of "folio", and I think it's better
> > to make KPF_FOLIO set based on the definition.
> 
> Me neither. But in any case a THP *is* a folio. So you'd have to set that
> flag in any case.

OK.

> 
> And any order-0 page (i.e., anon, pagecache) is also a folio. What you seem
> to imply with folio is "large folio". So KPF_FOLIO is really wrong as far as
> I can tell.

Ah, I meant "large folio" for the new flag, so it might have been better to
name it KPF_LARGE_FOLIO.

> 
> > "being mapped to userspace" can be one possible criteria for the definition.
> > But reading source code, folio_slab() and slab_folio() convert between
> > struct slab and struct folio, so I feel that someone might think a slab is
> > a kind of folio.
> 
> I keep forgetting if "folio" is just the generic term for any order-0 or
> compound page, or only for some of them. I usually live in the "anon" world,
> so I don't get reminded that often :)

I didn't notice that an order-0 page is also a folio.

> 
> 
> > > > in pagecache (so "folios in narrower meaning").  I'm not confident about
> > > > this choice, so if you have any idea about this, please let me know.
> > > 
> > > It does sound inconsistent. What exactly do you want to tell user space with
> > > the new flag?
> > 
> > The current most problematic behavior is to report folio as thp (order-2
> > pagecache page is definitely a folio but not a thp), and this is what the
> > new flag is intended to tell.
> 
> We are currently considering calling these sub-PMD sized THPs "small-sized
> THP". [1] Arguably, we're starting with the anon part where we won't get
> around exposing them to the user in sysfs.
> 
> So I wouldn't immediately say that these things are not THPs. They are not
> PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. Whereby
> slabs can also be order-0 folios, but hugetlb can't.
> 
> 
> Looking at other interfaces, we do expose:
> 
> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16
> 
> So maybe we should just continue talking about compound pages or do we have
> to use both terms here in this interface?

Extending the concept of thp to arbitrary size of thp sounds good to me.
If patchset [1] will be merged, then setting KPF_THP on large folios is totally
fine and one of my problem in this patchset will be automatically resolved.
So I'm thinking of not adding new flag and just focusing on garbage data issue.

Thank you very much for sharing ideas.

Naoya Horiguchi

> 
> [1] https://lkml.kernel.org/r/20230929114421.3761121-1-ryan.roberts@arm.com
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-13  0:54       ` Naoya Horiguchi
@ 2023-10-13  7:46         ` David Hildenbrand
  0 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2023-10-13  7:46 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, Matthew Wilcox, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel, Ryan Roberts

On 13.10.23 02:54, Naoya Horiguchi wrote:
> On Thu, Oct 12, 2023 at 05:30:34PM +0200, David Hildenbrand wrote:
>> On 12.10.23 17:02, Naoya Horiguchi wrote:
>>> On Thu, Oct 12, 2023 at 10:33:04AM +0200, David Hildenbrand wrote:
>>>> On 10.10.23 16:27, Naoya Horiguchi wrote:
>>>>> Hi everyone,
>>>>>
>>>>> This patchset addresses 2 issues in /proc/kpageflags.
>>>>>
>>>>>      1. We can't easily tell folio from thp, because currently both pages are
>>>>>         judged as thp, and
>>>>>      2. we see some garbage data in records of compound tail pages because
>>>>>         we use tail pages to store some internal data.
>>>>>
>>>>> These issues require userspace programs to do additional work to understand
>>>>> the page status, which makes situation more complicated.
>>>>>
>>>>> This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
>>>>> by hiding part of page flag info on tail pages of compound pages for issue 2.
>>>>>
>>>>> I think that technically some compound pages like thp/hugetlb/slab could be
>>>>> considered as folio, but in this version KPF_FOLIO is set only on folios
>>>>
>>>> At least thp+hugetlb are most certainly folios. Regarding slab, I suspect we
>>>> no longer call them folios (cannot be mapped to user space). But Im not sure
>>>> about the type hierarchy.
>>>
>>> I'm not sure about the exact definition of "folio", and I think it's better
>>> to make KPF_FOLIO set based on the definition.
>>
>> Me neither. But in any case a THP *is* a folio. So you'd have to set that
>> flag in any case.
> 
> OK.
> 
>>
>> And any order-0 page (i.e., anon, pagecache) is also a folio. What you seem
>> to imply with folio is "large folio". So KPF_FOLIO is really wrong as far as
>> I can tell.
> 
> Ah, I meant "large folio" for the new flag, so it might have been better to
> name it KPF_LARGE_FOLIO.
> 
>>
>>> "being mapped to userspace" can be one possible criteria for the definition.
>>> But reading source code, folio_slab() and slab_folio() convert between
>>> struct slab and struct folio, so I feel that someone might think a slab is
>>> a kind of folio.
>>
>> I keep forgetting if "folio" is just the generic term for any order-0 or
>> compound page, or only for some of them. I usually live in the "anon" world,
>> so I don't get reminded that often :)
> 
> I didn't notice that an order-0 page is also a folio.
> 
>>
>>
>>>>> in pagecache (so "folios in narrower meaning").  I'm not confident about
>>>>> this choice, so if you have any idea about this, please let me know.
>>>>
>>>> It does sound inconsistent. What exactly do you want to tell user space with
>>>> the new flag?
>>>
>>> The current most problematic behavior is to report folio as thp (order-2
>>> pagecache page is definitely a folio but not a thp), and this is what the
>>> new flag is intended to tell.
>>
>> We are currently considering calling these sub-PMD sized THPs "small-sized
>> THP". [1] Arguably, we're starting with the anon part where we won't get
>> around exposing them to the user in sysfs.
>>
>> So I wouldn't immediately say that these things are not THPs. They are not
>> PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. Whereby
>> slabs can also be order-0 folios, but hugetlb can't.
>>
>>
>> Looking at other interfaces, we do expose:
>>
>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16
>>
>> So maybe we should just continue talking about compound pages or do we have
>> to use both terms here in this interface?
> 
> Extending the concept of thp to arbitrary size of thp sounds good to me.
> If patchset [1] will be merged, then setting KPF_THP on large folios is totally
> fine and one of my problem in this patchset will be automatically resolved.

CCing Ryan.

> So I'm thinking of not adding new flag and just focusing on garbage data issue.

That sounds minimal and reasonable! Flags/values that logically belong 
to the head (although are stored in the tail) should probably be exposed 
along with the head. Flags that apply to the actual tail pages should 
stay with the tail pages.

> 
> Thank you very much for sharing ideas.

Thank you!

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-12 15:30     ` David Hildenbrand
  2023-10-13  0:54       ` Naoya Horiguchi
@ 2023-10-13 15:03       ` Matthew Wilcox
  2023-10-16 10:13         ` David Hildenbrand
  1 sibling, 1 reply; 15+ messages in thread
From: Matthew Wilcox @ 2023-10-13 15:03 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel

On Thu, Oct 12, 2023 at 05:30:34PM +0200, David Hildenbrand wrote:
> On 12.10.23 17:02, Naoya Horiguchi wrote:
> > On Thu, Oct 12, 2023 at 10:33:04AM +0200, David Hildenbrand wrote:
> > > On 10.10.23 16:27, Naoya Horiguchi wrote:
> > > > Hi everyone,
> > > > 
> > > > This patchset addresses 2 issues in /proc/kpageflags.
> > > > 
> > > >     1. We can't easily tell folio from thp, because currently both pages are
> > > >        judged as thp, and
> > > >     2. we see some garbage data in records of compound tail pages because
> > > >        we use tail pages to store some internal data.
> > > > 
> > > > These issues require userspace programs to do additional work to understand
> > > > the page status, which makes situation more complicated.
> > > > 
> > > > This patchset tries to solve these by defining KPF_FOLIO for issue 1., and
> > > > by hiding part of page flag info on tail pages of compound pages for issue 2.
> > > > 
> > > > I think that technically some compound pages like thp/hugetlb/slab could be
> > > > considered as folio, but in this version KPF_FOLIO is set only on folios
> > > 
> > > At least thp+hugetlb are most certainly folios. Regarding slab, I suspect we
> > > no longer call them folios (cannot be mapped to user space). But Im not sure
> > > about the type hierarchy.
> > 
> > I'm not sure about the exact definition of "folio", and I think it's better
> > to make KPF_FOLIO set based on the definition.
> 
> Me neither. But in any case a THP *is* a folio. So you'd have to set that
> flag in any case.
> 
> And any order-0 page (i.e., anon, pagecache) is also a folio. What you seem
> to imply with folio is "large folio". So KPF_FOLIO is really wrong as far as
> I can tell.

Our type hierarchy is degenerate ... in both the neutral and negative
sense of the word.  A folio is simply not-a-tail-page.  So, as you said,
all head pages and all order-0 pages are folios.

But we're still struggling against the legacy of our "struct page is
everything" mistake, and trying to fix that too.  The general term I've
chosen for this is "memdesc", but we aren't very far down the route of
disentangling the various types from either page or folio.  I'd imagined
that we'd convert everything to folio, then get into splitting them out,
but at least for ptdesc and slab we've gone for the direct conversion
approach.

At some point we probably want to disentangle anon folios from file
folios, but that's a fair ways down the list, after turning folios into a
separate allocation from struct page.  At least on my list ... if someone
wants to do that as a matter of urgency, I'm sure they can be accomodated.
It's not an easy task, for sure.  Our needs are better expressed as
(in Java terms) Interfaces rather than subclasses.  Or Traits/Generics
if you've started learning Rust.

We definitely have the concept of "mappable to userspace" which applies
to anon, file, netmem, some device driver allocations, some vmalloc
allocations, but not slab, page tables, or free memory.  Those memdescs
need refcount, mapcount, dirty flag, lock flag, maybe mapping?

Then we have "managed by the LRU" which applies to anon & file only.
Those memdescs need refcount, lru, and a pile of flags.

There's definitely scope for reordering and shrinking the various
memdescs.  Once they're fully separated from struct page.  What we _call_
them is a separate struggle.  Try to imagine how shrink_folio_list()
works if filemem & anonmem have different types ...

> > > It does sound inconsistent. What exactly do you want to tell user space with
> > > the new flag?
> > 
> > The current most problematic behavior is to report folio as thp (order-2
> > pagecache page is definitely a folio but not a thp), and this is what the
> > new flag is intended to tell.
> 
> We are currently considering calling these sub-PMD sized THPs "small-sized
> THP". [1] Arguably, we're starting with the anon part where we won't get
> around exposing them to the user in sysfs.
> 
> So I wouldn't immediately say that these things are not THPs. They are not
> PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. Whereby
> slabs can also be order-0 folios, but hugetlb can't.

I think this is a mistake.  Users expect THPs to be PMD sized.  We already
have the term "large folio" in use for file-backed memory; why do we
need to invent a new term for anon large folios?

> Looking at other interfaces, we do expose:
> 
> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16
> 
> So maybe we should just continue talking about compound pages or do we have
> to use both terms here in this interface?

I don;t know how easy it's going to be to distinguish between a head
and tail page in the Glorious Future once pages and folios are separated.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-13 15:03       ` Matthew Wilcox
@ 2023-10-16 10:13         ` David Hildenbrand
  2023-10-16 11:36           ` Ryan Roberts
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2023-10-16 10:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel, Ryan Roberts, Hugh Dickins

>>>> It does sound inconsistent. What exactly do you want to tell user space with
>>>> the new flag?
>>>
>>> The current most problematic behavior is to report folio as thp (order-2
>>> pagecache page is definitely a folio but not a thp), and this is what the
>>> new flag is intended to tell.
>>
>> We are currently considering calling these sub-PMD sized THPs "small-sized
>> THP". [1] Arguably, we're starting with the anon part where we won't get
>> around exposing them to the user in sysfs.
>>
>> So I wouldn't immediately say that these things are not THPs. They are not
>> PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. Whereby
>> slabs can also be order-0 folios, but hugetlb can't.
> 
> I think this is a mistake.  Users expect THPs to be PMD sized.  We already
> have the term "large folio" in use for file-backed memory; why do we
> need to invent a new term for anon large folios?

I changed my opinion two times, but I stabilized at "these are just huge 
pages of different size" when it comes to user-visible features.

Handling/calling them folios internally -- especially to abstract the 
page vs. compound page and how we manage/handle the metadata -- is a 
reasonable thing to do, because that's what we decided to pass around.


For future reference, here is a writeup about my findings and the reason 
for my opinion:


(1) OS-independent concept

Ignoring how the OS manages metadata (e.g., "struct page", "struct 
folio", compound head/tail, memdesc, ...), the common term to describe a 
"the smallest fixed-length contiguous block of physical memory into 
which memory pages are mapped by the operating system.["[1] is a page 
frame -- people usually simplify by dropping the "frame" part, so do I.

Larger pages (which we call "huge pages", FreeBSD "superpages", Windows 
"large pages") can come in different sizes and were traditionally based 
on architecture support, whereby architectures can support multiple ones 
[1]; I think what we see is that the OS might use intermediate sizes to 
manage memory more efficiently, abstracting/evolving that concept from 
the actual hardware page table mapping granularity.

But the foundation is that we are dealing with "blocks of physical 
memory" in a unit that is larger than the smallest page sizes. Larger pages.

[the comment about SGI IRIX on [1] is an interesting read; so are 
"scattered superpages"[3]]

Users learned the difference between a "page" and a "huge page". I'm 
confident that they can learn the difference between a "traditional huge 
page" and a "small-sized huge page", just like they did with hugetlb 
(below).

We just have to be careful with memory statistics and to default to the 
traditional huge pages for now. Slowly, the term "THP" will become more 
generic. Apart from that, I fail to see the big source of confusion.

Note: FreeBSD currently similarly calls these things on arm64 
"medium-sized superpages", and did not invent new terms for that so far 
[2].


(2) hugetlb

Traditional huge pages started out to be PMD-sized. Before 2008, we only 
supported a single huge page size. Ever since, we added support for 
sizes larger (gigantic) and smaller than that (cont-pte / cont-pmd).

So (a) users did not panic because we also supported huge pages that 
were not PMD-sized; (b) we managed to integrate it into the existing 
environment, defaulting to the old PMD-sized huge pages towards the user 
but still providing configuration knobs and (c) it is natural today to 
have multiple huge page sizes supported in hugetlb.

Nowadays, when somebody says that they are using hugetlb huge pages, the 
first question frequently is "which huge page size?". The same will 
happen with transparent huge pages I believe.


(3) THP preparation for multiple sizes

With
	/sys/kernel/mm/transparent_hugepage/hpage_pmd_size
added in 2016, we already provided a way for users to query the PMD size 
for THP, implying that there might be multiple sizes in the future.

Therefore, in commit 49920d28781d, Hugh already envisioned " some 
transparent support for pud and pgd pages" and ended up calling it 
"_pmd_size". Turns out, we want smaller THPs first, not larger ones.


(4) Metadata management

How the OS manages metadata for its memory -- and how it calls the 
involved datastructures -- is IMHO an implementation detail (an 
important one regarding performance, robustness and metadata overhead as 
we learned, though ;) ).

We were able to introduce folios without user-visible changes. We should 
be able to implement memdesc (or memory type hierarchies) without 
user-visible changes -- except for some interfaces that provide access 
to bare "struct page" information (classifies as debugging interfaces IMHO).


Last but not least, we ended up consistently calling these "larger than 
a page" things that we map into user space "(transparent) huge page" 
towards the user in toggles, stats and documentation. Fortunately we 
didn't use the term "compound page" back then; it would have been a mistake.


Regarding the pagecache, we managed to not expose any toggles towards 
the user, because memory waste can be better controlled. So the term 
"folio" does not pop up as a toggle in /sys and /proc.

	t14s: ~  $ find /sys -name "*folio*" 2> /dev/null
	t14s: ~  $ find /proc -name "*folio*" 2> /dev/null

Once we want to remove the (sub)page mapcount, we'll likely have to 
remove _nr_pages_mapped. To make some workloads that are sensitive to 
memory consumption [4] play along when not accounting only the actually 
mapped parts, we might have to introduce other ways to control that, 
when "/sys/kernel/debug/fault_around_bytes" no longer does the trick. 
I'm hoping we can still find ways to avoid exposing any toggles for 
that; we'll see.


[1] https://en.wikipedia.org/wiki/Page_(computer_memory)
[2] https://www.freebsd.org/status/report-2022-04-2022-06/superpages/
[3] https://ieeexplore.ieee.org/document/6657040/similar#similar
[4] https://www.suse.com/support/kb/doc/?id=000019017


> 
>> Looking at other interfaces, we do expose:
>>
>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16
>>
>> So maybe we should just continue talking about compound pages or do we have
>> to use both terms here in this interface?
> 
> I don;t know how easy it's going to be to distinguish between a head
> and tail page in the Glorious Future once pages and folios are separated.

Probably a page-based interface would be the wrong interface for that; 
fortunately, this interface has a "debugging" smell to it, so we might 
be able to replace it.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-16 10:13         ` David Hildenbrand
@ 2023-10-16 11:36           ` Ryan Roberts
  2023-10-18  5:25             ` Naoya Horiguchi
  0 siblings, 1 reply; 15+ messages in thread
From: Ryan Roberts @ 2023-10-16 11:36 UTC (permalink / raw)
  To: David Hildenbrand, Matthew Wilcox
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Kirill A. Shutemov,
	Mike Kravetz, Miaohe Lin, Vlastimil Babka, Muchun Song,
	Naoya Horiguchi, linux-kernel, Hugh Dickins

On 16/10/2023 11:13, David Hildenbrand wrote:
>>>>> It does sound inconsistent. What exactly do you want to tell user space with
>>>>> the new flag?
>>>>
>>>> The current most problematic behavior is to report folio as thp (order-2
>>>> pagecache page is definitely a folio but not a thp), and this is what the
>>>> new flag is intended to tell.
>>>
>>> We are currently considering calling these sub-PMD sized THPs "small-sized
>>> THP". [1] Arguably, we're starting with the anon part where we won't get
>>> around exposing them to the user in sysfs.
>>>
>>> So I wouldn't immediately say that these things are not THPs. They are not
>>> PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. Whereby
>>> slabs can also be order-0 folios, but hugetlb can't.
>>
>> I think this is a mistake.  Users expect THPs to be PMD sized.  We already
>> have the term "large folio" in use for file-backed memory; why do we
>> need to invent a new term for anon large folios?
> 
> I changed my opinion two times, but I stabilized at "these are just huge pages
> of different size" when it comes to user-visible features.
> 
> Handling/calling them folios internally -- especially to abstract the page vs.
> compound page and how we manage/handle the metadata -- is a reasonable thing to
> do, because that's what we decided to pass around.
> 
> 
> For future reference, here is a writeup about my findings and the reason for my
> opinion:
> 
> 
> (1) OS-independent concept
> 
> Ignoring how the OS manages metadata (e.g., "struct page", "struct folio",
> compound head/tail, memdesc, ...), the common term to describe a "the smallest
> fixed-length contiguous block of physical memory into which memory pages are
> mapped by the operating system.["[1] is a page frame -- people usually simplify
> by dropping the "frame" part, so do I.
> 
> Larger pages (which we call "huge pages", FreeBSD "superpages", Windows "large
> pages") can come in different sizes and were traditionally based on architecture
> support, whereby architectures can support multiple ones [1]; I think what we
> see is that the OS might use intermediate sizes to manage memory more
> efficiently, abstracting/evolving that concept from the actual hardware page
> table mapping granularity.
> 
> But the foundation is that we are dealing with "blocks of physical memory" in a
> unit that is larger than the smallest page sizes. Larger pages.
> 
> [the comment about SGI IRIX on [1] is an interesting read; so are "scattered
> superpages"[3]]
> 
> Users learned the difference between a "page" and a "huge page". I'm confident
> that they can learn the difference between a "traditional huge page" and a
> "small-sized huge page", just like they did with hugetlb (below).
> 
> We just have to be careful with memory statistics and to default to the
> traditional huge pages for now. Slowly, the term "THP" will become more generic.
> Apart from that, I fail to see the big source of confusion.
> 
> Note: FreeBSD currently similarly calls these things on arm64 "medium-sized
> superpages", and did not invent new terms for that so far [2].
> 
> 
> (2) hugetlb
> 
> Traditional huge pages started out to be PMD-sized. Before 2008, we only
> supported a single huge page size. Ever since, we added support for sizes larger
> (gigantic) and smaller than that (cont-pte / cont-pmd).
> 
> So (a) users did not panic because we also supported huge pages that were not
> PMD-sized; (b) we managed to integrate it into the existing environment,
> defaulting to the old PMD-sized huge pages towards the user but still providing
> configuration knobs and (c) it is natural today to have multiple huge page sizes
> supported in hugetlb.
> 
> Nowadays, when somebody says that they are using hugetlb huge pages, the first
> question frequently is "which huge page size?". The same will happen with
> transparent huge pages I believe.
> 
> 
> (3) THP preparation for multiple sizes
> 
> With
>     /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
> added in 2016, we already provided a way for users to query the PMD size for
> THP, implying that there might be multiple sizes in the future.
> 
> Therefore, in commit 49920d28781d, Hugh already envisioned " some transparent
> support for pud and pgd pages" and ended up calling it "_pmd_size". Turns out,
> we want smaller THPs first, not larger ones.
> 
> 
> (4) Metadata management
> 
> How the OS manages metadata for its memory -- and how it calls the involved
> datastructures -- is IMHO an implementation detail (an important one regarding
> performance, robustness and metadata overhead as we learned, though ;) ).
> 
> We were able to introduce folios without user-visible changes. We should be able
> to implement memdesc (or memory type hierarchies) without user-visible changes
> -- except for some interfaces that provide access to bare "struct page"
> information (classifies as debugging interfaces IMHO).
> 
> 
> Last but not least, we ended up consistently calling these "larger than a page"
> things that we map into user space "(transparent) huge page" towards the user in
> toggles, stats and documentation. Fortunately we didn't use the term "compound
> page" back then; it would have been a mistake.
> 
> 
> Regarding the pagecache, we managed to not expose any toggles towards the user,
> because memory waste can be better controlled. So the term "folio" does not pop
> up as a toggle in /sys and /proc.
> 
>     t14s: ~  $ find /sys -name "*folio*" 2> /dev/null
>     t14s: ~  $ find /proc -name "*folio*" 2> /dev/null
> 
> Once we want to remove the (sub)page mapcount, we'll likely have to remove
> _nr_pages_mapped. To make some workloads that are sensitive to memory
> consumption [4] play along when not accounting only the actually mapped parts,
> we might have to introduce other ways to control that, when
> "/sys/kernel/debug/fault_around_bytes" no longer does the trick. I'm hoping we
> can still find ways to avoid exposing any toggles for that; we'll see.
> 
> 
> [1] https://en.wikipedia.org/wiki/Page_(computer_memory)
> [2] https://www.freebsd.org/status/report-2022-04-2022-06/superpages/
> [3] https://ieeexplore.ieee.org/document/6657040/similar#similar
> [4] https://www.suse.com/support/kb/doc/?id=000019017

+1 for David's reasoning.

FWIW, the way I see it, everything is a folio; a folio is an implementation
detail that neatly abstracts a physically contiguous, power-of-2 number of pages
(including the single page case). So I'm not sure how useful it is to add the
proposed KPF_FOLIO flag. The only real thing I can imagine user space using it
for would be to tell if some extent of virtual memory is physically contiguous,
and you can already do that from the PFN.

Bigger picture interface-wise, I think it is simpler and more understandable to
the user to extend an existing concept (THP) rather than invent a new one
(folios) that substantially overlaps with the existing (PMD-sized) THP concept.

That said, if you have plans in the folio roadmap that I'm not aware of, then
perhaps those would change my mind. There is a thread here [1] where we are
discussing the best way to expose "small-sized THP" (anon large folios) to user
space - Metthew if you you stong feelings, please do reply!

[1]
https://lore.kernel.org/linux-mm/6d89fdc9-ef55-d44e-bf12-fafff318aef8@redhat.com/

Thanks,
Ryan


> 
> 
>>
>>> Looking at other interfaces, we do expose:
>>>
>>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
>>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16
>>>
>>> So maybe we should just continue talking about compound pages or do we have
>>> to use both terms here in this interface?
>>
>> I don;t know how easy it's going to be to distinguish between a head
>> and tail page in the Glorious Future once pages and folios are separated.
> 
> Probably a page-based interface would be the wrong interface for that;
> fortunately, this interface has a "debugging" smell to it, so we might be able
> to replace it.
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages
  2023-10-16 11:36           ` Ryan Roberts
@ 2023-10-18  5:25             ` Naoya Horiguchi
  0 siblings, 0 replies; 15+ messages in thread
From: Naoya Horiguchi @ 2023-10-18  5:25 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: David Hildenbrand, Matthew Wilcox, linux-mm, Andrew Morton,
	Kirill A. Shutemov, Mike Kravetz, Miaohe Lin, Vlastimil Babka,
	Muchun Song, Naoya Horiguchi, linux-kernel, Hugh Dickins

On Mon, Oct 16, 2023 at 12:36:22PM +0100, Ryan Roberts wrote:
> On 16/10/2023 11:13, David Hildenbrand wrote:
> >>>>> It does sound inconsistent. What exactly do you want to tell user space with
> >>>>> the new flag?
> >>>>
> >>>> The current most problematic behavior is to report folio as thp (order-2
> >>>> pagecache page is definitely a folio but not a thp), and this is what the
> >>>> new flag is intended to tell.
> >>>
> >>> We are currently considering calling these sub-PMD sized THPs "small-sized
> >>> THP". [1] Arguably, we're starting with the anon part where we won't get
> >>> around exposing them to the user in sysfs.
> >>>
> >>> So I wouldn't immediately say that these things are not THPs. They are not
> >>> PMD-sized THP. A slab/hugetlb is certainly not a thp but a folio. Whereby
> >>> slabs can also be order-0 folios, but hugetlb can't.
> >>
> >> I think this is a mistake.  Users expect THPs to be PMD sized.  We already
> >> have the term "large folio" in use for file-backed memory; why do we
> >> need to invent a new term for anon large folios?
> > 
> > I changed my opinion two times, but I stabilized at "these are just huge pages
> > of different size" when it comes to user-visible features.
> > 
> > Handling/calling them folios internally -- especially to abstract the page vs.
> > compound page and how we manage/handle the metadata -- is a reasonable thing to
> > do, because that's what we decided to pass around.
> > 
> > 
> > For future reference, here is a writeup about my findings and the reason for my
> > opinion:
> > 
> > 
> > (1) OS-independent concept
> > 
> > Ignoring how the OS manages metadata (e.g., "struct page", "struct folio",
> > compound head/tail, memdesc, ...), the common term to describe a "the smallest
> > fixed-length contiguous block of physical memory into which memory pages are
> > mapped by the operating system.["[1] is a page frame -- people usually simplify
> > by dropping the "frame" part, so do I.
> > 
> > Larger pages (which we call "huge pages", FreeBSD "superpages", Windows "large
> > pages") can come in different sizes and were traditionally based on architecture
> > support, whereby architectures can support multiple ones [1]; I think what we
> > see is that the OS might use intermediate sizes to manage memory more
> > efficiently, abstracting/evolving that concept from the actual hardware page
> > table mapping granularity.
> > 
> > But the foundation is that we are dealing with "blocks of physical memory" in a
> > unit that is larger than the smallest page sizes. Larger pages.
> > 
> > [the comment about SGI IRIX on [1] is an interesting read; so are "scattered
> > superpages"[3]]
> > 
> > Users learned the difference between a "page" and a "huge page". I'm confident
> > that they can learn the difference between a "traditional huge page" and a
> > "small-sized huge page", just like they did with hugetlb (below).
> > 
> > We just have to be careful with memory statistics and to default to the
> > traditional huge pages for now. Slowly, the term "THP" will become more generic.
> > Apart from that, I fail to see the big source of confusion.
> > 
> > Note: FreeBSD currently similarly calls these things on arm64 "medium-sized
> > superpages", and did not invent new terms for that so far [2].
> > 
> > 
> > (2) hugetlb
> > 
> > Traditional huge pages started out to be PMD-sized. Before 2008, we only
> > supported a single huge page size. Ever since, we added support for sizes larger
> > (gigantic) and smaller than that (cont-pte / cont-pmd).
> > 
> > So (a) users did not panic because we also supported huge pages that were not
> > PMD-sized; (b) we managed to integrate it into the existing environment,
> > defaulting to the old PMD-sized huge pages towards the user but still providing
> > configuration knobs and (c) it is natural today to have multiple huge page sizes
> > supported in hugetlb.
> > 
> > Nowadays, when somebody says that they are using hugetlb huge pages, the first
> > question frequently is "which huge page size?". The same will happen with
> > transparent huge pages I believe.
> > 
> > 
> > (3) THP preparation for multiple sizes
> > 
> > With
> >     /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
> > added in 2016, we already provided a way for users to query the PMD size for
> > THP, implying that there might be multiple sizes in the future.
> > 
> > Therefore, in commit 49920d28781d, Hugh already envisioned " some transparent
> > support for pud and pgd pages" and ended up calling it "_pmd_size". Turns out,
> > we want smaller THPs first, not larger ones.
> > 
> > 
> > (4) Metadata management
> > 
> > How the OS manages metadata for its memory -- and how it calls the involved
> > datastructures -- is IMHO an implementation detail (an important one regarding
> > performance, robustness and metadata overhead as we learned, though ;) ).
> > 
> > We were able to introduce folios without user-visible changes. We should be able
> > to implement memdesc (or memory type hierarchies) without user-visible changes
> > -- except for some interfaces that provide access to bare "struct page"
> > information (classifies as debugging interfaces IMHO).
> > 
> > 
> > Last but not least, we ended up consistently calling these "larger than a page"
> > things that we map into user space "(transparent) huge page" towards the user in
> > toggles, stats and documentation. Fortunately we didn't use the term "compound
> > page" back then; it would have been a mistake.
> > 
> > 
> > Regarding the pagecache, we managed to not expose any toggles towards the user,
> > because memory waste can be better controlled. So the term "folio" does not pop
> > up as a toggle in /sys and /proc.
> > 
> >     t14s: ~  $ find /sys -name "*folio*" 2> /dev/null
> >     t14s: ~  $ find /proc -name "*folio*" 2> /dev/null
> > 
> > Once we want to remove the (sub)page mapcount, we'll likely have to remove
> > _nr_pages_mapped. To make some workloads that are sensitive to memory
> > consumption [4] play along when not accounting only the actually mapped parts,
> > we might have to introduce other ways to control that, when
> > "/sys/kernel/debug/fault_around_bytes" no longer does the trick. I'm hoping we
> > can still find ways to avoid exposing any toggles for that; we'll see.
> > 
> > 
> > [1] https://en.wikipedia.org/wiki/Page_(computer_memory)
> > [2] https://www.freebsd.org/status/report-2022-04-2022-06/superpages/
> > [3] https://ieeexplore.ieee.org/document/6657040/similar#similar
> > [4] https://www.suse.com/support/kb/doc/?id=000019017
> 
> +1 for David's reasoning.
> 
> FWIW, the way I see it, everything is a folio; a folio is an implementation
> detail that neatly abstracts a physically contiguous, power-of-2 number of pages
> (including the single page case). So I'm not sure how useful it is to add the
> proposed KPF_FOLIO flag. The only real thing I can imagine user space using it
> for would be to tell if some extent of virtual memory is physically contiguous,
> and you can already do that from the PFN.
> 
> Bigger picture interface-wise, I think it is simpler and more understandable to
> the user to extend an existing concept (THP) rather than invent a new one
> (folios) that substantially overlaps with the existing (PMD-sized) THP concept.
> 
> That said, if you have plans in the folio roadmap that I'm not aware of, then
> perhaps those would change my mind. There is a thread here [1] where we are
> discussing the best way to expose "small-sized THP" (anon large folios) to user
> space - Metthew if you you stong feelings, please do reply!
> 
> [1]
> https://lore.kernel.org/linux-mm/6d89fdc9-ef55-d44e-bf12-fafff318aef8@redhat.com/
> 
> Thanks,
> Ryan
> 
> 
> > 
> > 
> >>
> >>> Looking at other interfaces, we do expose:
> >>>
> >>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_HEAD        15
> >>> include/uapi/linux/kernel-page-flags.h:#define KPF_COMPOUND_TAIL        16
> >>>
> >>> So maybe we should just continue talking about compound pages or do we have
> >>> to use both terms here in this interface?
> >>
> >> I don;t know how easy it's going to be to distinguish between a head
> >> and tail page in the Glorious Future once pages and folios are separated.
> > 
> > Probably a page-based interface would be the wrong interface for that;
> > fortunately, this interface has a "debugging" smell to it, so we might be able
> > to replace it.

This interface exposes per-pfn (not per-page) data records, specifying pfn by
file offset. It does not care about distinction between head and tail.
So I don't think that we can avoid referring to tail pages even after page-to-folio
conversion is complete.

But I agree that this interface is for debugging or testing.  To clarify
this, we might consider relocating this interface to a more suitable location
within debugfs, making it effectively invisible to non-debugging processes.
And maybe this could be the case also for other similar interfaces /proc/kpage*.
So all these files can be handled together to address this problem.

Thanks,
Naoya Horiguchi


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-10-18  5:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-10 14:27 [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages Naoya Horiguchi
2023-10-10 14:27 ` [PATCH v1 1/5] include/uapi/linux/kernel-page-flags.h: define KPF_FOLIO Naoya Horiguchi
2023-10-10 14:27 ` [PATCH v1 2/5] mm: kpageflags: distinguish thp and folio Naoya Horiguchi
2023-10-10 14:27 ` [PATCH v1 3/5] mm, kpageflags: separate code path for hugetlb pages Naoya Horiguchi
2023-10-10 14:28 ` [PATCH v1 4/5] mm, kpageflags: fix invalid output for PageSlab Naoya Horiguchi
2023-10-10 14:28 ` [PATCH v1 5/5] tools/mm/page-types.c: hide compound pages in non-raw mode Naoya Horiguchi
2023-10-12  8:33 ` [PATCH v1 0/5] mm, kpageflags: support folio and fix output for compound pages David Hildenbrand
2023-10-12 15:02   ` Naoya Horiguchi
2023-10-12 15:30     ` David Hildenbrand
2023-10-13  0:54       ` Naoya Horiguchi
2023-10-13  7:46         ` David Hildenbrand
2023-10-13 15:03       ` Matthew Wilcox
2023-10-16 10:13         ` David Hildenbrand
2023-10-16 11:36           ` Ryan Roberts
2023-10-18  5:25             ` Naoya Horiguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).