linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: move PG_slab flag to page_type
@ 2022-09-19 12:57 Hyeonggon Yoo
  2022-09-19 12:59 ` [RFC PATCH] " Hyeonggon Yoo
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-09-19 12:57 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Hyeonggon Yoo, Andrew Morton, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Vlastimil Babka, Roman Gushchin,
	Naoya Horiguchi, Miaohe Lin, Matthew Wilcox (Oracle),
	Minchan Kim, Mel Gorman, Andrea Arcangeli, Dan Williams,
	Hugh Dickins, Muchun Song, David Hildenbrand, Andrey Konovalov,
	Marco Elver

For now, only SLAB uses _mapcount field as a number of active objects in
a slab, and other slab allocators do not use it. As 16 bits are enough
for that, use remaining 16 bits of _mapcount as page_type even when
SLAB is used. And then move PG_slab flag to page_type!

Note that page_type is always placed in upper 16 bits of _mapcount to
avoid confusing normal _mapcount as page_type. As underflow (actually
I mean, yeah, overflow) is not a concern anymore, use more lower bits
except bit zero.

Add more folio helpers for PAGE_TYPE_OPS() not to break existing
slab implementations.

Remove PG_slab check from PAGE_FLAGS_CHECK_AT_FREE. buddy will still
check if _mapcount is properly set at free.

Exclude PG_slab from hwpoison and show_page_flags() for now.

Note that with this patch, page_mapped() and folio_mapped() always return
false for slab page.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---

I think this gives us two benefits:
- it frees a bit in flags field
- it makes it simpler to distinguish user-mapped pages and
  not-user-mapped pages.

Plus I'm writing a bit more of code including:
	0) a few cleanup for code that checks
	   !PageSlab() && page_mapped() or that does similar thing
	1) provide human-readale string of page_type in dump_page
	2) add show_page_types() for tracepoint
	3) fix hwpoison ...etc.

Anyway This is an early RFC, I will very appreciate feedbacks!

 include/linux/mm_types.h       | 22 +++++++--
 include/linux/page-flags.h     | 83 ++++++++++++++++++++++++++--------
 include/trace/events/mmflags.h |  1 -
 mm/memory-failure.c            |  8 ----
 mm/slab.h                      | 11 ++++-
 5 files changed, 92 insertions(+), 33 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index cf97f3884fda..4b217c6fbe1f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -193,12 +193,24 @@ struct page {
 		atomic_t _mapcount;
 
 		/*
-		 * If the page is neither PageSlab nor mappable to userspace,
-		 * the value stored here may help determine what this page
-		 * is used for.  See page-flags.h for a list of page types
-		 * which are currently stored here.
+		 * If the page is not mappable to userspace, the value
+		 * stored here may help determine what this page is used for.
+		 * See page-flags.h for a list of page types which are currently
+		 * stored here.
 		 */
-		unsigned int page_type;
+		struct {
+			/*
+			 * Always place page_type in
+			 * upper 16 bits of _mapcount
+			 */
+#ifdef CPU_BIG_ENDIAN
+			__u16 page_type;
+			__u16 active;
+#else
+			__u16 active;
+			__u16 page_type;
+#endif
+		};
 	};
 
 	/* Usage count. *DO NOT USE DIRECTLY*. See page_ref.h */
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 465ff35a8c00..b414d0996639 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -107,7 +107,6 @@ enum pageflags {
 	PG_workingset,
 	PG_waiters,		/* Page has waiters, check its waitqueue. Must be bit #7 and in the same byte as "PG_locked" */
 	PG_error,
-	PG_slab,
 	PG_owner_priv_1,	/* Owner use. If pagecache, fs may use*/
 	PG_arch_1,
 	PG_reserved,
@@ -484,7 +483,6 @@ PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
 	TESTCLEARFLAG(Active, active, PF_HEAD)
 PAGEFLAG(Workingset, workingset, PF_HEAD)
 	TESTCLEARFLAG(Workingset, workingset, PF_HEAD)
-__PAGEFLAG(Slab, slab, PF_NO_TAIL)
 __PAGEFLAG(SlobFree, slob_free, PF_NO_TAIL)
 PAGEFLAG(Checked, checked, PF_NO_COMPOUND)	   /* Used by some filesystems */
 
@@ -926,44 +924,90 @@ static inline bool is_page_hwpoison(struct page *page)
 }
 
 /*
- * For pages that are never mapped to userspace (and aren't PageSlab),
- * page_type may be used.  Because it is initialised to -1, we invert the
- * sense of the bit, so __SetPageFoo *clears* the bit used for PageFoo, and
- * __ClearPageFoo *sets* the bit used for PageFoo.  We reserve a few high and
- * low bits so that an underflow or overflow of page_mapcount() won't be
- * mistaken for a page type value.
+ * For pages that are never mapped to userspace, page_type may be used.
+ * Because it is initialised to -1, we invert the sense of the bit,
+ * so __SetPageFoo *clears* the bit used for PageFoo, and __ClearPageFoo
+ * *sets* the bit used for PageFoo.  We reserve a few high and low bits
+ * so that an underflow or overflow of page_mapcount() won't be mistaken
+ * for a page type value.
  */
 
-#define PAGE_TYPE_BASE	0xf0000000
-/* Reserve		0x0000007f to catch underflows of page_mapcount */
-#define PAGE_MAPCOUNT_RESERVE	-128
-#define PG_buddy	0x00000080
-#define PG_offline	0x00000100
-#define PG_table	0x00000200
-#define PG_guard	0x00000400
+#define PAGE_TYPE_BASE	0xf000
+#define PAGE_MAPCOUNT_RESERVE	-1
+#define PG_buddy	0x0002
+#define PG_offline	0x0004
+#define PG_table	0x0008
+#define PG_guard	0x0010
+#define PG_slab		0x0020
 
 #define PageType(page, flag)						\
 	((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
 
-static inline int page_has_type(struct page *page)
+#define PAGE_TYPE_MASK	((1UL << (BITS_PER_BYTE * sizeof(__u16))) - 1)
+
+static inline bool page_has_type(struct page *page)
 {
 	return (int)page->page_type < PAGE_MAPCOUNT_RESERVE;
 }
 
+static inline bool page_type_valid(__u16 page_type)
+{
+	return (page_type & PAGE_TYPE_BASE) == PAGE_TYPE_BASE;
+}
+
 #define PAGE_TYPE_OPS(uname, lname)					\
 static __always_inline int Page##uname(struct page *page)		\
 {									\
 	return PageType(page, PG_##lname);				\
 }									\
+static __always_inline int folio_test_##lname(struct folio *folio)	\
+{									\
+	struct page *page = &folio->page;				\
+									\
+	VM_BUG_ON_PAGE(PageTail(page), page);				\
+	return PageType(page, PG_##lname);				\
+}									\
+static __always_inline int __folio_test_##lname(struct folio *folio)	\
+{									\
+	struct page *page = &folio->page;				\
+									\
+	return PageType(page, PG_##lname);				\
+}									\
 static __always_inline void __SetPage##uname(struct page *page)		\
 {									\
 	VM_BUG_ON_PAGE(!PageType(page, 0), page);			\
 	page->page_type &= ~PG_##lname;					\
 }									\
+static __always_inline void folio_set_##lname(struct folio *folio)	\
+{									\
+	struct page *page = &folio->page;				\
+									\
+	VM_BUG_ON_PAGE(PageTail(page), page);				\
+	__SetPage##uname(page);						\
+}									\
+static __always_inline void __folio_set_##lname(struct folio *folio)	\
+{									\
+	struct page *page = &folio->page;				\
+									\
+	__SetPage##uname(page);						\
+}									\
 static __always_inline void __ClearPage##uname(struct page *page)	\
 {									\
 	VM_BUG_ON_PAGE(!Page##uname(page), page);			\
 	page->page_type |= PG_##lname;					\
+}									\
+static __always_inline void folio_clear_##lname(struct folio *folio)	\
+{									\
+	struct page *page = &folio->page;				\
+									\
+	VM_BUG_ON_PAGE(PageTail(page), page);				\
+	__ClearPage##uname(page);					\
+}									\
+static __always_inline void __folio_clear_##lname(struct folio *folio)	\
+{									\
+	struct page *page = &folio->page;				\
+									\
+	__ClearPage##uname(page);					\
 }
 
 /*
@@ -996,6 +1040,9 @@ PAGE_TYPE_OPS(Buddy, buddy)
  */
 PAGE_TYPE_OPS(Offline, offline)
 
+/* PageSlab() indicates that the page is used by slab subsystem. */
+PAGE_TYPE_OPS(Slab, slab)
+
 extern void page_offline_freeze(void);
 extern void page_offline_thaw(void);
 extern void page_offline_begin(void);
@@ -1057,8 +1104,8 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page)
 	(1UL << PG_lru		| 1UL << PG_locked	|	\
 	 1UL << PG_private	| 1UL << PG_private_2	|	\
 	 1UL << PG_writeback	| 1UL << PG_reserved	|	\
-	 1UL << PG_slab		| 1UL << PG_active 	|	\
-	 1UL << PG_unevictable	| __PG_MLOCKED)
+	 1UL << PG_active	| 1UL << PG_unevictable	|	\
+	 __PG_MLOCKED)
 
 /*
  * Flags checked when a page is prepped for return by the page allocator.
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index e87cb2b80ed3..fa5aa9e983ec 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -113,7 +113,6 @@
 	{1UL << PG_lru,			"lru"		},		\
 	{1UL << PG_active,		"active"	},		\
 	{1UL << PG_workingset,		"workingset"	},		\
-	{1UL << PG_slab,		"slab"		},		\
 	{1UL << PG_owner_priv_1,	"owner_priv_1"	},		\
 	{1UL << PG_arch_1,		"arch_1"	},		\
 	{1UL << PG_reserved,		"reserved"	},		\
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 14439806b5ef..9a25d10d7391 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1123,7 +1123,6 @@ static int me_huge_page(struct page_state *ps, struct page *p)
 #define mlock		(1UL << PG_mlocked)
 #define lru		(1UL << PG_lru)
 #define head		(1UL << PG_head)
-#define slab		(1UL << PG_slab)
 #define reserved	(1UL << PG_reserved)
 
 static struct page_state error_states[] = {
@@ -1133,13 +1132,6 @@ static struct page_state error_states[] = {
 	 * PG_buddy pages only make a small fraction of all free pages.
 	 */
 
-	/*
-	 * Could in theory check if slab page is free or if we can drop
-	 * currently unused objects without touching them. But just
-	 * treat it as standard kernel for now.
-	 */
-	{ slab,		slab,		MF_MSG_SLAB,	me_kernel },
-
 	{ head,		head,		MF_MSG_HUGE,		me_huge_page },
 
 	{ sc|dirty,	sc|dirty,	MF_MSG_DIRTY_SWAPCACHE,	me_swapcache_dirty },
diff --git a/mm/slab.h b/mm/slab.h
index 985820b9069b..a5273e189265 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -20,7 +20,16 @@ struct slab {
 		};
 		struct rcu_head rcu_head;
 	};
-	unsigned int active;
+	struct {
+		/* always place page_type in upper 16 bits of _mapcount */
+#ifdef CPU_BIG_ENDIAN
+		__u16 page_type;
+		__u16 active;
+#else
+		__u16 active;
+		__u16 page_type;
+#endif
+	};
 
 #elif defined(CONFIG_SLUB)
 
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] mm: move PG_slab flag to page_type
  2022-09-19 12:57 [PATCH] mm: move PG_slab flag to page_type Hyeonggon Yoo
@ 2022-09-19 12:59 ` Hyeonggon Yoo
  2022-09-19 13:16 ` [PATCH] " Hyeonggon Yoo
  2022-09-24 23:04 ` Matthew Wilcox
  2 siblings, 0 replies; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-09-19 12:59 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka, Roman Gushchin, Naoya Horiguchi,
	Miaohe Lin, Matthew Wilcox (Oracle),
	Minchan Kim, Mel Gorman, Andrea Arcangeli, Dan Williams,
	Hugh Dickins, Muchun Song, David Hildenbrand, Andrey Konovalov,
	Marco Elver

On Mon, Sep 19, 2022 at 09:57:08PM +0900, Hyeonggon Yoo wrote:
> For now, only SLAB uses _mapcount field as a number of active objects in
> a slab, and other slab allocators do not use it. As 16 bits are enough
> for that, use remaining 16 bits of _mapcount as page_type even when
> SLAB is used. And then move PG_slab flag to page_type!
> 
> Note that page_type is always placed in upper 16 bits of _mapcount to
> avoid confusing normal _mapcount as page_type. As underflow (actually
> I mean, yeah, overflow) is not a concern anymore, use more lower bits
> except bit zero.
> 
> Add more folio helpers for PAGE_TYPE_OPS() not to break existing
> slab implementations.
> 
> Remove PG_slab check from PAGE_FLAGS_CHECK_AT_FREE. buddy will still
> check if _mapcount is properly set at free.
> 
> Exclude PG_slab from hwpoison and show_page_flags() for now.
> 
> Note that with this patch, page_mapped() and folio_mapped() always return
> false for slab page.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Muchun Song <songmuchun@bytedance.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Andrey Konovalov <andreyknvl@gmail.com>
> Cc: Marco Elver <elver@google.com>
> Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> ---
> 
> I think this gives us two benefits:
> - it frees a bit in flags field
> - it makes it simpler to distinguish user-mapped pages and
>   not-user-mapped pages.
> 
> Plus I'm writing a bit more of code including:
> 	0) a few cleanup for code that checks
> 	   !PageSlab() && page_mapped() or that does similar thing
> 	1) provide human-readale string of page_type in dump_page
> 	2) add show_page_types() for tracepoint
> 	3) fix hwpoison ...etc.
> 
> Anyway This is an early RFC, I will very appreciate feedbacks!

Looks like I forgot to add RFC in subject :)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: move PG_slab flag to page_type
  2022-09-19 12:57 [PATCH] mm: move PG_slab flag to page_type Hyeonggon Yoo
  2022-09-19 12:59 ` [RFC PATCH] " Hyeonggon Yoo
@ 2022-09-19 13:16 ` Hyeonggon Yoo
  2022-09-24 23:04 ` Matthew Wilcox
  2 siblings, 0 replies; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-09-19 13:16 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Andrew Morton, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Vlastimil Babka, Roman Gushchin, Naoya Horiguchi,
	Miaohe Lin, Matthew Wilcox (Oracle),
	Minchan Kim, Mel Gorman, Andrea Arcangeli, Dan Williams,
	Hugh Dickins, Muchun Song, David Hildenbrand, Andrey Konovalov,
	Marco Elver

On Mon, Sep 19, 2022 at 09:57:08PM +0900, Hyeonggon Yoo wrote:
> For now, only SLAB uses _mapcount field as a number of active objects in
> a slab, and other slab allocators do not use it. As 16 bits are enough
> for that, use remaining 16 bits of _mapcount as page_type even when
> SLAB is used. And then move PG_slab flag to page_type!
> 
> Note that page_type is always placed in upper 16 bits of _mapcount to
> avoid confusing normal _mapcount as page_type. As underflow (actually
> I mean, yeah, overflow) is not a concern anymore, use more lower bits
> except bit zero.
> 
> Add more folio helpers for PAGE_TYPE_OPS() not to break existing
> slab implementations.
> 
> Remove PG_slab check from PAGE_FLAGS_CHECK_AT_FREE. buddy will still
> check if _mapcount is properly set at free.
> 
> Exclude PG_slab from hwpoison and show_page_flags() for now.
> 
> Note that with this patch, page_mapped() and folio_mapped() always return
> false for slab page.
> 

[...]

Hi. a silly mistake:

> 
>  include/linux/mm_types.h       | 22 +++++++--
>  include/linux/page-flags.h     | 83 ++++++++++++++++++++++++++--------
>  include/trace/events/mmflags.h |  1 -
>  mm/memory-failure.c            |  8 ----
>  mm/slab.h                      | 11 ++++-
>  5 files changed, 92 insertions(+), 33 deletions(-)
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index cf97f3884fda..4b217c6fbe1f 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -193,12 +193,24 @@ struct page {
>  		atomic_t _mapcount;
>  
>  		/*
> -		 * If the page is neither PageSlab nor mappable to userspace,
> -		 * the value stored here may help determine what this page
> -		 * is used for.  See page-flags.h for a list of page types
> -		 * which are currently stored here.
> +		 * If the page is not mappable to userspace, the value
> +		 * stored here may help determine what this page is used for.
> +		 * See page-flags.h for a list of page types which are currently
> +		 * stored here.
>  		 */
> -		unsigned int page_type;
> +		struct {
> +			/*
> +			 * Always place page_type in
> +			 * upper 16 bits of _mapcount
> +			 */
> +#ifdef CPU_BIG_ENDIAN

s/CPU_BIG_ENDIAN/CONFIG_CPU_BIG_ENDIAN/g

> +			__u16 page_type;
> +			__u16 active;
> +#else
> +			__u16 active;
> +			__u16 page_type;
> +#endif
> +		};
>  	};
>  
>  	/* Usage count. *DO NOT USE DIRECTLY*. See page_ref.h */

[...]

> diff --git a/mm/slab.h b/mm/slab.h
> index 985820b9069b..a5273e189265 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -20,7 +20,16 @@ struct slab {
>  		};
>  		struct rcu_head rcu_head;
>  	};
> -	unsigned int active;
> +	struct {
> +		/* always place page_type in upper 16 bits of _mapcount */
> +#ifdef CPU_BIG_ENDIAN

same here.

> +		__u16 page_type;
> +		__u16 active;
> +#else
> +		__u16 active;
> +		__u16 page_type;
> +#endif
> +	};
>  
>  #elif defined(CONFIG_SLUB)
>  
> -- 
> 2.32.0
> 

-- 
Thanks,
Hyeonggon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: move PG_slab flag to page_type
  2022-09-19 12:57 [PATCH] mm: move PG_slab flag to page_type Hyeonggon Yoo
  2022-09-19 12:59 ` [RFC PATCH] " Hyeonggon Yoo
  2022-09-19 13:16 ` [PATCH] " Hyeonggon Yoo
@ 2022-09-24 23:04 ` Matthew Wilcox
  2022-09-26  7:55   ` David Hildenbrand
  2022-10-07 13:36   ` Hyeonggon Yoo
  2 siblings, 2 replies; 8+ messages in thread
From: Matthew Wilcox @ 2022-09-24 23:04 UTC (permalink / raw)
  To: Hyeonggon Yoo
  Cc: linux-mm, linux-kernel, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Naoya Horiguchi, Miaohe Lin, Minchan Kim,
	Mel Gorman, Andrea Arcangeli, Dan Williams, Hugh Dickins,
	Muchun Song, David Hildenbrand, Andrey Konovalov, Marco Elver

On Mon, Sep 19, 2022 at 09:57:08PM +0900, Hyeonggon Yoo wrote:
> For now, only SLAB uses _mapcount field as a number of active objects in
> a slab, and other slab allocators do not use it. As 16 bits are enough
> for that, use remaining 16 bits of _mapcount as page_type even when
> SLAB is used. And then move PG_slab flag to page_type!
> 
> Note that page_type is always placed in upper 16 bits of _mapcount to
> avoid confusing normal _mapcount as page_type. As underflow (actually
> I mean, yeah, overflow) is not a concern anymore, use more lower bits
> except bit zero.
> 
> Add more folio helpers for PAGE_TYPE_OPS() not to break existing
> slab implementations.
> 
> Remove PG_slab check from PAGE_FLAGS_CHECK_AT_FREE. buddy will still
> check if _mapcount is properly set at free.
> 
> Exclude PG_slab from hwpoison and show_page_flags() for now.
> 
> Note that with this patch, page_mapped() and folio_mapped() always return
> false for slab page.

This is an interesting approach.  It raises some questions.

First, you say that folio_mapped() returns false for slab pages.  That's
only true for order-0 slab pages.  For larger pages,

        if (!folio_test_large(folio))
                return atomic_read(&folio->_mapcount) >= 0;
        if (atomic_read(folio_mapcount_ptr(folio)) >= 0)
                return true;

so that's going to depend what folio_mapcount_ptr() aliases with.

Second, this patch changes the behaviour of PageSlab() when applied to
tail pages.  Which raises the further question of what PageBuddy(),
PageTable(), PageGuard() and PageIsolated() should do for multi-page
folios, if that is even possible.

Third, can we do this without that awkward __u16 thing?  Perhaps

-#define PG_buddy        0x00000080
-#define PG_offline      0x00000100
-#define PG_table        0x00000200
-#define PG_guard        0x00000400
+#define PG_buddy        0x00010000
+#define PG_offline      0x00020000
+#define PG_table        0x00040000
+#define PG_guard        0x00080000
+#define PG_slab         0x00100000

... and then use wrappers in slab.c to access the bottom 16 bits?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: move PG_slab flag to page_type
  2022-09-24 23:04 ` Matthew Wilcox
@ 2022-09-26  7:55   ` David Hildenbrand
  2022-10-07 13:36   ` Hyeonggon Yoo
  1 sibling, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2022-09-26  7:55 UTC (permalink / raw)
  To: Matthew Wilcox, Hyeonggon Yoo
  Cc: linux-mm, linux-kernel, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Naoya Horiguchi, Miaohe Lin, Minchan Kim,
	Mel Gorman, Andrea Arcangeli, Dan Williams, Hugh Dickins,
	Muchun Song, Andrey Konovalov, Marco Elver

On 25.09.22 01:04, Matthew Wilcox wrote:
> On Mon, Sep 19, 2022 at 09:57:08PM +0900, Hyeonggon Yoo wrote:
>> For now, only SLAB uses _mapcount field as a number of active objects in
>> a slab, and other slab allocators do not use it. As 16 bits are enough
>> for that, use remaining 16 bits of _mapcount as page_type even when
>> SLAB is used. And then move PG_slab flag to page_type!
>>
>> Note that page_type is always placed in upper 16 bits of _mapcount to
>> avoid confusing normal _mapcount as page_type. As underflow (actually
>> I mean, yeah, overflow) is not a concern anymore, use more lower bits
>> except bit zero.
>>
>> Add more folio helpers for PAGE_TYPE_OPS() not to break existing
>> slab implementations.
>>
>> Remove PG_slab check from PAGE_FLAGS_CHECK_AT_FREE. buddy will still
>> check if _mapcount is properly set at free.
>>
>> Exclude PG_slab from hwpoison and show_page_flags() for now.
>>
>> Note that with this patch, page_mapped() and folio_mapped() always return
>> false for slab page.
> 
> This is an interesting approach.  It raises some questions.
> 
> First, you say that folio_mapped() returns false for slab pages.  That's
> only true for order-0 slab pages.  For larger pages,
> 
>          if (!folio_test_large(folio))
>                  return atomic_read(&folio->_mapcount) >= 0;
>          if (atomic_read(folio_mapcount_ptr(folio)) >= 0)
>                  return true;
> 
> so that's going to depend what folio_mapcount_ptr() aliases with.
> 
> Second, this patch changes the behaviour of PageSlab() when applied to
> tail pages.  Which raises the further question of what PageBuddy(),
> PageTable(), PageGuard() and PageIsolated() should do for multi-page
> folios, if that is even possible.

IIRC, these flags never apply on real compound pages so far. For 
example, PageBuddy() is only set on the first page of a (budy-aligned) 
free chunk of memory, and all "remaining" (tail) pages have a refcount 
of zero and don't have the flag set.

There are page tables on some systems (e.g., s390x) that span multiple 
pages (pmd_alloc_one()). I think the behavior is similar -- no compound 
page, and only the first page has the flag set.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: move PG_slab flag to page_type
  2022-09-24 23:04 ` Matthew Wilcox
  2022-09-26  7:55   ` David Hildenbrand
@ 2022-10-07 13:36   ` Hyeonggon Yoo
  2022-10-07 18:02     ` Matthew Wilcox
  1 sibling, 1 reply; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-10-07 13:36 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Naoya Horiguchi, Miaohe Lin, Minchan Kim,
	Mel Gorman, Andrea Arcangeli, Dan Williams, Hugh Dickins,
	Muchun Song, David Hildenbrand, Andrey Konovalov, Marco Elver

On Sun, Sep 25, 2022 at 12:04:40AM +0100, Matthew Wilcox wrote:
> On Mon, Sep 19, 2022 at 09:57:08PM +0900, Hyeonggon Yoo wrote:
> > For now, only SLAB uses _mapcount field as a number of active objects in
> > a slab, and other slab allocators do not use it. As 16 bits are enough
> > for that, use remaining 16 bits of _mapcount as page_type even when
> > SLAB is used. And then move PG_slab flag to page_type!
> > 
> > Note that page_type is always placed in upper 16 bits of _mapcount to
> > avoid confusing normal _mapcount as page_type. As underflow (actually
> > I mean, yeah, overflow) is not a concern anymore, use more lower bits
> > except bit zero.
> > 
> > Add more folio helpers for PAGE_TYPE_OPS() not to break existing
> > slab implementations.
> > 
> > Remove PG_slab check from PAGE_FLAGS_CHECK_AT_FREE. buddy will still
> > check if _mapcount is properly set at free.
> > 
> > Exclude PG_slab from hwpoison and show_page_flags() for now.
> > 
> > Note that with this patch, page_mapped() and folio_mapped() always return
> > false for slab page.
> 
> This is an interesting approach.  It raises some questions.

Hello Matthew, sorry for late reply and I didn't mean to ignore your
feedback. I realized compound pages and folio stuffs are my weak side and
needed some time to learn :)

> First, you say that folio_mapped() returns false for slab pages.  That's
> only true for order-0 slab pages.  For larger pages,
> 
>         if (!folio_test_large(folio))
>                 return atomic_read(&folio->_mapcount) >= 0;
>         if (atomic_read(folio_mapcount_ptr(folio)) >= 0)
>                 return true;
> 
> so that's going to depend what folio_mapcount_ptr() aliases with.

IIUC it's true for order > 0 slab too.

As slab pages are not mapped to userspace at all,
entire compound page nor base pages are not mapped to userspace.

AFAIK followings are true for order > 0 slab:
        - (first tail page)->compound_mapcount is -1
        - _mapcount of base pages are -1

So:
        folio_mapped() and page_mapped() (if applied to head page)
        returns false for larger pages with this patch.

I wrote simple testcase and did check that folio_mapped() and page_mapped()
returns false for both order-0 page and larger pages. (and SLAB
returned true for them before)

> Second, this patch changes the behaviour of PageSlab() when applied to
> tail pages.

Altough it changes the way it checks the flag,

it does not change behavior when applied to tail pages - PageSlab() on tail
page returns false with or without this patch.

If PageSlab() need to return true for tail pages too,
we may make it check page_type at head page.

But I'm not sure when it the behavior is needed.
Can you please share your insight on this?

> Which raises the further question of what PageBuddy(),
> PageTable(), PageGuard() and PageIsolated() should do for multi-page
> folios, if that is even possible.

For users that uses real compound page like slab, we can make it check
page_type of head page. (if needed)

But for cases David described, there isn't much thing we can do
except making them to use real compound pages.

> Third, can we do this without that awkward __u16 thing?  Perhaps
> 
> -#define PG_buddy        0x00000080
> -#define PG_offline      0x00000100
> -#define PG_table        0x00000200
> -#define PG_guard        0x00000400
> +#define PG_buddy        0x00010000
> +#define PG_offline      0x00020000
> +#define PG_table        0x00040000
> +#define PG_guard        0x00080000
> +#define PG_slab         0x00100000
> 
> ... and then use wrappers in slab.c to access the bottom 16 bits?

Definitely! I prefer that way and will adjust in RFC v2.

Thank you for precious feedback.

-- 
Hyeonggon


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: move PG_slab flag to page_type
  2022-10-07 13:36   ` Hyeonggon Yoo
@ 2022-10-07 18:02     ` Matthew Wilcox
  2022-10-08  4:21       ` Hyeonggon Yoo
  0 siblings, 1 reply; 8+ messages in thread
From: Matthew Wilcox @ 2022-10-07 18:02 UTC (permalink / raw)
  To: Hyeonggon Yoo
  Cc: linux-mm, linux-kernel, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Naoya Horiguchi, Miaohe Lin, Minchan Kim,
	Mel Gorman, Andrea Arcangeli, Dan Williams, Hugh Dickins,
	Muchun Song, David Hildenbrand, Andrey Konovalov, Marco Elver

On Fri, Oct 07, 2022 at 10:36:56PM +0900, Hyeonggon Yoo wrote:
> > First, you say that folio_mapped() returns false for slab pages.  That's
> > only true for order-0 slab pages.  For larger pages,
> > 
> >         if (!folio_test_large(folio))
> >                 return atomic_read(&folio->_mapcount) >= 0;
> >         if (atomic_read(folio_mapcount_ptr(folio)) >= 0)
> >                 return true;
> > 
> > so that's going to depend what folio_mapcount_ptr() aliases with.
> 
> IIUC it's true for order > 0 slab too.
> 
> As slab pages are not mapped to userspace at all,
> entire compound page nor base pages are not mapped to userspace.
> 
> AFAIK followings are true for order > 0 slab:
>         - (first tail page)->compound_mapcount is -1

That's the part I wasn't sure of.  I think we do, in
prep_compound_head().

>         - _mapcount of base pages are -1
> 
> So:
>         folio_mapped() and page_mapped() (if applied to head page)
>         returns false for larger pages with this patch.
> 
> I wrote simple testcase and did check that folio_mapped() and page_mapped()
> returns false for both order-0 page and larger pages. (and SLAB
> returned true for them before)
> 
> > Second, this patch changes the behaviour of PageSlab() when applied to
> > tail pages.
> 
> Altough it changes the way it checks the flag,
> 
> it does not change behavior when applied to tail pages - PageSlab() on tail
> page returns false with or without this patch.

Really?  It seems to me that it returns true at the moment.  Look:

__PAGEFLAG(Slab, slab, PF_NO_TAIL)

#define PF_NO_TAIL(page, enforce) ({                                    \
                VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
                PF_POISONED_CHECK(compound_head(page)); })

so AFAICS, PageSlab checks the Slab bit on the head page, not the
tail page.

> If PageSlab() need to return true for tail pages too,
> we may make it check page_type at head page.
> 
> But I'm not sure when it the behavior is needed.
> Can you please share your insight on this?

There are tools like tools/vm/page-types.c which expect PageSlab to
return true for tail pages.

> > Which raises the further question of what PageBuddy(),
> > PageTable(), PageGuard() and PageIsolated() should do for multi-page
> > folios, if that is even possible.
> 
> For users that uses real compound page like slab, we can make it check
> page_type of head page. (if needed)
> 
> But for cases David described, there isn't much thing we can do
> except making them to use real compound pages.
> 
> > Third, can we do this without that awkward __u16 thing?  Perhaps
> > 
> > -#define PG_buddy        0x00000080
> > -#define PG_offline      0x00000100
> > -#define PG_table        0x00000200
> > -#define PG_guard        0x00000400
> > +#define PG_buddy        0x00010000
> > +#define PG_offline      0x00020000
> > +#define PG_table        0x00040000
> > +#define PG_guard        0x00080000
> > +#define PG_slab         0x00100000
> > 
> > ... and then use wrappers in slab.c to access the bottom 16 bits?
> 
> Definitely! I prefer that way and will adjust in RFC v2.
> 
> Thank you for precious feedback.

No problem.  I suggested (in an off-list email) that you consider counting
'active' by subtraction rather than addition because I have a feeling that

int active(struct slab *slab)
{
	return ~(slab->page_type | PG_slab);
}

would be better than

int active(struct slab *slab)
{
	return slab->page_type & 0xffff;
}

at least in part because you don't have to clear the bottom 16 bits of
page_type when you clear PG_slab, and you don't have to re-set them
when you set PG_slab.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: move PG_slab flag to page_type
  2022-10-07 18:02     ` Matthew Wilcox
@ 2022-10-08  4:21       ` Hyeonggon Yoo
  0 siblings, 0 replies; 8+ messages in thread
From: Hyeonggon Yoo @ 2022-10-08  4:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, Andrew Morton, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Vlastimil Babka,
	Roman Gushchin, Naoya Horiguchi, Miaohe Lin, Minchan Kim,
	Mel Gorman, Andrea Arcangeli, Dan Williams, Hugh Dickins,
	Muchun Song, David Hildenbrand, Andrey Konovalov, Marco Elver

On Fri, Oct 07, 2022 at 07:02:35PM +0100, Matthew Wilcox wrote:
> On Fri, Oct 07, 2022 at 10:36:56PM +0900, Hyeonggon Yoo wrote:
> > > First, you say that folio_mapped() returns false for slab pages.  That's
> > > only true for order-0 slab pages.  For larger pages,
> > > 
> > >         if (!folio_test_large(folio))
> > >                 return atomic_read(&folio->_mapcount) >= 0;
> > >         if (atomic_read(folio_mapcount_ptr(folio)) >= 0)
> > >                 return true;
> > > 
> > > so that's going to depend what folio_mapcount_ptr() aliases with.
> > 
> > IIUC it's true for order > 0 slab too.
> > 
> > As slab pages are not mapped to userspace at all,
> > entire compound page nor base pages are not mapped to userspace.
> > 
> > AFAIK followings are true for order > 0 slab:
> >         - (first tail page)->compound_mapcount is -1
> 
> That's the part I wasn't sure of.  I think we do, in
> prep_compound_head().

Right, exactly!

> 
> >         - _mapcount of base pages are -1
> > 
> > So:
> >         folio_mapped() and page_mapped() (if applied to head page)
> >         returns false for larger pages with this patch.
> > 
> > I wrote simple testcase and did check that folio_mapped() and page_mapped()
> > returns false for both order-0 page and larger pages. (and SLAB
> > returned true for them before)

FYI, This is still true even after fixing my mistaken test case (see below)

> > 
> > > Second, this patch changes the behaviour of PageSlab() when applied to
> > > tail pages.
> > 
> > Altough it changes the way it checks the flag,
> > 
> > it does not change behavior when applied to tail pages - PageSlab() on tail
> > page returns false with or without this patch.
> 
> Really?  It seems to me that it returns true at the moment.  Look:
> 
> __PAGEFLAG(Slab, slab, PF_NO_TAIL)
> 
> #define PF_NO_TAIL(page, enforce) ({                                    \
>                 VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
>                 PF_POISONED_CHECK(compound_head(page)); })
> 
> so AFAICS, PageSlab checks the Slab bit on the head page, not the
> tail page.

You are right. I misunderstood it due to my mistakenly written test case
(without passing __GFP_COMP... how silly of me :D)

Hmm okay, then I will implement PF_NO_TAIL policy that works on page_type.

> 
> > If PageSlab() need to return true for tail pages too,
> > we may make it check page_type at head page.
> > 
> > But I'm not sure when it the behavior is needed.
> > Can you please share your insight on this?
> 
> There are tools like tools/vm/page-types.c which expect PageSlab to
> return true for tail pages.
>
> > > Which raises the further question of what PageBuddy(),
> > > PageTable(), PageGuard() and PageIsolated() should do for multi-page
> > > folios, if that is even possible.
> > 
> > For users that uses real compound page like slab, we can make it check
> > page_type of head page. (if needed)
> > 
> > But for cases David described, there isn't much thing we can do
> > except making them to use real compound pages.
> > 
> > > Third, can we do this without that awkward __u16 thing?  Perhaps
> > > 
> > > -#define PG_buddy        0x00000080
> > > -#define PG_offline      0x00000100
> > > -#define PG_table        0x00000200
> > > -#define PG_guard        0x00000400
> > > +#define PG_buddy        0x00010000
> > > +#define PG_offline      0x00020000
> > > +#define PG_table        0x00040000
> > > +#define PG_guard        0x00080000
> > > +#define PG_slab         0x00100000
> > > 
> > > ... and then use wrappers in slab.c to access the bottom 16 bits?
> > 
> > Definitely! I prefer that way and will adjust in RFC v2.
> > 
> > Thank you for precious feedback.
> 
> No problem.  I suggested (in an off-list email) that you consider counting
> 'active' by subtraction rather than addition because I have a feeling that
> 
> int active(struct slab *slab)
> {
> 	return ~(slab->page_type | PG_slab);
> }
> 
> would be better than
> 
> int active(struct slab *slab)
> {
> 	return slab->page_type & 0xffff;
> }
> 
> at least in part because you don't have to clear the bottom 16 bits of
> page_type when you clear PG_slab, and you don't have to re-set them
> when you set PG_slab.

Yeah, I was wondering what is the benefit of the that approach. 
After implementing both approach, your suggestion seems better to me too.

Many thanks, Matthew!

-- 
Hyeonggon


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-10-08  4:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-19 12:57 [PATCH] mm: move PG_slab flag to page_type Hyeonggon Yoo
2022-09-19 12:59 ` [RFC PATCH] " Hyeonggon Yoo
2022-09-19 13:16 ` [PATCH] " Hyeonggon Yoo
2022-09-24 23:04 ` Matthew Wilcox
2022-09-26  7:55   ` David Hildenbrand
2022-10-07 13:36   ` Hyeonggon Yoo
2022-10-07 18:02     ` Matthew Wilcox
2022-10-08  4:21       ` Hyeonggon Yoo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).