linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
@ 2023-08-14 18:44 Peter Xu
  2023-08-14 18:44 ` [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX Peter Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Peter Xu @ 2023-08-14 18:44 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Mike Kravetz, David Hildenbrand, Matthew Wilcox, Andrew Morton,
	peterx, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

rfcv1: https://lore.kernel.org/r/20230810204944.53471-1-peterx@redhat.com

This is rfcv2 of the patch, where I split two small changes out from the
last patch.  Please refer to each patch for details.  The goal of the
series is to document clearly on how the fields in struct folio is reused
over tail pages, and make it clear on what can still be reused as free.

Smoke tested on x86_64 only, kernel-doc should have no change.

Comments welcomed.  Thanks.

Peter Xu (3):
  mm: Add TAIL_MAPPING_REUSED_MAX
  mm: Reorg and declare free spaces in struct folio tails
  mm: Proper document tail pages fields for folio

 include/linux/mm_types.h | 60 ++++++++++++++++++++++++++++++++++++----
 mm/huge_memory.c         |  6 ++--
 2 files changed, 58 insertions(+), 8 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX
  2023-08-14 18:44 [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Peter Xu
@ 2023-08-14 18:44 ` Peter Xu
  2023-08-14 18:52   ` Peter Xu
  2023-08-14 19:08   ` Matthew Wilcox
  2023-08-14 18:44 ` [PATCH RFC v2 2/3] mm: Reorg and declare free spaces in struct folio tails Peter Xu
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Peter Xu @ 2023-08-14 18:44 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Mike Kravetz, David Hildenbrand, Matthew Wilcox, Andrew Morton,
	peterx, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

Tail pages have a sanity check on ->mapping fields, not all of them but
only upon index>2.  It's because we reused ->mapping fields of the tail
pages index=0,1 for other things.

Define a macro for "max index of tail pages that got ->mapping field
reused" on top of folio definition, because when we grow folio tail pages
we'd want to boost this too together.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm_types.h | 9 +++++++++
 mm/huge_memory.c         | 6 +++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 291c05cacd48..3f2b0d46f5d6 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -248,6 +248,15 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
 	return (struct page *)(~ENCODE_PAGE_BITS & (unsigned long)page);
 }
 
+/*
+ * This macro defines the maximum tail pages (of a folio) that can have the
+ * page->mapping field reused (offset 12 for 32bits, or 24 for 64bits).
+ *
+ * When the tail page's mapping field reused, it'll be exempted from
+ * ->mapping poisoning and checks.  Also see the macro TAIL_MAPPING.
+ */
+#define  TAIL_MAPPING_REUSED_MAX  (2)
+
 /**
  * struct folio - Represents a contiguous set of bytes.
  * @flags: Identical to the page flags.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0b709d2c46c6..72f244e16dcb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2444,9 +2444,9 @@ static void __split_huge_page_tail(struct page *head, int tail,
 			 (1L << PG_dirty) |
 			 LRU_GEN_MASK | LRU_REFS_MASK));
 
-	/* ->mapping in first and second tail page is replaced by other uses */
-	VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
-			page_tail);
+	/* ->mapping in <=TAIL_MAPPING_REUSED_MAX tail pages are reused */
+	VM_BUG_ON_PAGE(tail > TAIL_MAPPING_REUSED_MAX &&
+		       page_tail->mapping != TAIL_MAPPING, page_tail);
 	page_tail->mapping = head->mapping;
 	page_tail->index = head->index + tail;
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 2/3] mm: Reorg and declare free spaces in struct folio tails
  2023-08-14 18:44 [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Peter Xu
  2023-08-14 18:44 ` [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX Peter Xu
@ 2023-08-14 18:44 ` Peter Xu
  2023-08-14 18:44 ` [PATCH RFC v2 3/3] mm: Proper document tail pages fields for folio Peter Xu
  2023-08-14 19:58 ` [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Matthew Wilcox
  3 siblings, 0 replies; 16+ messages in thread
From: Peter Xu @ 2023-08-14 18:44 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Mike Kravetz, David Hildenbrand, Matthew Wilcox, Andrew Morton,
	peterx, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

It's not 100% clear on what are the free spaces in the folio tail pages.
Currently we defined fields for only tail pages 1-2 but they're not really
fully occupied.  Add the fields to show what is free, and also reorg them a
bit to make 32/64 bits alignment easy.

Here _free_1_0 should be a constant hole (of 2 bytes) on any system, make
them explicit so people know they can be reused at any time.

_free_1_1 is special and need some attention: this will shift tail page 1's
fields starting from _entire_mapcount to be 4 bytes later.  I don't expect
this change much on real performance - if it will it might be good to have
_entire_mapcount and _nr_pages_mapped to be put on the same 8B alignment,
assuming that _pincount should be rarer to be used in real life.  But in
all cases the movement shouldn't change much on x86 or anything that has
64B cachelines.  This is the major reason why I had this change separate
from the upcoming documentation update patch - it may need some attention,
and when unwanted things happen (I don't expect) we quickly know what's
wrong.

_free_1_2 / _free_2_1 just calls out extra free spaces elsewhere and
shouldn't affect a thing just like _free_1_0.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm_types.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3f2b0d46f5d6..829f5adfded1 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -329,11 +329,21 @@ struct folio {
 	/* public: */
 			unsigned char _folio_dtor;
 			unsigned char _folio_order;
+	/* private: 2 bytes can be reused later */
+			unsigned char _free_1_0[2];
+#ifdef CONFIG_64BIT
+	/* 4 bytes can be reused later (64 bits only) */
+			unsigned char _free_1_1[4];
+#endif
+	/* public: */
 			atomic_t _entire_mapcount;
 			atomic_t _nr_pages_mapped;
 			atomic_t _pincount;
 #ifdef CONFIG_64BIT
 			unsigned int _folio_nr_pages;
+	/* private: 4 bytes can be reused later (64 bits only) */
+			unsigned char _free_1_2[4];
+	/* public: */
 #endif
 	/* private: the union with struct page is transitional */
 		};
@@ -355,6 +365,8 @@ struct folio {
 			unsigned long _head_2a;
 	/* public: */
 			struct list_head _deferred_list;
+	/* private: 8 more free bytes for either 32/64 bits */
+			unsigned char _free_2_1[8];
 	/* private: the union with struct page is transitional */
 		};
 		struct page __page_2;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH RFC v2 3/3] mm: Proper document tail pages fields for folio
  2023-08-14 18:44 [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Peter Xu
  2023-08-14 18:44 ` [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX Peter Xu
  2023-08-14 18:44 ` [PATCH RFC v2 2/3] mm: Reorg and declare free spaces in struct folio tails Peter Xu
@ 2023-08-14 18:44 ` Peter Xu
  2023-08-14 23:01   ` Randy Dunlap
  2023-08-14 19:58 ` [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Matthew Wilcox
  3 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-08-14 18:44 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Mike Kravetz, David Hildenbrand, Matthew Wilcox, Andrew Morton,
	peterx, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

Tail page struct reuse is over-comlicated.  Not only because we have
implicit uses of tail page fields (mapcounts, or private for thp swap
support, etc., that we _may_ still use in the page structs, but not obvious
the relationship between that and the folio definitions), but also because
we have 32/64 bits layouts for struct page so it's unclear what we can use
and what we cannot when trying to find a new spot in folio struct.

We also have tricks like page->mapping, where we can reuse only the tail
page 1/2 but nothing more than tail page 2.  It is all mostly hidden, until
someone starts to read into a VM_BUG_ON_PAGE() of __split_huge_page_tail().

It's also unclear on how many fields we can reuse for a tail page.  The
real answer is (after help from Matthew): we have 7 WORDs guaranteed on 64
bits and 8 WORDs on 32 bits.  Nothing more than that is guaranteed to even
exist.

Let's document it clearly on what we can use and what we can't when
extending folio on reusing tail page fields, with 100% explanations on each
of them.  Hopefully after the doc update it will make it easier when:

  (1) Any reader to know exactly what field is where and for what, the
      relationships between folio tail pages and struct page definitions,

  (2) Any potential new fields to be added to a large folio, so we're clear
      which field one can still reuse.

This is assuming WORD is defined as sizeof(void *) on any archs, just like
the other comment in struct page we already have.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm_types.h | 41 ++++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 829f5adfded1..9c744f70ae84 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -322,11 +322,40 @@ struct folio {
 		};
 		struct page page;
 	};
+	/*
+	 * Some of the tail page fields may not be reused by the folio
+	 * object because they're already been used by the page struct.  On
+	 * 32bits there're at least 8 WORDs while on 64 bits there're at
+	 * least 7 WORDs:
+	 *
+	 * |--------+-------------+-------------------|
+	 * |  index | 32 bits     | 64 bits           |
+	 * |--------+-------------+-------------------|
+	 * |      0 | flags       | flags             |
+	 * |      1 | head        | head              |
+	 * |      2 | FREE        | FREE              |
+	 * |      3 | FREE [1]    | FREE [1]          |
+	 * |      4 | FREE        | FREE              |
+	 * |      5 | FREE        | private [2]       |
+	 * |      6 | mapcnt      | mapcnt+refcnt [3] |
+	 * |      7 | refcnt [3]  |                   |
+	 * |--------+-------------+-------------------|
+	 *
+	 * [1] "mapping" field.  It is free to use but needs to be with
+	 *     some caution due to poisoning, see TAIL_MAPPING_REUSED_MAX.
+	 *
+	 * [2] "private" field, used when THP_SWAP is on (but disabled on
+	 *     32 bits, so this index is FREE on 32bit or hugetlb folios).
+	 *     May need to be fixed finally.
+	 *
+	 * [3] "refcount" field must be zero for all tail pages.  See e.g.
+	 *     has_unmovable_pages() on page_ref_count() check and comment.
+	 */
 	union {
 		struct {
 			unsigned long _flags_1;
 			unsigned long _head_1;
-	/* public: */
+	/* public: WORD 2 */
 			unsigned char _folio_dtor;
 			unsigned char _folio_order;
 	/* private: 2 bytes can be reused later */
@@ -335,7 +364,7 @@ struct folio {
 	/* 4 bytes can be reused later (64 bits only) */
 			unsigned char _free_1_1[4];
 #endif
-	/* public: */
+	/* public: WORD 3 */
 			atomic_t _entire_mapcount;
 			atomic_t _nr_pages_mapped;
 			atomic_t _pincount;
@@ -350,20 +379,20 @@ struct folio {
 		struct page __page_1;
 	};
 	union {
-		struct {
+		struct {	/* hugetlb folios */
 			unsigned long _flags_2;
 			unsigned long _head_2;
-	/* public: */
+	/* public: WORD 2 */
 			void *_hugetlb_subpool;
 			void *_hugetlb_cgroup;
 			void *_hugetlb_cgroup_rsvd;
 			void *_hugetlb_hwpoison;
 	/* private: the union with struct page is transitional */
 		};
-		struct {
+		struct {	/* non-hugetlb folios */
 			unsigned long _flags_2a;
 			unsigned long _head_2a;
-	/* public: */
+	/* public: WORD 2-3 */
 			struct list_head _deferred_list;
 	/* private: 8 more free bytes for either 32/64 bits */
 			unsigned char _free_2_1[8];
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX
  2023-08-14 18:44 ` [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX Peter Xu
@ 2023-08-14 18:52   ` Peter Xu
  2023-08-14 19:08   ` Matthew Wilcox
  1 sibling, 0 replies; 16+ messages in thread
From: Peter Xu @ 2023-08-14 18:52 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Mike Kravetz, David Hildenbrand, Matthew Wilcox, Andrew Morton,
	Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Mon, Aug 14, 2023 at 02:44:09PM -0400, Peter Xu wrote:
> Tail pages have a sanity check on ->mapping fields, not all of them but
> only upon index>2.  It's because we reused ->mapping fields of the tail
> pages index=0,1 for other things.
> 
> Define a macro for "max index of tail pages that got ->mapping field
> reused" on top of folio definition, because when we grow folio tail pages
> we'd want to boost this too together.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/linux/mm_types.h | 9 +++++++++
>  mm/huge_memory.c         | 6 +++---
>  2 files changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 291c05cacd48..3f2b0d46f5d6 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -248,6 +248,15 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
>  	return (struct page *)(~ENCODE_PAGE_BITS & (unsigned long)page);
>  }
>  
> +/*
> + * This macro defines the maximum tail pages (of a folio) that can have the
> + * page->mapping field reused (offset 12 for 32bits, or 24 for 64bits).
> + *
> + * When the tail page's mapping field reused, it'll be exempted from
> + * ->mapping poisoning and checks.  Also see the macro TAIL_MAPPING.
> + */
> +#define  TAIL_MAPPING_REUSED_MAX  (2)
> +
>  /**
>   * struct folio - Represents a contiguous set of bytes.
>   * @flags: Identical to the page flags.
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 0b709d2c46c6..72f244e16dcb 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2444,9 +2444,9 @@ static void __split_huge_page_tail(struct page *head, int tail,
>  			 (1L << PG_dirty) |
>  			 LRU_GEN_MASK | LRU_REFS_MASK));
>  
> -	/* ->mapping in first and second tail page is replaced by other uses */
> -	VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
> -			page_tail);
> +	/* ->mapping in <=TAIL_MAPPING_REUSED_MAX tail pages are reused */
> +	VM_BUG_ON_PAGE(tail > TAIL_MAPPING_REUSED_MAX &&
> +		       page_tail->mapping != TAIL_MAPPING, page_tail);
>  	page_tail->mapping = head->mapping;
>  	page_tail->index = head->index + tail;

I tend to also squash this in when repost:

diff --git a/mm/internal.h b/mm/internal.h
index 02a6fd36f46e..a75df0bd1f89 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -428,7 +428,8 @@ static inline void prep_compound_tail(struct page *head, int tail_idx)
 {
        struct page *p = head + tail_idx;

-       p->mapping = TAIL_MAPPING;
+       if (tail_idx > TAIL_MAPPING_REUSED_MAX)
+               p->mapping = TAIL_MAPPING;
        set_compound_head(p, head);
        set_page_private(p, 0);
 }

-- 
Peter Xu


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX
  2023-08-14 18:44 ` [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX Peter Xu
  2023-08-14 18:52   ` Peter Xu
@ 2023-08-14 19:08   ` Matthew Wilcox
  2023-08-14 19:51     ` Peter Xu
  1 sibling, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2023-08-14 19:08 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Mon, Aug 14, 2023 at 02:44:09PM -0400, Peter Xu wrote:
> +/*
> + * This macro defines the maximum tail pages (of a folio) that can have the
> + * page->mapping field reused (offset 12 for 32bits, or 24 for 64bits).

No, don't say how many bytes into the structure something is.  It'll
only get out of date.  If somebody needs to know, use pahole.

> + * When the tail page's mapping field reused, it'll be exempted from
> + * ->mapping poisoning and checks.  Also see the macro TAIL_MAPPING.
> + */
> +#define  TAIL_MAPPING_REUSED_MAX  (2)

More importantly, I think this is over-parametrisation.  If you start to
use extra fields in struct folio, just change the code in page_alloc.c
directly.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX
  2023-08-14 19:08   ` Matthew Wilcox
@ 2023-08-14 19:51     ` Peter Xu
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Xu @ 2023-08-14 19:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Mon, Aug 14, 2023 at 08:08:57PM +0100, Matthew Wilcox wrote:
> On Mon, Aug 14, 2023 at 02:44:09PM -0400, Peter Xu wrote:
> > +/*
> > + * This macro defines the maximum tail pages (of a folio) that can have the
> > + * page->mapping field reused (offset 12 for 32bits, or 24 for 64bits).
> 
> No, don't say how many bytes into the structure something is.  It'll
> only get out of date.  If somebody needs to know, use pahole.

OK.

> 
> > + * When the tail page's mapping field reused, it'll be exempted from
> > + * ->mapping poisoning and checks.  Also see the macro TAIL_MAPPING.
> > + */
> > +#define  TAIL_MAPPING_REUSED_MAX  (2)
> 
> More importantly, I think this is over-parametrisation.  If you start to
> use extra fields in struct folio, just change the code in page_alloc.c
> directly.

One should at least also need to change __split_huge_page_tail() on the
BUG_ON() with the hard-coded "tail > 2"?

I wanted to link all these pieces together, and the use case is when anyone
would like to e.g. reuse tail page 3 of a folio, hence put the macro here.

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-14 18:44 [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Peter Xu
                   ` (2 preceding siblings ...)
  2023-08-14 18:44 ` [PATCH RFC v2 3/3] mm: Proper document tail pages fields for folio Peter Xu
@ 2023-08-14 19:58 ` Matthew Wilcox
  2023-08-14 20:21   ` Peter Xu
  2023-08-14 23:01   ` Randy Dunlap
  3 siblings, 2 replies; 16+ messages in thread
From: Matthew Wilcox @ 2023-08-14 19:58 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Mon, Aug 14, 2023 at 02:44:08PM -0400, Peter Xu wrote:

Look, this is all still too complicated.  And you're trying to make
something better that I'm trying to make disappear.  I'd really rather
you spent your time worrying about making userfaultfd use folios
than faffing with this.

How about this?

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5e74ce4a28cd..873285bb5d45 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -334,11 +334,14 @@ struct folio {
 	/* public: */
 			unsigned char _folio_dtor;
 			unsigned char _folio_order;
+			/* two bytes available here */
 			atomic_t _entire_mapcount;
 			atomic_t _nr_pages_mapped;
 			atomic_t _pincount;
+			/* no more space on 32-bt */
 #ifdef CONFIG_64BIT
 			unsigned int _folio_nr_pages;
+			/* twelve bytes available on 64-bit */
 #endif
 	/* private: the union with struct page is transitional */
 		};
@@ -360,6 +363,7 @@ struct folio {
 			unsigned long _head_2a;
 	/* public: */
 			struct list_head _deferred_list;
+			/* three more words available here */
 	/* private: the union with struct page is transitional */
 		};
 		struct page __page_2;

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-14 19:58 ` [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Matthew Wilcox
@ 2023-08-14 20:21   ` Peter Xu
  2023-08-15  3:45     ` Matthew Wilcox
  2023-08-14 23:01   ` Randy Dunlap
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-08-14 20:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Mon, Aug 14, 2023 at 08:58:44PM +0100, Matthew Wilcox wrote:
> On Mon, Aug 14, 2023 at 02:44:08PM -0400, Peter Xu wrote:
> 
> Look, this is all still too complicated.  And you're trying to make
> something better that I'm trying to make disappear.  I'd really rather
> you spent your time worrying about making userfaultfd use folios
> than faffing with this.

I saw that internally some of uffd already start to use folio, while I
don't think the syscall part needs changing yet - the ranged API should
work for folio when it comes, and other than that folio should be hidden
and transparent, afaiu.

Do you mean when large folios can land on anon/shmem we can start to
allocate large folios there for uffd operations?  Or something else?

> 
> How about this?

I still prefer my version, sorry. But I agree this is better than nothing
to guide what's free to use - it's really not obvious to me, and I suppose
true to most people.  Besides..

> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5e74ce4a28cd..873285bb5d45 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -334,11 +334,14 @@ struct folio {
>  	/* public: */
>  			unsigned char _folio_dtor;
>  			unsigned char _folio_order;
> +			/* two bytes available here */
>  			atomic_t _entire_mapcount;
>  			atomic_t _nr_pages_mapped;
>  			atomic_t _pincount;
> +			/* no more space on 32-bt */
>  #ifdef CONFIG_64BIT
>  			unsigned int _folio_nr_pages;
> +			/* twelve bytes available on 64-bit */
>  #endif
>  	/* private: the union with struct page is transitional */
>  		};
> @@ -360,6 +363,7 @@ struct folio {
>  			unsigned long _head_2a;
>  	/* public: */
>  			struct list_head _deferred_list;
> +			/* three more words available here */

.. not really three more words here but 2 for 32 bits and 1 for 64 bits.
In my patch 3 I used "8 bytes free" so it's applicable to both.

I can figure it out in ten seconds now with my documents..

	 * |--------+-------------+-------------------|
	 * |  index | 32 bits     | 64 bits           |
	 * |--------+-------------+-------------------|
	 * |      0 | flags       | flags             |
	 * |      1 | head        | head              |
	 * |      2 | FREE        | FREE              |
	 * |      3 | FREE [1]    | FREE [1]          |
	 * |      4 | FREE        | FREE              |
	 * |      5 | FREE        | private [2]       |
	 * |      6 | mapcnt      | mapcnt+refcnt [3] |
	 * |      7 | refcnt [3]  |                   |
	 * |--------+-------------+-------------------|

Then...

	/* public: WORD 2-3 */
			struct list_head _deferred_list;
        <----- so after this we have WORDs 4/5 free on 32bits and 4 only on
               64 bits, because WORD 5 is used on 64bits

... but I won't be able to if without these documents.  I hope it justifies
that it's still worthwhile.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-14 19:58 ` [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Matthew Wilcox
  2023-08-14 20:21   ` Peter Xu
@ 2023-08-14 23:01   ` Randy Dunlap
  1 sibling, 0 replies; 16+ messages in thread
From: Randy Dunlap @ 2023-08-14 23:01 UTC (permalink / raw)
  To: Matthew Wilcox, Peter Xu
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov



On 8/14/23 12:58, Matthew Wilcox wrote:
> On Mon, Aug 14, 2023 at 02:44:08PM -0400, Peter Xu wrote:
> 
> Look, this is all still too complicated.  And you're trying to make
> something better that I'm trying to make disappear.  I'd really rather
> you spent your time worrying about making userfaultfd use folios
> than faffing with this.
> 
> How about this?
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5e74ce4a28cd..873285bb5d45 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -334,11 +334,14 @@ struct folio {
>   	/* public: */
>   			unsigned char _folio_dtor;
>   			unsigned char _folio_order;
> +			/* two bytes available here */
>   			atomic_t _entire_mapcount;
>   			atomic_t _nr_pages_mapped;
>   			atomic_t _pincount;
> +			/* no more space on 32-bt */

			                    32-bit

>   #ifdef CONFIG_64BIT
>   			unsigned int _folio_nr_pages;
> +			/* twelve bytes available on 64-bit */
>   #endif
>   	/* private: the union with struct page is transitional */
>   		};
> @@ -360,6 +363,7 @@ struct folio {
>   			unsigned long _head_2a;
>   	/* public: */
>   			struct list_head _deferred_list;
> +			/* three more words available here */
>   	/* private: the union with struct page is transitional */
>   		};
>   		struct page __page_2;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 3/3] mm: Proper document tail pages fields for folio
  2023-08-14 18:44 ` [PATCH RFC v2 3/3] mm: Proper document tail pages fields for folio Peter Xu
@ 2023-08-14 23:01   ` Randy Dunlap
  0 siblings, 0 replies; 16+ messages in thread
From: Randy Dunlap @ 2023-08-14 23:01 UTC (permalink / raw)
  To: Peter Xu, linux-mm, linux-kernel
  Cc: Mike Kravetz, David Hildenbrand, Matthew Wilcox, Andrew Morton,
	Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

Hi--

On 8/14/23 11:44, Peter Xu wrote:
> Tail page struct reuse is over-comlicated.  Not only because we have
> implicit uses of tail page fields (mapcounts, or private for thp swap
> support, etc., that we _may_ still use in the page structs, but not obvious
> the relationship between that and the folio definitions), but also because
> we have 32/64 bits layouts for struct page so it's unclear what we can use
> and what we cannot when trying to find a new spot in folio struct.
> 
> We also have tricks like page->mapping, where we can reuse only the tail
> page 1/2 but nothing more than tail page 2.  It is all mostly hidden, until
> someone starts to read into a VM_BUG_ON_PAGE() of __split_huge_page_tail().
> 
> It's also unclear on how many fields we can reuse for a tail page.  The
> real answer is (after help from Matthew): we have 7 WORDs guaranteed on 64
> bits and 8 WORDs on 32 bits.  Nothing more than that is guaranteed to even
> exist.
> 
> Let's document it clearly on what we can use and what we can't when
> extending folio on reusing tail page fields, with 100% explanations on each
> of them.  Hopefully after the doc update it will make it easier when:
> 
>    (1) Any reader to know exactly what field is where and for what, the
>        relationships between folio tail pages and struct page definitions,
> 
>    (2) Any potential new fields to be added to a large folio, so we're clear
>        which field one can still reuse.
> 
> This is assuming WORD is defined as sizeof(void *) on any archs, just like
> the other comment in struct page we already have.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   include/linux/mm_types.h | 41 ++++++++++++++++++++++++++++++++++------
>   1 file changed, 35 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 829f5adfded1..9c744f70ae84 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -322,11 +322,40 @@ struct folio {
>   		};
>   		struct page page;
>   	};
> +	/*
> +	 * Some of the tail page fields may not be reused by the folio
> +	 * object because they're already been used by the page struct.  On

	                  they have

> +	 * 32bits there're at least 8 WORDs while on 64 bits there're at

preferably s/there're/there are/

> +	 * least 7 WORDs:
> +	 *
> +	 * |--------+-------------+-------------------|
> +	 * |  index | 32 bits     | 64 bits           |
> +	 * |--------+-------------+-------------------|
> +	 * |      0 | flags       | flags             |
> +	 * |      1 | head        | head              |
> +	 * |      2 | FREE        | FREE              |
> +	 * |      3 | FREE [1]    | FREE [1]          |
> +	 * |      4 | FREE        | FREE              |
> +	 * |      5 | FREE        | private [2]       |
> +	 * |      6 | mapcnt      | mapcnt+refcnt [3] |
> +	 * |      7 | refcnt [3]  |                   |
> +	 * |--------+-------------+-------------------|
> +	 *
> +	 * [1] "mapping" field.  It is free to use but needs to be with
> +	 *     some caution due to poisoning, see TAIL_MAPPING_REUSED_MAX.
> +	 *
> +	 * [2] "private" field, used when THP_SWAP is on (but disabled on
> +	 *     32 bits, so this index is FREE on 32bit or hugetlb folios).
> +	 *     May need to be fixed finally.
> +	 *
> +	 * [3] "refcount" field must be zero for all tail pages.  See e.g.
> +	 *     has_unmovable_pages() on page_ref_count() check and comment.
> +	 */
>   	union {
>   		struct {
>   			unsigned long _flags_1;
>   			unsigned long _head_1;
> -	/* public: */
> +	/* public: WORD 2 */
>   			unsigned char _folio_dtor;
>   			unsigned char _folio_order;
>   	/* private: 2 bytes can be reused later */
> @@ -335,7 +364,7 @@ struct folio {
>   	/* 4 bytes can be reused later (64 bits only) */
>   			unsigned char _free_1_1[4];
>   #endif
> -	/* public: */
> +	/* public: WORD 3 */
>   			atomic_t _entire_mapcount;
>   			atomic_t _nr_pages_mapped;
>   			atomic_t _pincount;
> @@ -350,20 +379,20 @@ struct folio {
>   		struct page __page_1;
>   	};
>   	union {
> -		struct {
> +		struct {	/* hugetlb folios */
>   			unsigned long _flags_2;
>   			unsigned long _head_2;
> -	/* public: */
> +	/* public: WORD 2 */
>   			void *_hugetlb_subpool;
>   			void *_hugetlb_cgroup;
>   			void *_hugetlb_cgroup_rsvd;
>   			void *_hugetlb_hwpoison;
>   	/* private: the union with struct page is transitional */
>   		};
> -		struct {
> +		struct {	/* non-hugetlb folios */
>   			unsigned long _flags_2a;
>   			unsigned long _head_2a;
> -	/* public: */
> +	/* public: WORD 2-3 */
>   			struct list_head _deferred_list;
>   	/* private: 8 more free bytes for either 32/64 bits */
>   			unsigned char _free_2_1[8];

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-14 20:21   ` Peter Xu
@ 2023-08-15  3:45     ` Matthew Wilcox
  2023-08-15 19:37       ` Peter Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2023-08-15  3:45 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Mon, Aug 14, 2023 at 04:21:55PM -0400, Peter Xu wrote:
> On Mon, Aug 14, 2023 at 08:58:44PM +0100, Matthew Wilcox wrote:
> > On Mon, Aug 14, 2023 at 02:44:08PM -0400, Peter Xu wrote:
> > 
> > Look, this is all still too complicated.  And you're trying to make
> > something better that I'm trying to make disappear.  I'd really rather
> > you spent your time worrying about making userfaultfd use folios
> > than faffing with this.
> 
> I saw that internally some of uffd already start to use folio, while I
> don't think the syscall part needs changing yet - the ranged API should
> work for folio when it comes, and other than that folio should be hidden
> and transparent, afaiu.
> 
> Do you mean when large folios can land on anon/shmem we can start to
> allocate large folios there for uffd operations?  Or something else?

Hm, I thought there were some parts that still needed to be converted.
But I don't see anything obvious right now.

> > @@ -360,6 +363,7 @@ struct folio {
> >  			unsigned long _head_2a;
> >  	/* public: */
> >  			struct list_head _deferred_list;
> > +			/* three more words available here */
> 
> .. not really three more words here but 2 for 32 bits and 1 for 64 bits.
> In my patch 3 I used "8 bytes free" so it's applicable to both.

I always forget about THP_SWAP using tail->private.  That actually needs
to be asserted by the compiler, not just documented.  Something along
these lines.

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 659c7b84726c..3880b3f2e321 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -340,8 +340,11 @@ struct folio {
 			atomic_t _pincount;
 #ifdef CONFIG_64BIT
 			unsigned int _folio_nr_pages;
-#endif
+			/* 4 byte gap here */
 	/* private: the union with struct page is transitional */
+			/* Fix THP_SWAP to not use tail->private */
+			unsigned long _private_1;
+#endif
 		};
 		struct page __page_1;
 	};
@@ -362,6 +365,9 @@ struct folio {
 	/* public: */
 			struct list_head _deferred_list;
 	/* private: the union with struct page is transitional */
+			unsigned long _avail_2a;
+			/* Fix THP_SWAP to not use tail->private */
+			unsigned long _private_2a;
 		};
 		struct page __page_2;
 	};
@@ -386,12 +392,18 @@ FOLIO_MATCH(memcg_data, memcg_data);
 			offsetof(struct page, pg) + sizeof(struct page))
 FOLIO_MATCH(flags, _flags_1);
 FOLIO_MATCH(compound_head, _head_1);
+#ifdef CONFIG_64BIT
+FOLIO_MATCH(private, _private_1);
+#endif
 #undef FOLIO_MATCH
 #define FOLIO_MATCH(pg, fl)						\
 	static_assert(offsetof(struct folio, fl) ==			\
 			offsetof(struct page, pg) + 2 * sizeof(struct page))
 FOLIO_MATCH(flags, _flags_2);
 FOLIO_MATCH(compound_head, _head_2);
+FOLIO_MATCH(flags, _flags_2a);
+FOLIO_MATCH(compound_head, _head_2a);
+FOLIO_MATCH(private, _private_2a);
 #undef FOLIO_MATCH
 
 /*

This is against the patchset I just posted which frees up a word in the
first tail page.


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-15  3:45     ` Matthew Wilcox
@ 2023-08-15 19:37       ` Peter Xu
  2023-08-15 20:16         ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-08-15 19:37 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Tue, Aug 15, 2023 at 04:45:52AM +0100, Matthew Wilcox wrote:
> I always forget about THP_SWAP using tail->private.  That actually needs
> to be asserted by the compiler, not just documented.  Something along
> these lines.
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 659c7b84726c..3880b3f2e321 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -340,8 +340,11 @@ struct folio {
>  			atomic_t _pincount;
>  #ifdef CONFIG_64BIT
>  			unsigned int _folio_nr_pages;
> -#endif
> +			/* 4 byte gap here */
>  	/* private: the union with struct page is transitional */
> +			/* Fix THP_SWAP to not use tail->private */
> +			unsigned long _private_1;
> +#endif
>  		};
>  		struct page __page_1;
>  	};
> @@ -362,6 +365,9 @@ struct folio {
>  	/* public: */
>  			struct list_head _deferred_list;
>  	/* private: the union with struct page is transitional */
> +			unsigned long _avail_2a;
> +			/* Fix THP_SWAP to not use tail->private */
> +			unsigned long _private_2a;
>  		};
>  		struct page __page_2;
>  	};
> @@ -386,12 +392,18 @@ FOLIO_MATCH(memcg_data, memcg_data);
>  			offsetof(struct page, pg) + sizeof(struct page))
>  FOLIO_MATCH(flags, _flags_1);
>  FOLIO_MATCH(compound_head, _head_1);
> +#ifdef CONFIG_64BIT
> +FOLIO_MATCH(private, _private_1);
> +#endif
>  #undef FOLIO_MATCH
>  #define FOLIO_MATCH(pg, fl)						\
>  	static_assert(offsetof(struct folio, fl) ==			\
>  			offsetof(struct page, pg) + 2 * sizeof(struct page))
>  FOLIO_MATCH(flags, _flags_2);
>  FOLIO_MATCH(compound_head, _head_2);
> +FOLIO_MATCH(flags, _flags_2a);
> +FOLIO_MATCH(compound_head, _head_2a);
> +FOLIO_MATCH(private, _private_2a);
>  #undef FOLIO_MATCH
>  
>  /*
> 
> This is against the patchset I just posted which frees up a word in the
> first tail page.

Okay, I assume you meant to suggest leverage FOLIO_MATCH(), which I can
definitely try.  But then I'd hope it covers not only private field but all
the fields that the tail pages reuses; the goal is to document everything
no matter in what form.  I'll see what I can get..  Thanks.

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-15 19:37       ` Peter Xu
@ 2023-08-15 20:16         ` Matthew Wilcox
  2023-08-15 20:39           ` Peter Xu
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2023-08-15 20:16 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Tue, Aug 15, 2023 at 03:37:21PM -0400, Peter Xu wrote:
> On Tue, Aug 15, 2023 at 04:45:52AM +0100, Matthew Wilcox wrote:
> > I always forget about THP_SWAP using tail->private.  That actually needs
> > to be asserted by the compiler, not just documented.  Something along
> > these lines.
> > 
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 659c7b84726c..3880b3f2e321 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -340,8 +340,11 @@ struct folio {
> >  			atomic_t _pincount;
> >  #ifdef CONFIG_64BIT
> >  			unsigned int _folio_nr_pages;
> > -#endif
> > +			/* 4 byte gap here */
> >  	/* private: the union with struct page is transitional */
> > +			/* Fix THP_SWAP to not use tail->private */
> > +			unsigned long _private_1;
> > +#endif
> >  		};
> >  		struct page __page_1;
> >  	};
> > @@ -362,6 +365,9 @@ struct folio {
> >  	/* public: */
> >  			struct list_head _deferred_list;
> >  	/* private: the union with struct page is transitional */
> > +			unsigned long _avail_2a;
> > +			/* Fix THP_SWAP to not use tail->private */
> > +			unsigned long _private_2a;
> >  		};
> >  		struct page __page_2;
> >  	};
> > @@ -386,12 +392,18 @@ FOLIO_MATCH(memcg_data, memcg_data);
> >  			offsetof(struct page, pg) + sizeof(struct page))
> >  FOLIO_MATCH(flags, _flags_1);
> >  FOLIO_MATCH(compound_head, _head_1);
> > +#ifdef CONFIG_64BIT
> > +FOLIO_MATCH(private, _private_1);
> > +#endif
> >  #undef FOLIO_MATCH
> >  #define FOLIO_MATCH(pg, fl)						\
> >  	static_assert(offsetof(struct folio, fl) ==			\
> >  			offsetof(struct page, pg) + 2 * sizeof(struct page))
> >  FOLIO_MATCH(flags, _flags_2);
> >  FOLIO_MATCH(compound_head, _head_2);
> > +FOLIO_MATCH(flags, _flags_2a);
> > +FOLIO_MATCH(compound_head, _head_2a);
> > +FOLIO_MATCH(private, _private_2a);
> >  #undef FOLIO_MATCH
> >  
> >  /*
> > 
> > This is against the patchset I just posted which frees up a word in the
> > first tail page.
> 
> Okay, I assume you meant to suggest leverage FOLIO_MATCH(), which I can
> definitely try.  But then I'd hope it covers not only private field but all
> the fields that the tail pages reuses; the goal is to document everything
> no matter in what form.  I'll see what I can get..  Thanks.

No, sometimes there are things which shouldn't be documented because they
don't matter, and when changing code sometimes we forget to change the
documentation, and then people read the documentation which is different
from the code, and they get confused.

It matters that the various 'private' members line up.  It matters
that folio->index matches page->index.  It does not matter what
offset _entire_mapcount is at.  That can be moved around freely and no
documentation needs to be changed.

I don't want you to use FOLIO_MATCH to make any unnecessary assertions.
The only assertion missing is for _private_1 and _private_2a, and that's
why I wrote a patch to add them.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-15 20:16         ` Matthew Wilcox
@ 2023-08-15 20:39           ` Peter Xu
  2023-08-15 21:03             ` Matthew Wilcox
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Xu @ 2023-08-15 20:39 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Tue, Aug 15, 2023 at 09:16:47PM +0100, Matthew Wilcox wrote:
> No, sometimes there are things which shouldn't be documented because they
> don't matter, and when changing code sometimes we forget to change the
> documentation, and then people read the documentation which is different
> from the code, and they get confused.
> 
> It matters that the various 'private' members line up.  It matters
> that folio->index matches page->index.  It does not matter what
> offset _entire_mapcount is at.  That can be moved around freely and no
> documentation needs to be changed.
> 
> I don't want you to use FOLIO_MATCH to make any unnecessary assertions.
> The only assertion missing is for _private_1 and _private_2a, and that's
> why I wrote a patch to add them.

I didn't mean to add assertions for _entire_mapcount (I don't even know
how..), but _mapcount and _refcount to clamp the fields, then all fields
can be clear, just like head/flags clamping the start of fields.

One thing I can understand that you'd like to avoid these "offset" things
is perhaps because you keep that in mind to, one day, have mmdesc replacing
folio so folio doesn't need to match struct page at all some day,
ideally. The order of fields, size of fields, etc. none of them should
matter, when it comes, and we should go toward that.  However my argument
would be that, before that day comes IMHO we need some good documentation
for us to know how the fields look like now, why they worked, and how to
reuse new fields.. when that comes, we can just safely remove these
documentations.

It's just that these 'offset's still matter and matters a lot now, imho,
but it's very confusing when read without some help.

Let me try one more time to see how you think about it on an rfcv3.  If
that still doesn't get any form of ack from you, I'll put this aside.

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio
  2023-08-15 20:39           ` Peter Xu
@ 2023-08-15 21:03             ` Matthew Wilcox
  0 siblings, 0 replies; 16+ messages in thread
From: Matthew Wilcox @ 2023-08-15 21:03 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, Mike Kravetz, David Hildenbrand,
	Andrew Morton, Yu Zhao, Ryan Roberts, Yang Shi, Hugh Dickins,
	Kirill A . Shutemov

On Tue, Aug 15, 2023 at 04:39:16PM -0400, Peter Xu wrote:
> On Tue, Aug 15, 2023 at 09:16:47PM +0100, Matthew Wilcox wrote:
> > No, sometimes there are things which shouldn't be documented because they
> > don't matter, and when changing code sometimes we forget to change the
> > documentation, and then people read the documentation which is different
> > from the code, and they get confused.
> > 
> > It matters that the various 'private' members line up.  It matters
> > that folio->index matches page->index.  It does not matter what
> > offset _entire_mapcount is at.  That can be moved around freely and no
> > documentation needs to be changed.
> > 
> > I don't want you to use FOLIO_MATCH to make any unnecessary assertions.
> > The only assertion missing is for _private_1 and _private_2a, and that's
> > why I wrote a patch to add them.
> 
> I didn't mean to add assertions for _entire_mapcount (I don't even know
> how..), but _mapcount and _refcount to clamp the fields, then all fields
> can be clear, just like head/flags clamping the start of fields.

Ah!  mapcount does make sense, yes.  We could just put a
			/* no more space here */
comment in, but an assert works too.

> One thing I can understand that you'd like to avoid these "offset" things
> is perhaps because you keep that in mind to, one day, have mmdesc replacing
> folio so folio doesn't need to match struct page at all some day,
> ideally. The order of fields, size of fields, etc. none of them should
> matter, when it comes, and we should go toward that.  However my argument
> would be that, before that day comes IMHO we need some good documentation
> for us to know how the fields look like now, why they worked, and how to
> reuse new fields.. when that comes, we can just safely remove these
> documentations.
> 
> It's just that these 'offset's still matter and matters a lot now, imho,
> but it's very confusing when read without some help.

No, that's not why I'm opposed to them.  I'm opposed to over-documenting
things, as I just outlined.  Documentation is necessary and good for all
kinds of reasons, but it should be meaningful and not prone to rot.  If
there's a tool that can tell you something, there's no point in
documenting it; that's why I pointed you towards pahole.  I also
like "documentation" which is checked by the compiler, hence the
existence of the FOLIO_MATCH macro which documents that the two
structures line up, and the compiler checks that they do.  FOLIO_MATCH
even caught a bug!

> Let me try one more time to see how you think about it on an rfcv3.  If
> that still doesn't get any form of ack from you, I'll put this aside.

At least we've got to something that I like the idea of ;-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-08-15 21:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14 18:44 [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Peter Xu
2023-08-14 18:44 ` [PATCH RFC v2 1/3] mm: Add TAIL_MAPPING_REUSED_MAX Peter Xu
2023-08-14 18:52   ` Peter Xu
2023-08-14 19:08   ` Matthew Wilcox
2023-08-14 19:51     ` Peter Xu
2023-08-14 18:44 ` [PATCH RFC v2 2/3] mm: Reorg and declare free spaces in struct folio tails Peter Xu
2023-08-14 18:44 ` [PATCH RFC v2 3/3] mm: Proper document tail pages fields for folio Peter Xu
2023-08-14 23:01   ` Randy Dunlap
2023-08-14 19:58 ` [PATCH RFC v2 0/3] mm: Properly document tail pages for a folio Matthew Wilcox
2023-08-14 20:21   ` Peter Xu
2023-08-15  3:45     ` Matthew Wilcox
2023-08-15 19:37       ` Peter Xu
2023-08-15 20:16         ` Matthew Wilcox
2023-08-15 20:39           ` Peter Xu
2023-08-15 21:03             ` Matthew Wilcox
2023-08-14 23:01   ` Randy Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).