linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Zi Yan <ziy@nvidia.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Yang Shi" <shy828301@gmail.com>, "Yu Zhao" <yuzhao@google.com>,
	linux-mm@kvack.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	"Ryan Roberts" <ryan.roberts@arm.com>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Zach O'Keefe" <zokeefe@google.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v3 6/7] mm: truncate: split huge page cache page to a non-zero order if possible.
Date: Sun, 16 Apr 2023 12:44:53 -0700 (PDT)	[thread overview]
Message-ID: <9dd96da-efa2-5123-20d4-4992136ef3ad@google.com> (raw)
In-Reply-To: <20230403201839.4097845-7-zi.yan@sent.com>

On Mon, 3 Apr 2023, Zi Yan wrote:

> From: Zi Yan <ziy@nvidia.com>
> 
> To minimize the number of pages after a huge page truncation, we do not
> need to split it all the way down to order-0. The huge page has at most
> three parts, the part before offset, the part to be truncated, the part
> remaining at the end. Find the greatest common divisor of them to
> calculate the new page order from it, so we can split the huge
> page to this order and keep the remaining pages as large and as few as
> possible.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  mm/truncate.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/truncate.c b/mm/truncate.c
> index 86de31ed4d32..817efd5e94b4 100644
> --- a/mm/truncate.c
> +++ b/mm/truncate.c
> @@ -22,6 +22,7 @@
>  #include <linux/buffer_head.h>	/* grr. try_to_release_page */
>  #include <linux/shmem_fs.h>
>  #include <linux/rmap.h>
> +#include <linux/gcd.h>

Really?

>  #include "internal.h"
>  
>  /*
> @@ -211,7 +212,8 @@ int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
>  bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  {
>  	loff_t pos = folio_pos(folio);
> -	unsigned int offset, length;
> +	unsigned int offset, length, remaining;
> +	unsigned int new_order = folio_order(folio);
>  
>  	if (pos < start)
>  		offset = start - pos;
> @@ -222,6 +224,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  		length = length - offset;
>  	else
>  		length = end + 1 - pos - offset;
> +	remaining = folio_size(folio) - offset - length;
>  
>  	folio_wait_writeback(folio);
>  	if (length == folio_size(folio)) {
> @@ -236,11 +239,25 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
>  	 */
>  	folio_zero_range(folio, offset, length);
>  
> +	/*
> +	 * Use the greatest common divisor of offset, length, and remaining
> +	 * as the smallest page size and compute the new order from it. So we
> +	 * can truncate a subpage as large as possible. Round up gcd to
> +	 * PAGE_SIZE, otherwise ilog2 can give -1 when gcd/PAGE_SIZE is 0.
> +	 */
> +	new_order = ilog2(round_up(gcd(gcd(offset, length), remaining),
> +				   PAGE_SIZE) / PAGE_SIZE);

Gosh.  In mm/readahead.c I can see "order = __ffs(index)",
and I think something along those lines would be more appropriate here.

But, if there's any value at all to choosing intermediate orders here in
truncation, I don't think choosing a single order is the right approach -
more easily implemented, yes, but is it worth doing?

What you'd actually want (if anything) is to choose the largest orders
possible, with smaller and smaller orders filling in the rest (I expect
there's a technical name for this, but I don't remember - bin packing
is something else, I think).

As this code stands, truncate a 2M huge page at 1M and you get two 1M
pieces (one then discarded) - nice; but truncate it at 1M+1 and you get
lots of order 2 (forced up from 1) pieces.  Seems weird, and not worth
the effort.

Hugh

> +
> +	/* order-1 THP not supported, downgrade to order-0 */
> +	if (new_order == 1)
> +		new_order = 0;
> +
> +
>  	if (folio_has_private(folio))
>  		folio_invalidate(folio, offset, length);
>  	if (!folio_test_large(folio))
>  		return true;
> -	if (split_folio(folio) == 0)
> +	if (split_huge_page_to_list_to_order(&folio->page, NULL, new_order) == 0)
>  		return true;
>  	if (folio_test_dirty(folio))
>  		return false;
> -- 
> 2.39.2


  reply	other threads:[~2023-04-16 19:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-03 20:18 [PATCH v3 0/7] Split a folio to any lower order folios Zi Yan
2023-04-03 20:18 ` [PATCH v3 1/7] mm/memcg: use order instead of nr in split_page_memcg() Zi Yan
2023-04-03 20:18 ` [PATCH v3 2/7] mm/page_owner: use order instead of nr in split_page_owner() Zi Yan
2023-04-03 20:18 ` [PATCH v3 3/7] mm: memcg: make memcg huge page split support any order split Zi Yan
2023-04-03 20:18 ` [PATCH v3 4/7] mm: page_owner: add support for splitting to any order in split page_owner Zi Yan
2023-04-03 20:18 ` [PATCH v3 5/7] mm: thp: split huge page to any lower order pages Zi Yan
2023-04-16 19:25   ` Hugh Dickins
2023-04-17 14:49     ` Zi Yan
2023-04-03 20:18 ` [PATCH v3 6/7] mm: truncate: split huge page cache page to a non-zero order if possible Zi Yan
2023-04-16 19:44   ` Hugh Dickins [this message]
2023-04-16 19:51     ` Hugh Dickins
2023-04-17 15:20     ` Zi Yan
2023-04-18  1:05       ` Hugh Dickins
2023-04-03 20:18 ` [PATCH v3 7/7] mm: huge_memory: enable debugfs to split huge pages to any order Zi Yan
2023-04-04 21:47 ` [PATCH v3 0/7] Split a folio to any lower order folios Andrew Morton
2023-04-16 18:11   ` Hugh Dickins
2023-04-16 18:45     ` Andrew Morton
2023-04-17 14:20     ` David Hildenbrand
2023-04-17 19:26       ` Zi Yan
2023-04-18 10:29         ` David Hildenbrand
2023-04-18 14:00           ` Zi Yan
2024-02-13 12:30 ` Pankaj Raghav (Samsung)
2024-02-13 13:46   ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9dd96da-efa2-5123-20d4-4992136ef3ad@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mkoutny@suse.com \
    --cc=roman.gushchin@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).