linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Michael Ellerman <mpe@ellerman.id.au>,
	Christoph Hellwig <hch@lst.de>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Robin Murphy <robin.murphy@arm.com>,
	linuxppc-dev@lists.ozlabs.org,
	virtualization@lists.linux-foundation.org,
	iommu@lists.linux-foundation.org,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Eric Ren <renzhengeek@gmail.com>
Subject: Re: [PATCH v4 4/7] mm: make alloc_contig_range work at pageblock granularity
Date: Fri, 4 Feb 2022 14:56:52 +0100	[thread overview]
Message-ID: <Yf0wpFmtckRRzLFg@localhost.localdomain> (raw)
In-Reply-To: <20220119190623.1029355-5-zi.yan@sent.com>

On Wed, Jan 19, 2022 at 02:06:20PM -0500, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> alloc_contig_range() worked at MAX_ORDER-1 granularity to avoid merging
> pageblocks with different migratetypes. It might unnecessarily convert
> extra pageblocks at the beginning and at the end of the range. Change
> alloc_contig_range() to work at pageblock granularity.
> 
> It is done by restoring pageblock types and split >pageblock_order free
> pages after isolating at MAX_ORDER-1 granularity and migrating pages
> away at pageblock granularity. The reason for this process is that
> during isolation, some pages, either free or in-use, might have >pageblock
> sizes and isolating part of them can cause free accounting issues.
> Restoring the migratetypes of the pageblocks not in the interesting
> range later is much easier.

Hi Zi Yan,

Due to time constraints I only glanced over, so some comments below
about stuff that caught my eye:

> +static inline void split_free_page_into_pageblocks(struct page *free_page,
> +				int order, struct zone *zone)
> +{
> +	unsigned long pfn;
> +
> +	spin_lock(&zone->lock);
> +	del_page_from_free_list(free_page, zone, order);
> +	for (pfn = page_to_pfn(free_page);
> +	     pfn < page_to_pfn(free_page) + (1UL << order);

It migt make sense to have a end_pfn variable so that does not have to
be constantly evaluated. Or maybe the compiler is clever enough to only
evualuate it once.

> +	     pfn += pageblock_nr_pages) {
> +		int mt = get_pfnblock_migratetype(pfn_to_page(pfn), pfn);
> +
> +		__free_one_page(pfn_to_page(pfn), pfn, zone, pageblock_order,
> +				mt, FPI_NONE);
> +	}
> +	spin_unlock(&zone->lock);

It is possible that free_page's order is already pageblock_order, so I
would add a one-liner upfront to catch that case and return, otherwise
we do the delete_from_freelist-and-free_it_again dance.

> +	/* Save the migratepages of the pageblocks before start and after end */
> +	num_pageblock_to_save = (alloc_start - isolate_start) / pageblock_nr_pages
> +				+ (isolate_end - alloc_end) / pageblock_nr_pages;
> +	saved_mt =
> +		kmalloc_array(num_pageblock_to_save,
> +			      sizeof(unsigned char), GFP_KERNEL);
> +	if (!saved_mt)
> +		return -ENOMEM;
> +
> +	num = save_migratetypes(saved_mt, isolate_start, alloc_start);
> +
> +	num = save_migratetypes(&saved_mt[num], alloc_end, isolate_end);

I really hope we can put all this magic within start_isolate_page_range,
and the counterparts in undo_isolate_page_range.

Also, I kinda dislike the &saved_mt thing. I thought about some other
approaches but nothing that wasn't too specific for this case, and I
guess we want that function to be as generic as possible.

> +	/*
> +	 * Split free page spanning [alloc_end, isolate_end) and put the
> +	 * pageblocks in the right migratetype list
> +	 */
> +	for (outer_end = alloc_end; outer_end < isolate_end;) {
> +		unsigned long begin_pfn = outer_end;
> +
> +		order = 0;
> +		while (!PageBuddy(pfn_to_page(outer_end))) {
> +			if (++order >= MAX_ORDER) {
> +				outer_end = begin_pfn;
> +				break;
> +			}
> +			outer_end &= ~0UL << order;
> +		}
> +
> +		if (outer_end != begin_pfn) {
> +			order = buddy_order(pfn_to_page(outer_end));
> +
> +			/*
> +			 * split the free page has start page and put the pageblocks
> +			 * in the right migratetype list
> +			 */
> +			VM_BUG_ON(outer_end + (1UL << order) <= begin_pfn);

How could this possibily happen?

> +			{
> +				struct page *free_page = pfn_to_page(outer_end);
> +
> +				split_free_page_into_pageblocks(free_page, order, cc.zone);
> +			}
> +			outer_end += 1UL << order;
> +		} else
> +			outer_end = begin_pfn + 1;
>  	}

I think there are cases could optimize for. If the page has already been
split in pageblock by the outer_start loop, we could skip this outer_end
logic altogether.

E.g: An order-10 page is split in two pageblocks. There's nothing else
to be done, right? We could skip this. 


-- 
Oscar Salvador
SUSE Labs


  reply	other threads:[~2022-02-04 13:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-19 19:06 [PATCH v4 0/7] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-01-19 19:06 ` [PATCH v4 1/7] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
2022-01-24 14:02   ` Mel Gorman
2022-01-24 16:12     ` Zi Yan
2022-01-24 16:43       ` Mel Gorman
2022-01-19 19:06 ` [PATCH v4 2/7] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c Zi Yan
2022-01-25  6:23   ` Oscar Salvador
2022-01-19 19:06 ` [PATCH v4 3/7] mm: page_isolation: check specified range for unmovable pages Zi Yan
2022-01-24  9:55   ` Oscar Salvador
2022-01-24 17:17     ` Zi Yan
2022-01-25 13:19       ` Oscar Salvador
2022-01-25 13:21         ` Oscar Salvador
2022-01-25 16:31           ` Zi Yan
2022-02-02 12:18   ` Oscar Salvador
2022-02-02 12:25     ` David Hildenbrand
2022-02-02 16:25       ` Zi Yan
2022-02-02 16:35       ` Oscar Salvador
2022-01-19 19:06 ` [PATCH v4 4/7] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-02-04 13:56   ` Oscar Salvador [this message]
2022-02-04 15:19     ` Zi Yan
2022-01-19 19:06 ` [PATCH v4 5/7] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-01-19 19:06 ` [PATCH v4 6/7] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
2022-01-19 19:06 ` [PATCH v4 7/7] arch: powerpc: adjust fadump alignment to be pageblock aligned Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yf0wpFmtckRRzLFg@localhost.localdomain \
    --to=osalvador@suse.de \
    --cc=david@redhat.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mgorman@techsingularity.net \
    --cc=mpe@ellerman.id.au \
    --cc=renzhengeek@gmail.com \
    --cc=robin.murphy@arm.com \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).