All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>, Zi Yan <ziy@nvidia.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block conversion
Date: Tue, 26 Mar 2024 12:28:37 +0100	[thread overview]
Message-ID: <a0879316-31de-4fec-ad1f-caabbfff2e48@suse.cz> (raw)
In-Reply-To: <20240320180429.678181-7-hannes@cmpxchg.org>

On 3/20/24 7:02 PM, Johannes Weiner wrote:
> Currently, page block type conversion during fallbacks, atomic
> reservations and isolation can strand various amounts of free pages on
> incorrect freelists.
> 
> For example, fallback stealing moves free pages in the block to the
> new type's freelists, but then may not actually claim the block for
> that type if there aren't enough compatible pages already allocated.
> 
> In all cases, free page moving might fail if the block straddles more
> than one zone, in which case no free pages are moved at all, but the
> block type is changed anyway.
> 
> This is detrimental to type hygiene on the freelists. It encourages
> incompatible page mixing down the line (ask for one type, get another)
> and thus contributes to long-term fragmentation.
> 
> Split the process into a proper transaction: check first if conversion
> will happen, then try to move the free pages, and only if that was
> successful convert the block to the new type.
> 
> Tested-by: "Huang, Ying" <ying.huang@intel.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

Nit below:

> @@ -1743,33 +1770,37 @@ static inline bool boost_watermark(struct zone *zone)
>  }
>  
>  /*
> - * This function implements actual steal behaviour. If order is large enough,
> - * we can steal whole pageblock. If not, we first move freepages in this
> - * pageblock to our migratetype and determine how many already-allocated pages
> - * are there in the pageblock with a compatible migratetype. If at least half
> - * of pages are free or compatible, we can change migratetype of the pageblock
> - * itself, so pages freed in the future will be put on the correct free list.
> + * This function implements actual steal behaviour. If order is large enough, we
> + * can claim the whole pageblock for the requested migratetype. If not, we check
> + * the pageblock for constituent pages; if at least half of the pages are free
> + * or compatible, we can still claim the whole block, so pages freed in the
> + * future will be put on the correct free list. Otherwise, we isolate exactly
> + * the order we need from the fallback block and leave its migratetype alone.
>   */
> -static void steal_suitable_fallback(struct zone *zone, struct page *page,
> -		unsigned int alloc_flags, int start_type, bool whole_block)
> +static struct page *
> +steal_suitable_fallback(struct zone *zone, struct page *page,
> +			int current_order, int order, int start_type,
> +			unsigned int alloc_flags, bool whole_block)
>  {
> -	unsigned int current_order = buddy_order(page);
>  	int free_pages, movable_pages, alike_pages;
> -	int old_block_type;
> +	unsigned long start_pfn, end_pfn;
> +	int block_type;
>  
> -	old_block_type = get_pageblock_migratetype(page);
> +	block_type = get_pageblock_migratetype(page);
>  
>  	/*
>  	 * This can happen due to races and we want to prevent broken
>  	 * highatomic accounting.
>  	 */
> -	if (is_migrate_highatomic(old_block_type))
> +	if (is_migrate_highatomic(block_type))
>  		goto single_page;
>  
>  	/* Take ownership for orders >= pageblock_order */
>  	if (current_order >= pageblock_order) {
> +		del_page_from_free_list(page, zone, current_order);
>  		change_pageblock_range(page, current_order, start_type);
> -		goto single_page;
> +		expand(zone, page, order, current_order, start_type);
> +		return page;

Is the exact order here important (AFAIK shouldn't be?) or we could just
change_pageblock_range(); block_type = start_type; goto single_page?

>  	}
>  
>  	/*
> @@ -1784,10 +1815,9 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
>  	if (!whole_block)
>  		goto single_page;
>  
> -	free_pages = move_freepages_block(zone, page, start_type,
> -						&movable_pages);
>  	/* moving whole block can fail due to zone boundary conditions */
> -	if (!free_pages)
> +	if (!prep_move_freepages_block(zone, page, &start_pfn, &end_pfn,
> +				       &free_pages, &movable_pages))
>  		goto single_page;
>  
>  	/*
> @@ -1805,7 +1835,7 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
>  		 * vice versa, be conservative since we can't distinguish the
>  		 * exact migratetype of non-movable pages.
>  		 */
> -		if (old_block_type == MIGRATE_MOVABLE)
> +		if (block_type == MIGRATE_MOVABLE)
>  			alike_pages = pageblock_nr_pages
>  						- (free_pages + movable_pages);
>  		else
> @@ -1816,13 +1846,16 @@ static void steal_suitable_fallback(struct zone *zone, struct page *page,
>  	 * compatible migratability as our allocation, claim the whole block.
>  	 */
>  	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
> -			page_group_by_mobility_disabled)
> +			page_group_by_mobility_disabled) {
> +		move_freepages(zone, start_pfn, end_pfn, start_type);
>  		set_pageblock_migratetype(page, start_type);
> -
> -	return;
> +		return __rmqueue_smallest(zone, order, start_type);
> +	}
>  
>  single_page:
> -	move_to_free_list(page, zone, current_order, start_type);
> +	del_page_from_free_list(page, zone, current_order);
> +	expand(zone, page, order, current_order, block_type);
> +	return page;
>  }


  reply	other threads:[~2024-03-26 11:28 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-20 18:02 [PATCH V4 00/10] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2024-03-20 18:02 ` [PATCH 01/10] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2024-03-20 18:02 ` [PATCH 02/10] mm: page_alloc: optimize free_unref_folios() Johannes Weiner
2024-03-25 15:56   ` Vlastimil Babka
2024-03-20 18:02 ` [PATCH 03/10] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2024-03-20 18:02 ` [PATCH 04/10] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2024-03-20 18:02 ` [PATCH 05/10] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2024-03-25 16:22   ` Vlastimil Babka
2024-03-20 18:02 ` [PATCH 06/10] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2024-03-26 11:28   ` Vlastimil Babka [this message]
2024-03-26 12:34     ` Johannes Weiner
2024-04-05 12:11   ` Baolin Wang
2024-04-05 16:56     ` Johannes Weiner
2024-04-07  6:58       ` Baolin Wang
2024-04-08  7:24       ` Vlastimil Babka
2024-04-09  6:21       ` Vlastimil Babka
2024-03-20 18:02 ` [PATCH 07/10] mm: page_alloc: close migratetype race between freeing and stealing Johannes Weiner
2024-03-26 15:25   ` Vlastimil Babka
2024-03-20 18:02 ` [PATCH 08/10] mm: page_alloc: set migratetype inside move_freepages() Johannes Weiner
2024-03-26 15:40   ` Vlastimil Babka
2024-03-20 18:02 ` [PATCH 09/10] mm: page_isolation: prepare for hygienic freelists Johannes Weiner
2024-03-21 13:13   ` kernel test robot
2024-03-21 14:24     ` Johannes Weiner
2024-03-21 15:03       ` Zi Yan
2024-03-27  8:06   ` Vlastimil Babka
2024-03-20 18:02 ` [PATCH 10/10] mm: page_alloc: consolidate free page accounting Johannes Weiner
2024-03-27  8:54   ` Vlastimil Babka
2024-03-27 14:32     ` Johannes Weiner
2024-03-27 18:57     ` [PATCH 1/3] mm: page_alloc: consolidate free page accounting fix Johannes Weiner
2024-03-27 18:58     ` [PATCH 2/3] mm: page_alloc: consolidate free page accounting fix 2 Johannes Weiner
2024-03-27 19:01     ` [PATCH 3/3] mm: page_alloc: batch vmstat updates in expand() Johannes Weiner
2024-03-27 20:35       ` Vlastimil Babka
2024-04-07 10:19   ` [PATCH 10/10] mm: page_alloc: consolidate free page accounting Baolin Wang
2024-04-08  7:38     ` Vlastimil Babka
2024-04-08  9:13       ` Baolin Wang
2024-04-08 14:23       ` Johannes Weiner
2024-04-09  6:23         ` Vlastimil Babka
2024-04-09  7:48           ` [PATCH] mm: page_alloc: consolidate free page accounting fix 3 Baolin Wang
2024-04-09 21:15             ` kernel test robot
2024-04-09 22:36               ` Johannes Weiner
2024-04-09 21:25             ` kernel test robot
2024-04-09  7:56           ` [PATCH 10/10] mm: page_alloc: consolidate free page accounting Baolin Wang
2024-04-09  8:41             ` Vlastimil Babka
2024-04-09  9:31         ` Baolin Wang
2024-04-09 14:46           ` Zi Yan
2024-04-10  8:49             ` Baolin Wang
2024-03-27  9:30 ` [PATCH V4 00/10] mm: page_alloc: freelist migratetype hygiene Vlastimil Babka
2024-03-27 13:10   ` Zi Yan
2024-03-27 14:29   ` Johannes Weiner
2024-04-08  9:30 ` Baolin Wang
2024-04-08 14:24   ` Johannes Weiner
2024-05-11  5:14 ` Yu Zhao
2024-05-13 16:03   ` Johannes Weiner
2024-05-13 18:10     ` Yu Zhao
2024-05-13 19:04       ` Johannes Weiner
  -- strict thread matches above, loose matches on Subject: below --
2024-03-06  4:08 [PATCH V3 01/10] " Johannes Weiner
2024-03-06  4:08 ` [PATCH 06/10] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0879316-31de-4fec-ad1f-caabbfff2e48@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.