All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Use pageblock_order for cma and alloc_contig_range alignment.
@ 2022-01-19 19:06 ` Zi Yan
  0 siblings, 0 replies; 73+ messages in thread
From: Zi Yan @ 2022-01-19 19:06 UTC (permalink / raw)
  To: David Hildenbrand, linux-mm
  Cc: linux-kernel, Michael Ellerman, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy, linuxppc-dev, virtualization,
	iommu, Vlastimil Babka, Mel Gorman, Eric Ren, Zi Yan

From: Zi Yan <ziy@nvidia.com>

Hi all,

This patchset tries to remove the MAX_ORDER-1 alignment requirement for CMA
and alloc_contig_range(). It prepares for my upcoming changes to make
MAX_ORDER adjustable at boot time[1]. It is on top of mmotm-2021-12-29-20-07.

Changelog from RFC
===
1. Dropped two irrelevant patches on non-lru compound page handling, as
   it is not supported upstream.
2. Renamed migratetype_has_fallback() to migratetype_is_mergeable().
3. Always check whether two pageblocks can be merged in
   __free_one_page() when order is >= pageblock_order, as the case (not
   mergeable pageblocks are isolated, CMA, and HIGHATOMIC) becomes more common.
3. Moving has_unmovable_pages() is now a separate patch.
4. Removed MAX_ORDER-1 alignment requirement in the comment in virtio_mem code.

Description
===

The MAX_ORDER - 1 alignment requirement comes from that alloc_contig_range()
isolates pageblocks to remove free memory from buddy allocator but isolating
only a subset of pageblocks within a page spanning across multiple pageblocks
causes free page accounting issues. Isolated page might not be put into the
right free list, since the code assumes the migratetype of the first pageblock
as the whole free page migratetype. This is based on the discussion at [2].

To remove the requirement, this patchset:
1. still isolates pageblocks at MAX_ORDER - 1 granularity;
2. but saves the pageblock migratetypes outside the specified range of
   alloc_contig_range() and restores them after all pages within the range
   become free after __alloc_contig_migrate_range();
3. only checks unmovable pages within the range instead of MAX_ORDER - 1 aligned
   range during isolation to avoid alloc_contig_range() failure when pageblocks
   within a MAX_ORDER - 1 aligned range are allocated separately.
3. splits free pages spanning multiple pageblocks at the beginning and the end
   of the range and puts the split pages to the right migratetype free lists
   based on the pageblock migratetypes;
4. returns pages not in the range as it did before.

Isolation needs to be done at MAX_ORDER - 1 granularity, because otherwise
either 1) it is needed to detect to-be-isolated page size (free, PageHuge, THP,
or other PageCompound) to make sure all pageblocks belonging to a single page
are isolated together and later restore pageblock migratetypes outside the
range, or 2) assuming isolation happens at pageblock granularity, a free page
with multi-migratetype pageblocks can seen in free page path and needs
to be split and freed at pageblock granularity.

One optimization might come later:
1. make MIGRATE_ISOLATE a separate bit to avoid saving and restoring existing
   migratetypes before and after isolation respectively.

Feel free to give comments and suggestions. Thanks.

[1] https://lore.kernel.org/linux-mm/20210805190253.2795604-1-zi.yan@sent.com/
[2] https://lore.kernel.org/linux-mm/d19fb078-cb9b-f60f-e310-fdeea1b947d2@redhat.com/


Zi Yan (7):
  mm: page_alloc: avoid merging non-fallbackable pageblocks with others.
  mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c
  mm: page_isolation: check specified range for unmovable pages
  mm: make alloc_contig_range work at pageblock granularity
  mm: cma: use pageblock_order as the single alignment
  drivers: virtio_mem: use pageblock size as the minimum virtio_mem
    size.
  arch: powerpc: adjust fadump alignment to be pageblock aligned.

 arch/powerpc/include/asm/fadump-internal.h |   4 +-
 drivers/virtio/virtio_mem.c                |   7 +-
 include/linux/mmzone.h                     |  16 +-
 include/linux/page-isolation.h             |   3 +-
 kernel/dma/contiguous.c                    |   2 +-
 mm/cma.c                                   |   6 +-
 mm/memory_hotplug.c                        |  12 +-
 mm/page_alloc.c                            | 337 +++++++++++----------
 mm/page_isolation.c                        | 154 +++++++++-
 9 files changed, 352 insertions(+), 189 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 73+ messages in thread
* Re: [PATCH v4 3/7] mm: page_isolation: check specified range for unmovable pages
@ 2022-01-22  8:32 kernel test robot
  0 siblings, 0 replies; 73+ messages in thread
From: kernel test robot @ 2022-01-22  8:32 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 22598 bytes --]

CC: llvm(a)lists.linux.dev
CC: kbuild-all(a)lists.01.org
In-Reply-To: <20220119190623.1029355-4-zi.yan@sent.com>
References: <20220119190623.1029355-4-zi.yan@sent.com>
TO: Zi Yan <zi.yan@sent.com>

Hi Zi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on linus/master hnaz-mm/master v5.16 next-20220121]
[cannot apply to powerpc/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Zi-Yan/Use-pageblock_order-for-cma-and-alloc_contig_range-alignment/20220120-032623
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 99613159ad749543621da8238acf1a122880144e
:::::: branch date: 3 days ago
:::::: commit date: 3 days ago
config: x86_64-randconfig-c007 (https://download.01.org/0day-ci/archive/20220122/202201221659.XbJv2v5A-lkp(a)intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project f7b7138a62648f4019c55e4671682af1f851f295)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/34f568f50dcc72fc76e85b137621abc7dedb1cc5
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Zi-Yan/Use-pageblock_order-for-cma-and-alloc_contig_range-alignment/20220120-032623
        git checkout 34f568f50dcc72fc76e85b137621abc7dedb1cc5
        # save the config file to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 clang-analyzer 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


clang-analyzer warnings: (new ones prefixed by >>)
           ^
   fs/xfs/libxfs/xfs_iext_tree.c:655:6: note: Assuming 'new' is equal to field 'leaf'
           if (cur->leaf != new && cur->pos == 0 && nr_entries > 0) {
               ^~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:655:23: note: Left side of '&&' is false
           if (cur->leaf != new && cur->pos == 0 && nr_entries > 0) {
                                ^
   fs/xfs/libxfs/xfs_iext_tree.c:660:23: note: Assuming 'i' is <= field 'pos'
           for (i = nr_entries; i > cur->pos; i--)
                                ^~~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:660:2: note: Loop condition is false. Execution continues on line 662
           for (i = nr_entries; i > cur->pos; i--)
           ^
   fs/xfs/libxfs/xfs_iext_tree.c:667:6: note: Assuming 'new' is non-null
           if (new)
               ^~~
   fs/xfs/libxfs/xfs_iext_tree.c:667:2: note: Taking true branch
           if (new)
           ^
   fs/xfs/libxfs/xfs_iext_tree.c:668:3: note: Calling 'xfs_iext_insert_node'
                   xfs_iext_insert_node(ifp, xfs_iext_leaf_key(new, 0), new, 2);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:503:6: note: Assuming 'level' is <= field 'if_height'
           if (ifp->if_height < level)
               ^~~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:503:2: note: Taking false branch
           if (ifp->if_height < level)
           ^
   fs/xfs/libxfs/xfs_iext_tree.c:507:2: note: Value assigned to 'node'
           node = xfs_iext_find_level(ifp, offset, level);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:511:9: note: Assuming 'pos' is >= 'nr_entries'
           ASSERT(pos >= nr_entries || xfs_iext_key_cmp(node, pos, offset) != 0);
                  ^
   fs/xfs/xfs_linux.h:208:10: note: expanded from macro 'ASSERT'
           (likely(expr) ? (void)0 : assfail(NULL, #expr, __FILE__, __LINE__))
                   ^~~~
   include/linux/compiler.h:77:40: note: expanded from macro 'likely'
   # define likely(x)      __builtin_expect(!!(x), 1)
                                               ^
   fs/xfs/libxfs/xfs_iext_tree.c:511:27: note: Left side of '||' is true
           ASSERT(pos >= nr_entries || xfs_iext_key_cmp(node, pos, offset) != 0);
                                    ^
   fs/xfs/libxfs/xfs_iext_tree.c:511:2: note: '?' condition is true
           ASSERT(pos >= nr_entries || xfs_iext_key_cmp(node, pos, offset) != 0);
           ^
   fs/xfs/xfs_linux.h:208:3: note: expanded from macro 'ASSERT'
           (likely(expr) ? (void)0 : assfail(NULL, #expr, __FILE__, __LINE__))
            ^
   include/linux/compiler.h:77:20: note: expanded from macro 'likely'
   # define likely(x)      __builtin_expect(!!(x), 1)
                           ^
   fs/xfs/libxfs/xfs_iext_tree.c:512:9: note: Assuming 'nr_entries' is <= KEYS_PER_NODE
           ASSERT(nr_entries <= KEYS_PER_NODE);
                  ^
   fs/xfs/xfs_linux.h:208:10: note: expanded from macro 'ASSERT'
           (likely(expr) ? (void)0 : assfail(NULL, #expr, __FILE__, __LINE__))
                   ^~~~
   include/linux/compiler.h:77:40: note: expanded from macro 'likely'
   # define likely(x)      __builtin_expect(!!(x), 1)
                                               ^
   fs/xfs/libxfs/xfs_iext_tree.c:512:2: note: '?' condition is true
           ASSERT(nr_entries <= KEYS_PER_NODE);
           ^
   fs/xfs/xfs_linux.h:208:3: note: expanded from macro 'ASSERT'
           (likely(expr) ? (void)0 : assfail(NULL, #expr, __FILE__, __LINE__))
            ^
   include/linux/compiler.h:77:20: note: expanded from macro 'likely'
   # define likely(x)      __builtin_expect(!!(x), 1)
                           ^
   fs/xfs/libxfs/xfs_iext_tree.c:514:6: note: Assuming 'nr_entries' is not equal to KEYS_PER_NODE
           if (nr_entries == KEYS_PER_NODE)
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:514:2: note: Taking false branch
           if (nr_entries == KEYS_PER_NODE)
           ^
   fs/xfs/libxfs/xfs_iext_tree.c:521:6: note: Assuming 'node' is equal to 'new'
           if (node != new && pos == 0 && nr_entries > 0)
               ^~~~~~~~~~~
   fs/xfs/libxfs/xfs_iext_tree.c:521:18: note: Left side of '&&' is false
           if (node != new && pos == 0 && nr_entries > 0)
                           ^
   fs/xfs/libxfs/xfs_iext_tree.c:524:23: note: 'i' is <= 'pos'
           for (i = nr_entries; i > pos; i--) {
                                ^
   fs/xfs/libxfs/xfs_iext_tree.c:524:2: note: Loop condition is false. Execution continues on line 528
           for (i = nr_entries; i > pos; i--) {
           ^
   fs/xfs/libxfs/xfs_iext_tree.c:528:18: note: Array access (via field 'keys') results in a null pointer dereference
           node->keys[pos] = offset;
                 ~~~~      ^
   Suppressed 6 warnings (6 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   4 warnings generated.
   Suppressed 4 warnings (4 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   5 warnings generated.
   Suppressed 5 warnings (5 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   6 warnings generated.
>> mm/page_isolation.c:40:2: warning: Value stored to 'page' is never read [clang-analyzer-deadcode.DeadStores]
           page = pfn_to_page(pfn);
           ^
   mm/page_isolation.c:40:2: note: Value stored to 'page' is never read
   Suppressed 5 warnings (5 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   5 warnings generated.
   Suppressed 5 warnings (5 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   5 warnings generated.
   Suppressed 5 warnings (5 in non-user code).
   Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
   8 warnings generated.
   include/linux/list.h:72:13: warning: Access to field 'prev' results in a dereference of a null pointer (loaded from variable 'next') [clang-analyzer-core.NullDereference]
           next->prev = new;
                      ^
   mm/zsmalloc.c:2302:16: note: Calling 'zs_compact'
           pages_freed = zs_compact(pool);
                         ^~~~~~~~~~~~~~~~
   mm/zsmalloc.c:2270:11: note: '?' condition is false
           for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
                    ^
   mm/zsmalloc.c:150:59: note: expanded from macro 'ZS_SIZE_CLASSES'
   #define ZS_SIZE_CLASSES (DIV_ROUND_UP(ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE, \
                                                             ^
   mm/zsmalloc.c:132:52: note: expanded from macro 'ZS_MIN_ALLOC_SIZE'
           MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
                                                             ^
   mm/zsmalloc.c:121:41: note: expanded from macro 'OBJ_INDEX_BITS'
   #define OBJ_INDEX_BITS  (BITS_PER_LONG - _PFN_BITS - OBJ_TAG_BITS)
                                            ^
   note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
   mm/zsmalloc.c:91:35: note: expanded from macro 'MAX_POSSIBLE_PHYSMEM_BITS'
   #define MAX_POSSIBLE_PHYSMEM_BITS MAX_PHYSMEM_BITS
                                     ^
   arch/x86/include/asm/sparsemem.h:27:28: note: expanded from macro 'MAX_PHYSMEM_BITS'
   # define MAX_PHYSMEM_BITS       (pgtable_l5_enabled() ? 52 : 46)
                                    ^
   arch/x86/include/asm/pgtable_64_types.h:40:30: note: expanded from macro 'pgtable_l5_enabled'
   #define pgtable_l5_enabled() 0
                                ^
   mm/zsmalloc.c:2270:11: note: '?' condition is true
           for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
                    ^
   mm/zsmalloc.c:150:59: note: expanded from macro 'ZS_SIZE_CLASSES'
   #define ZS_SIZE_CLASSES (DIV_ROUND_UP(ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE, \
                                                             ^
   mm/zsmalloc.c:132:2: note: expanded from macro 'ZS_MIN_ALLOC_SIZE'
           MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
           ^
   mm/zsmalloc.c:129:20: note: expanded from macro 'MAX'
   #define MAX(a, b) ((a) >= (b) ? (a) : (b))
                      ^
   mm/zsmalloc.c:2270:2: note: Loop condition is true.  Entering loop body
           for (i = ZS_SIZE_CLASSES - 1; i >= 0; i--) {
           ^
   mm/zsmalloc.c:2272:7: note: Assuming 'class' is non-null
                   if (!class)
                       ^~~~~~
   mm/zsmalloc.c:2272:3: note: Taking false branch
                   if (!class)
                   ^
   mm/zsmalloc.c:2274:7: note: Assuming 'i' is equal to field 'index'
                   if (class->index != i)
                       ^~~~~~~~~~~~~~~~~
   mm/zsmalloc.c:2274:3: note: Taking false branch
                   if (class->index != i)
                   ^
   mm/zsmalloc.c:2276:18: note: Calling '__zs_compact'
                   pages_freed += __zs_compact(pool, class);
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~
   mm/zsmalloc.c:2221:2: note: Calling 'spin_lock'
           spin_lock(&class->lock);
           ^~~~~~~~~~~~~~~~~~~~~~~
   include/linux/spinlock.h:349:2: note: Value assigned to field 'next'
           raw_spin_lock(&lock->rlock);
           ^
   include/linux/spinlock.h:215:29: note: expanded from macro 'raw_spin_lock'
   #define raw_spin_lock(lock)     _raw_spin_lock(lock)
                                   ^~~~~~~~~~~~~~~~~~~~
   mm/zsmalloc.c:2221:2: note: Returning from 'spin_lock'
           spin_lock(&class->lock);
           ^~~~~~~~~~~~~~~~~~~~~~~
   mm/zsmalloc.c:2222:2: note: Loop condition is true.  Entering loop body
           while ((src_zspage = isolate_zspage(class, true))) {
           ^
   mm/zsmalloc.c:2224:7: note: Assuming the condition is true
                   if (!zs_can_compact(class))
                       ^~~~~~~~~~~~~~~~~~~~~~
   mm/zsmalloc.c:2224:3: note: Taking true branch
                   if (!zs_can_compact(class))
                   ^
   mm/zsmalloc.c:2225:4: note:  Execution continues on line 2256
                           break;
                           ^
   mm/zsmalloc.c:2256:6: note: 'src_zspage' is non-null
           if (src_zspage)
               ^~~~~~~~~~
   mm/zsmalloc.c:2256:2: note: Taking true branch
           if (src_zspage)
           ^

vim +/page +40 mm/page_isolation.c

0f0848e5118a4c Joonsoo Kim 2016-01-14   17  
1edc0e02d51603 Zi Yan      2022-01-19   18  /*
34f568f50dcc72 Zi Yan      2022-01-19   19   * This function checks whether pageblock within [start_pfn, end_pfn) includes
34f568f50dcc72 Zi Yan      2022-01-19   20   * unmovable pages or not.
1edc0e02d51603 Zi Yan      2022-01-19   21   *
1edc0e02d51603 Zi Yan      2022-01-19   22   * PageLRU check without isolation or lru_lock could race so that
1edc0e02d51603 Zi Yan      2022-01-19   23   * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable
1edc0e02d51603 Zi Yan      2022-01-19   24   * check without lock_page also may miss some movable non-lru pages at
1edc0e02d51603 Zi Yan      2022-01-19   25   * race condition. So you can't expect this function should be exact.
1edc0e02d51603 Zi Yan      2022-01-19   26   *
1edc0e02d51603 Zi Yan      2022-01-19   27   * Returns a page without holding a reference. If the caller wants to
1edc0e02d51603 Zi Yan      2022-01-19   28   * dereference that page (e.g., dumping), it has to make sure that it
1edc0e02d51603 Zi Yan      2022-01-19   29   * cannot get removed (e.g., via memory unplug) concurrently.
1edc0e02d51603 Zi Yan      2022-01-19   30   *
1edc0e02d51603 Zi Yan      2022-01-19   31   */
1edc0e02d51603 Zi Yan      2022-01-19   32  static struct page *has_unmovable_pages(struct zone *zone, struct page *page,
34f568f50dcc72 Zi Yan      2022-01-19   33  				 int migratetype, int flags,
34f568f50dcc72 Zi Yan      2022-01-19   34  				 unsigned long start_pfn, unsigned long end_pfn)
1edc0e02d51603 Zi Yan      2022-01-19   35  {
34f568f50dcc72 Zi Yan      2022-01-19   36  	unsigned long first_pfn = max(page_to_pfn(page), start_pfn);
34f568f50dcc72 Zi Yan      2022-01-19   37  	unsigned long pfn = first_pfn;
34f568f50dcc72 Zi Yan      2022-01-19   38  	unsigned long last_pfn = min(ALIGN(pfn + 1, pageblock_nr_pages), end_pfn);
34f568f50dcc72 Zi Yan      2022-01-19   39  
34f568f50dcc72 Zi Yan      2022-01-19  @40  	page = pfn_to_page(pfn);
1edc0e02d51603 Zi Yan      2022-01-19   41  
1edc0e02d51603 Zi Yan      2022-01-19   42  	if (is_migrate_cma_page(page)) {
1edc0e02d51603 Zi Yan      2022-01-19   43  		/*
1edc0e02d51603 Zi Yan      2022-01-19   44  		 * CMA allocations (alloc_contig_range) really need to mark
1edc0e02d51603 Zi Yan      2022-01-19   45  		 * isolate CMA pageblocks even when they are not movable in fact
1edc0e02d51603 Zi Yan      2022-01-19   46  		 * so consider them movable here.
1edc0e02d51603 Zi Yan      2022-01-19   47  		 */
1edc0e02d51603 Zi Yan      2022-01-19   48  		if (is_migrate_cma(migratetype))
1edc0e02d51603 Zi Yan      2022-01-19   49  			return NULL;
1edc0e02d51603 Zi Yan      2022-01-19   50  
1edc0e02d51603 Zi Yan      2022-01-19   51  		return page;
1edc0e02d51603 Zi Yan      2022-01-19   52  	}
1edc0e02d51603 Zi Yan      2022-01-19   53  
34f568f50dcc72 Zi Yan      2022-01-19   54  	for (pfn = first_pfn; pfn < last_pfn; pfn++) {
34f568f50dcc72 Zi Yan      2022-01-19   55  		page = pfn_to_page(pfn);
1edc0e02d51603 Zi Yan      2022-01-19   56  
1edc0e02d51603 Zi Yan      2022-01-19   57  		/*
1edc0e02d51603 Zi Yan      2022-01-19   58  		 * Both, bootmem allocations and memory holes are marked
1edc0e02d51603 Zi Yan      2022-01-19   59  		 * PG_reserved and are unmovable. We can even have unmovable
1edc0e02d51603 Zi Yan      2022-01-19   60  		 * allocations inside ZONE_MOVABLE, for example when
1edc0e02d51603 Zi Yan      2022-01-19   61  		 * specifying "movablecore".
1edc0e02d51603 Zi Yan      2022-01-19   62  		 */
1edc0e02d51603 Zi Yan      2022-01-19   63  		if (PageReserved(page))
1edc0e02d51603 Zi Yan      2022-01-19   64  			return page;
1edc0e02d51603 Zi Yan      2022-01-19   65  
1edc0e02d51603 Zi Yan      2022-01-19   66  		/*
1edc0e02d51603 Zi Yan      2022-01-19   67  		 * If the zone is movable and we have ruled out all reserved
1edc0e02d51603 Zi Yan      2022-01-19   68  		 * pages then it should be reasonably safe to assume the rest
1edc0e02d51603 Zi Yan      2022-01-19   69  		 * is movable.
1edc0e02d51603 Zi Yan      2022-01-19   70  		 */
1edc0e02d51603 Zi Yan      2022-01-19   71  		if (zone_idx(zone) == ZONE_MOVABLE)
1edc0e02d51603 Zi Yan      2022-01-19   72  			continue;
1edc0e02d51603 Zi Yan      2022-01-19   73  
1edc0e02d51603 Zi Yan      2022-01-19   74  		/*
1edc0e02d51603 Zi Yan      2022-01-19   75  		 * Hugepages are not in LRU lists, but they're movable.
1edc0e02d51603 Zi Yan      2022-01-19   76  		 * THPs are on the LRU, but need to be counted as #small pages.
1edc0e02d51603 Zi Yan      2022-01-19   77  		 * We need not scan over tail pages because we don't
1edc0e02d51603 Zi Yan      2022-01-19   78  		 * handle each tail page individually in migration.
1edc0e02d51603 Zi Yan      2022-01-19   79  		 */
1edc0e02d51603 Zi Yan      2022-01-19   80  		if (PageHuge(page) || PageTransCompound(page)) {
1edc0e02d51603 Zi Yan      2022-01-19   81  			struct page *head = compound_head(page);
1edc0e02d51603 Zi Yan      2022-01-19   82  			unsigned int skip_pages;
1edc0e02d51603 Zi Yan      2022-01-19   83  
1edc0e02d51603 Zi Yan      2022-01-19   84  			if (PageHuge(page)) {
1edc0e02d51603 Zi Yan      2022-01-19   85  				if (!hugepage_migration_supported(page_hstate(head)))
1edc0e02d51603 Zi Yan      2022-01-19   86  					return page;
1edc0e02d51603 Zi Yan      2022-01-19   87  			} else if (!PageLRU(head) && !__PageMovable(head)) {
1edc0e02d51603 Zi Yan      2022-01-19   88  				return page;
1edc0e02d51603 Zi Yan      2022-01-19   89  			}
1edc0e02d51603 Zi Yan      2022-01-19   90  
1edc0e02d51603 Zi Yan      2022-01-19   91  			skip_pages = compound_nr(head) - (page - head);
34f568f50dcc72 Zi Yan      2022-01-19   92  			pfn += skip_pages - 1;
1edc0e02d51603 Zi Yan      2022-01-19   93  			continue;
1edc0e02d51603 Zi Yan      2022-01-19   94  		}
1edc0e02d51603 Zi Yan      2022-01-19   95  
1edc0e02d51603 Zi Yan      2022-01-19   96  		/*
1edc0e02d51603 Zi Yan      2022-01-19   97  		 * We can't use page_count without pin a page
1edc0e02d51603 Zi Yan      2022-01-19   98  		 * because another CPU can free compound page.
1edc0e02d51603 Zi Yan      2022-01-19   99  		 * This check already skips compound tails of THP
1edc0e02d51603 Zi Yan      2022-01-19  100  		 * because their page->_refcount is zero at all time.
1edc0e02d51603 Zi Yan      2022-01-19  101  		 */
1edc0e02d51603 Zi Yan      2022-01-19  102  		if (!page_ref_count(page)) {
1edc0e02d51603 Zi Yan      2022-01-19  103  			if (PageBuddy(page))
34f568f50dcc72 Zi Yan      2022-01-19  104  				pfn += (1 << buddy_order(page)) - 1;
1edc0e02d51603 Zi Yan      2022-01-19  105  			continue;
1edc0e02d51603 Zi Yan      2022-01-19  106  		}
1edc0e02d51603 Zi Yan      2022-01-19  107  
1edc0e02d51603 Zi Yan      2022-01-19  108  		/*
1edc0e02d51603 Zi Yan      2022-01-19  109  		 * The HWPoisoned page may be not in buddy system, and
1edc0e02d51603 Zi Yan      2022-01-19  110  		 * page_count() is not 0.
1edc0e02d51603 Zi Yan      2022-01-19  111  		 */
1edc0e02d51603 Zi Yan      2022-01-19  112  		if ((flags & MEMORY_OFFLINE) && PageHWPoison(page))
1edc0e02d51603 Zi Yan      2022-01-19  113  			continue;
1edc0e02d51603 Zi Yan      2022-01-19  114  
1edc0e02d51603 Zi Yan      2022-01-19  115  		/*
1edc0e02d51603 Zi Yan      2022-01-19  116  		 * We treat all PageOffline() pages as movable when offlining
1edc0e02d51603 Zi Yan      2022-01-19  117  		 * to give drivers a chance to decrement their reference count
1edc0e02d51603 Zi Yan      2022-01-19  118  		 * in MEM_GOING_OFFLINE in order to indicate that these pages
1edc0e02d51603 Zi Yan      2022-01-19  119  		 * can be offlined as there are no direct references anymore.
1edc0e02d51603 Zi Yan      2022-01-19  120  		 * For actually unmovable PageOffline() where the driver does
1edc0e02d51603 Zi Yan      2022-01-19  121  		 * not support this, we will fail later when trying to actually
1edc0e02d51603 Zi Yan      2022-01-19  122  		 * move these pages that still have a reference count > 0.
1edc0e02d51603 Zi Yan      2022-01-19  123  		 * (false negatives in this function only)
1edc0e02d51603 Zi Yan      2022-01-19  124  		 */
1edc0e02d51603 Zi Yan      2022-01-19  125  		if ((flags & MEMORY_OFFLINE) && PageOffline(page))
1edc0e02d51603 Zi Yan      2022-01-19  126  			continue;
1edc0e02d51603 Zi Yan      2022-01-19  127  
1edc0e02d51603 Zi Yan      2022-01-19  128  		if (__PageMovable(page) || PageLRU(page))
1edc0e02d51603 Zi Yan      2022-01-19  129  			continue;
1edc0e02d51603 Zi Yan      2022-01-19  130  
1edc0e02d51603 Zi Yan      2022-01-19  131  		/*
1edc0e02d51603 Zi Yan      2022-01-19  132  		 * If there are RECLAIMABLE pages, we need to check
1edc0e02d51603 Zi Yan      2022-01-19  133  		 * it.  But now, memory offline itself doesn't call
1edc0e02d51603 Zi Yan      2022-01-19  134  		 * shrink_node_slabs() and it still to be fixed.
1edc0e02d51603 Zi Yan      2022-01-19  135  		 */
1edc0e02d51603 Zi Yan      2022-01-19  136  		return page;
1edc0e02d51603 Zi Yan      2022-01-19  137  	}
1edc0e02d51603 Zi Yan      2022-01-19  138  	return NULL;
1edc0e02d51603 Zi Yan      2022-01-19  139  }
1edc0e02d51603 Zi Yan      2022-01-19  140  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2022-02-04 15:21 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-19 19:06 [PATCH v4 0/7] Use pageblock_order for cma and alloc_contig_range alignment Zi Yan
2022-01-19 19:06 ` Zi Yan
2022-01-19 19:06 ` Zi Yan
2022-01-19 19:06 ` [PATCH v4 1/7] mm: page_alloc: avoid merging non-fallbackable pageblocks with others Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-24 14:02   ` Mel Gorman
2022-01-24 14:02     ` Mel Gorman
2022-01-24 14:02     ` Mel Gorman
2022-01-24 14:02     ` Mel Gorman
2022-01-24 16:12     ` Zi Yan
2022-01-24 16:12       ` Zi Yan
2022-01-24 16:12       ` Zi Yan via iommu
2022-01-24 16:43       ` Mel Gorman
2022-01-24 16:43         ` Mel Gorman
2022-01-24 16:43         ` Mel Gorman
2022-01-24 16:43         ` Mel Gorman
2022-01-19 19:06 ` [PATCH v4 2/7] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-25  6:23   ` Oscar Salvador
2022-01-25  6:23     ` Oscar Salvador
2022-01-25  6:23     ` Oscar Salvador
2022-01-19 19:06 ` [PATCH v4 3/7] mm: page_isolation: check specified range for unmovable pages Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-24  9:55   ` Oscar Salvador
2022-01-24  9:55     ` Oscar Salvador
2022-01-24  9:55     ` Oscar Salvador
2022-01-24 17:17     ` Zi Yan
2022-01-24 17:17       ` Zi Yan
2022-01-24 17:17       ` Zi Yan via iommu
2022-01-25 13:19       ` Oscar Salvador
2022-01-25 13:19         ` Oscar Salvador
2022-01-25 13:19         ` Oscar Salvador
2022-01-25 13:21         ` Oscar Salvador
2022-01-25 13:21           ` Oscar Salvador
2022-01-25 13:21           ` Oscar Salvador
2022-01-25 16:31           ` Zi Yan
2022-01-25 16:31             ` Zi Yan
2022-01-25 16:31             ` Zi Yan via iommu
2022-02-02 12:18   ` Oscar Salvador
2022-02-02 12:18     ` Oscar Salvador
2022-02-02 12:18     ` Oscar Salvador
2022-02-02 12:25     ` David Hildenbrand
2022-02-02 12:25       ` David Hildenbrand
2022-02-02 12:25       ` David Hildenbrand
2022-02-02 12:25       ` David Hildenbrand
2022-02-02 16:25       ` Zi Yan
2022-02-02 16:25         ` Zi Yan
2022-02-02 16:25         ` Zi Yan via iommu
2022-02-02 16:35       ` Oscar Salvador
2022-02-02 16:35         ` Oscar Salvador
2022-02-02 16:35         ` Oscar Salvador
2022-01-19 19:06 ` [PATCH v4 4/7] mm: make alloc_contig_range work at pageblock granularity Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-02-04 13:56   ` Oscar Salvador
2022-02-04 13:56     ` Oscar Salvador
2022-02-04 13:56     ` Oscar Salvador
2022-02-04 15:19     ` Zi Yan
2022-02-04 15:19       ` Zi Yan
2022-02-04 15:19       ` Zi Yan via iommu
2022-01-19 19:06 ` [PATCH v4 5/7] mm: cma: use pageblock_order as the single alignment Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06 ` [PATCH v4 6/7] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06 ` [PATCH v4 7/7] arch: powerpc: adjust fadump alignment to be pageblock aligned Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-19 19:06   ` Zi Yan
2022-01-22  8:32 [PATCH v4 3/7] mm: page_isolation: check specified range for unmovable pages kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.