linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
       [not found] <20210302210949.2440120-1-minchan@kernel.org>
@ 2021-03-03 12:49 ` Michal Hocko
  2021-03-03 20:23   ` Minchan Kim
  2021-03-03 13:38 ` kernel test robot
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2021-03-03 12:49 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, joaodias, surenb, cgoldswo, willy,
	david, vbabka, linux-fsdevel

On Tue 02-03-21 13:09:48, Minchan Kim wrote:
> LRU pagevec holds refcount of pages until the pagevec are drained.
> It could prevent migration since the refcount of the page is greater
> than the expection in migration logic. To mitigate the issue,
> callers of migrate_pages drains LRU pagevec via migrate_prep or
> lru_add_drain_all before migrate_pages call.
> 
> However, it's not enough because pages coming into pagevec after the
> draining call still could stay at the pagevec so it could keep
> preventing page migration. Since some callers of migrate_pages have
> retrial logic with LRU draining, the page would migrate at next trail
> but it is still fragile in that it doesn't close the fundamental race
> between upcoming LRU pages into pagvec and migration so the migration
> failure could cause contiguous memory allocation failure in the end.
> 
> To close the race, this patch disables lru caches(i.e, pagevec)
> during ongoing migration until migrate is done.
> 
> Since it's really hard to reproduce, I measured how many times
> migrate_pages retried with force mode below debug code.
> 
> int migrate_pages(struct list_head *from, new_page_t get_new_page,
> 			..
> 			..
> 
> if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
>        printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
>        dump_page(page, "fail to migrate");
> }
> 
> The test was repeating android apps launching with cma allocation
> in background every five seconds. Total cma allocation count was
> about 500 during the testing. With this patch, the dump_page count
> was reduced from 400 to 30.

Have you seen any improvement on the CMA allocation success rate?

> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> * from RFC - http://lore.kernel.org/linux-mm/20210216170348.1513483-1-minchan@kernel.org
>   * use atomic and lru_add_drain_all for strict ordering - mhocko
>   * lru_cache_disable/enable - mhocko
> 
>  fs/block_dev.c          |  2 +-
>  include/linux/migrate.h |  6 +++--
>  include/linux/swap.h    |  4 ++-
>  mm/compaction.c         |  4 +--
>  mm/fadvise.c            |  2 +-
>  mm/gup.c                |  2 +-
>  mm/khugepaged.c         |  2 +-
>  mm/ksm.c                |  2 +-
>  mm/memcontrol.c         |  4 +--
>  mm/memfd.c              |  2 +-
>  mm/memory-failure.c     |  2 +-
>  mm/memory_hotplug.c     |  2 +-
>  mm/mempolicy.c          |  6 +++++
>  mm/migrate.c            | 15 ++++++-----
>  mm/page_alloc.c         |  5 +++-
>  mm/swap.c               | 55 +++++++++++++++++++++++++++++++++++------
>  16 files changed, 85 insertions(+), 30 deletions(-)

The churn seems to be quite big for something that should have been a
very small change. Have you considered not changing lru_add_drain_all
but rather introduce __lru_add_dain_all that would implement the
enforced flushing?

[...]
> +static atomic_t lru_disable_count = ATOMIC_INIT(0);
> +
> +bool lru_cache_disabled(void)
> +{
> +	return atomic_read(&lru_disable_count);
> +}
> +
> +void lru_cache_disable(void)
> +{
> +	/*
> +	 * lru_add_drain_all's IPI will make sure no new pages are added
> +	 * to the pcp lists and drain them all.
> +	 */
> +	atomic_inc(&lru_disable_count);

As already mentioned in the last review. The IPI reference is more
cryptic than useful. I would go with something like this instead

	/*
	 * lru_add_drain_all in the force mode will schedule draining on
	 * all online CPUs so any calls of lru_cache_disabled wrapped by
	 * local_lock or preemption disabled would be  ordered by that.
	 * The atomic operation doesn't need to have stronger ordering
	 * requirements because that is enforece by the scheduling
	 * guarantees.
	 */
> +
> +	/*
> +	 * Clear the LRU lists so pages can be isolated.
> +	 */
> +	lru_add_drain_all(true);
> +}
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
       [not found] <20210302210949.2440120-1-minchan@kernel.org>
  2021-03-03 12:49 ` [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily Michal Hocko
@ 2021-03-03 13:38 ` kernel test robot
  2021-03-03 15:11 ` kernel test robot
  2021-03-03 18:12 ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-03-03 13:38 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: kbuild-all, clang-built-linux, Linux Memory Management List,
	LKML, joaodias, surenb, cgoldswo, willy, mhocko, david, vbabka

[-- Attachment #1: Type: text/plain, Size: 15696 bytes --]

Hi Minchan,

I love your patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on linux/master linus/master v5.12-rc1 next-20210303]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Minchan-Kim/mm-disable-LRU-pagevec-during-the-migration-temporarily/20210303-191809
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: s390-randconfig-r034-20210303 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a7cad6680b4087eff8994f1f99ac40c661a6621f)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/0day-ci/linux/commit/6e669beb75caae92c613a012734b1a2dc9485524
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Minchan-Kim/mm-disable-LRU-pagevec-during-the-migration-temporarily/20210303-191809
        git checkout 6e669beb75caae92c613a012734b1a2dc9485524
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:19:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x000000ffUL) << 24) |            \
                     ^
   In file included from arch/s390/kernel/uv.c:14:
   In file included from include/linux/memblock.h:14:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:20:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x0000ff00UL) <<  8) |            \
                     ^
   In file included from arch/s390/kernel/uv.c:14:
   In file included from include/linux/memblock.h:14:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:21:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0x00ff0000UL) >>  8) |            \
                     ^
   In file included from arch/s390/kernel/uv.c:14:
   In file included from include/linux/memblock.h:14:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:119:21: note: expanded from macro '__swab32'
           ___constant_swab32(x) :                 \
                              ^
   include/uapi/linux/swab.h:22:12: note: expanded from macro '___constant_swab32'
           (((__u32)(x) & (__u32)0xff000000UL) >> 24)))
                     ^
   In file included from arch/s390/kernel/uv.c:14:
   In file included from include/linux/memblock.h:14:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:34:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:120:12: note: expanded from macro '__swab32'
           __fswab32(x))
                     ^
   In file included from arch/s390/kernel/uv.c:14:
   In file included from include/linux/memblock.h:14:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from arch/s390/include/asm/io.h:80:
   include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> arch/s390/kernel/uv.c:272:22: error: too few arguments to function call, single argument 'force_all_cpus' was not specified
                           lru_add_drain_all();
                           ~~~~~~~~~~~~~~~~~ ^
   include/linux/swap.h:347:13: note: 'lru_add_drain_all' declared here
   extern void lru_add_drain_all(bool force_all_cpus);
               ^
   20 warnings and 1 error generated.


vim +/force_all_cpus +272 arch/s390/kernel/uv.c

214d9bbcd3a672 Claudio Imbrenda  2020-01-21  212  
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  213  /*
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  214   * Requests the Ultravisor to make a page accessible to a guest.
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  215   * If it's brought in the first time, it will be cleared. If
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  216   * it has been exported before, it will be decrypted and integrity
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  217   * checked.
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  218   */
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  219  int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  220  {
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  221  	struct vm_area_struct *vma;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  222  	bool local_drain = false;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  223  	spinlock_t *ptelock;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  224  	unsigned long uaddr;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  225  	struct page *page;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  226  	pte_t *ptep;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  227  	int rc;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  228  
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  229  again:
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  230  	rc = -EFAULT;
d8ed45c5dcd455 Michel Lespinasse 2020-06-08  231  	mmap_read_lock(gmap->mm);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  232  
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  233  	uaddr = __gmap_translate(gmap, gaddr);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  234  	if (IS_ERR_VALUE(uaddr))
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  235  		goto out;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  236  	vma = find_vma(gmap->mm, uaddr);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  237  	if (!vma)
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  238  		goto out;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  239  	/*
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  240  	 * Secure pages cannot be huge and userspace should not combine both.
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  241  	 * In case userspace does it anyway this will result in an -EFAULT for
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  242  	 * the unpack. The guest is thus never reaching secure mode. If
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  243  	 * userspace is playing dirty tricky with mapping huge pages later
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  244  	 * on this will result in a segmentation fault.
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  245  	 */
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  246  	if (is_vm_hugetlb_page(vma))
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  247  		goto out;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  248  
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  249  	rc = -ENXIO;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  250  	page = follow_page(vma, uaddr, FOLL_WRITE);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  251  	if (IS_ERR_OR_NULL(page))
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  252  		goto out;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  253  
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  254  	lock_page(page);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  255  	ptep = get_locked_pte(gmap->mm, uaddr, &ptelock);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  256  	rc = make_secure_pte(ptep, uaddr, page, uvcb);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  257  	pte_unmap_unlock(ptep, ptelock);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  258  	unlock_page(page);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  259  out:
d8ed45c5dcd455 Michel Lespinasse 2020-06-08  260  	mmap_read_unlock(gmap->mm);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  261  
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  262  	if (rc == -EAGAIN) {
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  263  		wait_on_page_writeback(page);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  264  	} else if (rc == -EBUSY) {
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  265  		/*
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  266  		 * If we have tried a local drain and the page refcount
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  267  		 * still does not match our expected safe value, try with a
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  268  		 * system wide drain. This is needed if the pagevecs holding
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  269  		 * the page are on a different CPU.
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  270  		 */
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  271  		if (local_drain) {
214d9bbcd3a672 Claudio Imbrenda  2020-01-21 @272  			lru_add_drain_all();
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  273  			/* We give up here, and let the caller try again */
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  274  			return -EAGAIN;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  275  		}
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  276  		/*
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  277  		 * We are here if the page refcount does not match the
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  278  		 * expected safe value. The main culprits are usually
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  279  		 * pagevecs. With lru_add_drain() we drain the pagevecs
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  280  		 * on the local CPU so that hopefully the refcount will
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  281  		 * reach the expected safe value.
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  282  		 */
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  283  		lru_add_drain();
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  284  		local_drain = true;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  285  		/* And now we try again immediately after draining */
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  286  		goto again;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  287  	} else if (rc == -ENXIO) {
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  288  		if (gmap_fault(gmap, gaddr, FAULT_FLAG_WRITE))
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  289  			return -EFAULT;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  290  		return -EAGAIN;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  291  	}
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  292  	return rc;
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  293  }
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  294  EXPORT_SYMBOL_GPL(gmap_make_secure);
214d9bbcd3a672 Claudio Imbrenda  2020-01-21  295  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24383 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
       [not found] <20210302210949.2440120-1-minchan@kernel.org>
  2021-03-03 12:49 ` [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily Michal Hocko
  2021-03-03 13:38 ` kernel test robot
@ 2021-03-03 15:11 ` kernel test robot
  2021-03-03 18:12 ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-03-03 15:11 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: kbuild-all, Linux Memory Management List, LKML, joaodias, surenb,
	cgoldswo, willy, mhocko, david, vbabka

[-- Attachment #1: Type: text/plain, Size: 11653 bytes --]

Hi Minchan,

I love your patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on linux/master linus/master v5.12-rc1 next-20210303]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Minchan-Kim/mm-disable-LRU-pagevec-during-the-migration-temporarily/20210303-191809
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: nios2-randconfig-r032-20210303 (attached as .config)
compiler: nios2-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/6e669beb75caae92c613a012734b1a2dc9485524
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Minchan-Kim/mm-disable-LRU-pagevec-during-the-migration-temporarily/20210303-191809
        git checkout 6e669beb75caae92c613a012734b1a2dc9485524
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nios2 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   mm/swap.c:59:6: warning: no previous prototype for 'lru_cache_disabled' [-Wmissing-prototypes]
      59 | bool lru_cache_disabled(void)
         |      ^~~~~~~~~~~~~~~~~~
>> mm/swap.c:871:6: error: conflicting types for 'lru_add_drain_all'
     871 | void lru_add_drain_all(void)
         |      ^~~~~~~~~~~~~~~~~
   In file included from mm/swap.c:20:
   include/linux/swap.h:347:13: note: previous declaration of 'lru_add_drain_all' was here
     347 | extern void lru_add_drain_all(bool force_all_cpus);
         |             ^~~~~~~~~~~~~~~~~

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for SERIAL_CORE_CONSOLE
   Depends on TTY && HAS_IOMEM
   Selected by
   - EARLY_PRINTK


vim +/lru_add_drain_all +871 mm/swap.c

053837fce7aa79 Nick Piggin           2006-01-18  768  
9852a7212324fd Michal Hocko          2018-01-31  769  /*
9852a7212324fd Michal Hocko          2018-01-31  770   * Doesn't need any cpu hotplug locking because we do rely on per-cpu
9852a7212324fd Michal Hocko          2018-01-31  771   * kworkers being shut down before our page_alloc_cpu_dead callback is
9852a7212324fd Michal Hocko          2018-01-31  772   * executed on the offlined cpu.
9852a7212324fd Michal Hocko          2018-01-31  773   * Calling this function with cpu hotplug locks held can actually lead
9852a7212324fd Michal Hocko          2018-01-31  774   * to obscure indirect dependencies via WQ context.
9852a7212324fd Michal Hocko          2018-01-31  775   */
6e669beb75caae Minchan Kim           2021-03-02  776  void lru_add_drain_all(bool force_all_cpus)
053837fce7aa79 Nick Piggin           2006-01-18  777  {
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  778  	/*
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  779  	 * lru_drain_gen - Global pages generation number
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  780  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  781  	 * (A) Definition: global lru_drain_gen = x implies that all generations
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  782  	 *     0 < n <= x are already *scheduled* for draining.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  783  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  784  	 * This is an optimization for the highly-contended use case where a
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  785  	 * user space workload keeps constantly generating a flow of pages for
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  786  	 * each CPU.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  787  	 */
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  788  	static unsigned int lru_drain_gen;
5fbc461636c32e Chris Metcalf         2013-09-12  789  	static struct cpumask has_work;
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  790  	static DEFINE_MUTEX(lock);
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  791  	unsigned cpu, this_gen;
5fbc461636c32e Chris Metcalf         2013-09-12  792  
ce612879ddc78e Michal Hocko          2017-04-07  793  	/*
ce612879ddc78e Michal Hocko          2017-04-07  794  	 * Make sure nobody triggers this path before mm_percpu_wq is fully
ce612879ddc78e Michal Hocko          2017-04-07  795  	 * initialized.
ce612879ddc78e Michal Hocko          2017-04-07  796  	 */
ce612879ddc78e Michal Hocko          2017-04-07  797  	if (WARN_ON(!mm_percpu_wq))
ce612879ddc78e Michal Hocko          2017-04-07  798  		return;
ce612879ddc78e Michal Hocko          2017-04-07  799  
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  800  	/*
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  801  	 * Guarantee pagevec counter stores visible by this CPU are visible to
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  802  	 * other CPUs before loading the current drain generation.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  803  	 */
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  804  	smp_mb();
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  805  
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  806  	/*
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  807  	 * (B) Locally cache global LRU draining generation number
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  808  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  809  	 * The read barrier ensures that the counter is loaded before the mutex
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  810  	 * is taken. It pairs with smp_mb() inside the mutex critical section
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  811  	 * at (D).
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  812  	 */
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  813  	this_gen = smp_load_acquire(&lru_drain_gen);
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  814  
5fbc461636c32e Chris Metcalf         2013-09-12  815  	mutex_lock(&lock);
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  816  
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  817  	/*
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  818  	 * (C) Exit the draining operation if a newer generation, from another
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  819  	 * lru_add_drain_all(), was already scheduled for draining. Check (A).
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  820  	 */
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  821  	if (unlikely(this_gen != lru_drain_gen))
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  822  		goto done;
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  823  
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  824  	/*
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  825  	 * (D) Increment global generation number
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  826  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  827  	 * Pairs with smp_load_acquire() at (B), outside of the critical
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  828  	 * section. Use a full memory barrier to guarantee that the new global
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  829  	 * drain generation number is stored before loading pagevec counters.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  830  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  831  	 * This pairing must be done here, before the for_each_online_cpu loop
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  832  	 * below which drains the page vectors.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  833  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  834  	 * Let x, y, and z represent some system CPU numbers, where x < y < z.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  835  	 * Assume CPU #z is is in the middle of the for_each_online_cpu loop
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  836  	 * below and has already reached CPU #y's per-cpu data. CPU #x comes
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  837  	 * along, adds some pages to its per-cpu vectors, then calls
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  838  	 * lru_add_drain_all().
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  839  	 *
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  840  	 * If the paired barrier is done at any later step, e.g. after the
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  841  	 * loop, CPU #x will just exit at (C) and miss flushing out all of its
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  842  	 * added pages.
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  843  	 */
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  844  	WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1);
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  845  	smp_mb();
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  846  
5fbc461636c32e Chris Metcalf         2013-09-12  847  	cpumask_clear(&has_work);
5fbc461636c32e Chris Metcalf         2013-09-12  848  	for_each_online_cpu(cpu) {
5fbc461636c32e Chris Metcalf         2013-09-12  849  		struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
5fbc461636c32e Chris Metcalf         2013-09-12  850  
6e669beb75caae Minchan Kim           2021-03-02  851  		if (force_all_cpus ||
6e669beb75caae Minchan Kim           2021-03-02  852  		    pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) ||
7e0cc01ea181c6 Qian Cai              2020-08-14  853  		    data_race(pagevec_count(&per_cpu(lru_rotate.pvec, cpu))) ||
b01b2141999936 Ingo Molnar           2020-05-27  854  		    pagevec_count(&per_cpu(lru_pvecs.lru_deactivate_file, cpu)) ||
b01b2141999936 Ingo Molnar           2020-05-27  855  		    pagevec_count(&per_cpu(lru_pvecs.lru_deactivate, cpu)) ||
b01b2141999936 Ingo Molnar           2020-05-27  856  		    pagevec_count(&per_cpu(lru_pvecs.lru_lazyfree, cpu)) ||
5fbc461636c32e Chris Metcalf         2013-09-12  857  		    need_activate_page_drain(cpu)) {
5fbc461636c32e Chris Metcalf         2013-09-12  858  			INIT_WORK(work, lru_add_drain_per_cpu);
ce612879ddc78e Michal Hocko          2017-04-07  859  			queue_work_on(cpu, mm_percpu_wq, work);
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  860  			__cpumask_set_cpu(cpu, &has_work);
5fbc461636c32e Chris Metcalf         2013-09-12  861  		}
5fbc461636c32e Chris Metcalf         2013-09-12  862  	}
5fbc461636c32e Chris Metcalf         2013-09-12  863  
5fbc461636c32e Chris Metcalf         2013-09-12  864  	for_each_cpu(cpu, &has_work)
5fbc461636c32e Chris Metcalf         2013-09-12  865  		flush_work(&per_cpu(lru_add_drain_work, cpu));
5fbc461636c32e Chris Metcalf         2013-09-12  866  
eef1a429f234f8 Konstantin Khlebnikov 2019-11-30  867  done:
5fbc461636c32e Chris Metcalf         2013-09-12  868  	mutex_unlock(&lock);
053837fce7aa79 Nick Piggin           2006-01-18  869  }
6ea183d60c4695 Michal Hocko          2019-02-20  870  #else
6ea183d60c4695 Michal Hocko          2019-02-20 @871  void lru_add_drain_all(void)
6ea183d60c4695 Michal Hocko          2019-02-20  872  {
6ea183d60c4695 Michal Hocko          2019-02-20  873  	lru_add_drain();
6ea183d60c4695 Michal Hocko          2019-02-20  874  }
6446a5131e24a8 Ahmed S. Darwish      2020-08-27  875  #endif /* CONFIG_SMP */
053837fce7aa79 Nick Piggin           2006-01-18  876  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24755 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
       [not found] <20210302210949.2440120-1-minchan@kernel.org>
                   ` (2 preceding siblings ...)
  2021-03-03 15:11 ` kernel test robot
@ 2021-03-03 18:12 ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2021-03-03 18:12 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: kbuild-all, clang-built-linux, Linux Memory Management List,
	LKML, joaodias, surenb, cgoldswo, willy, mhocko, david, vbabka

[-- Attachment #1: Type: text/plain, Size: 10538 bytes --]

Hi Minchan,

I love your patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on linux/master linus/master v5.12-rc1 next-20210303]
[cannot apply to hnaz-linux-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Minchan-Kim/mm-disable-LRU-pagevec-during-the-migration-temporarily/20210303-191809
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: arm-randconfig-r021-20210303 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project a7cad6680b4087eff8994f1f99ac40c661a6621f)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/0day-ci/linux/commit/6e669beb75caae92c613a012734b1a2dc9485524
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Minchan-Kim/mm-disable-LRU-pagevec-during-the-migration-temporarily/20210303-191809
        git checkout 6e669beb75caae92c613a012734b1a2dc9485524
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from fs/iomap/buffered-io.c:19:
>> include/linux/migrate.h:70:41: error: void function 'migrate_prep' should not return a value [-Wreturn-type]
   static inline void migrate_prep(void) { return -ENOSYS; }
                                           ^      ~~~~~~~
>> include/linux/migrate.h:71:47: error: void function 'migrate_prep_local' should not return a value [-Wreturn-type]
   static inline void migrate_prep_local(void) { return -ENOSYS; }
                                                 ^      ~~~~~~~
   2 errors generated.
--
   In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:40:
   In file included from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:86:
   In file included from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_mn.h:28:
   In file included from include/linux/hmm.h:16:
>> include/linux/migrate.h:70:41: error: void function 'migrate_prep' should not return a value [-Wreturn-type]
   static inline void migrate_prep(void) { return -ENOSYS; }
                                           ^      ~~~~~~~
>> include/linux/migrate.h:71:47: error: void function 'migrate_prep_local' should not return a value [-Wreturn-type]
   static inline void migrate_prep_local(void) { return -ENOSYS; }
                                                 ^      ~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1143:18: warning: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Wtautological-constant-out-of-range-compare]
                       res->start > 0x100000000ull)
                       ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:86: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                                                        ^~~~
   include/linux/compiler.h:69:3: note: expanded from macro '__trace_if_value'
           (cond) ?                                        \
            ^~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1143:18: warning: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Wtautological-constant-out-of-range-compare]
                       res->start > 0x100000000ull)
                       ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1143:18: warning: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Wtautological-constant-out-of-range-compare]
                       res->start > 0x100000000ull)
                       ~~~~~~~~~~ ^ ~~~~~~~~~~~~~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:61: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                               ^~~~
   3 warnings and 2 errors generated.
--
   In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:36:
   In file included from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:86:
   In file included from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_mn.h:28:
   In file included from include/linux/hmm.h:16:
>> include/linux/migrate.h:70:41: error: void function 'migrate_prep' should not return a value [-Wreturn-type]
   static inline void migrate_prep(void) { return -ENOSYS; }
                                           ^      ~~~~~~~
>> include/linux/migrate.h:71:47: error: void function 'migrate_prep_local' should not return a value [-Wreturn-type]
   static inline void migrate_prep_local(void) { return -ENOSYS; }
                                                 ^      ~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                     version_major, version_minor);
                                     ^~~~~~~~~~~~~
   include/drm/drm_print.h:498:19: note: expanded from macro 'DRM_ERROR'
           __drm_err(fmt, ##__VA_ARGS__)
                     ~~~    ^~~~~~~~~~~
   drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:22: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat]
                                     version_major, version_minor);
                                                    ^~~~~~~~~~~~~
   include/drm/drm_print.h:498:19: note: expanded from macro 'DRM_ERROR'
           __drm_err(fmt, ##__VA_ARGS__)
                     ~~~    ^~~~~~~~~~~
   2 warnings and 2 errors generated.
--
   In file included from mm/page_alloc.c:61:
>> include/linux/migrate.h:70:41: error: void function 'migrate_prep' should not return a value [-Wreturn-type]
   static inline void migrate_prep(void) { return -ENOSYS; }
                                           ^      ~~~~~~~
>> include/linux/migrate.h:71:47: error: void function 'migrate_prep_local' should not return a value [-Wreturn-type]
   static inline void migrate_prep_local(void) { return -ENOSYS; }
                                                 ^      ~~~~~~~
   mm/page_alloc.c:2621:5: warning: no previous prototype for function 'find_suitable_fallback' [-Wmissing-prototypes]
   int find_suitable_fallback(struct free_area *area, unsigned int order,
       ^
   mm/page_alloc.c:2621:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int find_suitable_fallback(struct free_area *area, unsigned int order,
   ^
   static 
   mm/page_alloc.c:3600:15: warning: no previous prototype for function 'should_fail_alloc_page' [-Wmissing-prototypes]
   noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
                 ^
   mm/page_alloc.c:3600:10: note: declare 'static' if the function is not intended to be used outside of this translation unit
   noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
            ^
            static 
   2 warnings and 2 errors generated.
--
   In file included from kernel/sched/rt.c:6:
   In file included from kernel/sched/sched.h:53:
>> include/linux/migrate.h:70:41: error: void function 'migrate_prep' should not return a value [-Wreturn-type]
   static inline void migrate_prep(void) { return -ENOSYS; }
                                           ^      ~~~~~~~
>> include/linux/migrate.h:71:47: error: void function 'migrate_prep_local' should not return a value [-Wreturn-type]
   static inline void migrate_prep_local(void) { return -ENOSYS; }
                                                 ^      ~~~~~~~
   kernel/sched/rt.c:669:6: warning: no previous prototype for function 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
   bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
        ^
   kernel/sched/rt.c:669:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
   ^
   static 
   1 warning and 2 errors generated.


vim +/migrate_prep +70 include/linux/migrate.h

    58	
    59	static inline void putback_movable_pages(struct list_head *l) {}
    60	static inline int migrate_pages(struct list_head *l, new_page_t new,
    61			free_page_t free, unsigned long private, enum migrate_mode mode,
    62			int reason)
    63		{ return -ENOSYS; }
    64	static inline struct page *alloc_migration_target(struct page *page,
    65			unsigned long private)
    66		{ return NULL; }
    67	static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
    68		{ return -EBUSY; }
    69	
  > 70	static inline void migrate_prep(void) { return -ENOSYS; }
  > 71	static inline void migrate_prep_local(void) { return -ENOSYS; }
    72	static inline void migrate_done(void) {}
    73	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32862 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
  2021-03-03 12:49 ` [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily Michal Hocko
@ 2021-03-03 20:23   ` Minchan Kim
  2021-03-04  8:07     ` David Hildenbrand
  2021-03-05 16:06     ` Michal Hocko
  0 siblings, 2 replies; 9+ messages in thread
From: Minchan Kim @ 2021-03-03 20:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, joaodias, surenb, cgoldswo, willy,
	david, vbabka, linux-fsdevel

On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote:
> On Tue 02-03-21 13:09:48, Minchan Kim wrote:
> > LRU pagevec holds refcount of pages until the pagevec are drained.
> > It could prevent migration since the refcount of the page is greater
> > than the expection in migration logic. To mitigate the issue,
> > callers of migrate_pages drains LRU pagevec via migrate_prep or
> > lru_add_drain_all before migrate_pages call.
> > 
> > However, it's not enough because pages coming into pagevec after the
> > draining call still could stay at the pagevec so it could keep
> > preventing page migration. Since some callers of migrate_pages have
> > retrial logic with LRU draining, the page would migrate at next trail
> > but it is still fragile in that it doesn't close the fundamental race
> > between upcoming LRU pages into pagvec and migration so the migration
> > failure could cause contiguous memory allocation failure in the end.
> > 
> > To close the race, this patch disables lru caches(i.e, pagevec)
> > during ongoing migration until migrate is done.
> > 
> > Since it's really hard to reproduce, I measured how many times
> > migrate_pages retried with force mode below debug code.
> > 
> > int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > 			..
> > 			..
> > 
> > if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
> >        printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
> >        dump_page(page, "fail to migrate");
> > }
> > 
> > The test was repeating android apps launching with cma allocation
> > in background every five seconds. Total cma allocation count was
> > about 500 during the testing. With this patch, the dump_page count
> > was reduced from 400 to 30.
> 
> Have you seen any improvement on the CMA allocation success rate?

Unfortunately, the cma alloc failure rate with reasonable margin
of error is really hard to reproduce under real workload.
That's why I measured the soft metric instead of direct cma fail
under real workload(I don't want to make some adhoc artificial
benchmark and keep tunes system knobs until it could show 
extremly exaggerated result to convice patch effect).

Please say if you belive this work is pointless unless there is
stable data under reproducible scenario. I am happy to drop it.

> 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> > * from RFC - http://lore.kernel.org/linux-mm/20210216170348.1513483-1-minchan@kernel.org
> >   * use atomic and lru_add_drain_all for strict ordering - mhocko
> >   * lru_cache_disable/enable - mhocko
> > 
> >  fs/block_dev.c          |  2 +-
> >  include/linux/migrate.h |  6 +++--
> >  include/linux/swap.h    |  4 ++-
> >  mm/compaction.c         |  4 +--
> >  mm/fadvise.c            |  2 +-
> >  mm/gup.c                |  2 +-
> >  mm/khugepaged.c         |  2 +-
> >  mm/ksm.c                |  2 +-
> >  mm/memcontrol.c         |  4 +--
> >  mm/memfd.c              |  2 +-
> >  mm/memory-failure.c     |  2 +-
> >  mm/memory_hotplug.c     |  2 +-
> >  mm/mempolicy.c          |  6 +++++
> >  mm/migrate.c            | 15 ++++++-----
> >  mm/page_alloc.c         |  5 +++-
> >  mm/swap.c               | 55 +++++++++++++++++++++++++++++++++++------
> >  16 files changed, 85 insertions(+), 30 deletions(-)
> 
> The churn seems to be quite big for something that should have been a
> very small change. Have you considered not changing lru_add_drain_all
> but rather introduce __lru_add_dain_all that would implement the
> enforced flushing?

Good idea.

> 
> [...]
> > +static atomic_t lru_disable_count = ATOMIC_INIT(0);
> > +
> > +bool lru_cache_disabled(void)
> > +{
> > +	return atomic_read(&lru_disable_count);
> > +}
> > +
> > +void lru_cache_disable(void)
> > +{
> > +	/*
> > +	 * lru_add_drain_all's IPI will make sure no new pages are added
> > +	 * to the pcp lists and drain them all.
> > +	 */
> > +	atomic_inc(&lru_disable_count);
> 
> As already mentioned in the last review. The IPI reference is more
> cryptic than useful. I would go with something like this instead
> 
> 	/*
> 	 * lru_add_drain_all in the force mode will schedule draining on
> 	 * all online CPUs so any calls of lru_cache_disabled wrapped by
> 	 * local_lock or preemption disabled would be  ordered by that.
> 	 * The atomic operation doesn't need to have stronger ordering
> 	 * requirements because that is enforece by the scheduling
> 	 * guarantees.
> 	 */

Thanks for the nice description.
I will use it in next revision if you believe this work is useful.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
  2021-03-03 20:23   ` Minchan Kim
@ 2021-03-04  8:07     ` David Hildenbrand
  2021-03-04 15:55       ` Minchan Kim
  2021-03-05 16:06     ` Michal Hocko
  1 sibling, 1 reply; 9+ messages in thread
From: David Hildenbrand @ 2021-03-04  8:07 UTC (permalink / raw)
  To: Minchan Kim, Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, joaodias, surenb, cgoldswo, willy,
	vbabka, linux-fsdevel

On 03.03.21 21:23, Minchan Kim wrote:
> On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote:
>> On Tue 02-03-21 13:09:48, Minchan Kim wrote:
>>> LRU pagevec holds refcount of pages until the pagevec are drained.
>>> It could prevent migration since the refcount of the page is greater
>>> than the expection in migration logic. To mitigate the issue,
>>> callers of migrate_pages drains LRU pagevec via migrate_prep or
>>> lru_add_drain_all before migrate_pages call.
>>>
>>> However, it's not enough because pages coming into pagevec after the
>>> draining call still could stay at the pagevec so it could keep
>>> preventing page migration. Since some callers of migrate_pages have
>>> retrial logic with LRU draining, the page would migrate at next trail
>>> but it is still fragile in that it doesn't close the fundamental race
>>> between upcoming LRU pages into pagvec and migration so the migration
>>> failure could cause contiguous memory allocation failure in the end.
>>>
>>> To close the race, this patch disables lru caches(i.e, pagevec)
>>> during ongoing migration until migrate is done.
>>>
>>> Since it's really hard to reproduce, I measured how many times
>>> migrate_pages retried with force mode below debug code.
>>>
>>> int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>> 			..
>>> 			..
>>>
>>> if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
>>>         printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
>>>         dump_page(page, "fail to migrate");
>>> }
>>>
>>> The test was repeating android apps launching with cma allocation
>>> in background every five seconds. Total cma allocation count was
>>> about 500 during the testing. With this patch, the dump_page count
>>> was reduced from 400 to 30.
>>
>> Have you seen any improvement on the CMA allocation success rate?
> 
> Unfortunately, the cma alloc failure rate with reasonable margin
> of error is really hard to reproduce under real workload.
> That's why I measured the soft metric instead of direct cma fail
> under real workload(I don't want to make some adhoc artificial
> benchmark and keep tunes system knobs until it could show
> extremly exaggerated result to convice patch effect).
> 
> Please say if you belive this work is pointless unless there is
> stable data under reproducible scenario. I am happy to drop it.

Do you have *some* application that triggers such a high retry count?

I'd love to run it along with virtio-mem and report the actual 
allocation success rate / necessary retries. That could give an 
indication of how helpful your work would be.

Anything that improves the reliability of alloc_contig_range() is of 
high interest to me. If it doesn't increase the reliability but merely 
does some internal improvements (less retries), it might still be 
valuable, but not that important.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
  2021-03-04  8:07     ` David Hildenbrand
@ 2021-03-04 15:55       ` Minchan Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Minchan Kim @ 2021-03-04 15:55 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Michal Hocko, Andrew Morton, linux-mm, LKML, joaodias, surenb,
	cgoldswo, willy, vbabka, linux-fsdevel

On Thu, Mar 04, 2021 at 09:07:28AM +0100, David Hildenbrand wrote:
> On 03.03.21 21:23, Minchan Kim wrote:
> > On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote:
> > > On Tue 02-03-21 13:09:48, Minchan Kim wrote:
> > > > LRU pagevec holds refcount of pages until the pagevec are drained.
> > > > It could prevent migration since the refcount of the page is greater
> > > > than the expection in migration logic. To mitigate the issue,
> > > > callers of migrate_pages drains LRU pagevec via migrate_prep or
> > > > lru_add_drain_all before migrate_pages call.
> > > > 
> > > > However, it's not enough because pages coming into pagevec after the
> > > > draining call still could stay at the pagevec so it could keep
> > > > preventing page migration. Since some callers of migrate_pages have
> > > > retrial logic with LRU draining, the page would migrate at next trail
> > > > but it is still fragile in that it doesn't close the fundamental race
> > > > between upcoming LRU pages into pagvec and migration so the migration
> > > > failure could cause contiguous memory allocation failure in the end.
> > > > 
> > > > To close the race, this patch disables lru caches(i.e, pagevec)
> > > > during ongoing migration until migrate is done.
> > > > 
> > > > Since it's really hard to reproduce, I measured how many times
> > > > migrate_pages retried with force mode below debug code.
> > > > 
> > > > int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > > 			..
> > > > 			..
> > > > 
> > > > if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
> > > >         printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
> > > >         dump_page(page, "fail to migrate");
> > > > }
> > > > 
> > > > The test was repeating android apps launching with cma allocation
> > > > in background every five seconds. Total cma allocation count was
> > > > about 500 during the testing. With this patch, the dump_page count
> > > > was reduced from 400 to 30.
> > > 
> > > Have you seen any improvement on the CMA allocation success rate?
> > 
> > Unfortunately, the cma alloc failure rate with reasonable margin
> > of error is really hard to reproduce under real workload.
> > That's why I measured the soft metric instead of direct cma fail
> > under real workload(I don't want to make some adhoc artificial
> > benchmark and keep tunes system knobs until it could show
> > extremly exaggerated result to convice patch effect).
> > 
> > Please say if you belive this work is pointless unless there is
> > stable data under reproducible scenario. I am happy to drop it.
> 
> Do you have *some* application that triggers such a high retry count?

I have no idea what the specific appliction could trigger the high
retry count since the LRUs(the VM LRU and buffer_head LRU) are
common place everybody could use and every process could trigger.

> 
> I'd love to run it along with virtio-mem and report the actual allocation
> success rate / necessary retries. That could give an indication of how
> helpful your work would be.

If it could give stable report, that would be very helpful.

> 
> Anything that improves the reliability of alloc_contig_range() is of high
> interest to me. If it doesn't increase the reliability but merely does some
> internal improvements (less retries), it might still be valuable, but not
> that important.

less retrial is good but I'd like to put more effort to close the race
I mentioned completely since the cma allocation failure for our usecases
are critical for user experience.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
  2021-03-03 20:23   ` Minchan Kim
  2021-03-04  8:07     ` David Hildenbrand
@ 2021-03-05 16:06     ` Michal Hocko
  2021-03-05 20:26       ` Minchan Kim
  1 sibling, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2021-03-05 16:06 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, joaodias, surenb, cgoldswo, willy,
	david, vbabka, linux-fsdevel

On Wed 03-03-21 12:23:22, Minchan Kim wrote:
> On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote:
> > On Tue 02-03-21 13:09:48, Minchan Kim wrote:
> > > LRU pagevec holds refcount of pages until the pagevec are drained.
> > > It could prevent migration since the refcount of the page is greater
> > > than the expection in migration logic. To mitigate the issue,
> > > callers of migrate_pages drains LRU pagevec via migrate_prep or
> > > lru_add_drain_all before migrate_pages call.
> > > 
> > > However, it's not enough because pages coming into pagevec after the
> > > draining call still could stay at the pagevec so it could keep
> > > preventing page migration. Since some callers of migrate_pages have
> > > retrial logic with LRU draining, the page would migrate at next trail
> > > but it is still fragile in that it doesn't close the fundamental race
> > > between upcoming LRU pages into pagvec and migration so the migration
> > > failure could cause contiguous memory allocation failure in the end.
> > > 
> > > To close the race, this patch disables lru caches(i.e, pagevec)
> > > during ongoing migration until migrate is done.
> > > 
> > > Since it's really hard to reproduce, I measured how many times
> > > migrate_pages retried with force mode below debug code.
> > > 
> > > int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > 			..
> > > 			..
> > > 
> > > if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
> > >        printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
> > >        dump_page(page, "fail to migrate");
> > > }
> > > 
> > > The test was repeating android apps launching with cma allocation
> > > in background every five seconds. Total cma allocation count was
> > > about 500 during the testing. With this patch, the dump_page count
> > > was reduced from 400 to 30.
> > 
> > Have you seen any improvement on the CMA allocation success rate?
> 
> Unfortunately, the cma alloc failure rate with reasonable margin
> of error is really hard to reproduce under real workload.
> That's why I measured the soft metric instead of direct cma fail
> under real workload(I don't want to make some adhoc artificial
> benchmark and keep tunes system knobs until it could show 
> extremly exaggerated result to convice patch effect).
> 
> Please say if you belive this work is pointless unless there is
> stable data under reproducible scenario. I am happy to drop it.

Well, I am not saying that this is pointless. In the end the resulting
change is relatively small and it provides a useful functionality for
other users (e.g. hotplug). That should be a sufficient justification.

I was asking about CMA allocation success rate because that is a much
more reasonable metric than how many times something has retried because
retries can help to increase success rate and the patch doesn't really
remove those. If you want to use number of retries as a metric then the
average allocation latency would be more meaningful.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily
  2021-03-05 16:06     ` Michal Hocko
@ 2021-03-05 20:26       ` Minchan Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Minchan Kim @ 2021-03-05 20:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, joaodias, surenb, cgoldswo, willy,
	david, vbabka, linux-fsdevel

On Fri, Mar 05, 2021 at 05:06:17PM +0100, Michal Hocko wrote:
> On Wed 03-03-21 12:23:22, Minchan Kim wrote:
> > On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote:
> > > On Tue 02-03-21 13:09:48, Minchan Kim wrote:
> > > > LRU pagevec holds refcount of pages until the pagevec are drained.
> > > > It could prevent migration since the refcount of the page is greater
> > > > than the expection in migration logic. To mitigate the issue,
> > > > callers of migrate_pages drains LRU pagevec via migrate_prep or
> > > > lru_add_drain_all before migrate_pages call.
> > > > 
> > > > However, it's not enough because pages coming into pagevec after the
> > > > draining call still could stay at the pagevec so it could keep
> > > > preventing page migration. Since some callers of migrate_pages have
> > > > retrial logic with LRU draining, the page would migrate at next trail
> > > > but it is still fragile in that it doesn't close the fundamental race
> > > > between upcoming LRU pages into pagvec and migration so the migration
> > > > failure could cause contiguous memory allocation failure in the end.
> > > > 
> > > > To close the race, this patch disables lru caches(i.e, pagevec)
> > > > during ongoing migration until migrate is done.
> > > > 
> > > > Since it's really hard to reproduce, I measured how many times
> > > > migrate_pages retried with force mode below debug code.
> > > > 
> > > > int migrate_pages(struct list_head *from, new_page_t get_new_page,
> > > > 			..
> > > > 			..
> > > > 
> > > > if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
> > > >        printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
> > > >        dump_page(page, "fail to migrate");
> > > > }
> > > > 
> > > > The test was repeating android apps launching with cma allocation
> > > > in background every five seconds. Total cma allocation count was
> > > > about 500 during the testing. With this patch, the dump_page count
> > > > was reduced from 400 to 30.
> > > 
> > > Have you seen any improvement on the CMA allocation success rate?
> > 
> > Unfortunately, the cma alloc failure rate with reasonable margin
> > of error is really hard to reproduce under real workload.
> > That's why I measured the soft metric instead of direct cma fail
> > under real workload(I don't want to make some adhoc artificial
> > benchmark and keep tunes system knobs until it could show 
> > extremly exaggerated result to convice patch effect).
> > 
> > Please say if you belive this work is pointless unless there is
> > stable data under reproducible scenario. I am happy to drop it.
> 
> Well, I am not saying that this is pointless. In the end the resulting
> change is relatively small and it provides a useful functionality for
> other users (e.g. hotplug). That should be a sufficient justification.

Yub, that was my impression to worth upstreaming rather than keeping
downstream tree so made divergent.

> 
> I was asking about CMA allocation success rate because that is a much
> more reasonable metric than how many times something has retried because
> retries can help to increase success rate and the patch doesn't really
> remove those. If you want to use number of retries as a metric then the
> average allocation latency would be more meaningful.

I believe the allocation latency would be pretty big and retrial part
would be marginal so doubt it's meaningful.

Let me send next revision with as-is descripion once I fix places
you pointed out.

Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-03-05 20:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210302210949.2440120-1-minchan@kernel.org>
2021-03-03 12:49 ` [PATCH 1/2] mm: disable LRU pagevec during the migration temporarily Michal Hocko
2021-03-03 20:23   ` Minchan Kim
2021-03-04  8:07     ` David Hildenbrand
2021-03-04 15:55       ` Minchan Kim
2021-03-05 16:06     ` Michal Hocko
2021-03-05 20:26       ` Minchan Kim
2021-03-03 13:38 ` kernel test robot
2021-03-03 15:11 ` kernel test robot
2021-03-03 18:12 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).