[RFC PATCH 00/14] Accelerating page migrations

* [RFC PATCH 00/14] Accelerating page migrations
@ 2017-02-17 15:05 Zi Yan
  2017-02-17 15:05 ` [RFC PATCH 01/14] mm/migrate: Add new mode parameter to migrate_page_copy() function Zi Yan
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Zi Yan @ 2017-02-17 15:05 UTC (permalink / raw)
  To: linux-mm; +Cc: dnellans, apopple, paulmck, khandual, zi.yan

From: Zi Yan <zi.yan@cs.rutgers.edu>

Hi all,

I was asked to post my complete patcheset for comment, so this patchset includes
parallel page migration patches I sent before. It is rebased on
mmotm-2017-02-07-15-20.

This RFC is trying to accelerate huge page migrations, so it relies on
THP migration from Naoya (https://lwn.net/Articles/713667/). There are a lot of
rough edges in the code, but I want to get comments about how those ideas look
to you.

You can find kernel code with this patchset and THP migration patchset here:
https://github.com/x-y-z/linux-thp-migration/tree/page_migration_opt_linux-mm

General description:
=======================================================

There are four parts:
1. Parallel page migration (Patch 1-6). It uses multi-threaded process instead of
   existing single-threaded one to transfer huge pages, where each thread will
   transfer a part of a huge page. It makes better use of existing memory 
   bandwidth.

2. Concurrent page migration (Patch 7,8). Linux currently transfer a list of pages
   sequentially. This is good for error handling, but bad for utilizing 
   memory bandwidth. This part batches page migration. It first
   unmaps all pages, then copy all pages together, finally reinstates PTEs
   after data copy is done. Only anonymous page is supported.

3. Exchange page migration (Patch 9-11). This is trying to save new page allocations
   when two-way page migrations are performed between two memory nodes. 
   Instead of repeating new page allocation then page migraiton, it simply 
   exchange page content of two peer pages. Only anonymous page is supported.

4. DMA page migration (Patch 12-14). This is trying to free CPUs from data copy.
   It uses DMA engine in a system to copy page data instead of CPU threads.

Experiment results:
=======================================================

I did page migration micro-benchmark with the changes on a two-socket Intel
E5-2640v3 with DDR4 at 1866MHz and cross-socket BW 32.0GB/s. This machine also
has 16 channel IOAT DMA, which provides ~11GB/s data copy throughput.

1. Parallel page migration: it increases 2MB page migration throughput from
   3.0GB/s (1-thread) to 8.6GB/s (8-thread).
2. Concurrent page migration: it increases 16 2MB page migration throughput from
   3.3GB/s (1-thread) to 14.4GB/s (8-thread).
3. Exchange page migration: it increases 16 two-way 2MB page migration throughput
   from 3.3GB/s (1-thread) to 17.8GB/s (8-thread).
4. DMA page migration: it can saturate Intel DMA engine's 11GB/s data copy throughput.

The improvements (except for DMA) was also tested on a IBM Power8 and NVIDIA
ARM64 systems and the range of improvements in results is very similar.

Best Regards,
Yan Zi

Zi Yan (14):
  mm/migrate: Add new mode parameter to migrate_page_copy() function
  mm/migrate: Make migrate_mode types non-exclussive
  mm/migrate: Add copy_pages_mthread function
  mm/migrate: Add new migrate mode MIGRATE_MT
  mm/migrate: Add new migration flag MPOL_MF_MOVE_MT for syscalls
  sysctl: Add global tunable mt_page_copy
  migrate: Add copy_page_lists_mthread() function.
  mm: migrate: Add concurrent page migration into move_pages syscall.
  mm: migrate: Add exchange_page_mthread() and
    exchange_page_lists_mthread() to exchange two pages or two page
    lists.
  mm: Add exchange_pages and exchange_pages_concur functions to exchange
    two lists of pages instead of two migrate_pages().
  mm: migrate: Add exchange_pages syscall to exchange two page lists.
  migrate: Add copy_page_dma to use DMA Engine to copy pages.
  mm: migrate: Add copy_page_dma into migrate_page_copy.
  mm: Add copy_page_lists_dma_always to support copy a list of pages.

 arch/x86/entry/syscalls/syscall_64.tbl |    2 +
 fs/aio.c                               |    2 +-
 fs/f2fs/data.c                         |    2 +-
 fs/hugetlbfs/inode.c                   |    2 +-
 fs/ubifs/file.c                        |    2 +-
 include/linux/highmem.h                |    3 +
 include/linux/ksm.h                    |    5 +
 include/linux/migrate.h                |    6 +-
 include/linux/migrate_mode.h           |   10 +-
 include/linux/sched/sysctl.h           |    4 +
 include/linux/syscalls.h               |    5 +
 include/uapi/linux/mempolicy.h         |    6 +-
 kernel/sysctl.c                        |   32 +
 mm/Makefile                            |    3 +
 mm/compaction.c                        |   20 +-
 mm/copy_pages.c                        |  720 ++++++++++++++++++
 mm/exchange.c                          | 1257 ++++++++++++++++++++++++++++++++
 mm/internal.h                          |   11 +
 mm/ksm.c                               |   35 +
 mm/mempolicy.c                         |    7 +-
 mm/migrate.c                           |  573 ++++++++++++++-
 mm/shmem.c                             |    2 +-
 22 files changed, 2662 insertions(+), 47 deletions(-)
 create mode 100644 mm/copy_pages.c
 create mode 100644 mm/exchange.c

-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 25+ messages in thread