All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/6] Removing limitations of merging anonymous VMAs
@ 2022-05-16 12:53 Jakub Matěna
  2022-05-16 12:54 ` [RFC PATCH v3 1/6] [PATCH 1/6] mm: refactor of vma_merge() Jakub Matěna
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Jakub Matěna @ 2022-05-16 12:53 UTC (permalink / raw)
  To: linux-mm
  Cc: patches, linux-kernel, vbabka, mhocko, mgorman, willy,
	liam.howlett, hughd, kirill, riel, rostedt, peterz, david,
	Jakub Matěna

This is a series of patches that try to improve merge success rate when
VMAs are being moved, resized or otherwise modified.

Motivation
In the current kernel it is impossible to merge two anonymous VMAs
if one of them was moved. That is because VMA's page offset is
set according to the virtual address where it was created and in
order to merge two VMAs page offsets need to follow up.
Another problem when merging two faulted VMA's is their anon_vma. In
current kernel these anon_vmas have to be the one and the same.
Otherwise merge is again not allowed.
There are several places from which vma_merge() is called and therefore
several use cases that might profit from this upgrade. These include
mmap (that fills a hole between two VMAs), mremap (that moves VMA next
to another one or again perfectly fills a hole), mprotect (that modifies
protection and allows merging with a neighbor) and brk (that expands VMA
so that it is adjacent to a neighbor).
Missed merge opportunities increase the number of VMAs of a process
and in some cases can cause problems when a max count is reached.

Solution
The series solves the first problem with
page offsets by updating them when the VMA is moved to a
different virtual address (patch 4). As for the second
problem, merging of VMAs with different anon_vma is allowed under some
conditions (patch 5). Another missed opportunity in the current kernel
is when mremap enlarges an already existing VMA and it is possible to
merge with following VMA (patch 2). Patch 1 refactors function
vma_merge and makes it easier to understand and also allows relatively
seamless tracing of successful merges introduced by the patch 6. Patch
3 introduces migration waiting and rmap locking into the pagewalk
mechnism, which is necessary for patche 4 and 5.

Limitations
For both problems solution works only for VMAs that do not share
physical pages with other processes (usually child or parent
processes). This is checked by looking at anon_vma of the respective
VMA and also by looking at mapcount of individual pages. The reason
why it is not possible or at least not easy to accomplish is that
each physical page has a pointer to anon_vma and page offset. And
when this physical page is shared we cannot simply change these
parameters without affecting all of the VMAs mapping this physical
page. Good thing is that this case amounts only for about 1-3% of
all merges (measured on jemalloc (0%), redis (2.7%) and kcbench
(1.2%) tests) that fail to merge in the current kernel.
Measuring also shows slight increase in running time, jemalloc (0.3%),
redis (1%), kcbench (1%). More extensive data can be viewed at
https://home.alabanda.cz/share/results.png

Changelog
Pagewalk - previously page struct has been accessed using follow_page()
which goes through the whole pagewalk for each call. This version uses
walk_page_vma() which goes through all the necessary pages at the pte
level (vm_normal_page() is used there).
Pgoff update was previously performed at the beginning of copy_vma()
for all the pages (page->index specifically) and also for the pgoff
variable used to construct the VMA copy. Now the update of individual
pages is done later in move_page_tables(). This makes more sense because
move_page_tables() moves all the pages to the new VMA anyway and this
again spares some otherwise duplicate page walking.
Anon_vma update for mprotect cases is done in __vma_adjust(). For
mremap cases the update is done in move_page_tables() together with
the page offset update. Previously the anon_vma update was always
handled in __vma_adjust() but it was not done in all necessary cases.
More details are mentioned in the concerned patches.

Questions
Is it necessary to check mapcount of individual pages of the VMA to
determine if they are shared with other processes? Is it even possible
when VMA or respectivelly its anon_vma is not shared? So far as my
knowledge of kernel goes, it seems that checking individual pages is not
necessary and check on the level of anon_vma is suficient. KSM would
theoretically interfere with page mapcount, but it is temporarily
disabled before move_vma() in mremap syscall happens. Does anyone know
about something else that can change mapcount without anon_vma knowing?

This series of patches and documentation of the related code will
be part of my master's thesis.
This patch series is based on tag v5.18-rc2. This is a third version.

Jakub Matěna (6):
  mm: refactor of vma_merge()
  mm: add merging after mremap resize
  mm: add migration waiting and rmap locking to pagewalk
  mm: adjust page offset in mremap
  mm: enable merging of VMAs with different anon_vmas
  mm: add tracing for VMA merges

 fs/exec.c                   |   2 +-
 fs/proc/task_mmu.c          |   4 +-
 include/linux/mm.h          |   4 +-
 include/linux/pagewalk.h    |  15 +-
 include/linux/rmap.h        |  19 ++-
 include/trace/events/mmap.h |  83 ++++++++++
 mm/internal.h               |  12 ++
 mm/mmap.c                   | 291 ++++++++++++++++++++++++++----------
 mm/mremap.c                 | 153 ++++++++++++++-----
 mm/pagewalk.c               |  75 +++++++++-
 mm/rmap.c                   | 144 ++++++++++++++++++
 11 files changed, 670 insertions(+), 132 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 22+ messages in thread
* Re: [RFC PATCH v3 3/6] [PATCH 3/6] mm: add migration waiting and rmap locking to pagewalk
  2022-05-16 12:54 ` [RFC PATCH v3 3/6] [PATCH 3/6] mm: add migration waiting and rmap locking to pagewalk Jakub Matěna
@ 2022-05-18  5:41 ` Dan Carpenter
  0 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2022-05-18  5:03 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 3424 bytes --]

CC: kbuild-all(a)lists.01.org
BCC: lkp(a)intel.com
In-Reply-To: <20220516125405.1675-4-matenajakub@gmail.com>
References: <20220516125405.1675-4-matenajakub@gmail.com>
TO: "Jakub Matěna" <matenajakub@gmail.com>

Hi "Jakub,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on kees/for-next/execve]
[also build test WARNING on linux/master linus/master v5.18-rc7]
[cannot apply to next-20220517]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Jakub-Mat-na/Removing-limitations-of-merging-anonymous-VMAs/20220516-205637
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/execve
:::::: branch date: 2 days ago
:::::: commit date: 2 days ago
config: i386-randconfig-m021-20220516 (https://download.01.org/0day-ci/archive/20220518/202205181348.9FV48Mdu-lkp(a)intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
mm/pagewalk.c:104 walk_pte_range() error: uninitialized symbol 'ptl'.

vim +/ptl +104 mm/pagewalk.c

e6473092bd91165 Matt Mackall     2008-02-04   91  
fbf56346b855872 Steven Price     2020-02-03   92  static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
fbf56346b855872 Steven Price     2020-02-03   93  			  struct mm_walk *walk)
fbf56346b855872 Steven Price     2020-02-03   94  {
fbf56346b855872 Steven Price     2020-02-03   95  	pte_t *pte;
fbf56346b855872 Steven Price     2020-02-03   96  	int err = 0;
fbf56346b855872 Steven Price     2020-02-03   97  	spinlock_t *ptl;
fbf56346b855872 Steven Price     2020-02-03   98  
9ba7cddd9f98f45 Jakub Matěna     2022-05-16   99  	if (walk->flags & WALK_LOCK_RMAP)
9ba7cddd9f98f45 Jakub Matěna     2022-05-16  100  		take_rmap_locks(walk->vma);
9ba7cddd9f98f45 Jakub Matěna     2022-05-16  101  
fbf56346b855872 Steven Price     2020-02-03  102  	if (walk->no_vma) {
fbf56346b855872 Steven Price     2020-02-03  103  		pte = pte_offset_map(pmd, addr);
9ba7cddd9f98f45 Jakub Matěna     2022-05-16 @104  		err = walk_pte_range_inner(pte, addr, end, walk, ptl, pmd);
fbf56346b855872 Steven Price     2020-02-03  105  		pte_unmap(pte);
fbf56346b855872 Steven Price     2020-02-03  106  	} else {
fbf56346b855872 Steven Price     2020-02-03  107  		pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
9ba7cddd9f98f45 Jakub Matěna     2022-05-16  108  		err = walk_pte_range_inner(pte, addr, end, walk, ptl, pmd);
ace88f1018b8816 Thomas Hellstrom 2019-10-04  109  		pte_unmap_unlock(pte, ptl);
fbf56346b855872 Steven Price     2020-02-03  110  	}
fbf56346b855872 Steven Price     2020-02-03  111  
9ba7cddd9f98f45 Jakub Matěna     2022-05-16  112  	if (walk->flags & WALK_LOCK_RMAP)
9ba7cddd9f98f45 Jakub Matěna     2022-05-16  113  		drop_rmap_locks(walk->vma);
9ba7cddd9f98f45 Jakub Matěna     2022-05-16  114  
e6473092bd91165 Matt Mackall     2008-02-04  115  	return err;
e6473092bd91165 Matt Mackall     2008-02-04  116  }
e6473092bd91165 Matt Mackall     2008-02-04  117  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-05-25 14:05 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-16 12:53 [RFC PATCH v3 0/6] Removing limitations of merging anonymous VMAs Jakub Matěna
2022-05-16 12:54 ` [RFC PATCH v3 1/6] [PATCH 1/6] mm: refactor of vma_merge() Jakub Matěna
2022-05-20 13:28   ` Kirill A. Shutemov
2022-05-20 15:52     ` Jakub Matěna
2022-05-16 12:54 ` [RFC PATCH v3 2/6] [PATCH 2/6] mm: add merging after mremap resize Jakub Matěna
2022-05-16 21:05   ` kernel test robot
2022-05-20 13:41   ` Kirill A. Shutemov
2022-05-20 14:48     ` Jakub Matěna
2022-05-16 12:54 ` [RFC PATCH v3 3/6] [PATCH 3/6] mm: add migration waiting and rmap locking to pagewalk Jakub Matěna
2022-05-16 21:46   ` kernel test robot
2022-05-16 12:54 ` [RFC PATCH v3 4/6] [PATCH 4/6] mm: adjust page offset in mremap Jakub Matěna
2022-05-19  8:39   ` [mm] df8ef36a21: kernel_BUG_at_lib/list_debug.c kernel test robot
2022-05-19  8:39     ` kernel test robot
2022-05-16 12:54 ` [RFC PATCH v3 5/6] [PATCH 5/6] mm: enable merging of VMAs with different anon_vmas Jakub Matěna
2022-05-19  8:01   ` [mm] d0a63efe2f: WARNING:at_mm/rmap.c:#reconnect_page_pte kernel test robot
2022-05-19  8:01     ` kernel test robot
2022-05-16 12:54 ` [RFC PATCH v3 6/6] [PATCH 6/6] mm: add tracing for VMA merges Jakub Matěna
2022-05-25 14:05   ` Steven Rostedt
2022-05-17 16:44 ` [RFC PATCH v3 0/6] Removing limitations of merging anonymous VMAs Kirill A. Shutemov
2022-05-20 12:22   ` Vlastimil Babka
2022-05-18  5:03 [RFC PATCH v3 3/6] [PATCH 3/6] mm: add migration waiting and rmap locking to pagewalk kernel test robot
2022-05-18  5:41 ` Dan Carpenter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.