* [PATCHv2 0/4] Cleanup mm_populate() codepath
@ 2015-02-16 11:27 Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 1/4] mm: rename FOLL_MLOCK to FOLL_POPULATE Kirill A. Shutemov
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-02-16 11:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rik van Riel, Kirill A. Shutemov
While reading mlock()- and mm_populate()-related code, I've found several
things confusing. This patchset cleanup the codepath for future readers.
v2:
- Fix typos pointed by David Rientjes;
- Apply acks;
Kirill A. Shutemov (4):
mm: rename FOLL_MLOCK to FOLL_POPULATE
mm: rename __mlock_vma_pages_range() to populate_vma_page_range()
mm: move gup() -> posix mlock() error conversion out of __mm_populate
mm: move mm_populate()-related code to mm/gup.c
Documentation/vm/unevictable-lru.txt | 26 +++----
include/linux/mm.h | 2 +-
mm/gup.c | 124 ++++++++++++++++++++++++++++++++-
mm/huge_memory.c | 2 +-
mm/internal.h | 2 +-
mm/mlock.c | 131 +++--------------------------------
mm/mmap.c | 4 +-
7 files changed, 142 insertions(+), 149 deletions(-)
--
2.1.4
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCHv2 1/4] mm: rename FOLL_MLOCK to FOLL_POPULATE
2015-02-16 11:27 [PATCHv2 0/4] Cleanup mm_populate() codepath Kirill A. Shutemov
@ 2015-02-16 11:27 ` Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 2/4] mm: rename __mlock_vma_pages_range() to populate_vma_page_range() Kirill A. Shutemov
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-02-16 11:27 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Rik van Riel, Kirill A. Shutemov,
Michel Lespinasse
After commit a1fde08c74e9 FOLL_MLOCK has lost its original meaning: we
don't necessarily mlock the page if the flags is set -- we also take
VM_LOCKED into consideration.
Since we use the same codepath for __mm_populate(), let's rename
FOLL_MLOCK to FOLL_POPULATE.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
---
include/linux/mm.h | 2 +-
mm/gup.c | 6 +++---
mm/huge_memory.c | 2 +-
mm/mlock.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a09837f3f4b7..2dbf6f0f6e5b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2110,7 +2110,7 @@ static inline struct page *follow_page(struct vm_area_struct *vma,
#define FOLL_FORCE 0x10 /* get_user_pages read/write w/o permission */
#define FOLL_NOWAIT 0x20 /* if a disk transfer is needed, start the IO
* and return without waiting upon it */
-#define FOLL_MLOCK 0x40 /* mark page as mlocked */
+#define FOLL_POPULATE 0x40 /* fault in page */
#define FOLL_SPLIT 0x80 /* don't return transhuge pages, split them */
#define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */
#define FOLL_NUMA 0x200 /* force NUMA hinting page fault */
diff --git a/mm/gup.c b/mm/gup.c
index a6e24e246f86..1b114ba9aebf 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -92,7 +92,7 @@ retry:
*/
mark_page_accessed(page);
}
- if ((flags & FOLL_MLOCK) && (vma->vm_flags & VM_LOCKED)) {
+ if ((flags & FOLL_POPULATE) && (vma->vm_flags & VM_LOCKED)) {
/*
* The preliminary mapping check is mainly to avoid the
* pointless overhead of lock_page on the ZERO_PAGE
@@ -265,8 +265,8 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma,
unsigned int fault_flags = 0;
int ret;
- /* For mlock, just skip the stack guard page. */
- if ((*flags & FOLL_MLOCK) &&
+ /* For mm_populate(), just skip the stack guard page. */
+ if ((*flags & FOLL_POPULATE) &&
(stack_guard_page_start(vma, address) ||
stack_guard_page_end(vma, address + PAGE_SIZE)))
return -ENOENT;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e08e37ad050e..bc8d351026a9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1231,7 +1231,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
pmd, _pmd, 1))
update_mmu_cache_pmd(vma, addr, pmd);
}
- if ((flags & FOLL_MLOCK) && (vma->vm_flags & VM_LOCKED)) {
+ if ((flags & FOLL_POPULATE) && (vma->vm_flags & VM_LOCKED)) {
if (page->mapping && trylock_page(page)) {
lru_add_drain();
if (page->mapping)
diff --git a/mm/mlock.c b/mm/mlock.c
index 73cf0987088c..7712d8ab6fe5 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -237,7 +237,7 @@ long __mlock_vma_pages_range(struct vm_area_struct *vma,
VM_BUG_ON_VMA(end > vma->vm_end, vma);
VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
- gup_flags = FOLL_TOUCH | FOLL_MLOCK;
+ gup_flags = FOLL_TOUCH | FOLL_POPULATE;
/*
* We want to touch writable mappings with a write fault in order
* to break COW, except for shared mappings because these don't COW
--
2.1.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCHv2 2/4] mm: rename __mlock_vma_pages_range() to populate_vma_page_range()
2015-02-16 11:27 [PATCHv2 0/4] Cleanup mm_populate() codepath Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 1/4] mm: rename FOLL_MLOCK to FOLL_POPULATE Kirill A. Shutemov
@ 2015-02-16 11:27 ` Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 3/4] mm: move gup() -> posix mlock() error conversion out of __mm_populate Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 4/4] mm: move mm_populate()-related code to mm/gup.c Kirill A. Shutemov
3 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-02-16 11:27 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Rik van Riel, Kirill A. Shutemov,
Michel Lespinasse
__mlock_vma_pages_range() doesn't necessarily mlock pages. It depends on
vma flags. The same codepath is used for MAP_POPULATE.
Let's rename __mlock_vma_pages_range() to populate_vma_page_range().
This patch also drops mlock_vma_pages_range() references from
documentation. It has gone in commit cea10a19b797.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
---
Documentation/vm/unevictable-lru.txt | 26 ++++++++------------------
mm/internal.h | 2 +-
mm/mlock.c | 12 ++++++------
mm/mmap.c | 4 ++--
4 files changed, 17 insertions(+), 27 deletions(-)
diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt
index 744f82f86c58..86cb4624fc5a 100644
--- a/Documentation/vm/unevictable-lru.txt
+++ b/Documentation/vm/unevictable-lru.txt
@@ -317,7 +317,7 @@ If the VMA passes some filtering as described in "Filtering Special Vmas"
below, mlock_fixup() will attempt to merge the VMA with its neighbors or split
off a subset of the VMA if the range does not cover the entire VMA. Once the
VMA has been merged or split or neither, mlock_fixup() will call
-__mlock_vma_pages_range() to fault in the pages via get_user_pages() and to
+populate_vma_page_range() to fault in the pages via get_user_pages() and to
mark the pages as mlocked via mlock_vma_page().
Note that the VMA being mlocked might be mapped with PROT_NONE. In this case,
@@ -327,7 +327,7 @@ fault path or in vmscan.
Also note that a page returned by get_user_pages() could be truncated or
migrated out from under us, while we're trying to mlock it. To detect this,
-__mlock_vma_pages_range() checks page_mapping() after acquiring the page lock.
+populate_vma_page_range() checks page_mapping() after acquiring the page lock.
If the page is still associated with its mapping, we'll go ahead and call
mlock_vma_page(). If the mapping is gone, we just unlock the page and move on.
In the worst case, this will result in a page mapped in a VM_LOCKED VMA
@@ -392,7 +392,7 @@ ignored for munlock.
If the VMA is VM_LOCKED, mlock_fixup() again attempts to merge or split off the
specified range. The range is then munlocked via the function
-__mlock_vma_pages_range() - the same function used to mlock a VMA range -
+populate_vma_page_range() - the same function used to mlock a VMA range -
passing a flag to indicate that munlock() is being performed.
Because the VMA access protections could have been changed to PROT_NONE after
@@ -402,7 +402,7 @@ get_user_pages() was enhanced to accept a flag to ignore the permissions when
fetching the pages - all of which should be resident as a result of previous
mlocking.
-For munlock(), __mlock_vma_pages_range() unlocks individual pages by calling
+For munlock(), populate_vma_page_range() unlocks individual pages by calling
munlock_vma_page(). munlock_vma_page() unconditionally clears the PG_mlocked
flag using TestClearPageMlocked(). As with mlock_vma_page(),
munlock_vma_page() use the Test*PageMlocked() function to handle the case where
@@ -463,21 +463,11 @@ populate the page table.
To mlock a range of memory under the unevictable/mlock infrastructure, the
mmap() handler and task address space expansion functions call
-mlock_vma_pages_range() specifying the vma and the address range to mlock.
-mlock_vma_pages_range() filters VMAs like mlock_fixup(), as described above in
-"Filtering Special VMAs". It will clear the VM_LOCKED flag, which will have
-already been set by the caller, in filtered VMAs. Thus these VMA's need not be
-visited for munlock when the region is unmapped.
-
-For "normal" VMAs, mlock_vma_pages_range() calls __mlock_vma_pages_range() to
-fault/allocate the pages and mlock them. Again, like mlock_fixup(),
-mlock_vma_pages_range() downgrades the mmap semaphore to read mode before
-attempting to fault/allocate and mlock the pages and "upgrades" the semaphore
-back to write mode before returning.
-
-The callers of mlock_vma_pages_range() will have already added the memory range
+populate_vma_page_range() specifying the vma and the address range to mlock.
+
+The callers of populate_vma_page_range() will have already added the memory range
to be mlocked to the task's "locked_vm". To account for filtered VMAs,
-mlock_vma_pages_range() returns the number of pages NOT mlocked. All of the
+populate_vma_page_range() returns the number of pages NOT mlocked. All of the
callers then subtract a non-negative return value from the task's locked_vm. A
negative return value represent an error - for example, from get_user_pages()
attempting to fault in a VMA with PROT_NONE access. In this case, we leave the
diff --git a/mm/internal.h b/mm/internal.h
index c4d6c9b43491..5445860bfc08 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -240,7 +240,7 @@ void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev, struct rb_node *rb_parent);
#ifdef CONFIG_MMU
-extern long __mlock_vma_pages_range(struct vm_area_struct *vma,
+extern long populate_vma_page_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end, int *nonblocking);
extern void munlock_vma_pages_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
diff --git a/mm/mlock.c b/mm/mlock.c
index 7712d8ab6fe5..c3ea18323034 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -206,13 +206,13 @@ out:
}
/**
- * __mlock_vma_pages_range() - mlock a range of pages in the vma.
+ * populate_vma_page_range() - populate a range of pages in the vma.
* @vma: target vma
* @start: start address
* @end: end address
* @nonblocking:
*
- * This takes care of making the pages present too.
+ * This takes care of mlocking the pages too if VM_LOCKED is set.
*
* return 0 on success, negative error code on error.
*
@@ -224,7 +224,7 @@ out:
* If @nonblocking is non-NULL, it must held for read only and may be
* released. If it's released, *@nonblocking will be set to 0.
*/
-long __mlock_vma_pages_range(struct vm_area_struct *vma,
+long populate_vma_page_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end, int *nonblocking)
{
struct mm_struct *mm = vma->vm_mm;
@@ -596,7 +596,7 @@ success:
/*
* vm_flags is protected by the mmap_sem held in write mode.
* It's okay if try_to_unmap_one unmaps a page just after we
- * set VM_LOCKED, __mlock_vma_pages_range will bring it back.
+ * set VM_LOCKED, populate_vma_page_range will bring it back.
*/
if (lock)
@@ -702,11 +702,11 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
if (nstart < vma->vm_start)
nstart = vma->vm_start;
/*
- * Now fault in a range of pages. __mlock_vma_pages_range()
+ * Now fault in a range of pages. populate_vma_page_range()
* double checks the vma flags, so that it won't mlock pages
* if the vma was already munlocked.
*/
- ret = __mlock_vma_pages_range(vma, nstart, nend, &locked);
+ ret = populate_vma_page_range(vma, nstart, nend, &locked);
if (ret < 0) {
if (ignore_errors) {
ret = 0;
diff --git a/mm/mmap.c b/mm/mmap.c
index da9990acc08b..943c6ad18b1d 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2318,7 +2318,7 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
if (!prev || expand_stack(prev, addr))
return NULL;
if (prev->vm_flags & VM_LOCKED)
- __mlock_vma_pages_range(prev, addr, prev->vm_end, NULL);
+ populate_vma_page_range(prev, addr, prev->vm_end, NULL);
return prev;
}
#else
@@ -2353,7 +2353,7 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
if (expand_stack(vma, addr))
return NULL;
if (vma->vm_flags & VM_LOCKED)
- __mlock_vma_pages_range(vma, addr, start, NULL);
+ populate_vma_page_range(vma, addr, start, NULL);
return vma;
}
#endif
--
2.1.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCHv2 3/4] mm: move gup() -> posix mlock() error conversion out of __mm_populate
2015-02-16 11:27 [PATCHv2 0/4] Cleanup mm_populate() codepath Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 1/4] mm: rename FOLL_MLOCK to FOLL_POPULATE Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 2/4] mm: rename __mlock_vma_pages_range() to populate_vma_page_range() Kirill A. Shutemov
@ 2015-02-16 11:27 ` Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 4/4] mm: move mm_populate()-related code to mm/gup.c Kirill A. Shutemov
3 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-02-16 11:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rik van Riel, Kirill A. Shutemov
This is praparation to moving mm_populate()-related code out of
mm/mlock.c.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
---
mm/mlock.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/mm/mlock.c b/mm/mlock.c
index c3ea18323034..9fbe611b1e93 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -712,7 +712,6 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
ret = 0;
continue; /* continue at next VMA */
}
- ret = __mlock_posix_error_return(ret);
break;
}
nend = nstart + ret * PAGE_SIZE;
@@ -750,9 +749,13 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
error = do_mlock(start, len, 1);
up_write(¤t->mm->mmap_sem);
- if (!error)
- error = __mm_populate(start, len, 0);
- return error;
+ if (error)
+ return error;
+
+ error = __mm_populate(start, len, 0);
+ if (error)
+ return __mlock_posix_error_return(error);
+ return 0;
}
SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len)
--
2.1.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCHv2 4/4] mm: move mm_populate()-related code to mm/gup.c
2015-02-16 11:27 [PATCHv2 0/4] Cleanup mm_populate() codepath Kirill A. Shutemov
` (2 preceding siblings ...)
2015-02-16 11:27 ` [PATCHv2 3/4] mm: move gup() -> posix mlock() error conversion out of __mm_populate Kirill A. Shutemov
@ 2015-02-16 11:27 ` Kirill A. Shutemov
3 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2015-02-16 11:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel, Rik van Riel, Kirill A. Shutemov
It's odd that we have populate_vma_page_range() and __mm_populate() in
mm/mlock.c. It's implementation of generic memory population and
mlocking is one of possible side effect, if VM_LOCKED is set.
__get_user_pages() is core of the implementation. Let's move the code
mm/gup.c.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
---
mm/gup.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mm/mlock.c | 118 -------------------------------------------------------------
2 files changed, 118 insertions(+), 118 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index 1b114ba9aebf..ca7b607ab671 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -819,6 +819,124 @@ long get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
EXPORT_SYMBOL(get_user_pages);
/**
+ * populate_vma_page_range() - populate a range of pages in the vma.
+ * @vma: target vma
+ * @start: start address
+ * @end: end address
+ * @nonblocking:
+ *
+ * This takes care of mlocking the pages too if VM_LOCKED is set.
+ *
+ * return 0 on success, negative error code on error.
+ *
+ * vma->vm_mm->mmap_sem must be held.
+ *
+ * If @nonblocking is NULL, it may be held for read or write and will
+ * be unperturbed.
+ *
+ * If @nonblocking is non-NULL, it must held for read only and may be
+ * released. If it's released, *@nonblocking will be set to 0.
+ */
+long populate_vma_page_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end, int *nonblocking)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ unsigned long nr_pages = (end - start) / PAGE_SIZE;
+ int gup_flags;
+
+ VM_BUG_ON(start & ~PAGE_MASK);
+ VM_BUG_ON(end & ~PAGE_MASK);
+ VM_BUG_ON_VMA(start < vma->vm_start, vma);
+ VM_BUG_ON_VMA(end > vma->vm_end, vma);
+ VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
+
+ gup_flags = FOLL_TOUCH | FOLL_POPULATE;
+ /*
+ * We want to touch writable mappings with a write fault in order
+ * to break COW, except for shared mappings because these don't COW
+ * and we would not want to dirty them for nothing.
+ */
+ if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
+ gup_flags |= FOLL_WRITE;
+
+ /*
+ * We want mlock to succeed for regions that have any permissions
+ * other than PROT_NONE.
+ */
+ if (vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC))
+ gup_flags |= FOLL_FORCE;
+
+ /*
+ * We made sure addr is within a VMA, so the following will
+ * not result in a stack expansion that recurses back here.
+ */
+ return __get_user_pages(current, mm, start, nr_pages, gup_flags,
+ NULL, NULL, nonblocking);
+}
+
+/*
+ * __mm_populate - populate and/or mlock pages within a range of address space.
+ *
+ * This is used to implement mlock() and the MAP_POPULATE / MAP_LOCKED mmap
+ * flags. VMAs must be already marked with the desired vm_flags, and
+ * mmap_sem must not be held.
+ */
+int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
+{
+ struct mm_struct *mm = current->mm;
+ unsigned long end, nstart, nend;
+ struct vm_area_struct *vma = NULL;
+ int locked = 0;
+ long ret = 0;
+
+ VM_BUG_ON(start & ~PAGE_MASK);
+ VM_BUG_ON(len != PAGE_ALIGN(len));
+ end = start + len;
+
+ for (nstart = start; nstart < end; nstart = nend) {
+ /*
+ * We want to fault in pages for [nstart; end) address range.
+ * Find first corresponding VMA.
+ */
+ if (!locked) {
+ locked = 1;
+ down_read(&mm->mmap_sem);
+ vma = find_vma(mm, nstart);
+ } else if (nstart >= vma->vm_end)
+ vma = vma->vm_next;
+ if (!vma || vma->vm_start >= end)
+ break;
+ /*
+ * Set [nstart; nend) to intersection of desired address
+ * range with the first VMA. Also, skip undesirable VMA types.
+ */
+ nend = min(end, vma->vm_end);
+ if (vma->vm_flags & (VM_IO | VM_PFNMAP))
+ continue;
+ if (nstart < vma->vm_start)
+ nstart = vma->vm_start;
+ /*
+ * Now fault in a range of pages. populate_vma_page_range()
+ * double checks the vma flags, so that it won't mlock pages
+ * if the vma was already munlocked.
+ */
+ ret = populate_vma_page_range(vma, nstart, nend, &locked);
+ if (ret < 0) {
+ if (ignore_errors) {
+ ret = 0;
+ continue; /* continue at next VMA */
+ }
+ break;
+ }
+ nend = nstart + ret * PAGE_SIZE;
+ ret = 0;
+ }
+ if (locked)
+ up_read(&mm->mmap_sem);
+ return ret; /* 0 or negative error code */
+}
+
+/**
* get_dump_page() - pin user page in memory while writing it to core dump
* @addr: user address
*
diff --git a/mm/mlock.c b/mm/mlock.c
index 9fbe611b1e93..283ff972ea43 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -205,62 +205,6 @@ out:
return nr_pages - 1;
}
-/**
- * populate_vma_page_range() - populate a range of pages in the vma.
- * @vma: target vma
- * @start: start address
- * @end: end address
- * @nonblocking:
- *
- * This takes care of mlocking the pages too if VM_LOCKED is set.
- *
- * return 0 on success, negative error code on error.
- *
- * vma->vm_mm->mmap_sem must be held.
- *
- * If @nonblocking is NULL, it may be held for read or write and will
- * be unperturbed.
- *
- * If @nonblocking is non-NULL, it must held for read only and may be
- * released. If it's released, *@nonblocking will be set to 0.
- */
-long populate_vma_page_range(struct vm_area_struct *vma,
- unsigned long start, unsigned long end, int *nonblocking)
-{
- struct mm_struct *mm = vma->vm_mm;
- unsigned long nr_pages = (end - start) / PAGE_SIZE;
- int gup_flags;
-
- VM_BUG_ON(start & ~PAGE_MASK);
- VM_BUG_ON(end & ~PAGE_MASK);
- VM_BUG_ON_VMA(start < vma->vm_start, vma);
- VM_BUG_ON_VMA(end > vma->vm_end, vma);
- VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm);
-
- gup_flags = FOLL_TOUCH | FOLL_POPULATE;
- /*
- * We want to touch writable mappings with a write fault in order
- * to break COW, except for shared mappings because these don't COW
- * and we would not want to dirty them for nothing.
- */
- if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
- gup_flags |= FOLL_WRITE;
-
- /*
- * We want mlock to succeed for regions that have any permissions
- * other than PROT_NONE.
- */
- if (vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC))
- gup_flags |= FOLL_FORCE;
-
- /*
- * We made sure addr is within a VMA, so the following will
- * not result in a stack expansion that recurses back here.
- */
- return __get_user_pages(current, mm, start, nr_pages, gup_flags,
- NULL, NULL, nonblocking);
-}
-
/*
* convert get_user_pages() return value to posix mlock() error
*/
@@ -660,68 +604,6 @@ static int do_mlock(unsigned long start, size_t len, int on)
return error;
}
-/*
- * __mm_populate - populate and/or mlock pages within a range of address space.
- *
- * This is used to implement mlock() and the MAP_POPULATE / MAP_LOCKED mmap
- * flags. VMAs must be already marked with the desired vm_flags, and
- * mmap_sem must not be held.
- */
-int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
-{
- struct mm_struct *mm = current->mm;
- unsigned long end, nstart, nend;
- struct vm_area_struct *vma = NULL;
- int locked = 0;
- long ret = 0;
-
- VM_BUG_ON(start & ~PAGE_MASK);
- VM_BUG_ON(len != PAGE_ALIGN(len));
- end = start + len;
-
- for (nstart = start; nstart < end; nstart = nend) {
- /*
- * We want to fault in pages for [nstart; end) address range.
- * Find first corresponding VMA.
- */
- if (!locked) {
- locked = 1;
- down_read(&mm->mmap_sem);
- vma = find_vma(mm, nstart);
- } else if (nstart >= vma->vm_end)
- vma = vma->vm_next;
- if (!vma || vma->vm_start >= end)
- break;
- /*
- * Set [nstart; nend) to intersection of desired address
- * range with the first VMA. Also, skip undesirable VMA types.
- */
- nend = min(end, vma->vm_end);
- if (vma->vm_flags & (VM_IO | VM_PFNMAP))
- continue;
- if (nstart < vma->vm_start)
- nstart = vma->vm_start;
- /*
- * Now fault in a range of pages. populate_vma_page_range()
- * double checks the vma flags, so that it won't mlock pages
- * if the vma was already munlocked.
- */
- ret = populate_vma_page_range(vma, nstart, nend, &locked);
- if (ret < 0) {
- if (ignore_errors) {
- ret = 0;
- continue; /* continue at next VMA */
- }
- break;
- }
- nend = nstart + ret * PAGE_SIZE;
- ret = 0;
- }
- if (locked)
- up_read(&mm->mmap_sem);
- return ret; /* 0 or negative error code */
-}
-
SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
{
unsigned long locked;
--
2.1.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-02-16 11:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-16 11:27 [PATCHv2 0/4] Cleanup mm_populate() codepath Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 1/4] mm: rename FOLL_MLOCK to FOLL_POPULATE Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 2/4] mm: rename __mlock_vma_pages_range() to populate_vma_page_range() Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 3/4] mm: move gup() -> posix mlock() error conversion out of __mm_populate Kirill A. Shutemov
2015-02-16 11:27 ` [PATCHv2 4/4] mm: move mm_populate()-related code to mm/gup.c Kirill A. Shutemov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).