All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Baron <jbaron@akamai.com>
To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Cc: Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Mel Gorman <mgorman@suse.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT
Date: Fri,  8 Jun 2018 14:56:52 -0400	[thread overview]
Message-ID: <1528484212-7199-1-git-send-email-jbaron@akamai.com> (raw)

In order to free memory that is marked MLOCK_ONFAULT, the memory region
needs to be first unlocked, before calling MADV_DONTNEED. And if the region
is to be reused as MLOCK_ONFAULT, we require another call to mlock2() with
the MLOCK_ONFAULT flag.

Let's simplify freeing memory that is set MLOCK_ONFAULT, by allowing
MADV_DONTNEED to work directly for memory that is set MLOCK_ONFAULT. The
locked memory limits, tracked by mm->locked_vm do not need to be adjusted
in this case, since they were charged to the entire region when
MLOCK_ONFAULT was initially set.

Further, I don't think allowing MADV_FREE for MLOCK_ONFAULT regions makes
sense, since the point of MLOCK_ONFAULT is for userspace to know when pages
are locked in memory and thus to know when page faults will occur.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/internal.h | 18 ++++++++++++++++++
 mm/madvise.c  |  4 ++--
 mm/oom_kill.c |  2 +-
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 9e3654d..16c0041 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -15,6 +15,7 @@
 #include <linux/mm.h>
 #include <linux/pagemap.h>
 #include <linux/tracepoint-defs.h>
+#include <uapi/asm-generic/mman-common.h>
 
 /*
  * The set of flags that only affect watermark checking and reclaim
@@ -45,9 +46,26 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 
 static inline bool can_madv_dontneed_vma(struct vm_area_struct *vma)
 {
+	return !(((vma->vm_flags & (VM_LOCKED|VM_LOCKONFAULT)) == VM_LOCKED) ||
+		 (vma->vm_flags & (VM_HUGETLB|VM_PFNMAP)));
+}
+
+static inline bool can_madv_free_vma(struct vm_area_struct *vma)
+{
 	return !(vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP));
 }
 
+static inline bool can_madv_dontneed_or_free_vma(struct vm_area_struct *vma,
+						 int behavior)
+{
+	if (behavior == MADV_DONTNEED)
+		return can_madv_dontneed_vma(vma);
+	else if (behavior == MADV_FREE)
+		return can_madv_free_vma(vma);
+	else
+		return 0;
+}
+
 void unmap_page_range(struct mmu_gather *tlb,
 			     struct vm_area_struct *vma,
 			     unsigned long addr, unsigned long end,
diff --git a/mm/madvise.c b/mm/madvise.c
index 4d3c922..61ff306 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -517,7 +517,7 @@ static long madvise_dontneed_free(struct vm_area_struct *vma,
 				  int behavior)
 {
 	*prev = vma;
-	if (!can_madv_dontneed_vma(vma))
+	if (!can_madv_dontneed_or_free_vma(vma, behavior))
 		return -EINVAL;
 
 	if (!userfaultfd_remove(vma, start, end)) {
@@ -539,7 +539,7 @@ static long madvise_dontneed_free(struct vm_area_struct *vma,
 			 */
 			return -ENOMEM;
 		}
-		if (!can_madv_dontneed_vma(vma))
+		if (!can_madv_dontneed_or_free_vma(vma, behavior))
 			return -EINVAL;
 		if (end > vma->vm_end) {
 			/*
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 8ba6cb8..9817d15 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -492,7 +492,7 @@ void __oom_reap_task_mm(struct mm_struct *mm)
 	set_bit(MMF_UNSTABLE, &mm->flags);
 
 	for (vma = mm->mmap ; vma; vma = vma->vm_next) {
-		if (!can_madv_dontneed_vma(vma))
+		if (!can_madv_free_vma(vma))
 			continue;
 
 		/*
-- 
2.7.4

             reply	other threads:[~2018-06-08 19:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-08 18:56 Jason Baron [this message]
2018-06-08 19:57 ` [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT Andrew Morton
2018-06-08 20:55   ` Jason Baron
2018-06-09 11:51 ` kbuild test robot
2018-06-11  7:20 ` Michal Hocko
2018-06-11 14:51   ` Jason Baron
2018-06-11 15:03     ` Michal Hocko
2018-06-11 16:23       ` Jason Baron
2018-06-12  7:46         ` Michal Hocko
2018-06-12 14:11           ` Jason Baron
2018-06-13  6:32             ` Vlastimil Babka
2018-06-13  7:15               ` Michal Hocko
2018-06-13  7:51                 ` Vlastimil Babka
2018-06-13  8:37                   ` Michal Hocko
2018-06-15 19:36                 ` Jason Baron
2018-06-20 11:00                   ` Michal Hocko
2018-06-28 20:20                     ` Jason Baron
2018-06-13  9:13             ` Michal Hocko
2018-06-15 19:28               ` Jason Baron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1528484212-7199-1-git-send-email-jbaron@akamai.com \
    --to=jbaron@akamai.com \
    --cc=akpm@linux-foundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.