linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix another VM_BUG_ON_PAGE(PageTail(page)) on mbind(2)
@ 2016-02-01 13:26 Kirill A. Shutemov
  2016-02-01 13:26 ` [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page() Kirill A. Shutemov
  2016-02-01 13:26 ` [PATCH 2/2] mempolicy: do not try to queue pages from !vma_migratable() Kirill A. Shutemov
  0 siblings, 2 replies; 7+ messages in thread
From: Kirill A. Shutemov @ 2016-02-01 13:26 UTC (permalink / raw)
  To: Andrew Morton, Dmitry Vyukov
  Cc: Vlastimil Babka, David Rientjes, Naoya Horiguchi, Michal Hocko,
	linux-mm, linux-kernel, Kirill A. Shutemov

Dmitry Vyukov reported yet another VM_BUG_ON_PAGE(PageTail(page)) bug from
isolate_lru_page().

The first patch relaxes VM_BUG_ON_PAGE(): no need to crash on tail pages
which are not on LRU. We are not going to isolate them anyway.

The second patch tries streamline logic within queue_pages_range(): no
need to scan non migratable VMAs. The patch requires more careful review.

Any of these patches should fix the issue. I think both should be applied.
The first one is subject for 4.5 as the bogus was introduced there.

Kirill A. Shutemov (2):
  mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page()
  mempolicy: do not try to queue pages from !vma_migratable()

 mm/mempolicy.c | 14 +++++---------
 mm/vmscan.c    |  2 +-
 2 files changed, 6 insertions(+), 10 deletions(-)

-- 
2.7.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page()
  2016-02-01 13:26 [PATCH 0/2] Fix another VM_BUG_ON_PAGE(PageTail(page)) on mbind(2) Kirill A. Shutemov
@ 2016-02-01 13:26 ` Kirill A. Shutemov
  2016-02-01 14:24   ` Michal Hocko
  2016-02-01 13:26 ` [PATCH 2/2] mempolicy: do not try to queue pages from !vma_migratable() Kirill A. Shutemov
  1 sibling, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2016-02-01 13:26 UTC (permalink / raw)
  To: Andrew Morton, Dmitry Vyukov
  Cc: Vlastimil Babka, David Rientjes, Naoya Horiguchi, Michal Hocko,
	linux-mm, linux-kernel, Kirill A. Shutemov

We don't care if there's a tail pages which is not on LRU. We are not
going to isolate them anyway.

Testcase:

	#include <fcntl.h>
	#include <unistd.h>
	#include <stdio.h>
	#include <sys/mman.h>
	#include <numaif.h>

	#define SIZE 0x2000

	int foo;

	int main()
	{
		int fd;
		char *p;
		unsigned long mask = 2;

		fd = open("/dev/sg0", O_RDWR);
		p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
		/* Faultin pages */
		foo = p[0] + p[0x1000];
		mbind(p, SIZE, MPOL_BIND, &mask, 4, MPOL_MF_MOVE | MPOL_MF_STRICT);
		return 0;
	}

MPOL_MF_STRICT makes queue_pages_test_walk() ignore !vma_megratable()
and we try to queue such pages for migration. It's good question why we
ignore !vma_megratable() for MPOL_MF_STRICT, but it's subject for a
separate patch.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: bb5b8589767a ("mm: make sure isolate_lru_page() is never called for tail page"
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index eb3dd37ccd7c..492fbe73420b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1443,7 +1443,7 @@ int isolate_lru_page(struct page *page)
 	int ret = -EBUSY;
 
 	VM_BUG_ON_PAGE(!page_count(page), page);
-	VM_BUG_ON_PAGE(PageTail(page), page);
+	VM_BUG_ON_PAGE(PageLRU(page) && PageTail(page), page);
 
 	if (PageLRU(page)) {
 		struct zone *zone = page_zone(page);
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] mempolicy: do not try to queue pages from !vma_migratable()
  2016-02-01 13:26 [PATCH 0/2] Fix another VM_BUG_ON_PAGE(PageTail(page)) on mbind(2) Kirill A. Shutemov
  2016-02-01 13:26 ` [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page() Kirill A. Shutemov
@ 2016-02-01 13:26 ` Kirill A. Shutemov
  2016-02-01 22:28   ` Andrew Morton
  1 sibling, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2016-02-01 13:26 UTC (permalink / raw)
  To: Andrew Morton, Dmitry Vyukov
  Cc: Vlastimil Babka, David Rientjes, Naoya Horiguchi, Michal Hocko,
	linux-mm, linux-kernel, Kirill A. Shutemov

Maybe I miss some point, but I don't see a reason why we try to queue
pages from non migratable VMAs.

The only case when we can queue pages from such VMA is MPOL_MF_STRICT
plus MPOL_MF_MOVE or MPOL_MF_MOVE_ALL for VMA which has pages on LRU,
but gfp mask is not sutable for migaration (see mapping_gfp_mask() check
in vma_migratable()). That's looks like a bug to me.

Let's filter out non-migratable vma at start of queue_pages_test_walk()
and go to queue_pages_pte_range() only if MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL flag is set.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/mempolicy.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 27d135408a22..4c4187c0e1de 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -548,8 +548,7 @@ retry:
 			goto retry;
 		}
 
-		if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
-			migrate_page_add(page, qp->pagelist, flags);
+		migrate_page_add(page, qp->pagelist, flags);
 	}
 	pte_unmap_unlock(pte - 1, ptl);
 	cond_resched();
@@ -625,7 +624,7 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
 	unsigned long endvma = vma->vm_end;
 	unsigned long flags = qp->flags;
 
-	if (vma->vm_flags & VM_PFNMAP)
+	if (!vma_migratable(vma))
 		return 1;
 
 	if (endvma > end)
@@ -644,16 +643,13 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
 
 	if (flags & MPOL_MF_LAZY) {
 		/* Similar to task_numa_work, skip inaccessible VMAs */
-		if (vma_migratable(vma) &&
-			vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))
+		if (vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))
 			change_prot_numa(vma, start, endvma);
 		return 1;
 	}
 
-	if ((flags & MPOL_MF_STRICT) ||
-	    ((flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) &&
-	     vma_migratable(vma)))
-		/* queue pages from current vma */
+	/* queue pages from current vma */
+	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
 		return 0;
 	return 1;
 }
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page()
  2016-02-01 13:26 ` [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page() Kirill A. Shutemov
@ 2016-02-01 14:24   ` Michal Hocko
  2016-02-01 14:38     ` Kirill A. Shutemov
  0 siblings, 1 reply; 7+ messages in thread
From: Michal Hocko @ 2016-02-01 14:24 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Andrew Morton, Dmitry Vyukov, Vlastimil Babka, David Rientjes,
	Naoya Horiguchi, linux-mm, linux-kernel

On Mon 01-02-16 16:26:08, Kirill A. Shutemov wrote:
> We don't care if there's a tail pages which is not on LRU. We are not
> going to isolate them anyway.

yes we are not going to isolate them but calling this function on a
tail page is wrong in principle, no? PageLRU check is racy outside of
lru_lock so what if we are racing here. I know, highly unlikely but not
impossible. So I am not really sure this is an improvement. When would
we hit this VM_BUG_ON and it wouldn't be a bug or at least suspicious
usage?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page()
  2016-02-01 14:24   ` Michal Hocko
@ 2016-02-01 14:38     ` Kirill A. Shutemov
  2016-02-02 13:21       ` Michal Hocko
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2016-02-01 14:38 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Kirill A. Shutemov, Andrew Morton, Dmitry Vyukov,
	Vlastimil Babka, David Rientjes, Naoya Horiguchi, linux-mm,
	linux-kernel

On Mon, Feb 01, 2016 at 03:24:46PM +0100, Michal Hocko wrote:
> On Mon 01-02-16 16:26:08, Kirill A. Shutemov wrote:
> > We don't care if there's a tail pages which is not on LRU. We are not
> > going to isolate them anyway.
> 
> yes we are not going to isolate them but calling this function on a
> tail page is wrong in principle, no? PageLRU check is racy outside of
> lru_lock so what if we are racing here. I know, highly unlikely but not
> impossible. So I am not really sure this is an improvement. When would
> we hit this VM_BUG_ON and it wouldn't be a bug or at least suspicious
> usage?

Yes, there is no point in calling isolate_lru_page() for tail pages, but
we do this anyway -- see the second patch.

And we need to validate all drivers, that they don't forget to set VM_IO
or make vma_migratable() return false in other way.

Alternative approach would be to downgrate the VM_BUG_ON_PAGE() to
WARN_ONCE_ON(). This way we would have chance to catch bad callers.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] mempolicy: do not try to queue pages from !vma_migratable()
  2016-02-01 13:26 ` [PATCH 2/2] mempolicy: do not try to queue pages from !vma_migratable() Kirill A. Shutemov
@ 2016-02-01 22:28   ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2016-02-01 22:28 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dmitry Vyukov, Vlastimil Babka, David Rientjes, Naoya Horiguchi,
	Michal Hocko, linux-mm, linux-kernel

On Mon,  1 Feb 2016 16:26:09 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:

> Maybe I miss some point, but I don't see a reason why we try to queue
> pages from non migratable VMAs.
> 
> The only case when we can queue pages from such VMA is MPOL_MF_STRICT
> plus MPOL_MF_MOVE or MPOL_MF_MOVE_ALL for VMA which has pages on LRU,
> but gfp mask is not sutable for migaration (see mapping_gfp_mask() check
> in vma_migratable()). That's looks like a bug to me.
> 
> Let's filter out non-migratable vma at start of queue_pages_test_walk()
> and go to queue_pages_pte_range() only if MPOL_MF_MOVE or
> MPOL_MF_MOVE_ALL flag is set.

Conflicts with
http://ozlabs.org/~akpm/mmots/broken-out/mm-mempolicy-skip-vm_hugetlb-and-vm_mixedmap-vma-for-lazy-mbind.patch.
I resolved it thusly, please review:

--- a/mm/mempolicy.c~mempolicy-do-not-try-to-queue-pages-from-vma_migratable
+++ a/mm/mempolicy.c
@@ -548,8 +548,7 @@ retry:
 			goto retry;
 		}
 
-		if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
-			migrate_page_add(page, qp->pagelist, flags);
+		migrate_page_add(page, qp->pagelist, flags);
 	}
 	pte_unmap_unlock(pte - 1, ptl);
 	cond_resched();
@@ -625,7 +624,7 @@ static int queue_pages_test_walk(unsigne
 	unsigned long endvma = vma->vm_end;
 	unsigned long flags = qp->flags;
 
-	if (vma->vm_flags & VM_PFNMAP)
+	if (!vma_migratable(vma))
 		return 1;
 
 	if (endvma > end)
@@ -644,17 +643,15 @@ static int queue_pages_test_walk(unsigne
 
 	if (flags & MPOL_MF_LAZY) {
 		/* Similar to task_numa_work, skip inaccessible VMAs */
-		if (vma_migratable(vma) && !is_vm_hugetlb_page(vma) &&
+		if (!is_vm_hugetlb_page(vma) &&
 			(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)) &&
 			!(vma->vm_flags & VM_MIXEDMAP))
 			change_prot_numa(vma, start, endvma);
 		return 1;
 	}
 
-	if ((flags & MPOL_MF_STRICT) ||
-	    ((flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) &&
-	     vma_migratable(vma)))
-		/* queue pages from current vma */
+	/* queue pages from current vma */
+	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))
 		return 0;
 	return 1;
 }
_

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page()
  2016-02-01 14:38     ` Kirill A. Shutemov
@ 2016-02-02 13:21       ` Michal Hocko
  0 siblings, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2016-02-02 13:21 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, Andrew Morton, Dmitry Vyukov,
	Vlastimil Babka, David Rientjes, Naoya Horiguchi, linux-mm,
	linux-kernel

On Mon 01-02-16 16:38:53, Kirill A. Shutemov wrote:
> On Mon, Feb 01, 2016 at 03:24:46PM +0100, Michal Hocko wrote:
> > On Mon 01-02-16 16:26:08, Kirill A. Shutemov wrote:
> > > We don't care if there's a tail pages which is not on LRU. We are not
> > > going to isolate them anyway.
> > 
> > yes we are not going to isolate them but calling this function on a
> > tail page is wrong in principle, no? PageLRU check is racy outside of
> > lru_lock so what if we are racing here. I know, highly unlikely but not
> > impossible. So I am not really sure this is an improvement. When would
> > we hit this VM_BUG_ON and it wouldn't be a bug or at least suspicious
> > usage?
> 
> Yes, there is no point in calling isolate_lru_page() for tail pages, but
> we do this anyway -- see the second patch.

yes, I have seen it and that is a bug as well AFAIU. So the VM_BUG_ON
triggered for the real bug.

> And we need to validate all drivers, that they don't forget to set VM_IO
> or make vma_migratable() return false in other way.

Yes, some drivers will do it incorrectly but this is VM_BUG_ON so it is
usually disabled no?

> Alternative approach would be to downgrate the VM_BUG_ON_PAGE() to
> WARN_ONCE_ON(). This way we would have chance to catch bad callers.

a ratelimitted WARN_ON would work as well.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-02 13:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-01 13:26 [PATCH 0/2] Fix another VM_BUG_ON_PAGE(PageTail(page)) on mbind(2) Kirill A. Shutemov
2016-02-01 13:26 ` [PATCH 1/2] mm: fix bogus VM_BUG_ON_PAGE() in isolate_lru_page() Kirill A. Shutemov
2016-02-01 14:24   ` Michal Hocko
2016-02-01 14:38     ` Kirill A. Shutemov
2016-02-02 13:21       ` Michal Hocko
2016-02-01 13:26 ` [PATCH 2/2] mempolicy: do not try to queue pages from !vma_migratable() Kirill A. Shutemov
2016-02-01 22:28   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).