All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched,numa,mm: revert to checking pmd/pte_write instead of VMA flags
@ 2016-09-09  1:30 ` Rik van Riel
  0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2016-09-09  1:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, torvalds, mgorman, peterz, mingo, aarcange

Commit 4d9424669946 ("mm: convert p[te|md]_mknonnuma and remaining
page table manipulations") changed NUMA balancing from _PAGE_NUMA
to using PROT_NONE, and was quickly found to introduce a regression
with NUMA grouping.

It was followed up by these changesets:

53da3bc2ba9e ("mm: fix up numa read-only thread grouping logic")
bea66fbd11af ("mm: numa: group related processes based on VMA flags instead of page table flags")
b191f9b106ea ("mm: numa: preserve PTE write permissions across a NUMA hinting fault")

The first of those two changesets try alternate approaches to NUMA
grouping, which apparently do not work as well as looking at the PTE
write permissions.

The latter patch preserves the PTE write permissions across a NUMA
protection fault. However, it forgets to revert the condition for
whether or not to group tasks together back to what it was before
3.19, even though the information is now preserved in the page tables
once again.

This patch brings the NUMA grouping heuristic back to what it was
before changeset 4d9424669946, which the changelogs of subsequent
changesets suggest worked best.

We have all the information again. We should probably use it.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 mm/huge_memory.c | 2 +-
 mm/memory.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2db2112aa31e..c8bde270f557 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1168,7 +1168,7 @@ int do_huge_pmd_numa_page(struct fault_env *fe, pmd_t pmd)
 	}
 
 	/* See similar comment in do_numa_page for explanation */
-	if (!(vma->vm_flags & VM_WRITE))
+	if (!pmd_write(pmd))
 		flags |= TNF_NO_GROUP;
 
 	/*
diff --git a/mm/memory.c b/mm/memory.c
index 83be99d9d8a1..558c85270ae2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3398,7 +3398,7 @@ static int do_numa_page(struct fault_env *fe, pte_t pte)
 	 * pte_dirty has unpredictable behaviour between PTE scan updates,
 	 * background writeback, dirty balancing and application behaviour.
 	 */
-	if (!(vma->vm_flags & VM_WRITE))
+	if (!pte_write(pte))
 		flags |= TNF_NO_GROUP;
 
 	/*

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-09-14  9:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09  1:30 [PATCH] sched,numa,mm: revert to checking pmd/pte_write instead of VMA flags Rik van Riel
2016-09-09  1:30 ` Rik van Riel
2016-09-11 16:24 ` Mel Gorman
2016-09-11 16:24   ` Mel Gorman
2016-09-12 15:09   ` Rik van Riel
2016-09-14  9:25     ` Mel Gorman
2016-09-14  9:25       ` Mel Gorman
2016-09-13 22:07 ` [tip:sched/core] sched/numa, mm: Revert " tip-bot for Rik van Riel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.