linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>,
	LinuxPPC-dev <linuxppc-dev@lists.ozlabs.org>,
	Hugh Dickins <hughd@google.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>, Ingo Molnar <mingo@redhat.com>,
	Paul Mackerras <paulus@samba.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Sasha Levin <sasha.levin@oracle.com>,
	Dave Jones <davej@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Kirill Shutemov <kirill.shutemov@linux.intel.com>,
	Mel Gorman <mgorman@suse.de>
Subject: [PATCH 07/10] mm: numa: Do not trap faults on the huge zero page
Date: Thu,  4 Dec 2014 11:24:30 +0000	[thread overview]
Message-ID: <1417692273-27170-8-git-send-email-mgorman@suse.de> (raw)
In-Reply-To: <1417692273-27170-1-git-send-email-mgorman@suse.de>

Faults on the huge zero page are pointless and there is a BUG_ON
to catch them during fault time. This patch reintroduces a check
that avoids marking the zero page PAGE_NONE.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/huge_mm.h |  3 ++-
 mm/huge_memory.c        | 13 ++++++++++++-
 mm/memory.c             |  1 -
 mm/mprotect.c           | 15 ++++++++++++++-
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 554bbe3..ad9051b 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -31,7 +31,8 @@ extern int move_huge_pmd(struct vm_area_struct *vma,
 			 unsigned long new_addr, unsigned long old_end,
 			 pmd_t *old_pmd, pmd_t *new_pmd);
 extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
-			unsigned long addr, pgprot_t newprot);
+			unsigned long addr, pgprot_t newprot,
+			int prot_numa);
 
 enum transparent_hugepage_flag {
 	TRANSPARENT_HUGEPAGE_FLAG,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5618e22..ad2a3ee 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1502,7 +1502,7 @@ out:
  *  - HPAGE_PMD_NR is protections changed and TLB flush necessary
  */
 int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
-		unsigned long addr, pgprot_t newprot)
+		unsigned long addr, pgprot_t newprot, int prot_numa)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	spinlock_t *ptl;
@@ -1510,6 +1510,17 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 
 	if (__pmd_trans_huge_lock(pmd, vma, &ptl) == 1) {
 		pmd_t entry;
+
+		/*
+		 * Avoid trapping faults against the zero page. The read-only
+		 * data is likely to be read-cached on the local CPU and
+		 * local/remote hits to the zero page are not interesting.
+		 */
+		if (prot_numa && is_huge_zero_pmd(*pmd)) {
+			spin_unlock(ptl);
+			return 0;
+		}
+
 		ret = 1;
 		entry = pmdp_get_and_clear_notify(mm, addr, pmd);
 		entry = pmd_modify(entry, newprot);
diff --git a/mm/memory.c b/mm/memory.c
index 2100e0f..2ec07a9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3136,7 +3136,6 @@ static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		pte_unmap_unlock(ptep, ptl);
 		return 0;
 	}
-	BUG_ON(is_zero_pfn(page_to_pfn(page)));
 
 	/*
 	 * Avoid grouping on DSO/COW pages in specific and RO pages
diff --git a/mm/mprotect.c b/mm/mprotect.c
index dc65c0f..33dfafb 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -75,6 +75,19 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 		oldpte = *pte;
 		if (pte_present(oldpte)) {
 			pte_t ptent;
+
+			/*
+			 * Avoid trapping faults against the zero or KSM
+			 * pages. See similar comment in change_huge_pmd.
+			 */
+			if (prot_numa) {
+				struct page *page;
+
+				page = vm_normal_page(vma, addr, oldpte);
+				if (!page || PageKsm(page))
+					continue;
+			}
+
 			ptent = ptep_modify_prot_start(mm, addr, pte);
 			ptent = pte_modify(ptent, newprot);
 
@@ -141,7 +154,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 				split_huge_page_pmd(vma, addr, pmd);
 			else {
 				int nr_ptes = change_huge_pmd(vma, pmd, addr,
-						newprot);
+						newprot, prot_numa);
 
 				if (nr_ptes) {
 					if (nr_ptes == HPAGE_PMD_NR) {
-- 
2.1.2

  parent reply	other threads:[~2014-12-04 11:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04 11:24 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v4 Mel Gorman
2014-12-04 11:24 ` [PATCH 01/10] mm: numa: Do not dereference pmd outside of the lock during NUMA hinting fault Mel Gorman
2014-12-04 11:24 ` [PATCH 02/10] mm: Add p[te|md] protnone helpers for use by NUMA balancing Mel Gorman
2014-12-04 11:24 ` [PATCH 03/10] mm: Convert p[te|md]_numa users to p[te|md]_protnone_numa Mel Gorman
2014-12-04 11:24 ` [PATCH 04/10] ppc64: Add paranoid warnings for unexpected DSISR_PROTFAULT Mel Gorman
2014-12-04 11:24 ` [PATCH 05/10] mm: Convert p[te|md]_mknonnuma and remaining page table manipulations Mel Gorman
2014-12-04 11:24 ` [PATCH 06/10] mm: Remove remaining references to NUMA hinting bits and helpers Mel Gorman
2014-12-04 11:24 ` Mel Gorman [this message]
2014-12-04 11:24 ` [PATCH 08/10] x86: mm: Restore original pte_special check Mel Gorman
2014-12-04 11:24 ` [PATCH 09/10] mm: numa: Add paranoid check around pte_protnone_numa Mel Gorman
2014-12-04 11:24 ` [PATCH 10/10] mm: numa: Avoid unnecessary TLB flushes when setting NUMA hinting entries Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2015-01-05 10:54 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v5 Mel Gorman
2015-01-05 10:54 ` [PATCH 07/10] mm: numa: Do not trap faults on the huge zero page Mel Gorman
2014-11-21 13:57 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v3 Mel Gorman
2014-11-21 13:57 ` [PATCH 07/10] mm: numa: Do not trap faults on the huge zero page Mel Gorman
2014-11-20 10:19 [PATCH 0/10] Replace _PAGE_NUMA with PAGE_NONE protections v2 Mel Gorman
2014-11-20 10:19 ` [PATCH 07/10] mm: numa: Do not trap faults on the huge zero page Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1417692273-27170-8-git-send-email-mgorman@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=davej@redhat.com \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=paulus@samba.org \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).