From: Joao Martins <joao.m.martins@oracle.com> To: linux-nvdimm@lists.01.org Cc: Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, kvm@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>, x86@kernel.org, Liran Alon <liran.alon@oracle.com>, Nikita Leshenko <nikita.leshchenko@oracle.com>, Barret Rhoden <brho@google.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Matthew Wilcox <willy@infradead.org>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Subject: [PATCH RFC 01/10] mm: Add pmd support for _PAGE_SPECIAL Date: Fri, 10 Jan 2020 19:03:04 +0000 [thread overview] Message-ID: <20200110190313.17144-2-joao.m.martins@oracle.com> (raw) In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> Currently vmf_insert_pfn_pmd only works with devmap and BUG_ON otherwise. Add support for handling page special when pfn_t has it marked with PFN_SPECIAL. Usage of page special aren't expected to do GUP hence return no pages on gup_huge_pmd() much like how it is done for ptes on gup_pte_range(). This allows a DAX driver to handle 2M hugepages without struct pages. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- arch/x86/include/asm/pgtable.h | 16 +++++++++++++++- mm/gup.c | 3 +++ mm/huge_memory.c | 7 ++++--- mm/memory.c | 3 ++- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index ad97dc155195..60351c0c15fe 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -255,7 +255,7 @@ static inline int pmd_large(pmd_t pte) #ifdef CONFIG_TRANSPARENT_HUGEPAGE static inline int pmd_trans_huge(pmd_t pmd) { - return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP|_PAGE_SPECIAL)) == _PAGE_PSE; } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -293,6 +293,15 @@ static inline int pgd_devmap(pgd_t pgd) { return 0; } +#endif + +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +static inline int pmd_special(pmd_t pmd) +{ + return !!(pmd_flags(pmd) & _PAGE_SPECIAL); +} +#endif + #endif #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -414,6 +423,11 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd) return pmd_set_flags(pmd, _PAGE_DEVMAP); } +static inline pmd_t pmd_mkspecial(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SPECIAL); +} + static inline pmd_t pmd_mkhuge(pmd_t pmd) { return pmd_set_flags(pmd, _PAGE_PSE); diff --git a/mm/gup.c b/mm/gup.c index 7646bf993b25..ba5f10535392 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2079,6 +2079,9 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr); } + if (pmd_special(orig)) + return 0; + refs = 0; page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 41a0fbddc96b..06ad4d6f7477 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -791,6 +791,8 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); + else if (pfn_t_special(pfn)) + entry = pmd_mkspecial(entry); if (write) { entry = pmd_mkyoung(pmd_mkdirty(entry)); entry = maybe_pmd_mkwrite(entry, vma); @@ -823,8 +825,7 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -2013,7 +2014,7 @@ spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pmd_lock(vma->vm_mm, pmd); if (likely(is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || - pmd_devmap(*pmd))) + pmd_devmap(*pmd) || pmd_special(*pmd))) return ptl; spin_unlock(ptl); return NULL; diff --git a/mm/memory.c b/mm/memory.c index 45442d9a4f52..cfc3668bddeb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1165,7 +1165,8 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, pmd = pmd_offset(pud, addr); do { next = pmd_addr_end(addr, end); - if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { + if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || + pmd_devmap(*pmd) || pmd_special(*pmd)) { if (next - addr != HPAGE_PMD_SIZE) __split_huge_pmd(vma, pmd, addr, false, NULL); else if (zap_huge_pmd(tlb, vma, pmd, addr)) -- 2.17.1 _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Joao Martins <joao.m.martins@oracle.com> To: linux-nvdimm@lists.01.org Cc: Dan Williams <dan.j.williams@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, Dave Jiang <dave.jiang@intel.com>, Ira Weiny <ira.weiny@intel.com>, Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, kvm@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>, x86@kernel.org, Liran Alon <liran.alon@oracle.com>, Nikita Leshenko <nikita.leshchenko@oracle.com>, Barret Rhoden <brho@google.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Matthew Wilcox <willy@infradead.org>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Subject: [PATCH RFC 01/10] mm: Add pmd support for _PAGE_SPECIAL Date: Fri, 10 Jan 2020 19:03:04 +0000 [thread overview] Message-ID: <20200110190313.17144-2-joao.m.martins@oracle.com> (raw) In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> Currently vmf_insert_pfn_pmd only works with devmap and BUG_ON otherwise. Add support for handling page special when pfn_t has it marked with PFN_SPECIAL. Usage of page special aren't expected to do GUP hence return no pages on gup_huge_pmd() much like how it is done for ptes on gup_pte_range(). This allows a DAX driver to handle 2M hugepages without struct pages. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- arch/x86/include/asm/pgtable.h | 16 +++++++++++++++- mm/gup.c | 3 +++ mm/huge_memory.c | 7 ++++--- mm/memory.c | 3 ++- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index ad97dc155195..60351c0c15fe 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -255,7 +255,7 @@ static inline int pmd_large(pmd_t pte) #ifdef CONFIG_TRANSPARENT_HUGEPAGE static inline int pmd_trans_huge(pmd_t pmd) { - return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP|_PAGE_SPECIAL)) == _PAGE_PSE; } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -293,6 +293,15 @@ static inline int pgd_devmap(pgd_t pgd) { return 0; } +#endif + +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL +static inline int pmd_special(pmd_t pmd) +{ + return !!(pmd_flags(pmd) & _PAGE_SPECIAL); +} +#endif + #endif #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -414,6 +423,11 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd) return pmd_set_flags(pmd, _PAGE_DEVMAP); } +static inline pmd_t pmd_mkspecial(pmd_t pmd) +{ + return pmd_set_flags(pmd, _PAGE_SPECIAL); +} + static inline pmd_t pmd_mkhuge(pmd_t pmd) { return pmd_set_flags(pmd, _PAGE_PSE); diff --git a/mm/gup.c b/mm/gup.c index 7646bf993b25..ba5f10535392 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2079,6 +2079,9 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr); } + if (pmd_special(orig)) + return 0; + refs = 0; page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 41a0fbddc96b..06ad4d6f7477 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -791,6 +791,8 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); + else if (pfn_t_special(pfn)) + entry = pmd_mkspecial(entry); if (write) { entry = pmd_mkyoung(pmd_mkdirty(entry)); entry = maybe_pmd_mkwrite(entry, vma); @@ -823,8 +825,7 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -2013,7 +2014,7 @@ spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pmd_lock(vma->vm_mm, pmd); if (likely(is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || - pmd_devmap(*pmd))) + pmd_devmap(*pmd) || pmd_special(*pmd))) return ptl; spin_unlock(ptl); return NULL; diff --git a/mm/memory.c b/mm/memory.c index 45442d9a4f52..cfc3668bddeb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1165,7 +1165,8 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, pmd = pmd_offset(pud, addr); do { next = pmd_addr_end(addr, end); - if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { + if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || + pmd_devmap(*pmd) || pmd_special(*pmd)) { if (next - addr != HPAGE_PMD_SIZE) __split_huge_pmd(vma, pmd, addr, false, NULL); else if (zap_huge_pmd(tlb, vma, pmd, addr)) -- 2.17.1
next prev parent reply other threads:[~2020-01-10 19:06 UTC|newest] Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-10 19:03 [PATCH RFC 00/10] device-dax: Support devices without PFN metadata Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` Joao Martins [this message] 2020-01-10 19:03 ` [PATCH RFC 01/10] mm: Add pmd support for _PAGE_SPECIAL Joao Martins 2020-02-03 21:34 ` Matthew Wilcox 2020-02-03 21:34 ` Matthew Wilcox 2020-02-04 16:14 ` Joao Martins 2020-02-04 16:14 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 02/10] mm: Handle pmd entries in follow_pfn() Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-03 21:37 ` Matthew Wilcox 2020-02-03 21:37 ` Matthew Wilcox 2020-02-04 16:17 ` Joao Martins 2020-02-04 16:17 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 03/10] mm: Add pud support for _PAGE_SPECIAL Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 04/10] mm: Handle pud entries in follow_pfn() Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 05/10] device-dax: Do not enforce MADV_DONTFORK on mmap() Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 06/10] device-dax: Introduce pfn_flags helper Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 07/10] device-dax: Add support for PFN_SPECIAL flags Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 08/10] dax/pmem: Add device-dax support for PFN_MODE_NONE Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-07 21:08 ` Jason Gunthorpe 2020-02-11 16:23 ` Joao Martins 2020-02-11 16:23 ` Joao Martins 2020-02-11 16:50 ` Jason Gunthorpe 2020-01-10 19:03 ` [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-04 15:28 ` Barret Rhoden 2020-02-04 15:28 ` Barret Rhoden 2020-02-04 16:44 ` Dan Williams 2020-02-04 16:44 ` Dan Williams 2020-02-04 16:44 ` Dan Williams 2020-02-04 18:20 ` Barret Rhoden 2020-02-04 18:20 ` Barret Rhoden 2020-02-04 19:24 ` Joao Martins 2020-02-04 19:24 ` Joao Martins 2020-02-04 21:43 ` Dan Williams 2020-02-04 21:43 ` Dan Williams 2020-02-04 21:43 ` Dan Williams 2020-02-04 21:57 ` Barret Rhoden 2020-02-04 21:57 ` Barret Rhoden 2020-02-04 1:24 ` [PATCH RFC 00/10] device-dax: Support devices without PFN metadata Dan Williams 2020-02-04 1:24 ` Dan Williams 2020-02-04 1:24 ` Dan Williams 2020-02-04 19:07 ` Joao Martins 2020-02-04 19:07 ` Joao Martins
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200110190313.17144-2-joao.m.martins@oracle.com \ --to=joao.m.martins@oracle.com \ --cc=akpm@linux-foundation.org \ --cc=alex.williamson@redhat.com \ --cc=boris.ostrovsky@oracle.com \ --cc=bp@alien8.de \ --cc=brho@google.com \ --cc=cohuck@redhat.com \ --cc=hpa@zytor.com \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=liran.alon@oracle.com \ --cc=mingo@redhat.com \ --cc=nikita.leshchenko@oracle.com \ --cc=tglx@linutronix.de \ --cc=willy@infradead.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.