From: Joao Martins <joao.m.martins@oracle.com> To: linux-nvdimm@lists.01.org Cc: Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, kvm@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>, x86@kernel.org, Liran Alon <liran.alon@oracle.com>, Nikita Leshenko <nikita.leshchenko@oracle.com>, Barret Rhoden <brho@google.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Matthew Wilcox <willy@infradead.org>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Subject: [PATCH RFC 03/10] mm: Add pud support for _PAGE_SPECIAL Date: Fri, 10 Jan 2020 19:03:06 +0000 [thread overview] Message-ID: <20200110190313.17144-4-joao.m.martins@oracle.com> (raw) In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> Currently vmf_insert_pfn_pud only works with devmap and BUG_ON otherwise. Add support for handling page special when pfn_t has it marked with PFN_SPECIAL. Usage of this type of pages aren't expected to do GUP hence return no pages on gup_huge_pud() much like how it is done for ptes on gup_pte_range() and for pmds on gup_huge_pmd(). This allows device-dax to handle 1G hugepages without struct pages. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- arch/x86/include/asm/pgtable.h | 18 +++++++++++++++++- mm/gup.c | 3 +++ mm/huge_memory.c | 8 +++++--- mm/memory.c | 3 ++- 4 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 60351c0c15fe..2027c063fa16 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -261,7 +261,7 @@ static inline int pmd_trans_huge(pmd_t pmd) #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static inline int pud_trans_huge(pud_t pud) { - return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP|_PAGE_SPECIAL)) == _PAGE_PSE; } #endif @@ -300,6 +300,17 @@ static inline int pmd_special(pmd_t pmd) { return !!(pmd_flags(pmd) & _PAGE_SPECIAL); } + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static inline int pud_special(pud_t pud) +{ + return !!(pud_flags(pud) & _PAGE_SPECIAL); +} +#else +static inline int pud_special(pud_t pud) +{ + return 0; +} #endif #endif @@ -487,6 +498,11 @@ static inline pud_t pud_mkhuge(pud_t pud) return pud_set_flags(pud, _PAGE_PSE); } +static inline pud_t pud_mkspecial(pud_t pud) +{ + return pud_set_flags(pud, _PAGE_SPECIAL); +} + static inline pud_t pud_mkyoung(pud_t pud) { return pud_set_flags(pud, _PAGE_ACCESSED); diff --git a/mm/gup.c b/mm/gup.c index ba5f10535392..ae4abe5878ad 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2123,6 +2123,9 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr); } + if (pud_special(orig)) + return 0; + refs = 0; page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 06ad4d6f7477..cff707163bc1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -879,6 +879,8 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, entry = pud_mkhuge(pfn_t_pud(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pud_mkdevmap(entry); + else if (pfn_t_special(pfn)) + entry = pud_mkspecial(entry); if (write) { entry = pud_mkyoung(pud_mkdirty(entry)); entry = maybe_pud_mkwrite(entry, vma); @@ -901,8 +903,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -2031,7 +2032,8 @@ spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pud_lock(vma->vm_mm, pud); - if (likely(pud_trans_huge(*pud) || pud_devmap(*pud))) + if (likely(pud_trans_huge(*pud) || pud_devmap(*pud)) || + pud_special(*pud)) return ptl; spin_unlock(ptl); return NULL; diff --git a/mm/memory.c b/mm/memory.c index db99684d2cb3..109643219e1b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1201,7 +1201,8 @@ static inline unsigned long zap_pud_range(struct mmu_gather *tlb, pud = pud_offset(p4d, addr); do { next = pud_addr_end(addr, end); - if (pud_trans_huge(*pud) || pud_devmap(*pud)) { + if (pud_trans_huge(*pud) || pud_devmap(*pud) || + pud_special(*pud)) { if (next - addr != HPAGE_PUD_SIZE) { VM_BUG_ON_VMA(!rwsem_is_locked(&tlb->mm->mmap_sem), vma); split_huge_pud(vma, pud, addr); -- 2.17.1 _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Joao Martins <joao.m.martins@oracle.com> To: linux-nvdimm@lists.01.org Cc: Dan Williams <dan.j.williams@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, Dave Jiang <dave.jiang@intel.com>, Ira Weiny <ira.weiny@intel.com>, Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, kvm@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, "H . Peter Anvin" <hpa@zytor.com>, x86@kernel.org, Liran Alon <liran.alon@oracle.com>, Nikita Leshenko <nikita.leshchenko@oracle.com>, Barret Rhoden <brho@google.com>, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Matthew Wilcox <willy@infradead.org>, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Subject: [PATCH RFC 03/10] mm: Add pud support for _PAGE_SPECIAL Date: Fri, 10 Jan 2020 19:03:06 +0000 [thread overview] Message-ID: <20200110190313.17144-4-joao.m.martins@oracle.com> (raw) In-Reply-To: <20200110190313.17144-1-joao.m.martins@oracle.com> Currently vmf_insert_pfn_pud only works with devmap and BUG_ON otherwise. Add support for handling page special when pfn_t has it marked with PFN_SPECIAL. Usage of this type of pages aren't expected to do GUP hence return no pages on gup_huge_pud() much like how it is done for ptes on gup_pte_range() and for pmds on gup_huge_pmd(). This allows device-dax to handle 1G hugepages without struct pages. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- arch/x86/include/asm/pgtable.h | 18 +++++++++++++++++- mm/gup.c | 3 +++ mm/huge_memory.c | 8 +++++--- mm/memory.c | 3 ++- 4 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 60351c0c15fe..2027c063fa16 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -261,7 +261,7 @@ static inline int pmd_trans_huge(pmd_t pmd) #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD static inline int pud_trans_huge(pud_t pud) { - return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE; + return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP|_PAGE_SPECIAL)) == _PAGE_PSE; } #endif @@ -300,6 +300,17 @@ static inline int pmd_special(pmd_t pmd) { return !!(pmd_flags(pmd) & _PAGE_SPECIAL); } + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static inline int pud_special(pud_t pud) +{ + return !!(pud_flags(pud) & _PAGE_SPECIAL); +} +#else +static inline int pud_special(pud_t pud) +{ + return 0; +} #endif #endif @@ -487,6 +498,11 @@ static inline pud_t pud_mkhuge(pud_t pud) return pud_set_flags(pud, _PAGE_PSE); } +static inline pud_t pud_mkspecial(pud_t pud) +{ + return pud_set_flags(pud, _PAGE_SPECIAL); +} + static inline pud_t pud_mkyoung(pud_t pud) { return pud_set_flags(pud, _PAGE_ACCESSED); diff --git a/mm/gup.c b/mm/gup.c index ba5f10535392..ae4abe5878ad 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2123,6 +2123,9 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr); } + if (pud_special(orig)) + return 0; + refs = 0; page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 06ad4d6f7477..cff707163bc1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -879,6 +879,8 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, entry = pud_mkhuge(pfn_t_pud(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pud_mkdevmap(entry); + else if (pfn_t_special(pfn)) + entry = pud_mkspecial(entry); if (write) { entry = pud_mkyoung(pud_mkdirty(entry)); entry = maybe_pud_mkwrite(entry, vma); @@ -901,8 +903,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && - !pfn_t_devmap(pfn)); + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == (VM_PFNMAP|VM_MIXEDMAP)); BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); @@ -2031,7 +2032,8 @@ spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) spinlock_t *ptl; ptl = pud_lock(vma->vm_mm, pud); - if (likely(pud_trans_huge(*pud) || pud_devmap(*pud))) + if (likely(pud_trans_huge(*pud) || pud_devmap(*pud)) || + pud_special(*pud)) return ptl; spin_unlock(ptl); return NULL; diff --git a/mm/memory.c b/mm/memory.c index db99684d2cb3..109643219e1b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1201,7 +1201,8 @@ static inline unsigned long zap_pud_range(struct mmu_gather *tlb, pud = pud_offset(p4d, addr); do { next = pud_addr_end(addr, end); - if (pud_trans_huge(*pud) || pud_devmap(*pud)) { + if (pud_trans_huge(*pud) || pud_devmap(*pud) || + pud_special(*pud)) { if (next - addr != HPAGE_PUD_SIZE) { VM_BUG_ON_VMA(!rwsem_is_locked(&tlb->mm->mmap_sem), vma); split_huge_pud(vma, pud, addr); -- 2.17.1
next prev parent reply other threads:[~2020-01-10 19:05 UTC|newest] Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-10 19:03 [PATCH RFC 00/10] device-dax: Support devices without PFN metadata Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 01/10] mm: Add pmd support for _PAGE_SPECIAL Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-03 21:34 ` Matthew Wilcox 2020-02-03 21:34 ` Matthew Wilcox 2020-02-04 16:14 ` Joao Martins 2020-02-04 16:14 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 02/10] mm: Handle pmd entries in follow_pfn() Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-03 21:37 ` Matthew Wilcox 2020-02-03 21:37 ` Matthew Wilcox 2020-02-04 16:17 ` Joao Martins 2020-02-04 16:17 ` Joao Martins 2020-01-10 19:03 ` Joao Martins [this message] 2020-01-10 19:03 ` [PATCH RFC 03/10] mm: Add pud support for _PAGE_SPECIAL Joao Martins 2020-01-10 19:03 ` [PATCH RFC 04/10] mm: Handle pud entries in follow_pfn() Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 05/10] device-dax: Do not enforce MADV_DONTFORK on mmap() Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 06/10] device-dax: Introduce pfn_flags helper Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 07/10] device-dax: Add support for PFN_SPECIAL flags Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 08/10] dax/pmem: Add device-dax support for PFN_MODE_NONE Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-01-10 19:03 ` [PATCH RFC 09/10] vfio/type1: Use follow_pfn for VM_FPNMAP VMAs Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-07 21:08 ` Jason Gunthorpe 2020-02-11 16:23 ` Joao Martins 2020-02-11 16:23 ` Joao Martins 2020-02-11 16:50 ` Jason Gunthorpe 2020-01-10 19:03 ` [PATCH RFC 10/10] nvdimm/e820: add multiple namespaces support Joao Martins 2020-01-10 19:03 ` Joao Martins 2020-02-04 15:28 ` Barret Rhoden 2020-02-04 15:28 ` Barret Rhoden 2020-02-04 16:44 ` Dan Williams 2020-02-04 16:44 ` Dan Williams 2020-02-04 16:44 ` Dan Williams 2020-02-04 18:20 ` Barret Rhoden 2020-02-04 18:20 ` Barret Rhoden 2020-02-04 19:24 ` Joao Martins 2020-02-04 19:24 ` Joao Martins 2020-02-04 21:43 ` Dan Williams 2020-02-04 21:43 ` Dan Williams 2020-02-04 21:43 ` Dan Williams 2020-02-04 21:57 ` Barret Rhoden 2020-02-04 21:57 ` Barret Rhoden 2020-02-04 1:24 ` [PATCH RFC 00/10] device-dax: Support devices without PFN metadata Dan Williams 2020-02-04 1:24 ` Dan Williams 2020-02-04 1:24 ` Dan Williams 2020-02-04 19:07 ` Joao Martins 2020-02-04 19:07 ` Joao Martins
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200110190313.17144-4-joao.m.martins@oracle.com \ --to=joao.m.martins@oracle.com \ --cc=akpm@linux-foundation.org \ --cc=alex.williamson@redhat.com \ --cc=boris.ostrovsky@oracle.com \ --cc=bp@alien8.de \ --cc=brho@google.com \ --cc=cohuck@redhat.com \ --cc=hpa@zytor.com \ --cc=konrad.wilk@oracle.com \ --cc=kvm@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=liran.alon@oracle.com \ --cc=mingo@redhat.com \ --cc=nikita.leshchenko@oracle.com \ --cc=tglx@linutronix.de \ --cc=willy@infradead.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.