This patch series reaps some of the benefits of the vm_fault_t work that Souptick has been diligently plugging away at by converting insert_pfn() to return a vm_fault_t. Eventually, we'll be able to do the same thing to insert_page(), but there's more work to be done converting all current callers of vm_insert_page() to vmf_insert_page(), and this patch series provides a nice clean-up. Nicolas, I'd like your reviewed-by on patch 1 please. Matthew Wilcox (10): cramfs: Convert to use vmf_insert_mixed mm: Remove vm_insert_mixed mm: Introduce vmf_insert_pfn_prot x86: Convert vdso to use vm_fault_t mm: Make vm_insert_pfn_prot static mm: Remove references to vm_insert_pfn mm: Remove vm_insert_pfn mm: Inline vm_insert_pfn_prot into caller mm: Convert __vm_insert_mixed to vm_fault_t mm: Convert insert_pfn to vm_fault_t Documentation/x86/pat.txt | 4 +- arch/x86/entry/vdso/vma.c | 24 +++---- fs/cramfs/inode.c | 9 ++- include/asm-generic/pgtable.h | 4 +- include/linux/hmm.h | 2 +- include/linux/mm.h | 32 +-------- mm/memory.c | 122 +++++++++++++++++----------------- 7 files changed, 84 insertions(+), 113 deletions(-) -- 2.18.0
cramfs is the only remaining user of vm_insert_mixed; convert it. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- fs/cramfs/inode.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index f408994fc632..b72449c19cd1 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -417,10 +417,15 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma) */ int i; vma->vm_flags |= VM_MIXEDMAP; - for (i = 0; i < pages && !ret; i++) { + for (i = 0; i < pages; i++) { + vm_fault_t vmf; unsigned long off = i * PAGE_SIZE; pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV); - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn); + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn); + if (vmf & VM_FAULT_ERROR) { + pages = i; + break; + } } } -- 2.18.0
All callers are now converted to vmf_insert_mixed() so convert vmf_insert_mixed() from being a compatibility wrapper into the real function. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- include/linux/mm.h | 15 +-------------- mm/memory.c | 14 ++++++++++---- 2 files changed, 11 insertions(+), 18 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a61ebe8ad4ca..ce1b24376d72 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2482,7 +2482,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot); -int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, +vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn); vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn); @@ -2501,19 +2501,6 @@ static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma, return VM_FAULT_NOPAGE; } -static inline vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, - unsigned long addr, pfn_t pfn) -{ - int err = vm_insert_mixed(vma, addr, pfn); - - if (err == -ENOMEM) - return VM_FAULT_OOM; - if (err < 0 && err != -EBUSY) - return VM_FAULT_SIGBUS; - - return VM_FAULT_NOPAGE; -} - static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn) { diff --git a/mm/memory.c b/mm/memory.c index c467102a5cbc..f0b123a94426 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1940,13 +1940,19 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, return insert_pfn(vma, addr, pfn, pgprot, mkwrite); } -int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn) +vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, + pfn_t pfn) { - return __vm_insert_mixed(vma, addr, pfn, false); + int err = __vm_insert_mixed(vma, addr, pfn, false); + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; } -EXPORT_SYMBOL(vm_insert_mixed); +EXPORT_SYMBOL(vmf_insert_mixed); /* * If the insertion of PTE failed because someone else already added a -- 2.18.0
Like vm_insert_pfn_prot(), but returns a vm_fault_t instead of an errno. Also unexport vm_insert_pfn_prot as it has no modular users. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- include/linux/mm.h | 2 ++ mm/memory.c | 47 ++++++++++++++++++++++++++++++---------------- 2 files changed, 33 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ce1b24376d72..e8bc1a16d44c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2482,6 +2482,8 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot); +vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, + unsigned long pfn, pgprot_t pgprot); vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn); vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma, diff --git a/mm/memory.c b/mm/memory.c index f0b123a94426..8c116c0f64d8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1843,21 +1843,6 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vm_insert_pfn); -/** - * vm_insert_pfn_prot - insert single pfn into user vma with specified pgprot - * @vma: user vma to map to - * @addr: target user address of this page - * @pfn: source kernel pfn - * @pgprot: pgprot flags for the inserted page - * - * This is exactly like vm_insert_pfn, except that it allows drivers to - * to override pgprot on a per-page basis. - * - * This only makes sense for IO mappings, and it makes no sense for - * cow mappings. In general, using multiple vmas is preferable; - * vm_insert_pfn_prot should only be used if using multiple VMAs is - * impractical. - */ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot) { @@ -1887,7 +1872,37 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, return ret; } -EXPORT_SYMBOL(vm_insert_pfn_prot); + +/** + * vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot + * @vma: user vma to map to + * @addr: target user address of this page + * @pfn: source kernel pfn + * @pgprot: pgprot flags for the inserted page + * + * This is exactly like vmf_insert_pfn(), except that it allows drivers to + * to override pgprot on a per-page basis. + * + * This only makes sense for IO mappings, and it makes no sense for + * COW mappings. In general, using multiple vmas is preferable; + * vm_insert_pfn_prot should only be used if using multiple VMAs is + * impractical. + * + * Return: vm_fault_t value. + */ +vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, + unsigned long pfn, pgprot_t pgprot) +{ + int err = vm_insert_pfn_prot(vma, addr, pfn, pgprot); + + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} +EXPORT_SYMBOL(vmf_insert_pfn_prot); static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn) { -- 2.18.0
Return vm_fault_t codes directly from the appropriate mm routines instead of converting from errnos ourselves. Fixes a minor bug where we'd return SIGBUS instead of the correct OOM code if we ran out of memory allocating page tables. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/entry/vdso/vma.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c index 5b8b556dbb12..d1daa53215a4 100644 --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -39,7 +39,7 @@ void __init init_vdso_image(const struct vdso_image *image) struct linux_binprm; -static int vdso_fault(const struct vm_special_mapping *sm, +static vm_fault_t vdso_fault(const struct vm_special_mapping *sm, struct vm_area_struct *vma, struct vm_fault *vmf) { const struct vdso_image *image = vma->vm_mm->context.vdso_image; @@ -84,12 +84,11 @@ static int vdso_mremap(const struct vm_special_mapping *sm, return 0; } -static int vvar_fault(const struct vm_special_mapping *sm, +static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, struct vm_area_struct *vma, struct vm_fault *vmf) { const struct vdso_image *image = vma->vm_mm->context.vdso_image; long sym_offset; - int ret = -EFAULT; if (!image) return VM_FAULT_SIGBUS; @@ -108,29 +107,24 @@ static int vvar_fault(const struct vm_special_mapping *sm, return VM_FAULT_SIGBUS; if (sym_offset == image->sym_vvar_page) { - ret = vm_insert_pfn(vma, vmf->address, - __pa_symbol(&__vvar_page) >> PAGE_SHIFT); + return vmf_insert_pfn(vma, vmf->address, + __pa_symbol(&__vvar_page) >> PAGE_SHIFT); } else if (sym_offset == image->sym_pvclock_page) { struct pvclock_vsyscall_time_info *pvti = pvclock_get_pvti_cpu0_va(); if (pvti && vclock_was_used(VCLOCK_PVCLOCK)) { - ret = vm_insert_pfn_prot( - vma, - vmf->address, - __pa(pvti) >> PAGE_SHIFT, - pgprot_decrypted(vma->vm_page_prot)); + return vmf_insert_pfn_prot(vma, vmf->address, + __pa(pvti) >> PAGE_SHIFT, + pgprot_decrypted(vma->vm_page_prot)); } } else if (sym_offset == image->sym_hvclock_page) { struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page(); if (tsc_pg && vclock_was_used(VCLOCK_HVCLOCK)) - ret = vm_insert_pfn(vma, vmf->address, - vmalloc_to_pfn(tsc_pg)); + return vmf_insert_pfn(vma, vmf->address, + vmalloc_to_pfn(tsc_pg)); } - if (ret == 0 || ret == -EBUSY) - return VM_FAULT_NOPAGE; - return VM_FAULT_SIGBUS; } -- 2.18.0
Now this is no longer used outside mm/memory.c, make it static. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- include/linux/mm.h | 2 -- mm/memory.c | 50 +++++++++++++++++++++++----------------------- 2 files changed, 25 insertions(+), 27 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index e8bc1a16d44c..1552c67c835e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2480,8 +2480,6 @@ int remap_pfn_range(struct vm_area_struct *, unsigned long addr, int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); -int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, - unsigned long pfn, pgprot_t pgprot); vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot); vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 8c116c0f64d8..8392a104a36d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1819,31 +1819,7 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, return retval; } -/** - * vm_insert_pfn - insert single pfn into user vma - * @vma: user vma to map to - * @addr: target user address of this page - * @pfn: source kernel pfn - * - * Similar to vm_insert_page, this allows drivers to insert individual pages - * they've allocated into a user vma. Same comments apply. - * - * This function should only be called from a vm_ops->fault handler, and - * in that case the handler should return NULL. - * - * vma cannot be a COW mapping. - * - * As this is called only for pages that do not currently exist, we - * do not need to flush old virtual caches or the TLB. - */ -int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, - unsigned long pfn) -{ - return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot); -} -EXPORT_SYMBOL(vm_insert_pfn); - -int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, +static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot) { int ret; @@ -1873,6 +1849,30 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, return ret; } +/** + * vm_insert_pfn - insert single pfn into user vma + * @vma: user vma to map to + * @addr: target user address of this page + * @pfn: source kernel pfn + * + * Similar to vm_insert_page, this allows drivers to insert individual pages + * they've allocated into a user vma. Same comments apply. + * + * This function should only be called from a vm_ops->fault handler, and + * in that case the handler should return NULL. + * + * vma cannot be a COW mapping. + * + * As this is called only for pages that do not currently exist, we + * do not need to flush old virtual caches or the TLB. + */ +int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, + unsigned long pfn) +{ + return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot); +} +EXPORT_SYMBOL(vm_insert_pfn); + /** * vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot * @vma: user vma to map to -- 2.18.0
Documentation and comments. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- Documentation/x86/pat.txt | 4 ++-- include/asm-generic/pgtable.h | 4 ++-- include/linux/hmm.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt index 2a4ee6302122..481d8d8536ac 100644 --- a/Documentation/x86/pat.txt +++ b/Documentation/x86/pat.txt @@ -90,12 +90,12 @@ pci proc | -- | -- | WC | Advanced APIs for drivers ------------------------- A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range, -vm_insert_pfn +vmf_insert_pfn Drivers wanting to export some pages to userspace do it by using mmap interface and a combination of 1) pgprot_noncached() -2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn() +2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn() With PAT support, a new API pgprot_writecombine is being added. So, drivers can continue to use the above sequence, with either pgprot_noncached() or diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 88ebc6102c7c..5657a20e0c59 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -757,7 +757,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd) /* * Interfaces that can be used by architecture code to keep track of * memory type of pfn mappings specified by the remap_pfn_range, - * vm_insert_pfn. + * vmf_insert_pfn. */ /* @@ -773,7 +773,7 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot, /* * track_pfn_insert is called when a _new_ single pfn is established - * by vm_insert_pfn(). + * by vmf_insert_pfn(). */ static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 4c92e3ba3e16..dde947083d4e 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -107,7 +107,7 @@ enum hmm_pfn_flag_e { * HMM_PFN_ERROR: corresponding CPU page table entry points to poisoned memory * HMM_PFN_NONE: corresponding CPU page table entry is pte_none() * HMM_PFN_SPECIAL: corresponding CPU page table entry is special; i.e., the - * result of vm_insert_pfn() or vm_insert_page(). Therefore, it should not + * result of vmf_insert_pfn() or vm_insert_page(). Therefore, it should not * be mirrored by a device, because the entry will never have HMM_PFN_VALID * set and the pfn value is undefined. * -- 2.18.0
All callers are now converted to vmf_insert_pfn so convert vmf_insert_pfn() from being a compatibility wrapper around vm_insert_pfn() to being a compatibility wrapper around vmf_insert_pfn_prot(). Signed-off-by: Matthew Wilcox <willy@infradead.org> --- include/linux/mm.h | 15 +------------ mm/memory.c | 54 +++++++++++++++++++++++++--------------------- 2 files changed, 30 insertions(+), 39 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1552c67c835e..bd5e2469b637 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2478,7 +2478,7 @@ struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr); int remap_pfn_range(struct vm_area_struct *, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t); int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); -int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, +vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot); @@ -2501,19 +2501,6 @@ static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma, return VM_FAULT_NOPAGE; } -static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, - unsigned long addr, unsigned long pfn) -{ - int err = vm_insert_pfn(vma, addr, pfn); - - if (err == -ENOMEM) - return VM_FAULT_OOM; - if (err < 0 && err != -EBUSY) - return VM_FAULT_SIGBUS; - - return VM_FAULT_NOPAGE; -} - static inline vm_fault_t vmf_error(int err) { if (err == -ENOMEM) diff --git a/mm/memory.c b/mm/memory.c index 8392a104a36d..d5ccbadd81c1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1849,30 +1849,6 @@ static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, return ret; } -/** - * vm_insert_pfn - insert single pfn into user vma - * @vma: user vma to map to - * @addr: target user address of this page - * @pfn: source kernel pfn - * - * Similar to vm_insert_page, this allows drivers to insert individual pages - * they've allocated into a user vma. Same comments apply. - * - * This function should only be called from a vm_ops->fault handler, and - * in that case the handler should return NULL. - * - * vma cannot be a COW mapping. - * - * As this is called only for pages that do not currently exist, we - * do not need to flush old virtual caches or the TLB. - */ -int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, - unsigned long pfn) -{ - return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot); -} -EXPORT_SYMBOL(vm_insert_pfn); - /** * vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot * @vma: user vma to map to @@ -1885,9 +1861,10 @@ EXPORT_SYMBOL(vm_insert_pfn); * * This only makes sense for IO mappings, and it makes no sense for * COW mappings. In general, using multiple vmas is preferable; - * vm_insert_pfn_prot should only be used if using multiple VMAs is + * vmf_insert_pfn_prot should only be used if using multiple VMAs is * impractical. * + * Context: Process context. May allocate using %GFP_KERNEL. * Return: vm_fault_t value. */ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, @@ -1904,6 +1881,33 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vmf_insert_pfn_prot); +/** + * vmf_insert_pfn - insert single pfn into user vma + * @vma: user vma to map to + * @addr: target user address of this page + * @pfn: source kernel pfn + * + * Similar to vm_insert_page, this allows drivers to insert individual pages + * they've allocated into a user vma. Same comments apply. + * + * This function should only be called from a vm_ops->fault handler, and + * in that case the handler should return the result of this function. + * + * vma cannot be a COW mapping. + * + * As this is called only for pages that do not currently exist, we + * do not need to flush old virtual caches or the TLB. + * + * Context: Process context. May allocate using %GFP_KERNEL. + * Return: vm_fault_t value. + */ +vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, + unsigned long pfn) +{ + return vmf_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot); +} +EXPORT_SYMBOL(vmf_insert_pfn); + static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn) { /* these checks mirror the abort conditions in vm_normal_page */ -- 2.18.0
vm_insert_pfn_prot() is only called from vmf_insert_pfn_prot(), so inline it and convert some of the errnos into vm_fault codes earlier. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- mm/memory.c | 55 +++++++++++++++++++++++------------------------------ 1 file changed, 24 insertions(+), 31 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index d5ccbadd81c1..9e97926fee19 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1819,36 +1819,6 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, return retval; } -static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, - unsigned long pfn, pgprot_t pgprot) -{ - int ret; - /* - * Technically, architectures with pte_special can avoid all these - * restrictions (same for remap_pfn_range). However we would like - * consistency in testing and feature parity among all, so we should - * try to keep these invariants in place for everybody. - */ - BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); - BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == - (VM_PFNMAP|VM_MIXEDMAP)); - BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); - BUG_ON((vma->vm_flags & VM_MIXEDMAP) && pfn_valid(pfn)); - - if (addr < vma->vm_start || addr >= vma->vm_end) - return -EFAULT; - - if (!pfn_modify_allowed(pfn, pgprot)) - return -EACCES; - - track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)); - - ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot, - false); - - return ret; -} - /** * vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot * @vma: user vma to map to @@ -1870,7 +1840,30 @@ static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot) { - int err = vm_insert_pfn_prot(vma, addr, pfn, pgprot); + int err; + + /* + * Technically, architectures with pte_special can avoid all these + * restrictions (same for remap_pfn_range). However we would like + * consistency in testing and feature parity among all, so we should + * try to keep these invariants in place for everybody. + */ + BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))); + BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) == + (VM_PFNMAP|VM_MIXEDMAP)); + BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); + BUG_ON((vma->vm_flags & VM_MIXEDMAP) && pfn_valid(pfn)); + + if (addr < vma->vm_start || addr >= vma->vm_end) + return VM_FAULT_SIGBUS; + + if (!pfn_modify_allowed(pfn, pgprot)) + return VM_FAULT_SIGBUS; + + track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)); + + err = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot, + false); if (err == -ENOMEM) return VM_FAULT_OOM; -- 2.18.0
Both of its callers currently convert its errno return into a vm_fault_t, so move the conversion into __vm_insert_mixed. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- mm/memory.c | 36 +++++++++++++++--------------------- 1 file changed, 15 insertions(+), 21 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9e97926fee19..9fef202b4ea7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1915,20 +1915,21 @@ static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn) return false; } -static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn, bool mkwrite) +static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, + unsigned long addr, pfn_t pfn, bool mkwrite) { pgprot_t pgprot = vma->vm_page_prot; + int err; BUG_ON(!vm_mixed_ok(vma, pfn)); if (addr < vma->vm_start || addr >= vma->vm_end) - return -EFAULT; + return VM_FAULT_SIGBUS; track_pfn_insert(vma, &pgprot, pfn); if (!pfn_modify_allowed(pfn_t_to_pfn(pfn), pgprot)) - return -EACCES; + return VM_FAULT_SIGBUS; /* * If we don't have pte special, then we have to use the pfn_valid() @@ -1947,15 +1948,10 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr, * result in pfn_t_has_page() == false. */ page = pfn_to_page(pfn_t_to_pfn(pfn)); - return insert_page(vma, addr, page, pgprot); + err = insert_page(vma, addr, page, pgprot); + } else { + err = insert_pfn(vma, addr, pfn, pgprot, mkwrite); } - return insert_pfn(vma, addr, pfn, pgprot, mkwrite); -} - -vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn) -{ - int err = __vm_insert_mixed(vma, addr, pfn, false); if (err == -ENOMEM) return VM_FAULT_OOM; @@ -1964,6 +1960,12 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, return VM_FAULT_NOPAGE; } + +vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, + pfn_t pfn) +{ + return __vm_insert_mixed(vma, addr, pfn, false); +} EXPORT_SYMBOL(vmf_insert_mixed); /* @@ -1971,18 +1973,10 @@ EXPORT_SYMBOL(vmf_insert_mixed); * different entry in the mean time, we treat that as success as we assume * the same entry was actually inserted. */ - vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn) { - int err; - - err = __vm_insert_mixed(vma, addr, pfn, true); - if (err == -ENOMEM) - return VM_FAULT_OOM; - if (err < 0 && err != -EBUSY) - return VM_FAULT_SIGBUS; - return VM_FAULT_NOPAGE; + return __vm_insert_mixed(vma, addr, pfn, true); } EXPORT_SYMBOL(vmf_insert_mixed_mkwrite); -- 2.18.0
All callers convert its errno into a vm_fault_t, so convert it to return a vm_fault_t directly. Signed-off-by: Matthew Wilcox <willy@infradead.org> --- mm/memory.c | 24 +++++------------------- 1 file changed, 5 insertions(+), 19 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9fef202b4ea7..368b65390762 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1767,19 +1767,16 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vm_insert_page); -static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, +static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn, pgprot_t prot, bool mkwrite) { struct mm_struct *mm = vma->vm_mm; - int retval; pte_t *pte, entry; spinlock_t *ptl; - retval = -ENOMEM; pte = get_locked_pte(mm, addr, &ptl); if (!pte) - goto out; - retval = -EBUSY; + return VM_FAULT_OOM; if (!pte_none(*pte)) { if (mkwrite) { /* @@ -1812,11 +1809,9 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, set_pte_at(mm, addr, pte, entry); update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */ - retval = 0; out_unlock: pte_unmap_unlock(pte, ptl); -out: - return retval; + return VM_FAULT_NOPAGE; } /** @@ -1840,8 +1835,6 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot) { - int err; - /* * Technically, architectures with pte_special can avoid all these * restrictions (same for remap_pfn_range). However we would like @@ -1862,15 +1855,8 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)); - err = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot, + return insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot, false); - - if (err == -ENOMEM) - return VM_FAULT_OOM; - if (err < 0 && err != -EBUSY) - return VM_FAULT_SIGBUS; - - return VM_FAULT_NOPAGE; } EXPORT_SYMBOL(vmf_insert_pfn_prot); @@ -1950,7 +1936,7 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, page = pfn_to_page(pfn_t_to_pfn(pfn)); err = insert_page(vma, addr, page, pgprot); } else { - err = insert_pfn(vma, addr, pfn, pgprot, mkwrite); + return insert_pfn(vma, addr, pfn, pgprot, mkwrite); } if (err == -ENOMEM) -- 2.18.0
On Tue, 28 Aug 2018, Matthew Wilcox wrote: > cramfs is the only remaining user of vm_insert_mixed; convert it. > > Signed-off-by: Matthew Wilcox <willy@infradead.org> > --- > fs/cramfs/inode.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c > index f408994fc632..b72449c19cd1 100644 > --- a/fs/cramfs/inode.c > +++ b/fs/cramfs/inode.c > @@ -417,10 +417,15 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma) > */ > int i; > vma->vm_flags |= VM_MIXEDMAP; > - for (i = 0; i < pages && !ret; i++) { > + for (i = 0; i < pages; i++) { > + vm_fault_t vmf; > unsigned long off = i * PAGE_SIZE; > pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV); > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn); > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn); > + if (vmf & VM_FAULT_ERROR) { > + pages = i; > + break; > + } I'd suggest this to properly deal with errers instead: diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index f408994fc6..0c35e62f10 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -418,9 +418,12 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma) int i; vma->vm_flags |= VM_MIXEDMAP; for (i = 0; i < pages && !ret; i++) { + vm_fault_t vmf; unsigned long off = i * PAGE_SIZE; pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV); - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn); + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn); + if (vmf & VM_FAULT_ERROR) + ret = vm_fault_to_errno(vmf, 0); } } Nicolas
On Tue, Aug 28, 2018 at 01:49:25PM -0400, Nicolas Pitre wrote:
> On Tue, 28 Aug 2018, Matthew Wilcox wrote:
> > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> > + if (vmf & VM_FAULT_ERROR) {
> > + pages = i;
> > + break;
> > + }
>
> I'd suggest this to properly deal with errers instead:
>
> - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> + if (vmf & VM_FAULT_ERROR)
> + ret = vm_fault_to_errno(vmf, 0);
By my reading of this function, the intent is actually to return 0
here and allow demand paging to work. Of course, I've spent all of
twenty minutes staring at this function, so I defer to the maintainer.
I think you'd need to be running a make-memory-allocations-fail fuzzer
to hit this, so it's likely never been tested.
On Tue, 28 Aug 2018, Matthew Wilcox wrote: > On Tue, Aug 28, 2018 at 01:49:25PM -0400, Nicolas Pitre wrote: > > On Tue, 28 Aug 2018, Matthew Wilcox wrote: > > > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn); > > > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn); > > > + if (vmf & VM_FAULT_ERROR) { > > > + pages = i; > > > + break; > > > + } > > > > I'd suggest this to properly deal with errers instead: > > > > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn); > > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn); > > + if (vmf & VM_FAULT_ERROR) > > + ret = vm_fault_to_errno(vmf, 0); > > By my reading of this function, the intent is actually to return 0 > here and allow demand paging to work. Of course, I've spent all of > twenty minutes staring at this function, so I defer to the maintainer. Demand paging is used when the filesystem layout isn't amenable to a direct mapping. It is not a fallback for when we're OOM or some other internal errors which ought to be reported immediately. > I think you'd need to be running a make-memory-allocations-fail fuzzer > to hit this, so it's likely never been tested. Well, it has been tested sort of, e.g. when vm_insert_mixed() returned an error due to misaligned addresses during development. Normally, vm_insert_mixed() and vmf_insert_mixed() should always succeed, and if they don't we certainly don't want to ignore it. Nicolas