The plan for these patches is to introduce the typedef, initially just as documentation ("These functions should return a VM_FAULT_ status"). We'll trickle the patches to individual drivers/filesystems in through the maintainers, as far as possible. Then we'll change the typedef to an unsigned int and break the compilation of any unconverted drivers/filesystems. vmf_insert_page(), vmf_insert_mixed() and vmf_insert_pfn() are three newly added functions. The various drivers/filesystems where return value of fault(), huge_fault(), page_mkwrite() and pfn_mkwrite() get converted, will need them. These functions will return correct VM_FAULT_ code based on err value. We've had bugs before where drivers returned -EFOO. And we have this silly inefficiency where vm_insert_xxx() return an errno which (afaict) every driver then converts into a VM_FAULT code. In many cases drivers failed to return correct VM_FAULT code value despite of vm_insert_xxx() fails. We have indentified and clean up all those existing bugs and silly inefficiencies in driver/filesystems by adding these three new inline wrappers. As mentioned above, we will trickle those patches to individual drivers/filesystems in through maintainers after these three wrapper functions are merged. Eventually we can convert vm_insert_xxx() into vmf_insert_xxx() and remove these inline wrappers, but these are a good intermediate step. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> --- include/linux/mm.h | 47 +++++++++++++++++++++++++++++++++++++++++++---- include/linux/mm_types.h | 2 ++ 2 files changed, 45 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ad06d42..a4d8853 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -379,17 +379,18 @@ struct vm_operations_struct { void (*close)(struct vm_area_struct * area); int (*split)(struct vm_area_struct * area, unsigned long addr); int (*mremap)(struct vm_area_struct * area); - int (*fault)(struct vm_fault *vmf); - int (*huge_fault)(struct vm_fault *vmf, enum page_entry_size pe_size); + vm_fault_t (*fault)(struct vm_fault *vmf); + vm_fault_t (*huge_fault)(struct vm_fault *vmf, + enum page_entry_size pe_size); void (*map_pages)(struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff); /* notification that a previously read-only page is about to become * writable, if an error is returned it will cause a SIGBUS */ - int (*page_mkwrite)(struct vm_fault *vmf); + vm_fault_t (*page_mkwrite)(struct vm_fault *vmf); /* same as page_mkwrite when using VM_PFNMAP|VM_MIXEDMAP */ - int (*pfn_mkwrite)(struct vm_fault *vmf); + vm_fault_t (*pfn_mkwrite)(struct vm_fault *vmf); /* called by access_process_vm when get_user_pages() fails, typically * for use by special VMAs that can switch between memory and hardware @@ -2413,6 +2414,44 @@ int vm_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn); int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len); +static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma, + unsigned long addr, struct page *page) +{ + int err = vm_insert_page(vma, addr, page); + + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} + +static inline vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, + unsigned long addr, pfn_t pfn) +{ + int err = vm_insert_mixed(vma, addr, pfn); + + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} + +static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, + unsigned long addr, unsigned long pfn) +{ + int err = vm_insert_pfn(vma, addr, pfn); + + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} struct page *follow_page_mask(struct vm_area_struct *vma, unsigned long address, unsigned int foll_flags, diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index fd1af6b..2161234 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -22,6 +22,8 @@ #endif #define AT_VECTOR_SIZE (2*(AT_VECTOR_SIZE_ARCH + AT_VECTOR_SIZE_BASE + 1)) +typedef int vm_fault_t; + struct address_space; struct mem_cgroup; struct hmm; -- 1.9.1
On Sat 10-03-18 21:53:52, Souptick Joarder wrote: > The plan for these patches is to introduce the typedef, initially > just as documentation ("These functions should return a VM_FAULT_ status"). > We'll trickle the patches to individual drivers/filesystems in through > the maintainers, as far as possible. Then we'll change the typedef > to an unsigned int and break the compilation of any unconverted > drivers/filesystems. > > vmf_insert_page(), vmf_insert_mixed() and vmf_insert_pfn() are three > newly added functions. The various drivers/filesystems where return value > of fault(), huge_fault(), page_mkwrite() and pfn_mkwrite() get converted, > will need them. These functions will return correct VM_FAULT_ code based > on err value. > > We've had bugs before where drivers returned -EFOO. And we have this > silly inefficiency where vm_insert_xxx() return an errno which (afaict) > every driver then converts into a VM_FAULT code. In many cases drivers > failed to return correct VM_FAULT code value despite of vm_insert_xxx() > fails. We have indentified and clean up all those existing bugs and silly > inefficiencies in driver/filesystems by adding these three new inline > wrappers. As mentioned above, we will trickle those patches to individual > drivers/filesystems in through maintainers after these three wrapper > functions are merged. > > Eventually we can convert vm_insert_xxx() into vmf_insert_xxx() and > remove these inline wrappers, but these are a good intermediate step. Yes, this makes sense to me. 3 copies of error->vmfault conversion hurt eyes a bit, though > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Anyway, feel free to add Acked-by: Michal Hocko <mhocko@suse.com> and good luck with conversions. > --- > include/linux/mm.h | 47 +++++++++++++++++++++++++++++++++++++++++++---- > include/linux/mm_types.h | 2 ++ > 2 files changed, 45 insertions(+), 4 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index ad06d42..a4d8853 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -379,17 +379,18 @@ struct vm_operations_struct { > void (*close)(struct vm_area_struct * area); > int (*split)(struct vm_area_struct * area, unsigned long addr); > int (*mremap)(struct vm_area_struct * area); > - int (*fault)(struct vm_fault *vmf); > - int (*huge_fault)(struct vm_fault *vmf, enum page_entry_size pe_size); > + vm_fault_t (*fault)(struct vm_fault *vmf); > + vm_fault_t (*huge_fault)(struct vm_fault *vmf, > + enum page_entry_size pe_size); > void (*map_pages)(struct vm_fault *vmf, > pgoff_t start_pgoff, pgoff_t end_pgoff); > > /* notification that a previously read-only page is about to become > * writable, if an error is returned it will cause a SIGBUS */ > - int (*page_mkwrite)(struct vm_fault *vmf); > + vm_fault_t (*page_mkwrite)(struct vm_fault *vmf); > > /* same as page_mkwrite when using VM_PFNMAP|VM_MIXEDMAP */ > - int (*pfn_mkwrite)(struct vm_fault *vmf); > + vm_fault_t (*pfn_mkwrite)(struct vm_fault *vmf); > > /* called by access_process_vm when get_user_pages() fails, typically > * for use by special VMAs that can switch between memory and hardware > @@ -2413,6 +2414,44 @@ int vm_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr, > pfn_t pfn); > int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len); > > +static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma, > + unsigned long addr, struct page *page) > +{ > + int err = vm_insert_page(vma, addr, page); > + > + if (err == -ENOMEM) > + return VM_FAULT_OOM; > + if (err < 0 && err != -EBUSY) > + return VM_FAULT_SIGBUS; > + > + return VM_FAULT_NOPAGE; > +} > + > +static inline vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, > + unsigned long addr, pfn_t pfn) > +{ > + int err = vm_insert_mixed(vma, addr, pfn); > + > + if (err == -ENOMEM) > + return VM_FAULT_OOM; > + if (err < 0 && err != -EBUSY) > + return VM_FAULT_SIGBUS; > + > + return VM_FAULT_NOPAGE; > +} > + > +static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, > + unsigned long addr, unsigned long pfn) > +{ > + int err = vm_insert_pfn(vma, addr, pfn); > + > + if (err == -ENOMEM) > + return VM_FAULT_OOM; > + if (err < 0 && err != -EBUSY) > + return VM_FAULT_SIGBUS; > + > + return VM_FAULT_NOPAGE; > +} > > struct page *follow_page_mask(struct vm_area_struct *vma, > unsigned long address, unsigned int foll_flags, > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index fd1af6b..2161234 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -22,6 +22,8 @@ > #endif > #define AT_VECTOR_SIZE (2*(AT_VECTOR_SIZE_ARCH + AT_VECTOR_SIZE_BASE + 1)) > > +typedef int vm_fault_t; > + > struct address_space; > struct mem_cgroup; > struct hmm; > -- > 1.9.1 -- Michal Hocko SUSE Labs
Use new return type vm_fault_t for fault handler in struct vm_operations_struct. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. commit 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> --- v2: Updated the change log mm/hugetlb.c | 2 +- mm/mmap.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7c204e3..acb432a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3158,7 +3158,7 @@ static int hugetlb_vm_op_split(struct vm_area_struct *vma, unsigned long addr) * hugegpage VMA. do_page_fault() is supposed to trap this, so BUG is we get * this far. */ -static int hugetlb_vm_op_fault(struct vm_fault *vmf) +static vm_fault_t hugetlb_vm_op_fault(struct vm_fault *vmf) { BUG(); return 0; diff --git a/mm/mmap.c b/mm/mmap.c index 9efdc021..ac41b34 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3208,7 +3208,7 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages) mm->data_vm += npages; } -static int special_mapping_fault(struct vm_fault *vmf); +static vm_fault_t special_mapping_fault(struct vm_fault *vmf); /* * Having a close hook prevents vma merging regardless of flags. @@ -3247,7 +3247,7 @@ static int special_mapping_mremap(struct vm_area_struct *new_vma) .fault = special_mapping_fault, }; -static int special_mapping_fault(struct vm_fault *vmf) +static vm_fault_t special_mapping_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; pgoff_t pgoff; -- 1.9.1
On Fri, May 11, 2018 at 11:36:39PM +0530, Souptick Joarder wrote:
> mm/hugetlb.c | 2 +-
> mm/mmap.c | 4 ++--
> 2 files changed, 3 insertions(+), 3 deletions(-)
Don't we also need to convert include/linux/mm_types.h:
@@ -621,7 +621,7 @@ struct vm_special_mapping {
* If non-NULL, then this is called to resolve page faults
* on the special mapping. If used, .pages is not checked.
*/
- int (*fault)(const struct vm_special_mapping *sm,
+ vm_fault_t (*fault)(const struct vm_special_mapping *sm,
struct vm_area_struct *vma,
struct vm_fault *vmf);
or are you leaving that for a later patch?
On Fri, May 11, 2018 at 11:45 PM, Matthew Wilcox <willy@infradead.org> wrote:
> On Fri, May 11, 2018 at 11:36:39PM +0530, Souptick Joarder wrote:
>> mm/hugetlb.c | 2 +-
>> mm/mmap.c | 4 ++--
>> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> Don't we also need to convert include/linux/mm_types.h:
>
> @@ -621,7 +621,7 @@ struct vm_special_mapping {
> * If non-NULL, then this is called to resolve page faults
> * on the special mapping. If used, .pages is not checked.
> */
> - int (*fault)(const struct vm_special_mapping *sm,
> + vm_fault_t (*fault)(const struct vm_special_mapping *sm,
> struct vm_area_struct *vma,
> struct vm_fault *vmf);
>
> or are you leaving that for a later patch?
Ahh, I didn't realise. No I think, we can add it as part of this
patch. Will send v3.