From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FF83C4361A for ; Fri, 4 Dec 2020 22:19:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C21DB22CB9 for ; Fri, 4 Dec 2020 22:19:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C21DB22CB9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D82616B0036; Fri, 4 Dec 2020 17:19:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D33106B005C; Fri, 4 Dec 2020 17:19:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFAD96B005D; Fri, 4 Dec 2020 17:19:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id AADE46B0036 for ; Fri, 4 Dec 2020 17:19:23 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 71A4C181AEF21 for ; Fri, 4 Dec 2020 22:19:23 +0000 (UTC) X-FDA: 77557017006.04.mind49_5114c64273c8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 536928002258 for ; Fri, 4 Dec 2020 22:19:23 +0000 (UTC) X-HE-Tag: mind49_5114c64273c8 X-Filterd-Recvd-Size: 8321 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Fri, 4 Dec 2020 22:19:21 +0000 (UTC) IronPort-SDR: zAl91pff1XtZPhacfB3ioZLD3ReDnFzCAZqPS+b2FVV5t3Ac723LWo2iqDEYxWAisEnV2L/77U 1/hBh74TeW9g== X-IronPort-AV: E=McAfee;i="6000,8403,9825"; a="160510702" X-IronPort-AV: E=Sophos;i="5.78,393,1599548400"; d="scan'208";a="160510702" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2020 14:19:08 -0800 IronPort-SDR: iUdjx55ap6LfbibPfCzDzqGcuoskT0co+RJ0YfvuxkpAFFipoD5mnpMUSUEp62HRaTVBl16QJe Owdmb95Z/2Ng== X-IronPort-AV: E=Sophos;i="5.78,393,1599548400"; d="scan'208";a="482543284" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2020 14:19:08 -0800 Date: Fri, 4 Dec 2020 14:19:08 -0800 From: Ira Weiny To: Jason Gunthorpe Cc: Andrew Morton , linux-mm , Dan Williams , John Hubbard , Pavel Tatashin Subject: Re: [PATCH] mm/gup: remove the vma allocation from gup_longterm_locked() Message-ID: <20201204221908.GI1563847@iweiny-DESK2.sc.intel.com> References: <0-v1-5551df3ed12e+b8-gup_dax_speedup_jgg@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0-v1-5551df3ed12e+b8-gup_dax_speedup_jgg@nvidia.com> User-Agent: Mutt/1.11.1 (2018-12-01) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Dec 04, 2020 at 03:39:52PM -0400, Jason Gunthorpe wrote: > Long ago there wasn't a FOLL_LONGTERM flag so this DAX check was done by > post-processing the VMA list. > > These days it is trivial to just check each VMA to see if it is DAX before > processing it inside __get_user_pages() and return failure if a DAX VMA is > encountered with FOLL_LONGTERM. > > Removing the allocation of the VMA list is a significant speed up for many > call sites. > > Add an IS_ENABLED to vma_is_fsdax so that code generation is unchanged > when DAX is compiled out. > > Remove the dummy version of __gup_longterm_locked() as !CONFIG_CMA already > makes memalloc_nocma_save(), check_and_migrate_cma_pages(), and > memalloc_nocma_restore() into a NOP. > > Cc: Dan Williams > Cc: Ira Weiny > Cc: John Hubbard > Cc: Pavel Tatashin > Signed-off-by: Jason Gunthorpe > --- > include/linux/fs.h | 2 +- > mm/gup.c | 83 +++++++++------------------------------------- > 2 files changed, 16 insertions(+), 69 deletions(-) > > This was tested using the fake nvdimm stuff and RDMA's FOLL_LONGTERM pin > continues to correctly reject DAX vmas and returns EOPNOTSUPP > > Pavel, this accomplishes the same #ifdef clean up as your patch series for CMA > by just deleting all the code that justified the ifdefs. > > FWIW, this is probably going to be the start of a longer trickle of patches to > make pin_user_pages()/unpin_user_pages() faster. This flow is offensively slow > right now. > > Ira, I investigated streamlining the callers from here, and you are right. > The distinction that FOLL_LONGTERM means locked == NULL is no longer required > now that the vma list isn't used, and with some adjusting of the CMA path we > can purge out a lot of other complexity too. > > I have some drafts, but I want to tackle this separately. > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 8667d0cdc71e76..1fcc2b00582b22 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -3230,7 +3230,7 @@ static inline bool vma_is_fsdax(struct vm_area_struct *vma) > { > struct inode *inode; > > - if (!vma->vm_file) > + if (!IS_ENABLED(CONFIG_FS_DAX) || !vma->vm_file) > return false; Just a minor nit on this change. Is this still called in frame_vector.c? I thought I saw a patch from Daniel Vetter which removed that? Ah yea... https://lore.kernel.org/lkml/20201127164131.2244124-6-daniel.vetter@ffwll.ch/ Regardless even if vma_is_fsdax() is still called there I don't think this will change anything there, other than making that code faster when FS_DAX is not configured. So perfect... Reviewed-by: Ira Weiny > if (!vma_is_dax(vma)) > return false; > diff --git a/mm/gup.c b/mm/gup.c > index 9c6a2f5001c5c2..311a44ff41ff42 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -923,6 +923,9 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) > if (gup_flags & FOLL_ANON && !vma_is_anonymous(vma)) > return -EFAULT; > > + if ((gup_flags & FOLL_LONGTERM) && vma_is_fsdax(vma)) > + return -EOPNOTSUPP; > + > if (write) { > if (!(vm_flags & VM_WRITE)) { > if (!(gup_flags & FOLL_FORCE)) > @@ -1060,10 +1063,14 @@ static long __get_user_pages(struct mm_struct *mm, > goto next_page; > } > > - if (!vma || check_vma_flags(vma, gup_flags)) { > + if (!vma) { > ret = -EFAULT; > goto out; > } > + ret = check_vma_flags(vma, gup_flags); > + if (ret) > + goto out; > + > if (is_vm_hugetlb_page(vma)) { > i = follow_hugetlb_page(mm, vma, pages, vmas, > &start, &nr_pages, i, > @@ -1567,26 +1574,6 @@ struct page *get_dump_page(unsigned long addr) > } > #endif /* CONFIG_ELF_CORE */ > > -#if defined(CONFIG_FS_DAX) || defined (CONFIG_CMA) > -static bool check_dax_vmas(struct vm_area_struct **vmas, long nr_pages) > -{ > - long i; > - struct vm_area_struct *vma_prev = NULL; > - > - for (i = 0; i < nr_pages; i++) { > - struct vm_area_struct *vma = vmas[i]; > - > - if (vma == vma_prev) > - continue; > - > - vma_prev = vma; > - > - if (vma_is_fsdax(vma)) > - return true; > - } > - return false; > -} > - > #ifdef CONFIG_CMA > static long check_and_migrate_cma_pages(struct mm_struct *mm, > unsigned long start, > @@ -1705,63 +1692,23 @@ static long __gup_longterm_locked(struct mm_struct *mm, > struct vm_area_struct **vmas, > unsigned int gup_flags) > { > - struct vm_area_struct **vmas_tmp = vmas; > unsigned long flags = 0; > - long rc, i; > + long rc; > > - if (gup_flags & FOLL_LONGTERM) { > - if (!pages) > - return -EINVAL; > - > - if (!vmas_tmp) { > - vmas_tmp = kcalloc(nr_pages, > - sizeof(struct vm_area_struct *), > - GFP_KERNEL); > - if (!vmas_tmp) > - return -ENOMEM; > - } > + if (gup_flags & FOLL_LONGTERM) > flags = memalloc_nocma_save(); > - } > > - rc = __get_user_pages_locked(mm, start, nr_pages, pages, > - vmas_tmp, NULL, gup_flags); > + rc = __get_user_pages_locked(mm, start, nr_pages, pages, vmas, NULL, > + gup_flags); > > if (gup_flags & FOLL_LONGTERM) { > - if (rc < 0) > - goto out; > - > - if (check_dax_vmas(vmas_tmp, rc)) { > - if (gup_flags & FOLL_PIN) > - unpin_user_pages(pages, rc); > - else > - for (i = 0; i < rc; i++) > - put_page(pages[i]); > - rc = -EOPNOTSUPP; > - goto out; > - } > - > - rc = check_and_migrate_cma_pages(mm, start, rc, pages, > - vmas_tmp, gup_flags); > -out: > + if (rc > 0) > + rc = check_and_migrate_cma_pages(mm, start, rc, pages, > + vmas, gup_flags); > memalloc_nocma_restore(flags); > } > - > - if (vmas_tmp != vmas) > - kfree(vmas_tmp); > return rc; > } > -#else /* !CONFIG_FS_DAX && !CONFIG_CMA */ > -static __always_inline long __gup_longterm_locked(struct mm_struct *mm, > - unsigned long start, > - unsigned long nr_pages, > - struct page **pages, > - struct vm_area_struct **vmas, > - unsigned int flags) > -{ > - return __get_user_pages_locked(mm, start, nr_pages, pages, vmas, > - NULL, flags); > -} > -#endif /* CONFIG_FS_DAX || CONFIG_CMA */ > > static bool is_valid_gup_flags(unsigned int gup_flags) > { > -- > 2.29.2 >