From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21FD3C282CE for ; Tue, 12 Feb 2019 00:08:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EE38320844 for ; Tue, 12 Feb 2019 00:08:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727906AbfBLAIW (ORCPT ); Mon, 11 Feb 2019 19:08:22 -0500 Received: from mga01.intel.com ([192.55.52.88]:17011 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727707AbfBLAIV (ORCPT ); Mon, 11 Feb 2019 19:08:21 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Feb 2019 16:08:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,360,1544515200"; d="scan'208";a="123724719" Received: from iweiny-desk2.sc.intel.com ([10.3.52.157]) by fmsmga008.fm.intel.com with ESMTP; 11 Feb 2019 16:08:21 -0800 Date: Mon, 11 Feb 2019 16:08:10 -0800 From: Ira Weiny To: Jason Gunthorpe Cc: Dan Williams , John Hubbard , linux-rdma , Linux Kernel Mailing List , Linux MM , Daniel Borkmann , Davidlohr Bueso , Netdev , Mike Marciniszyn , Dennis Dalessandro , Doug Ledford , Andrew Morton , "Kirill A. Shutemov" Subject: Re: [PATCH 2/3] mm/gup: Introduce get_user_pages_fast_longterm() Message-ID: <20190212000810.GA24207@iweiny-DESK2.sc.intel.com> References: <20190211201643.7599-1-ira.weiny@intel.com> <20190211201643.7599-3-ira.weiny@intel.com> <20190211203916.GA2771@ziepe.ca> <20190211212652.GA7790@iweiny-DESK2.sc.intel.com> <20190211215238.GA23825@iweiny-DESK2.sc.intel.com> <20190211220658.GH24692@ziepe.ca> <20190211232510.GP24692@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190211232510.GP24692@ziepe.ca> User-Agent: Mutt/1.11.1 (2018-12-01) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Feb 11, 2019 at 04:25:10PM -0700, Jason Gunthorpe wrote: > On Mon, Feb 11, 2019 at 02:55:10PM -0800, Dan Williams wrote: > > > > I also wonder if someone should think about making fast into a flag > > > too.. > > > > > > But I'm not sure when fast should be used vs when it shouldn't :( > > > > Effectively fast should always be used just in case the user cares > > about performance. It's just that it may fail and need to fall back to > > requiring the vma. > > But the fall back / slow path is hidden inside the API, so when should > the caller care? > > ie when should the caller care to use gup_fast vs gup_unlocked? (the > comments say they are the same, but this seems to be a mistake) > > Based on some of the comments in the code it looks like this API is > trying to convert itself into: > > long get_user_pages_locked(struct task_struct *tsk, struct mm_struct *mm, > unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages, > struct vm_area_struct **vmas, bool *locked) > > long get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm, > unsigned long start, unsigned long nr_pages, > unsigned int gup_flags, struct page **pages) > > (and maybe a FOLL_FAST if there is some reason we have _fast and > _unlocked) > > The reason I ask, is that if there is no reason for fast vs unlocked > then maybe Ira should convert HFI to use gup_unlocked and move the > 'fast' code into unlocked? > > ie move incrementally closer to the desired end-state here. If the pages are not in the page tables then fast is probably going to be slightly slower because it will have to fall back after walking the tables and finding something missing. For PSM2 (MPI) applications are performance improvement was probably because the memory in question was in the page tables and very much in use. Ira > > Jason