From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 710AECA9ECB for ; Thu, 31 Oct 2019 23:43:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3BCF32087F for ; Thu, 31 Oct 2019 23:43:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="WcECVR+i" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727580AbfJaXnU (ORCPT ); Thu, 31 Oct 2019 19:43:20 -0400 Received: from hqemgate16.nvidia.com ([216.228.121.65]:6542 "EHLO hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727382AbfJaXnU (ORCPT ); Thu, 31 Oct 2019 19:43:20 -0400 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Thu, 31 Oct 2019 16:43:23 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Thu, 31 Oct 2019 16:43:17 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Thu, 31 Oct 2019 16:43:17 -0700 Received: from [10.110.48.28] (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Thu, 31 Oct 2019 23:43:17 +0000 Subject: Re: [PATCH 05/19] mm/gup: introduce pin_user_pages*() and FOLL_PIN To: Ira Weiny CC: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Christoph Hellwig , Dan Williams , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko , Mike Kravetz , Paul Mackerras , Shuah Khan , Vlastimil Babka , , , , , , , , , , , , , LKML References: <20191030224930.3990755-1-jhubbard@nvidia.com> <20191030224930.3990755-6-jhubbard@nvidia.com> <20191031231503.GF14771@iweiny-DESK2.sc.intel.com> From: John Hubbard X-Nvconfidentiality: public Message-ID: Date: Thu, 31 Oct 2019 16:43:16 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191031231503.GF14771@iweiny-DESK2.sc.intel.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1572565403; bh=PrJ35wWczmWpdy8SkgccuWUsoQYieyO1GnA2qpEHC/A=; h=X-PGP-Universal:Subject:To:CC:References:From:X-Nvconfidentiality: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=WcECVR+iSXkISjovnYPdLc0VuZ56KulHWxjHPNkHamdnVkeg8Qni2kb2nQWiZpF0R uHERsJFKafYQzmBPtdM38PPqlW9hVOG53FvDZ5P/CxOEzhou6QhRqcQ9N12GqgYgNp /wHowk2nkXrYaCJhmEFLqJYYfo78cS3l7gi+SGvxI9D5IHER/CaM/qs9FgQ9C1/ITB 55hcdTY0VgK3qlFmt0zG/WKLy9ecbOZj0gkPVF2N0VtGLv24a1MPDNGxZaK/MyTkD4 GQ5q+uS/VQIxS/skRMHIwldhmJ5+XRyQaWgtFy880BgX2uhvtBtgNacphnrNm0S54P UIFLHD8GyGD5Q== Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On 10/31/19 4:15 PM, Ira Weiny wrote: > On Wed, Oct 30, 2019 at 03:49:16PM -0700, John Hubbard wrote: ... >> + * FOLL_PIN indicates that a special kind of tracking (not just page->_refcount, >> + * but an additional pin counting system) will be invoked. This is intended for >> + * anything that gets a page reference and then touches page data (for example, >> + * Direct IO). This lets the filesystem know that some non-file-system entity is >> + * potentially changing the pages' data. In contrast to FOLL_GET (whose pages >> + * are released via put_page()), FOLL_PIN pages must be released, ultimately, by >> + * a call to put_user_page(). >> + * >> + * FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different >> + * and separate refcounting mechanisms, however, and that means that each has >> + * its own acquire and release mechanisms: >> + * >> + * FOLL_GET: get_user_pages*() to acquire, and put_page() to release. >> + * >> + * FOLL_PIN: pin_user_pages*() or pin_longterm_pages*() to acquire, and >> + * put_user_pages to release. >> + * >> + * FOLL_PIN and FOLL_GET are mutually exclusive. > > You mean the flags are mutually exclusive for any single call, correct? > Because my first thought was that you meant that a page which was pin'ed can't > be "got". Which I don't think is true or necessary... Yes, you are correct. And yes you can absolutely mix get_user_pages() and pin_user_pages() calls on the same page(s). OK, I'll change the wording to "mutually exclusive for a given function call". > >> + * >> + * Please see Documentation/vm/pin_user_pages.rst for more information. > > NIT: I think we should include this file as part of this patch... heh. I kept hopping back and forth on this, because I've seen other patchsets that often put Documentation/ into its own patch. But you're right, of course: it's not right to refer to items that are not here until a later patch. I'll merge patch 19 into this one, then. ... >> @@ -1603,11 +1630,25 @@ static __always_inline long __gup_longterm_locked(struct task_struct *tsk, >> * and mm being operated on are the current task's and don't allow >> * passing of a locked parameter. We also obviously don't pass >> * FOLL_REMOTE in here. >> + * >> + * A note on gup_flags: FOLL_PIN should only be set internally by the >> + * pin_user_page*() and pin_longterm_*() APIs, never directly by the caller. >> + * That's in order to help avoid mismatches when releasing pages: >> + * get_user_pages*() pages must be released via put_page(), while >> + * pin_user_pages*() pages must be released via put_user_page(). > > Rather than put this here should we put it next to the definition of FOLL_PIN? > Because now we have this text 2x... :-/ > OK, I'll move it up next to FOLL_PIN, and get rid of the 2x places in gup.c ... >> +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm, >> + unsigned long start, unsigned long nr_pages, >> + unsigned int gup_flags, struct page **pages, >> + struct vm_area_struct **vmas, int *locked) >> +{ >> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */ >> + if (WARN_ON_ONCE(gup_flags & FOLL_GET)) >> + return -EINVAL; >> + >> + /* >> + * FIXME: as noted in the get_user_pages_remote() implementation, it >> + * is not yet possible to safely set FOLL_LONGTERM here. FOLL_LONGTERM >> + * needs to be set, but for now the best we can do is a "TODO" item. >> + */ > > Wait? Why can't we set FOLL_LONGTERM here? pin_* are new calls which are not > used yet right? Nope, not quite! See patch #14 ("vfio, mm: pin_longterm_pages (FOLL_PIN) and put_user_page() conversion"), in which I'm converting an existing get_user_pages_remote() caller. > > You set it in the other new pin_* functions? > Yes I did. Because those work already in their gup() counterparts. thanks, John Hubbard NVIDIA