From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2D76C43463 for ; Sat, 19 Sep 2020 00:01:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2DE7521D20 for ; Sat, 19 Sep 2020 00:01:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="K/tqPpls" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2DE7521D20 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8214F8E0001; Fri, 18 Sep 2020 20:01:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7ABBC6B0068; Fri, 18 Sep 2020 20:01:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 672DF8E0001; Fri, 18 Sep 2020 20:01:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 4C4F26B0062 for ; Fri, 18 Sep 2020 20:01:56 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 15638362E for ; Sat, 19 Sep 2020 00:01:56 +0000 (UTC) X-FDA: 77277857832.02.key30_621451d2712f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id EBB771011A81C for ; Sat, 19 Sep 2020 00:01:55 +0000 (UTC) X-HE-Tag: key30_621451d2712f X-Filterd-Recvd-Size: 6200 Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Sat, 19 Sep 2020 00:01:55 +0000 (UTC) Received: by mail-qk1-f194.google.com with SMTP id f142so8246697qke.13 for ; Fri, 18 Sep 2020 17:01:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=grh61e+nOrFCVDni5tqyXMzviwSliY01KhrfgkSXH/g=; b=K/tqPplscJiO9JdaRDQYdAcQZiRBlfBBZSnkF+OJ7ov1BFloe4+aNIz8SuA8dFnu9K Gn1HoGj5kOUIXl5lkOUMFNzjJbCszsQCsE7i/qhQCkZU6+sVwIIRxwmQiDbqoAF8ZdJX jDq/AiS/u74AnEYEgY9HAaUW3SFhVqA9DKo8RFNVPoRVSvYhKr7tKBZ+vZx/8C4ev1n2 f7Y5xIeIqOdi9t6fWl3QU26b3/OJZyuKc2e9JmImJc83WB9Fw7USf29+iyfBnlVnRbT0 vP5PDbSW76+iPNTsdmFI/9IFGs/r9OV48GXpuhOZO6BRLEObBacGtb7vj8XiB+6i8vWh 91UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=grh61e+nOrFCVDni5tqyXMzviwSliY01KhrfgkSXH/g=; b=mEb4Y7q4eFkzoSlKldDgXz/ae/28OgTGVEldDu4W3WPdO37+KcQZFeSbRdISxDGQVm wVMK58u+bYYzs5VDJ45BT/XoAmlX3y4IS5rZ0yTfuWf8JErUuFbhJhhM3SiQSypWQa6G eET1lIf9xMMNA4RvmG3xs9m2CBT21uoH0P0fDnyeRRu5l95nWlpzHn27E5uueimCiYN9 QfeK29HOM4QoUc1f+0kFsXIP28bEUi5KoxaiyIzKvDXcauOq+SQnWw6GNTgds3pRxLXz tI197Lr1d57MKX7d/0wwHTD1qw0LBgS158w0L3EnqIVIueGVu4PHsSjTUMvkQS4Qc123 Ldvw== X-Gm-Message-State: AOAM5311IC0H3oVEdlrRthIbh2x0stdT3+zRh0ZkYVRb1CznmRSCZXwZ lhv0EoDCI0Nw1jNCfanJNLqLPA== X-Google-Smtp-Source: ABdhPJwX6w34zNTedzCe9AJKEvH0HflO8G7+wYL8mJEArpwGyrcQYgqtahLqhb1qhLJ3zUZxettihA== X-Received: by 2002:a37:d10:: with SMTP id 16mr33664048qkn.402.1600473714742; Fri, 18 Sep 2020 17:01:54 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id f189sm3163478qkd.20.2020.09.18.17.01.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 17:01:53 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kJQK9-001u0r-5J; Fri, 18 Sep 2020 21:01:53 -0300 Date: Fri, 18 Sep 2020 21:01:53 -0300 From: Jason Gunthorpe To: John Hubbard Cc: Peter Xu , Linus Torvalds , Leon Romanovsky , Linux-MM , Linux Kernel Mailing List , "Maya B . Gokhale" , Yang Shi , Marty Mcfadden , Kirill Shutemov , Oleg Nesterov , Jann Horn , Jan Kara , Kirill Tkhai , Andrea Arcangeli , Christoph Hellwig , Andrew Morton Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification Message-ID: <20200919000153.GZ8409@ziepe.ca> References: <20200916174804.GC8409@ziepe.ca> <20200916184619.GB40154@xz-x1> <20200917112538.GD8409@ziepe.ca> <20200917193824.GL8409@ziepe.ca> <20200918164032.GA5962@xz-x1> <20200918173240.GY8409@ziepe.ca> <20200918204048.GC5962@xz-x1> <0af8c77e-ff60-cada-7d22-c7cfcf859b19@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0af8c77e-ff60-cada-7d22-c7cfcf859b19@nvidia.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 18, 2020 at 02:06:23PM -0700, John Hubbard wrote: > On 9/18/20 1:40 PM, Peter Xu wrote: > > On Fri, Sep 18, 2020 at 02:32:40PM -0300, Jason Gunthorpe wrote: > > > On Fri, Sep 18, 2020 at 12:40:32PM -0400, Peter Xu wrote: > > > > > > > Firstly in the draft patch mm->has_pinned is introduced and it's written to 1 > > > > as long as FOLL_GUP is called once. It's never reset after set. > > > > > > Worth thinking about also adding FOLL_LONGTERM here, at last as long > > > as it is not a counter. That further limits the impact. > > > > But theoritically we should also trigger COW here for pages even with PIN && > > !LONGTERM, am I right? Assuming that FOLL_PIN is already a corner case. > > > > This note, plus Linus' comment about "I'm a normal process, I've never > done any special rdma page pinning", has me a little worried. Because > page_maybe_dma_pinned() is counting both short- and long-term pins, > actually. And that includes O_DIRECT callers. > > O_DIRECT pins are short-term, and RDMA systems are long-term (and should > be setting FOLL_LONGTERM). But there's no way right now to discern > between them, once the initial pin_user_pages*() call is complete. All > we can do today is to count the number of FOLL_PIN calls, not the number > of FOLL_PIN | FOLL_LONGTERM calls. My thinking is to hit this issue you have to already be doing FOLL_LONGTERM, and if some driver hasn't been properly marked and regresses, the fix is to mark it. Remember, this use case requires the pin to extend after a system call, past another fork() system call, and still have data-coherence. IMHO that can only happen in the FOLL_LONGTERM case as it inhernetly means the lifetime of the pin is being controlled by userspace, not by the kernel. Otherwise userspace could not cause new DMA touches after fork. Explaining it like that makes me pretty confident it is the right thing to do, at least for a single bit. Yes, if we figure out how to do a counter, then the counter can be everything, but for now, as a rc regression fix, let us limit the number of impacted cases. Don't need to worry about the unpin problem because it is never undone. Jason