From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 613F9C4363D for ; Tue, 22 Sep 2020 17:54:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E07F12388B for ; Tue, 22 Sep 2020 17:54:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UZMSC38x" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E07F12388B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1F3DD8E0003; Tue, 22 Sep 2020 13:54:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A52C8E0001; Tue, 22 Sep 2020 13:54:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 093A78E0003; Tue, 22 Sep 2020 13:54:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0167.hostedemail.com [216.40.44.167]) by kanga.kvack.org (Postfix) with ESMTP id E747C8E0001 for ; Tue, 22 Sep 2020 13:54:23 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id AD8A03628 for ; Tue, 22 Sep 2020 17:54:23 +0000 (UTC) X-FDA: 77291446806.20.sofa96_2d1059d2714f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 89A32180C07A3 for ; Tue, 22 Sep 2020 17:54:23 +0000 (UTC) X-HE-Tag: sofa96_2d1059d2714f X-Filterd-Recvd-Size: 6846 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Sep 2020 17:54:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600797262; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=4mR3DIjZHT2KVrbWJytSPADbDcUprwbLg37SsN4WIGQ=; b=UZMSC38xavAEN9hGgrYYF7j4cLY+BQ94icvv/dOTGEK+lKE9EI8eRm4jrqr3II+uj07W6r Gm3Uy93hASZfLZcFZDOwCFTdXaRYUMQsOZYABKkhQcWTxqZ/b8vhClTWQxS+H31/lM3aeO ZTeVkfhU2EsI6rzKbpYOQ+hHAb8fHt0= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-567-HhMxmxgAOAi99wuTSLEpIw-1; Tue, 22 Sep 2020 13:54:18 -0400 X-MC-Unique: HhMxmxgAOAi99wuTSLEpIw-1 Received: by mail-qt1-f197.google.com with SMTP id y53so16780529qth.2 for ; Tue, 22 Sep 2020 10:54:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4mR3DIjZHT2KVrbWJytSPADbDcUprwbLg37SsN4WIGQ=; b=c1TmzeXu/sT9megOinlue/o5LC/LrbyEUJrQLJ2ywhQmZWy1UmzP8paOCliwelXecl FK6uu4Q3bLXeBpZREHZvJRdPt1dmbLQQmyXUgxpnvPvkpdF9IZFSiCwuC0EUdbKzapJs HtP0YEFbJy+RgVpoPUy+YBLhNEtKue1lCvt7vqwHXwgsRiqjLHeFhZC/pkoZZ6766hGi FB1mhglEKgjHfjSPAx7Hf5goMlUgeH4/1oRDymisXo61HtgECnMLztScF7R5i6w4Ol8h WKl/kHMFRI6u5X1lKDOLckEG8IIYdOa8Zier638pj/boQbMAlDLwnDjIRi9N2zHuGqZb dCXw== X-Gm-Message-State: AOAM532C12HKxla/NPzUvvOXRbdF2SzFlPNPZBxWWggegJPJBq73MWp9 McPzIG/8MnMO9bZ3ixhGcDHkYItLo7Obsy9B/H1BBp03IQp+Slz00FZK6ULiyHuukTT3wjjV70y sbJaT8xYTcb8= X-Received: by 2002:a05:620a:554:: with SMTP id o20mr1199003qko.205.1600797258002; Tue, 22 Sep 2020 10:54:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzEy/lYAbSGUpeZcIc5wUbh8f+DzzpFellpAvEfoWwJu5KYD0EEKMbzE+Dkq1UJuf2e0QxVXw== X-Received: by 2002:a05:620a:554:: with SMTP id o20mr1198974qko.205.1600797257688; Tue, 22 Sep 2020 10:54:17 -0700 (PDT) Received: from xz-x1 (bras-vprn-toroon474qw-lp130-11-70-53-122-15.dsl.bell.ca. [70.53.122.15]) by smtp.gmail.com with ESMTPSA id y22sm9832354qki.33.2020.09.22.10.54.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Sep 2020 10:54:16 -0700 (PDT) Date: Tue, 22 Sep 2020 13:54:15 -0400 From: Peter Xu To: Jason Gunthorpe Cc: John Hubbard , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Jan Kara , Michal Hocko , Kirill Tkhai , Kirill Shutemov , Hugh Dickins , Christoph Hellwig , Andrea Arcangeli , Oleg Nesterov , Leon Romanovsky , Linus Torvalds , Jann Horn Subject: Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned Message-ID: <20200922175415.GI19098@xz-x1> References: <20200921211744.24758-1-peterx@redhat.com> <20200921211744.24758-2-peterx@redhat.com> <224908c1-5d0f-8e01-baa9-94ec2374971f@nvidia.com> <20200922151736.GD19098@xz-x1> <20200922161046.GB731578@ziepe.ca> MIME-Version: 1.0 In-Reply-To: <20200922161046.GB731578@ziepe.ca> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 22, 2020 at 01:10:46PM -0300, Jason Gunthorpe wrote: > On Tue, Sep 22, 2020 at 11:17:36AM -0400, Peter Xu wrote: > > > > But it's admittedly a cosmetic point, combined with my perennial fear that > > > I'm missing something when I look at a READ_ONCE()/WRITE_ONCE() pair. :) > > > > Yeah but I hope I'm using it right.. :) I used READ_ONCE/WRITE_ONCE explicitly > > because I think they're cheaper than atomic operations, (which will, iiuc, lock > > the bus). > > It is worth thinking a bit about racing fork with > pin_user_pages(). The desired outcome is: > > If fork wins the page is write protected, and pin_user_pages_fast() > will COW it. > > If pin_user_pages_fast() wins then fork must see the READ_ONCE and > the pin. > > As get_user_pages_fast() is lockless it looks like the ordering has to > be like this: > > pin_user_pages_fast() fork() > atomic_set(has_pinned, 1); > [..] > atomic_add(page->_refcount) > ordered check write protect() > ordered set write protect() > atomic_read(page->_refcount) > atomic_read(has_pinned) > > Such that in all the degenerate racy cases the outcome is that both > sides COW, never neither. > > Thus I think it does have to be atomics purely from an ordering > perspective, observing an increased _refcount requires that has_pinned > != 0 if we are pinning. > > So, to make this 100% this ordering will need to be touched up too. Thanks for spotting this. So something like below should work, right? diff --git a/mm/memory.c b/mm/memory.c index 8f3521be80ca..6591f3f33299 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -888,8 +888,8 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, * Because we'll need to release the locks before doing cow, * pass this work to upper layer. */ - if (READ_ONCE(src_mm->has_pinned) && wp && - page_maybe_dma_pinned(page)) { + if (wp && page_maybe_dma_pinned(page) && + READ_ONCE(src_mm->has_pinned)) { /* We've got the page already; we're safe */ data->cow_old_page = page; data->cow_oldpte = *src_pte; I can also add some more comment to emphasize this. I think the WRITE_ONCE/READ_ONCE can actually be kept, because atomic ops should contain proper memory barriers already so the memory access orders should be guaranteed (e.g., atomic_add() will have an implicit wmb(); rmb() for the other side). However maybe it's even simpler to change has_pinned into atomic as John suggested. Thanks, -- Peter Xu