From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12108C433B4 for ; Tue, 18 May 2021 20:29:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A14836109F for ; Tue, 18 May 2021 20:29:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A14836109F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 352D28E0050; Tue, 18 May 2021 16:29:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DBD88E002F; Tue, 18 May 2021 16:29:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 106FF8E0050; Tue, 18 May 2021 16:29:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id CE2338E002F for ; Tue, 18 May 2021 16:29:20 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6B28EA77A for ; Tue, 18 May 2021 20:29:20 +0000 (UTC) X-FDA: 78155491680.28.4837647 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 92220E00010A for ; Tue, 18 May 2021 20:29:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621369759; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Byh3RXtf3e9z9fGxHtn9E4dig0desEcXbiOegyYPxxI=; b=XSTCi2I3UfhEM05tdE6q1ero94PPqCpQM3PnOk6bQB1p/FMHIZBpm33acX8DJB/3lD02Fs Dhjbg8Nd1mO1i1dIksWJ/2peb3OgP07hciVKBlyoe1GKo1JvZN+oGdQMNUf2tZ7+C019sU ulsVXCL937J41lQuFR3KytlDA74mnnY= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-183-Y_oSX8zIOA-aDXuh9jMn7Q-1; Tue, 18 May 2021 16:29:17 -0400 X-MC-Unique: Y_oSX8zIOA-aDXuh9jMn7Q-1 Received: by mail-qt1-f200.google.com with SMTP id x19-20020ac87a930000b02901f6125bcda0so2740379qtr.19 for ; Tue, 18 May 2021 13:29:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Byh3RXtf3e9z9fGxHtn9E4dig0desEcXbiOegyYPxxI=; b=Tdgn1xafUmYDMIps0ztNi1A2MUcso+DycuCOdieBoBK/P0H+1+RCx4Jr68KwUvUteR uZZHgJNrqz30fbSjN1yaHOhn0CdSs8jo/pMuPTWvikzsu2LpqWAAf6OColI3PPUT8BUf 7JCZ6t8QFNPo5FsQWhNmNQ6P/s8xG3C6I/OTW3uLEfDalhySmb8Ly0xQ+I4YJNIobrRo X+wPWLPhbbz4mQo+JECgck5IfkopBcVWbKfIsPVI4IayN0Sct8mUj5OqigJEWiPQioyw hPHXQ4SYE9OKQZ6fbOt/fI+9ZfFtXeJR287AiSwkkATZNRfHe+NVqSYqzkDQGKo4edvB z0IQ== X-Gm-Message-State: AOAM530BmjdXUXCwZh2W+9+VjocXEpuGE2PnMUiUVaOYraS+jQK7gTrZ wxeRnHj9A6BU06FnXup+1HEccPjK7eETuOcW1jekjzlUQYEr6T2Wu9xQr+1yF01xTh9zR7hDsCZ SHeEQqACZMiw= X-Received: by 2002:a37:9a44:: with SMTP id c65mr7481951qke.368.1621369757341; Tue, 18 May 2021 13:29:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx+5f5TscVPJ0ONHm5jVyCffJB6cEfYzjhEVjEsWtXZ+lJNcysTexL/nbuPh+rIA0ZMrBojSg== X-Received: by 2002:a37:9a44:: with SMTP id c65mr7481922qke.368.1621369757048; Tue, 18 May 2021 13:29:17 -0700 (PDT) Received: from t490s (bras-base-toroon474qw-grc-72-184-145-4-219.dsl.bell.ca. [184.145.4.219]) by smtp.gmail.com with ESMTPSA id g13sm13522617qtp.31.2021.05.18.13.29.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 May 2021 13:29:16 -0700 (PDT) Date: Tue, 18 May 2021 16:29:14 -0400 From: Peter Xu To: Jason Gunthorpe Cc: Alistair Popple , linux-mm@kvack.org, nouveau@lists.freedesktop.org, bskeggs@redhat.com, akpm@linux-foundation.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, jhubbard@nvidia.com, rcampbell@nvidia.com, jglisse@redhat.com, hch@infradead.org, daniel@ffwll.ch, willy@infradead.org, bsingharora@gmail.com, Christoph Hellwig Subject: Re: [PATCH v8 5/8] mm: Device exclusive memory access Message-ID: References: <20210407084238.20443-1-apopple@nvidia.com> <20210407084238.20443-6-apopple@nvidia.com> <47694715.suB6H4Uo8R@nvdebian> <20210518173334.GE1002214@nvidia.com> <20210518194509.GF1002214@nvidia.com> MIME-Version: 1.0 In-Reply-To: <20210518194509.GF1002214@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XSTCi2I3; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf05.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 92220E00010A X-Stat-Signature: gti9daz41ubeebjxjiynw4qpt3jtbsr7 X-HE-Tag: 1621369757-363058 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 18, 2021 at 04:45:09PM -0300, Jason Gunthorpe wrote: > On Tue, May 18, 2021 at 02:01:36PM -0400, Peter Xu wrote: > > > > Indeed it'll be odd for a COW page since for COW page then it means after > > > > parent/child writting to the page it'll clone into two, then it's a mistery on > > > > which one will be the one that "exclusived owned" by the device.. > > > > > > For COW pages it is like every other fork case.. We can't reliably > > > write-protect the device_exclusive page during fork so we must copy it > > > at fork time. > > > > > > Thus three reasonable choices: > > > - Copy to a new CPU page > > > - Migrate back to a CPU page and write protect it > > > - Copy to a new device exclusive page > > > > IMHO the ownership question would really help us to answer this one.. > > I'm confused about what device ownership you are talking about My question was more about the user scenario rather than anything related to the kernel code, nor does it related to page struct at all. Let me try to be a little bit more verbose... Firstly, I think one simple solution to handle fork() of device exclusive ptes is to do just like device private ptes: if COW we convert writable ptes into readable ptes. Then when CPU access happens (in either parent/child) page restore triggers which will convert those readable ptes into read-only present ptes (with the original page backing it). Then do_wp_page() will take care of page copy. However... if you see that also means parent/child have the equal opportunity to reuse that original page: who access first will do COW because refcount>1 for that page (note! it's possible that mapcount==1 here, as we drop mapcount when converting to device exclusive ptes; however with the most recent do_wp_page change from Linus where we'll also check page_count(), we'll still do COW just like when this page was GUPed by someone else). While that matters because the device is writting to that original page only, not the COWed one. Then here comes the ownership question: If we still want to have the parent process behave like before it fork()ed, IMHO we must make sure that original page (that exclusively owned by the device once) still belongs to the parent process not the child. That's why I think if that's the case we'd do early cow in fork(), because it guarantees that. I can't say I fully understand the whole picture, so sorry if I missed something important there. Thanks, -- Peter Xu