From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35EDFC43462 for ; Mon, 12 Apr 2021 23:17:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B39C961369 for ; Mon, 12 Apr 2021 23:17:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B39C961369 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 131CE6B0036; Mon, 12 Apr 2021 19:17:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E2D16B006E; Mon, 12 Apr 2021 19:17:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9D396B0070; Mon, 12 Apr 2021 19:17:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0225.hostedemail.com [216.40.44.225]) by kanga.kvack.org (Postfix) with ESMTP id CC3A86B0036 for ; Mon, 12 Apr 2021 19:17:44 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 86260180ACF62 for ; Mon, 12 Apr 2021 23:17:44 +0000 (UTC) X-FDA: 78025279248.11.3525E23 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 0F9B640002D9 for ; Mon, 12 Apr 2021 23:17:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618269463; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RjgkMTxmWerUbM+C2EtS/XPjyHCbWkmm0xddYj/OAJc=; b=W33JPKLabqarbsxy6hHL6DpNl6VzDuUqsbA0BRlN2KsEAofO78zJPW2TV23oPnZKKmXudo Am+iqgSejT3kFupOH4LO+sOVvyu2QrTQMQavMO4hvWPVPW6xFgDtqyRXiCJTzUpjoBcPMG TVVxMkvyIlVDhq07uqusMYlszCFrMC8= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-273-epqT1kWfOdmb32JI38Nmgw-1; Mon, 12 Apr 2021 19:17:41 -0400 X-MC-Unique: epqT1kWfOdmb32JI38Nmgw-1 Received: by mail-qv1-f71.google.com with SMTP id g9so4496731qvz.20 for ; Mon, 12 Apr 2021 16:17:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=RjgkMTxmWerUbM+C2EtS/XPjyHCbWkmm0xddYj/OAJc=; b=jtY5WWNg71ENa79tyxmmLyKup58K2RzPuQPyBLx/eWBQqVfk5Xx8at1LPLMzRgyf1H 88CSRUqQIDzI4F1dAnmlc7nLt5qmry6V2lnZ3uSPYO0U5HkEE15Xj3hDWx0wj7ekUEWg 2sl+HLYdUznyIGnyVDhMg6dn7+eUis+pKLwwKgFjY9BbXwdbrbVqHouWUft9k0GStVbr /bCVKlW2IsZ/YfMtyfw7p7nKY765e+C+STiXtiUioOf0ygPsbenvAf7pUeKAsJzRSmco kEg+A6XfCNydTo3XM9u2Sl/iaJoYVVf7BK5fORupaFwF+13DjFQKZWCmVg7++j4mjtoB lq9w== X-Gm-Message-State: AOAM5337XPjZoSh4k2/cLDMFa/dqfrv9s3g4MNa0wXoxiRRfWaICCMjp H1hpLGGcFb0za345ND4eCoYIX2GT/lJxCUcV3RZMEHFRECDdSvePDxVHvwEjSioiWI0+FcDUEij /KkHnqMWXOTM= X-Received: by 2002:ac8:5f10:: with SMTP id x16mr28535533qta.6.1618269461302; Mon, 12 Apr 2021 16:17:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyzEymnMMyU8ZbJb1fPKB3SHpxb+Q5rXyQvX4BVoakOG88f052kHWo+XsASwbe/h8OtNaAWxw== X-Received: by 2002:ac8:5f10:: with SMTP id x16mr28535500qta.6.1618269460962; Mon, 12 Apr 2021 16:17:40 -0700 (PDT) Received: from xz-x1 (bras-base-toroon474qw-grc-88-174-93-75-154.dsl.bell.ca. [174.93.75.154]) by smtp.gmail.com with ESMTPSA id j129sm9011374qkf.110.2021.04.12.16.17.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 16:17:40 -0700 (PDT) Date: Mon, 12 Apr 2021 19:17:36 -0400 From: Peter Xu To: Axel Rasmussen Cc: Alexander Viro , Andrea Arcangeli , Andrew Morton , Daniel Colascione , Hugh Dickins , Jerome Glisse , Joe Perches , Lokesh Gidra , Mike Kravetz , Mike Rapoport , Shaohua Li , Shuah Khan , Stephen Rothwell , Wang Qing , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, Brian Geffon , "Dr . David Alan Gilbert" , Mina Almasry , Oliver Upton Subject: Re: [PATCH 4/9] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem Message-ID: <20210412231736.GA1002612@xz-x1> References: <20210408234327.624367-1-axelrasmussen@google.com> <20210408234327.624367-5-axelrasmussen@google.com> MIME-Version: 1.0 In-Reply-To: <20210408234327.624367-5-axelrasmussen@google.com> Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 0F9B640002D9 X-Stat-Signature: oxgx1n7wcw97zthci889hbiyo7pn4fma Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618269458-413932 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 08, 2021 at 04:43:22PM -0700, Axel Rasmussen wrote: > +/* > + * Install PTEs, to map dst_addr (within dst_vma) to page. > + * > + * This function handles MCOPY_ATOMIC_CONTINUE (which is always file-backed), > + * whether or not dst_vma is VM_SHARED. It also handles the more general > + * MCOPY_ATOMIC_NORMAL case, when dst_vma is *not* VM_SHARED (it may be file > + * backed, or not). > + * > + * Note that MCOPY_ATOMIC_NORMAL for a VM_SHARED dst_vma is handled by > + * shmem_mcopy_atomic_pte instead. > + */ > +static int mcopy_atomic_install_ptes(struct mm_struct *dst_mm, pmd_t *dst_pmd, > + struct vm_area_struct *dst_vma, > + unsigned long dst_addr, struct page *page, > + bool newly_allocated, bool wp_copy) > +{ > + int ret; > + pte_t _dst_pte, *dst_pte; > + int writable; > + bool vm_shared = dst_vma->vm_flags & VM_SHARED; > + spinlock_t *ptl; > + struct inode *inode; > + pgoff_t offset, max_off; > + > + _dst_pte = mk_pte(page, dst_vma->vm_page_prot); > + writable = dst_vma->vm_flags & VM_WRITE; > + /* For private, non-anon we need CoW (don't write to page cache!) */ > + if (!vma_is_anonymous(dst_vma) && !vm_shared) > + writable = 0; > + > + if (writable || vma_is_anonymous(dst_vma)) > + _dst_pte = pte_mkdirty(_dst_pte); > + if (writable) { > + if (wp_copy) > + _dst_pte = pte_mkuffd_wp(_dst_pte); > + else > + _dst_pte = pte_mkwrite(_dst_pte); > + } else if (vm_shared) { > + /* > + * Since we didn't pte_mkdirty(), mark the page dirty or it > + * could be freed from under us. We could do this > + * unconditionally, but doing it only if !writable is faster. > + */ > + set_page_dirty(page); > + } > + > + dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); > + > + if (vma_is_shmem(dst_vma)) { > + /* The shmem MAP_PRIVATE case requires checking the i_size */ When you start to use this function in the last patch it'll be needed too even if MAP_SHARED? How about directly state the reason of doing this ("serialize against truncate with the PT lock") instead of commenting about "who will need it"? > + inode = dst_vma->vm_file->f_inode; > + offset = linear_page_index(dst_vma, dst_addr); > + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); > + ret = -EFAULT; > + if (unlikely(offset >= max_off)) > + goto out_unlock; > + } [...] > +/* Handles UFFDIO_CONTINUE for all shmem VMAs (shared or private). */ > +static int mcontinue_atomic_pte(struct mm_struct *dst_mm, > + pmd_t *dst_pmd, > + struct vm_area_struct *dst_vma, > + unsigned long dst_addr, > + bool wp_copy) > +{ > + struct inode *inode = file_inode(dst_vma->vm_file); > + pgoff_t pgoff = linear_page_index(dst_vma, dst_addr); > + struct page *page; > + int ret; > + > + ret = shmem_getpage(inode, pgoff, &page, SGP_READ); SGP_READ looks right, as we don't want page allocation. However I noticed there's very slight difference when the page was just fallocated: /* fallocated page? */ if (page && !PageUptodate(page)) { if (sgp != SGP_READ) goto clear; unlock_page(page); put_page(page); page = NULL; hindex = index; } I think it won't happen for your case since the page should be uptodate already (the other thread should check and modify the page before CONTINUE), but still raise this up, since if the page was allocated it smells better to still install the fallocated page (do we need to clear the page and SetUptodate)? -- Peter Xu