From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3864BC001B3 for ; Thu, 1 Apr 2021 18:43:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0FF0260FD9 for ; Thu, 1 Apr 2021 18:43:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241048AbhDASm1 (ORCPT ); Thu, 1 Apr 2021 14:42:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237342AbhDASTH (ORCPT ); Thu, 1 Apr 2021 14:19:07 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4DD0C0617A7 for ; Thu, 1 Apr 2021 04:24:08 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id v11so1436373wro.7 for ; Thu, 01 Apr 2021 04:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=android.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=uzuER9XggA/Xm395Jcfm7hXPCvj0tcp4BQB1DPPsWqk=; b=SG6xbMYAFAMzLUnkbQbcb8vCNm82fY5YFkUb4bfvv+dH4XpKA7O8HNsRF0zHuG2jmN HN5LjG1d19j0K9wuLhLSp5RmZCUteEt5UsTA6zsqiZnPI/PY6x4fcRqMpepymL+3QIom +5gRd9Dk/LHS8jpMrJxuP2AKUbUE1XVgi+SNLbCffLFV2Th9NGHICd/+anUM42krPwUd IbuUtyhrxEvrsx6RSYejW6a1iiDsBMIC6joh32sDh+p+fHFe03vymhIvxRUGGY8Gf5qB IKxpUxQM4/H9mP8OsBsQpJiriwx5/tY/0EvMPCq7doxD58DBySc2hwk7vlenq4eCUY3h lS2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=uzuER9XggA/Xm395Jcfm7hXPCvj0tcp4BQB1DPPsWqk=; b=nsI8JgZcoXV1yn65ZXIQWd8TzgS3QB+z2GO2Sfyk5685rv8o20dyOhLOGihXL4wOmx mPBsl+AJ/Iko96Uhg52icmIkg3sNzgGCCHOCrMtOK7Cd1FOkO+bzBmpqISLqDC+X5u+4 lO+/tvhZBnRpKHW4soId23nXEZ3+9/wuogBZ5XbM6bEQlUmy1bRv5lyfsaiPiGWQ0Ljc kQtf14+lbd9fN249lekLBT06DBzb01BuShYAGe6KfpS8aoG/4oFmKaV1A/HSjaZkQI3O Sp5s0R69Aq7pUeIhpwRAy7FcnwoGLMUizVTFBJob5UbO0syOzLUHizlEhFZZDFElf76s 5gpw== X-Gm-Message-State: AOAM533+Y3PDppFkxvqsSdQIJNuh3VimWdfUDHsklZGwGHAPkSZjsnUh l3VJLFa+BSfHGeoxWEbTTyTp1g0JmNW5LQ== X-Google-Smtp-Source: ABdhPJx1VqcNYrWZdJ0X35Kyua62s3EHzvXwUtXZBoVK0FTVUTVXO74tmMIR+k+MZLDo/2y6Hlfp4w== X-Received: by 2002:a05:6000:1acb:: with SMTP id i11mr9333662wry.68.1617276247503; Thu, 01 Apr 2021 04:24:07 -0700 (PDT) Received: from google.com ([2a00:79e0:d:210:3c81:8c37:b59c:f92e]) by smtp.gmail.com with ESMTPSA id y205sm10407745wmc.18.2021.04.01.04.24.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Apr 2021 04:24:06 -0700 (PDT) Date: Thu, 1 Apr 2021 12:24:05 +0100 From: Alessio Balsini To: Miklos Szeredi Cc: Alessio Balsini , Akilesh Kailash , Amir Goldstein , Antonio SJ Musumeci , David Anderson , Giuseppe Scrivano , Jann Horn , Jens Axboe , Martijn Coenen , Palmer Dabbelt , Paul Lawrence , Peng Tao , Stefano Duo , Zimuzo Ezeozue , wuyan , fuse-devel , kernel-team , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH RESEND V12 8/8] fuse: Introduce passthrough for mmap Message-ID: References: <20210125153057.3623715-1-balsini@android.com> <20210125153057.3623715-9-balsini@android.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 17, 2021 at 03:05:07PM +0100, Miklos Szeredi wrote: > On Mon, Jan 25, 2021 at 4:31 PM Alessio Balsini wrote: > > > > Enabling FUSE passthrough for mmap-ed operations not only affects > > performance, but has also been shown as mandatory for the correct > > functioning of FUSE passthrough. > > yanwu noticed [1] that a FUSE file with passthrough enabled may suffer > > data inconsistencies if the same file is also accessed with mmap. What > > happens is that read/write operations are directly applied to the lower > > file system (and its cache), while mmap-ed operations are affecting the > > FUSE cache. > > > > Extend the FUSE passthrough implementation to also handle memory-mapped > > FUSE file, to both fix the cache inconsistencies and extend the > > passthrough performance benefits to mmap-ed operations. > > > > [1] https://lore.kernel.org/lkml/20210119110654.11817-1-wu-yan@tcl.com/ > > > > Signed-off-by: Alessio Balsini > > --- > > fs/fuse/file.c | 3 +++ > > fs/fuse/fuse_i.h | 1 + > > fs/fuse/passthrough.c | 41 +++++++++++++++++++++++++++++++++++++++++ > > 3 files changed, 45 insertions(+) > > > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > > index cddada1e8bd9..e3741a94c1f9 100644 > > --- a/fs/fuse/file.c > > +++ b/fs/fuse/file.c > > @@ -2370,6 +2370,9 @@ static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma) > > if (FUSE_IS_DAX(file_inode(file))) > > return fuse_dax_mmap(file, vma); > > > > + if (ff->passthrough.filp) > > + return fuse_passthrough_mmap(file, vma); > > + > > if (ff->open_flags & FOPEN_DIRECT_IO) { > > /* Can't provide the coherency needed for MAP_SHARED */ > > if (vma->vm_flags & VM_MAYSHARE) > > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > > index 815af1845b16..7b0d65984608 100644 > > --- a/fs/fuse/fuse_i.h > > +++ b/fs/fuse/fuse_i.h > > @@ -1244,5 +1244,6 @@ int fuse_passthrough_setup(struct fuse_conn *fc, struct fuse_file *ff, > > void fuse_passthrough_release(struct fuse_passthrough *passthrough); > > ssize_t fuse_passthrough_read_iter(struct kiocb *iocb, struct iov_iter *to); > > ssize_t fuse_passthrough_write_iter(struct kiocb *iocb, struct iov_iter *from); > > +ssize_t fuse_passthrough_mmap(struct file *file, struct vm_area_struct *vma); > > > > #endif /* _FS_FUSE_I_H */ > > diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c > > index 24866c5fe7e2..284979f87747 100644 > > --- a/fs/fuse/passthrough.c > > +++ b/fs/fuse/passthrough.c > > @@ -135,6 +135,47 @@ ssize_t fuse_passthrough_write_iter(struct kiocb *iocb_fuse, > > return ret; > > } > > > > +ssize_t fuse_passthrough_mmap(struct file *file, struct vm_area_struct *vma) > > +{ > > + int ret; > > + const struct cred *old_cred; > > + struct fuse_file *ff = file->private_data; > > + struct inode *fuse_inode = file_inode(file); > > + struct file *passthrough_filp = ff->passthrough.filp; > > + struct inode *passthrough_inode = file_inode(passthrough_filp); > > + > > + if (!passthrough_filp->f_op->mmap) > > + return -ENODEV; > > + > > + if (WARN_ON(file != vma->vm_file)) > > + return -EIO; > > + > > + vma->vm_file = get_file(passthrough_filp); > > + > > + old_cred = override_creds(ff->passthrough.cred); > > + ret = call_mmap(vma->vm_file, vma); > > + revert_creds(old_cred); > > + > > + if (ret) > > + fput(passthrough_filp); > > + else > > + fput(file); > > + > > + if (file->f_flags & O_NOATIME) > > + return ret; > > + > > + if ((!timespec64_equal(&fuse_inode->i_mtime, > > + &passthrough_inode->i_mtime) || > > + !timespec64_equal(&fuse_inode->i_ctime, > > + &passthrough_inode->i_ctime))) { > > + fuse_inode->i_mtime = passthrough_inode->i_mtime; > > + fuse_inode->i_ctime = passthrough_inode->i_ctime; > > Again, violation of rules. Not sure why this is needed, mmap(2) > isn't supposed to change mtime or ctime, AFAIK. > > Thanks, > Miklos Hi Miklos, I don't have a strong preference for this and will drop the ctime/atime updates in v13. For the records, here follows my reasoning for which I decided to update atime/ctime here. >From the stats(2) man it just says that it's not guaranteed that atime would be updated, as `Other routines, like mmap(2), may or may not update st_atime.` Something similar according to the inotify(7) man that warns not to trigger events after mmap(2), msync(2), and munmap(2) operations. The mmap(2) man mentions that st_ctime and st_mtime would be updated for file mappings with PROT_WRITE and MAP_SHARED, before a msync(2) with MS_SYNC or MS_ASYNC. This passthrough scenario is slightly different from the standard mmap, but it seems to me that we are kind of falling into a similar use case for the atime/ctime update. I would imagine this is why OverlayFS updates atime/ctime too in ovl_mmap(), through ovl_copyattr(). Thanks, Alessio