From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BC58C282D8 for ; Wed, 30 Jan 2019 13:30:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2D29D20989 for ; Wed, 30 Jan 2019 13:30:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="edCI5uQ/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730735AbfA3Naw (ORCPT ); Wed, 30 Jan 2019 08:30:52 -0500 Received: from mail-yw1-f44.google.com ([209.85.161.44]:39965 "EHLO mail-yw1-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726548AbfA3Nav (ORCPT ); Wed, 30 Jan 2019 08:30:51 -0500 Received: by mail-yw1-f44.google.com with SMTP id g194so9603421ywe.7; Wed, 30 Jan 2019 05:30:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cdCnsyvBDSlu0WKDLlP7VnagqgWohDaaRAm5maPEzro=; b=edCI5uQ/GzOKn+CgOZ8hMFXbdCMIeZnzMCfI+gKfggYU0kzTA3w2GKAf1K/Y19oRBx 4kkUkRHawWZcbdI0Ln00XwFRnrzH0AG7KmqDewawt3mnlDdheYTNz+Bqswd7oi9C+tKR sqdCqg34cfY9Cd/xgg/EZd6zCQPmTmQtlEcbHjkpQ8yggTJ59ZxqolIxbS/PPowhhsm9 7j8hAFJDt52MUdsQz4IOZcJMNS/ftqtOQDSj+DHGVrjwXN4WBvt264gm4mFwy64Qg+70 yenQeOdhjtj4VuqguRulT0wcxPtSRhRM+y1hziQwZtjDyvEmPvX3iXJas/wILttZmB6R lBFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cdCnsyvBDSlu0WKDLlP7VnagqgWohDaaRAm5maPEzro=; b=VJmFsXeygs+EVwaDS69ohNI81PLGLNAnbDHyDTcQ7TQZeZ27Hjwnl1DzHhV7t/6S7t Pj7A6W6SQRDAk2CrY4SC67IbJ5GNjaIHOgB7MOuQjlyy83/AhsHGSeAsCfwJAk5tXNXY 9ymVwrO3qNP438b2QY+uGN/0IUZAt/S/764wgNkxTeDfSW8pEMhSup27rjZps3uzfDAL 4jVR5Ssmmgc31EZuJmu3vwjssMjumQVOdhSu3dNZawQefpXA7BXwPwVggmS5m721JcyV CjLq3wlJs+XvGRCYqerGfP+r/O8LKOP5sgH+hGU2++VqRQCrHdr1VMbiGMObMSVJ5RS8 QJgw== X-Gm-Message-State: AJcUukd9D0PJZ9+RKWWz7RDv1i/OZPjpIJ2e+RMh58Xat52fLFF0OkRR Vf77eUgZL+PpRE/tKOuaVo2jNwGZW9DEow5a3kE= X-Google-Smtp-Source: ALg8bN4cuZtVUAMS4XtaQTUSYF2x9itXYGxFu84JEjgf1+9XlkmrQaq1S2yVInSvbBjtucdwZ4mqYr/+cpNEhUH0iUw= X-Received: by 2002:a81:6382:: with SMTP id x124mr28993682ywb.248.1548855050430; Wed, 30 Jan 2019 05:30:50 -0800 (PST) MIME-Version: 1.0 References: <20190128125044.GC27972@quack2.suse.cz> <20190128212642.GQ4205@dastard> <20190129001826.GV4205@dastard> <20190129230129.GD4205@dastard> In-Reply-To: <20190129230129.GD4205@dastard> From: Amir Goldstein Date: Wed, 30 Jan 2019 15:30:39 +0200 Message-ID: Subject: Re: [LSF/MM TOPIC] Lazy file reflink To: Dave Chinner Cc: Jan Kara , lsf-pc@lists.linux-foundation.org, linux-fsdevel , linux-xfs , "Darrick J. Wong" , Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Wed, Jan 30, 2019 at 1:01 AM Dave Chinner wrote: > > On Tue, Jan 29, 2019 at 09:18:57AM +0200, Amir Goldstein wrote: > > > > I think it's a good idea to add file freeze semantics to the toolbox > > > > of useful things that could be accomplished with reflink. > > > > > > reflink is already atomic w.r.t. other writes - in what way does a > > > "file freeze" have any impact on a reflink operation? that is, apart > > > from preventing it from being done, because reflink can modify the > > > source inode on XFS, too.... > > > > > > > - create O_TMPFILE > > - freeze source file > > - read and calculate hash from source file > > - likely unfreeze and skip reflink+backup > > IF you can read the file to determine if you need to back up the > file while it's frozen, then why do you need to reflink it > before you read the file to back it up? It's the /exact same > read operation/ on the file. > > I'm not sure what the hell you actually want now, because you've > contradicting yourself again by saying you want read the entire file > while it's frozen and blocking writes, after you just said that's > not acceptible and why reflink is required.... > Mmm. I've tried to stick to a simplified description and picked one specific use case of large file that was not a good example. A colleague has corrected me that we are more concerned over the cost of reflinking many millions of files just to find out that they were not change and even just to back them up. Ended up being confusing. If I contradicted myself is because my description was often missing "sometimes". Let's start over: For small files, that can fit in application buffers, freeze is probably enough. For very large files, your assertion that reflink is cheap compared to reading the file is correct, so there is probably no justification for lazy reflink. In the middle, there are files that we can analyse fast enough to determine which parts of the file have been modified and send the modified parts. Send can happen quite later. If we analysed the frozen file, we wouldn't want the file to change before sending out the modified parts, hence we would want a reflink that is consistent with the file that we analysed. For that case, I think we need freeze+reflink, where reflink is done under write protection. If I am wrong from show me how. > > Bottom line: I completely agree with you that "file freeze" is sufficient > > for the case I presented, as long as reflink is allowed while file is frozen. > > IOW, break the existing compound API freeze+reflink+unfreeze to > > individual operations to give more control over to user. > > I don't think that's a good idea. If we allow "metadata" to be > unfrozen, but only freeze data, does that mean we allow modifying > owner, perms, attributes, etc, but then don't allow truncate. What > about preallocation over holes? That doesn't change data, and it's > only a metadata modification. What about background dedupe? That > sort of thing is a can of worms that I don't want to go anywhere > near. Either the file is frozen (i.e. effectively immutable but > blocks modifications rather than EPERMs) or it's a normal, writeable > file - madness lies within any other boundary... > OK. the data/metadata dichotomy may be wrong to use here, because it is not well defined. We all know that truncate and fallocate need to be serialized with data modifications and filesystems already do that, so however you call it, we all know what data changes means. I agree that data sometimes requires consistency with metadata. Fact is that the backup application is interested in the file content and metadata, but NOT the file disk layout. generic_remap_file_range_prep() does not require that source file is not immutable. Does XFS? I don't know if "immutable" has ever been defined w.r.t file layout on disk. has it? I recon btrfs re-balancing would not stop at migrating "immutable" file blocks would it? Still madness? or sparks of sanity? Thanks, Amir.