From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CA90C43381 for ; Tue, 19 Mar 2019 20:43:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 731462175B for ; Tue, 19 Mar 2019 20:43:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727267AbfCSUnf (ORCPT ); Tue, 19 Mar 2019 16:43:35 -0400 Received: from ipmail03.adl2.internode.on.net ([150.101.137.141]:11449 "EHLO ipmail03.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726712AbfCSUnf (ORCPT ); Tue, 19 Mar 2019 16:43:35 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail03.adl2.internode.on.net with ESMTP; 20 Mar 2019 07:13:32 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1h6La6-0003MG-AU; Wed, 20 Mar 2019 07:43:30 +1100 Date: Wed, 20 Mar 2019 07:43:30 +1100 From: Dave Chinner To: Amir Goldstein Cc: Jayashree , fstests , linux-fsdevel , linux-doc@vger.kernel.org, Vijaychidambaram Velayudhan Pillai , Theodore Tso , chao@kernel.org, Filipe Manana , Jonathan Corbet , Josef Bacik , Anna Schumaker Subject: Re: [PATCH v2] Documenting the crash-recovery guarantees of Linux file systems Message-ID: <20190319204330.GY26298@dastard> References: <1552418820-18102-1-git-send-email-jaya@cs.utexas.edu> <20190314011925.GG23020@dastard> <20190315030313.GP26298@dastard> <20190317221652.GQ26298@dastard> <20190319031312.GW26298@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, Mar 19, 2019 at 09:35:19AM +0200, Amir Goldstein wrote: > On Tue, Mar 19, 2019 at 5:13 AM Dave Chinner wrote: > > That is, sync_file_range() is only safe to use for this specific > > sort of ordered data integrity algorithm when flushing the entire > > file.(*) > > > > create > > setxattr > > write metadata volatile > > delayed allocation data volatile > > .... > > sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE | > > SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER); > > Extent Allocation metadata volatile > > ----> device -+ > > data volatile > > <-- complete -+ > > .... > > rename metadata volatile > > > > And so at this point, we only need a device cache flush to > > make the data persistent and a journal flush to make the rename > > persistent. And so it ends up the same case as non-AIO O_DIRECT. > > > > Funny, I once told that story and one Dave Chinner told me > "Nice story, but wrong.": > https://patchwork.kernel.org/patch/10576303/#22190719 > > You pointed to the minor detail that sync_file_range() uses > WB_SYNC_NONE. Ah, I forgot about that. That's what I get for not looking at the code. Did I mention that SFR is a complete crock of shit when it comes to data integrity operations? :/ > So yes, I agree, it is a nice story and we need to make it right, > by having an API (perhaps SYNC_FILE_RANGE_ALL). > When you pointed out my mistake, I switched the application to > use the FIEMAP_FLAG_SYNC API as a hack. Yeah, that 's a nasty hack :/ > Besides tests and documentation what could be useful is a portable > user space library that just does the right thing for every filesystem. *nod* but before that, we need the model to be defined and documented. And once we have a library, the fun part of convincing the world that it should be the glibc default behaviour can begin.... Cheers, Dave. -- Dave Chinner david@fromorbit.com