From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F4D9C33CB1 for ; Thu, 16 Jan 2020 10:16:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2F28F206B7 for ; Thu, 16 Jan 2020 10:16:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731449AbgAPKQR (ORCPT ); Thu, 16 Jan 2020 05:16:17 -0500 Received: from verein.lst.de ([213.95.11.211]:55216 "EHLO verein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729476AbgAPKQQ (ORCPT ); Thu, 16 Jan 2020 05:16:16 -0500 Received: by verein.lst.de (Postfix, from userid 2407) id 116AA68B20; Thu, 16 Jan 2020 11:16:13 +0100 (CET) Date: Thu, 16 Jan 2020 11:16:12 +0100 From: Christoph Hellwig To: Andreas Dilger Cc: David Howells , Christoph Hellwig , Qu Wenruo , linux-fsdevel , Al Viro , "Theodore Y. Ts'o" , "Darrick J. Wong" , Chris Mason , Josef Bacik , David Sterba , linux-ext4 , linux-xfs , linux-btrfs , Linux Kernel Mailing List Subject: Re: Problems with determining data presence by examining extents? Message-ID: <20200116101612.GB16435@lst.de> References: <4467.1579020509@warthog.procyon.org.uk> <00fc7691-77d5-5947-5493-5c97f262da81@gmx.com> <27181AE2-C63F-4932-A022-8B0563C72539@dilger.ca> <20200115133101.GA28583@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Jan 15, 2020 at 12:48:44PM -0700, Andreas Dilger wrote: > I don't think either of those will be any better than FIEMAP, if the reason > is that the underlying filesystem is filling in holes with actual data > blocks to optimize the IO pattern. SEEK_HOLE would not find a hole in > the block allocation, and would happily return the block of zeroes to > the caller. Also, it isn't clear if SEEK_HOLE considers an allocated but > unwritten extent to be a hole or a block? It is supposed to treat unwritten extents that are not dirty as holes. Note that fiemap can't even track the dirty state, so it will always give you the wrong answer in some cases. And that is by design given that it is a debug tool to give you the file system extent layout and can't be used for data integrity purposes.