Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: David Howells <dhowells@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: dhowells@redhat.com, Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Andreas Dilger <adilger@dilger.ca>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Theodore Y. Ts'o" <tytso@mit.edu>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Problems with determining data presence by examining extents?
Date: Wed, 15 Jan 2020 14:35:22 +0000
Message-ID: <26093.1579098922@warthog.procyon.org.uk> (raw)
In-Reply-To: <20200115133101.GA28583@lst.de>

Christoph Hellwig <hch@lst.de> wrote:

> If we can't get that easily it can be emulated using lseek SEEK_DATA /
> SEEK_HOLE assuming no other thread could be writing to the file, or the
> raciness doesn't matter.

Another thread could be writing to the file, and the raciness matters if I
want to cache the result of calling SEEK_HOLE - though it might be possible
just to mask it off.

One problem I have with SEEK_HOLE is that there's no upper bound on it.  Say
I have a 1GiB cachefile that's completely populated and I want to find out if
the first byte is present or not.  I call:

	end = vfs_llseek(file, SEEK_HOLE, 0);

It will have to scan the metadata of the entire 1GiB file and will then
presumably return the EOF position.  Now this might only be a mild irritation
as I can cache this information for later use, but it does put potentially put
a performance hiccough in the case of someone only reading the first page or
so of the file (say the file program).  On the other hand, probably most of
the files in the cache are likely to be complete - in which case, it's
probably quite cheap.

However, SEEK_HOLE doesn't help with the issue of the filesystem 'altering'
the content of the file by adding or removing blocks of zeros.

David


  parent reply index

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 16:48 David Howells
2020-01-14 22:49 ` Theodore Y. Ts'o
2020-01-15  3:54 ` Qu Wenruo
2020-01-15 12:46   ` Andreas Dilger
2020-01-15 13:10     ` Qu Wenruo
2020-01-15 13:31       ` Christoph Hellwig
2020-01-15 19:48         ` Andreas Dilger
2020-01-16 10:16           ` Christoph Hellwig
2020-01-15 20:55         ` David Howells
2020-01-15 22:11           ` Andreas Dilger
2020-01-15 23:09           ` David Howells
2020-01-26 18:19             ` Zygo Blaxell
2020-01-15 14:35       ` David Howells [this message]
2020-01-15 14:48         ` Christoph Hellwig
2020-01-15 14:59         ` David Howells
2020-01-16 10:13           ` Christoph Hellwig
2020-01-17 16:43           ` David Howells
2020-01-15 14:20   ` David Howells
2020-01-15  8:38 ` Christoph Hellwig
2020-01-15 13:50 ` David Howells
2020-01-15 14:05 ` David Howells
2020-01-15 14:24   ` Qu Wenruo
2020-01-15 14:50   ` David Howells
2020-01-15 14:15 ` David Howells

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26093.1579098922@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=adilger@dilger.ca \
    --cc=clm@fb.com \
    --cc=darrick.wong@oracle.com \
    --cc=dsterba@suse.com \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git