Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] iomap: use page dirty state to seek data over unwritten extents
Date: Wed, 14 Oct 2020 09:53:44 +1100
Message-ID: <20201013225344.GA7391@dread.disaster.area> (raw)
In-Reply-To: <20201012140350.950064-2-bfoster@redhat.com>

On Mon, Oct 12, 2020 at 10:03:49AM -0400, Brian Foster wrote:
> iomap seek hole/data currently uses page Uptodate state to track
> data over unwritten extents. This is odd and unpredictable in that
> the existence of clean pages changes behavior. For example:
> 
>   $ xfs_io -fc "falloc 0 32k" -c "seek -d 0" \
> 	    -c "pread 16k 4k" -c "seek -d 0" /mnt/file
>   Whence  Result
>   DATA    EOF
>   ...
>   Whence  Result
>   DATA    16384

I don't think there is any way around this, because the page cache
lookup done by the seek hole/data code is an
unlocked operation and can race with other IO and operations. That
is, seek does not take IO serialisation locks at all so
read/write/page faults/fallocate/etc all run concurrently with it...

i.e. we get an iomap that is current at the time the iomap_begin()
call is made, but we don't hold any locks to stabilise that extent
range while we do a page cache traversal looking for cached data.
That means any region of the unwritten iomap can change state while
we are running the page cache seek.

We cannot determine what the data contents without major overhead,
and if we are seeking over a large unwritten extent covered by clean
pages that then gets partially written synchronously by another
concurrent write IO then we might trip across clean uptodate pages
with real data in them by the time the page cache scan gets to it.

Hence the only thing we are looking at here is whether there is data
present in the cache or not. As such, I think assuming that only
dirty/writeback pages contain actual user data in a seek data/hole
operation is a fundametnally incorrect premise.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 14:03 [PATCH 0/2] iomap: zero dirty pages " Brian Foster
2020-10-12 14:03 ` [PATCH 1/2] iomap: use page dirty state to seek data " Brian Foster
2020-10-13 12:30   ` Brian Foster
2020-10-13 22:53   ` Dave Chinner [this message]
2020-10-14 12:59     ` Brian Foster
2020-10-14 22:37       ` Dave Chinner
2020-10-15  9:47   ` Christoph Hellwig
2020-10-19 16:55     ` Brian Foster
2020-10-27 18:07       ` Christoph Hellwig
2020-10-28 11:31         ` Brian Foster
2020-10-12 14:03 ` [PATCH 2/2] iomap: zero cached pages over unwritten extents on zero range Brian Foster
2020-10-15  9:49   ` Christoph Hellwig
2020-10-19 16:55     ` Brian Foster
2020-10-19 18:01       ` Brian Foster
2020-10-20 16:21         ` Brian Foster
2020-10-27 18:15           ` Christoph Hellwig
2020-10-28 11:31             ` Brian Foster
2020-10-23  1:02   ` [iomap] 11b5156248: xfstests.xfs.310.fail kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201013225344.GA7391@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git