All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Andreas Gruenbacher <agruenba@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 4/4] ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA
Date: Wed, 30 Aug 2017 16:59:21 +0200	[thread overview]
Message-ID: <20170830145921.GB5920@quack2.suse.cz> (raw)
In-Reply-To: <20170829142942.21594-5-agruenba@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3217 bytes --]

On Tue 29-08-17 16:29:42, Andreas Gruenbacher wrote:
> From: Christoph Hellwig <hch@lst.de>
> 
> Switch to the iomap_seek_hole and iomap_seek_data helpers for
> implementing lseek SEEK_HOLE / SEEK_DATA, and remove all the code that
> isn't needed any more.
> 
> Note that with this patch, ext4 will now always depend on the iomap code
> instead of only when CONFIG_DAX is enabled, and it requires adding a
> call into the extent status tree for iomap_begin as well to properly
> deal with delalloc extents.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> [Minor fixes and cleanups by Andreas]
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

...

> @@ -3425,6 +3425,29 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
>  
>  	if (!(flags & IOMAP_WRITE)) {
>  		ret = ext4_map_blocks(NULL, inode, &map, 0);
> +		if (ret < 0)
> +			return ret;

One general question about IOMAP_REPORT: When there is unwritten extent, we
use page_cache_seek_hole_data() to determine what is actually a hole and
what is data. We could use exactly the same function to determine what is a
hole and what is data in an IOMAP_HOLE extent, couldn't we? Now I
understand that a filesystem is supposed to return IOMAP_DELALLOC extent if
there is delayed allocation data covering queried offset so probably it's
not a great idea to use that however it still seems a bit like an
unnecessary duplication...

> +		if (!ret) {

Please go to this branch only for IOMAP_REPORT case to avoid unnecessary
overhead for DAX mappings...

> +			struct extent_status es = {};
> +
> +			ext4_es_find_delayed_extent_range(inode, map.m_lblk,
> +					map.m_lblk + map.m_len - 1, &es);
> +			/* Is delalloc data before next block in extent tree? */
> +			if (es.es_len && es.es_lblk < map.m_lblk + map.m_len) {

And this is still wrong - I have actually verified that with attached patch
that disables caching of extents (so it is equivalent to *very* aggresive
slab reclaim happening). With that patch applied you'll get:

kvm0:~ # rm /mnt/file; xfs_io -f -c "pwrite 4096 4096" -c "seek -a 0"
/mnt/file
wrote 4096/4096 bytes at offset 4096
4 KiB, 1 ops; 0.0000 sec (16.693 MiB/sec and 4273.5043 ops/sec)
Whence	Result
DATA	0
HOLE	8192

Which is obviously wrong - hole should be 0, data should be 4096.

And the reason why this is wrong is that when we are asked to map things at
first_block and there is a hole at that offset, we need to just truncate
the hole extent returned by ext4_map_blocks() by possibly following
delalloc allocation. But we still need to return that there *is a hole* at
first_block. But your patch seems to try to return the delalloc extent
instead which is wrong.

> +				ext4_lblk_t offs = 0;
> +
> +				if (es.es_lblk < map.m_lblk)
> +					offs = map.m_lblk - es.es_lblk;
> +				map.m_lblk = es.es_lblk + offs;
> +				map.m_pblk = ext4_es_pblock(&es) + offs;
> +				map.m_len = es.es_len - offs;
> +				if (ext4_es_is_unwritten(&es))
> +					map.m_flags |= EXT4_MAP_UNWRITTEN;
> +				if (ext4_es_is_delayed(&es))
> +					delalloc = true;
> +				ret = 1;
> +			}
> +		}
>  	} else {
>  		int dio_credits;
>  		handle_t *handle;

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: extent_status_disable.diff --]
[-- Type: text/x-patch, Size: 467 bytes --]

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index e7f12a204cbc..32b55cdf9b82 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -698,7 +698,7 @@ int ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
 	es_debug("add [%u/%u) %llu %x to extent status tree of inode %lu\n",
 		 lblk, len, pblk, status, inode->i_ino);
 
-	if (!len)
+	if (!len || !(status & EXTENT_STATUS_DELAYED))
 		return 0;
 
 	BUG_ON(end < lblk);

  reply	other threads:[~2017-08-30 14:59 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29 14:29 [PATCH 0/4] ext4: SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
2017-08-29 14:29 ` [PATCH 1/4] iomap: Switch from blkno to disk offset Andreas Gruenbacher
2017-08-29 16:11   ` Darrick J. Wong
2017-08-31 21:32     ` Darrick J. Wong
2017-08-31 21:37       ` Andreas Grünbacher
2017-08-30 15:00   ` Jan Kara
2017-08-29 14:29 ` [PATCH 2/4] iomap: Add IOMAP_F_DATA_INLINE flag Andreas Gruenbacher
2017-08-30 15:03   ` Jan Kara
2017-08-29 14:29 ` [PATCH 3/4] ext4: Add iomap support for inline data Andreas Gruenbacher
2017-08-30 12:47   ` Jan Kara
2017-08-30 12:54   ` Jan Kara
2017-08-29 14:29 ` [PATCH 4/4] ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA Andreas Gruenbacher
2017-08-30 14:59   ` Jan Kara [this message]
2017-09-14  9:17     ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170830145921.GB5920@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=agruenba@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.