All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Daniel Dawson <danielcdawson@gmail.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [inline_data] ext4: Stale flags before sync when convert to non-inline
Date: Wed, 29 Nov 2023 23:06:51 -0500	[thread overview]
Message-ID: <20231130040651.GC510020@mit.edu> (raw)
In-Reply-To: <5189fe60-c3e3-4bc6-89d4-1033cf4337c3@gmail.com>

On Tue, Nov 28, 2023 at 10:15:09PM -0800, Daniel Dawson wrote:
> When a file is converted from inline to non-inline, it has stale flags until
> sync.
> 
> If a file with inline data is written to such that it must become
> non-inline, it temporarily appears to have the inline data flag and not (if
> applicable) the extent flag. This is corrected on sync, but can cause
> problems in certain situations.

The issue here is delayed allocation.  When you write to a file with
delayed allocation enabled, the file system doesn't decide where the
data will be written on the storage device until the last possible
minute, when writeback occurs.  This can be triggered by a fsync(2) or
sync(2) system call,

> Why is this a problem? Because some code will fail under such a condition,
> for example, lseek(..., SEEK_HOLE) will result in ENOENT. I ran into this
> with Gentoo's Portage, which uses the call to handle sparse files when
> copying. Sometimes, an ebuild creates a temporary file that is quickly
> copied, and apparently the temporary is written in small increments,
> triggering this.

This is caused by missing logic in ext4_iomap_begin_report().  For the
non-inline case, this function does this:

	ret = ext4_map_blocks(NULL, inode, &map, 0);
	if (ret < 0)
		return ret;
	if (ret == 0)
		delalloc = ext4_iomap_is_delalloc(inode, &map);

For a non-inline file, if you write some data blocks that hasn't been
written back due to delayed allocation, ext4_map_blocks() won't be
able to map the logical block to a physical block.  This logic is
missing in the inline_data case:

	if (ext4_has_inline_data(inode)) {
		ret = ext4_inline_data_iomap(inode, iomap);
		if (ret != -EAGAIN) {
			if (ret == 0 && offset >= iomap->length)
				ret = -ENOENT;
			return ret;
		}
	}

What's missing is a call to ext4_iomap_is_delalloc() in the case where
ret == 0, and then setting the delayed allocation flag:

	if (delalloc && iomap->type == IOMAP_HOLE)
		iomap->type = IOMAP_DELALLOC;

This will deal with the combination of inline_data and delayed
allocation for SEEK_HOLE and for FIEMAP.

I'll need to turn this into an actual patch and then create a test to
validate the patch but I'm pretty sure this should deal with the
problem you've come across.

Cheers,

						- Ted

  parent reply	other threads:[~2023-11-30  4:07 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-29  6:15 [inline_data] ext4: Stale flags before sync when convert to non-inline Daniel Dawson
2023-11-29 20:05 ` Daniel Dawson
2023-11-30  4:06 ` Theodore Ts'o [this message]
2023-12-28  9:29   ` Daniel Dawson
2024-01-23  5:56 ` Daniel Dawson
2024-01-24 17:13   ` Luis Henriques
2024-01-28 12:06     ` Daniel Dawson
2024-01-29 11:17       ` Luis Henriques

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231130040651.GC510020@mit.edu \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=danielcdawson@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.