linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>, Jan Kara <jack@suse.cz>,
	Theodore Ts'o <tytso@mit.edu>,
	Matthew Wilcox <willy@infradead.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	John Garry <john.g.garry@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC 3/8] iomap: Add atomic write support for direct-io
Date: Mon, 4 Mar 2024 12:16:57 +1100	[thread overview]
Message-ID: <ZeUhCbT4sbucOT3L@dread.disaster.area> (raw)
In-Reply-To: <6a09654d152d3d1a07636174f5abcfce9948c20c.1709361537.git.ritesh.list@gmail.com>

On Sat, Mar 02, 2024 at 01:12:00PM +0530, Ritesh Harjani (IBM) wrote:
> This adds direct-io atomic writes support in iomap. This adds -
> 1. IOMAP_ATOMIC flag for iomap iter.
> 2. Sets REQ_ATOMIC to bio opflags.
> 3. Adds necessary checks in iomap_dio code to ensure a single bio is
>    submitted for an atomic write request. (since we only support ubuf
>    type iocb). Otherwise return an error EIO.
> 4. Adds a common helper routine iomap_dio_check_atomic(). It helps in
>    verifying mapped length and start/end physical offset against the hw
>    device constraints for supporting atomic writes.
> 
> This patch is based on a patch from John Garry <john.g.garry@oracle.com>
> which adds such support of DIO atomic writes to iomap.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/iomap/direct-io.c  | 75 +++++++++++++++++++++++++++++++++++++++++--
>  fs/iomap/trace.h      |  3 +-
>  include/linux/iomap.h |  1 +
>  3 files changed, 75 insertions(+), 4 deletions(-)

Ugh. Now we have two competing sets of changes to bring RWF_ATOMIC
support to iomap. One from John here:

https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/

and now this one.

Can the two of you please co-ordinate your efforts and based your
filesysetm work off the same iomap infrastructure changes?

.....

> @@ -356,6 +360,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
>  	if (need_zeroout) {
>  		/* zero out from the start of the block to the write offset */
>  		pad = pos & (fs_block_size - 1);
> +		if (unlikely(pad && atomic_write)) {
> +			WARN_ON_ONCE("pos not atomic write aligned\n");
> +			ret = -EINVAL;
> +			goto out;
> +		}

This atomic IO should have been rejected before it even got to
the layers where the bios are being built. If the IO alignment is
such that it does not align to filesystem allocation constraints, it
should be rejected at the filesystem ->write_iter() method and not
even get to the iomap layer.

.....

> @@ -516,6 +535,44 @@ static loff_t iomap_dio_iter(const struct iomap_iter *iter,
>  	}
>  }
>  
> +/*
> + * iomap_dio_check_atomic:	DIO Atomic checks before calling bio submission.
> + * @iter:			iomap iterator
> + * This function is called after filesystem block mapping and before bio
> + * formation/submission. This is the right place to verify hw device/block
> + * layer constraints to be followed for doing atomic writes. Hence do those
> + * common checks here.
> + */
> +static bool iomap_dio_check_atomic(struct iomap_iter *iter)
> +{
> +	struct block_device *bdev = iter->iomap.bdev;
> +	unsigned long long map_len = iomap_length(iter);
> +	unsigned long long start = iomap_sector(&iter->iomap, iter->pos)
> +						<< SECTOR_SHIFT;
> +	unsigned long long end = start + map_len - 1;
> +	unsigned int awu_min =
> +			queue_atomic_write_unit_min_bytes(bdev->bd_queue);
> +	unsigned int awu_max =
> +			queue_atomic_write_unit_max_bytes(bdev->bd_queue);
> +	unsigned long boundary =
> +			queue_atomic_write_boundary_bytes(bdev->bd_queue);
> +	unsigned long mask = ~(boundary - 1);
> +
> +
> +	/* map_len should be same as user specified iter->len */
> +	if (map_len < iter->len)
> +		return false;
> +	/* start should be aligned to block device min atomic unit alignment */
> +	if (!IS_ALIGNED(start, awu_min))
> +		return false;
> +	/* If top bits doesn't match, means atomic unit boundary is crossed */
> +	if (boundary && ((start | mask) != (end | mask)))
> +		return false;
> +
> +	return true;
> +}

I think you are re-implementing stuff that John has already done at
higher layers and in a generic manner. i.e.
generic_atomic_write_valid() in this patch:

https://lore.kernel.org/linux-fsdevel/20240226173612.1478858-4-john.g.garry@oracle.com/

We shouldn't be getting anywhere near the iomap layer if the IO is
not properly aligned to atomic IO constraints...

So, yeah, can you please co-ordinate the development of this
patchset with John and the work that has already been done to
support this functionality on block devices and XFS?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-03-04  1:17 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-02  7:41 [RFC 0/9] ext4: Add direct-io atomic write support using fsawu Ritesh Harjani (IBM)
2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
2024-03-02  7:41   ` [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Ritesh Harjani (IBM)
2024-03-04  0:59     ` Dave Chinner
2024-03-08  7:19       ` Ojaswin Mujoo
2024-03-02  7:42   ` [RFC 3/8] iomap: Add atomic write support for direct-io Ritesh Harjani (IBM)
2024-03-04  1:16     ` Dave Chinner [this message]
2024-03-04  5:33       ` Ritesh Harjani
2024-03-04  8:49         ` John Garry
2024-03-04 10:31           ` Ritesh Harjani
2024-03-04 20:56         ` Dave Chinner
2024-03-02  7:42   ` [RFC 4/8] ext4: Add statx and other atomic write helper routines Ritesh Harjani (IBM)
2024-03-06 11:14     ` John Garry
2024-03-08  8:10       ` Ritesh Harjani
2024-03-02  7:42   ` [RFC 5/8] ext4: Adds direct-io atomic writes checks Ritesh Harjani (IBM)
2024-03-02  7:42   ` [RFC 6/8] ext4: Add an inode flag for atomic writes Ritesh Harjani (IBM)
2024-03-04 20:34     ` Dave Chinner
2024-03-08  8:02       ` Ritesh Harjani
2024-03-02  7:42   ` [RFC 7/8] ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io Ritesh Harjani (IBM)
2024-03-02  7:42   ` [RFC 8/8] ext4: Adds atomic writes using fsawu Ritesh Harjani (IBM)
2024-03-02  7:42 ` [RFC 9/9] e2fsprogs/chattr: Supports atomic writes attribute Ritesh Harjani (IBM)
2024-03-06 11:22 ` [RFC 0/9] ext4: Add direct-io atomic write support using fsawu John Garry
2024-03-06 13:13   ` Ritesh Harjani
2024-03-08 20:25   ` [RFC] ext4: Add support for ext4_map_blocks_atomic() Ritesh Harjani (IBM)
2024-03-09  2:37     ` Ritesh Harjani
2024-03-13 18:40     ` John Garry
2024-03-14 15:52       ` Ritesh Harjani
2024-03-18  8:22         ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZeUhCbT4sbucOT3L@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=john.g.garry@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).