All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/9] ext4: Add direct-io atomic write support using fsawu
@ 2024-03-02  7:41 Ritesh Harjani (IBM)
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:41 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

Hello all,

This RFC series adds support for atomic writes to ext4 direct-io using
filesystem atomic write unit. It's built on top of John's "block atomic
write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables
atomic write support in underlying device driver and block layer.

This series uses the same RWF_ATOMIC interface for adding atomic write support
to ext4's direct-io path. One can utilize it by 2 of the methods explained below.
((1)mkfs.ext4 -b <BS>, (2) with bigalloc).

Filesystem atomic write unit (fsawu):
============================================
Atomic writes within ext4 can be supported using below 3 methods -
1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize),
   we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k).
   Now if the underlying HW device supports atomic writes, than a corresponding
   blocksize can be chosen as a filesystem atomic write unit (fsawu) which
   should be within the underlying hw defined [awu_min, awu_max] range.
   For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k)

   On a smaller pagesize system this can be utilized when support for LBS is
   complete (on ext4).

2. EXT4 already supports a feature called bigalloc. In that ext4 can handle
   allocation in cluster size units. So for e.g. we can create a filesystem with
   4k blocksize but with 64k clustersize. Such a configuration can also be used
   to support atomic writes if the underlying hw device supports it.
   In such case the fsawu_min will most likely be the filesystem blocksize and
   fsawu_max will mostly likely be the cluster size.

   So a user can do an atomic write of any size between [fsawu_min, fsawu_max]
   range as long as it satisfies other constraints being laid out by HW device
   (or by software stack) to support atomic writes.
   e.g. len should be a power of 2, pos % len should be naturally
   aligned and [start | end] (phys offsets) should not straddle over
   an atomic write boundary.

3. EXT4 mballoc can be made aware of doing aligned block allocation for e.g. by
   utilizing cr-0 allocation criteria. With this support, we won't be needing
   to format a new filesystem and hopefully when the support for this in mballoc
   is done, it can utilize the same interface/helper routines laid out in this
   patch series. There is work going on in this aspect too in parallel [2]


Purpose of an early RFC:
(note only minimal testing has been done on this).
========================
Other than getting early review comments on the design, hopefully it should also
help folks in their discussion at LSFMM since there are various topic proposals
out there regarding atomic write support in xfs and ext4 [3][4].


How to utilize this support:
===========================
1. mkfs.ext4 -b 4096 -C 65536 /dev/<sdb> (scsi_debug or device with atomic write)
   or mkfs.ext4 -b <BS=16k> if your platform supports it.
2. mount /dev/sdb /mnt
3. touch /mnt/f1
4. chattr +W /mnt/f1
5. xfs_io -dc "pwrite <pos> <len>" /mnt/f1


References:
===========
[1]: https://lore.kernel.org/all/20240226173612.1478858-1-john.g.garry@oracle.com/
[2]: https://lore.kernel.org/linux-ext4/cover.1701339358.git.ojaswin@linux.ibm.com/
[3]: https://www.spinics.net/lists/linux-xfs/msg81086.html
[4]: https://www.spinics.net/lists/linux-fsdevel/msg265226.html

John Garry (1):
  fs: Add FS_XFLAG_ATOMICWRITES flag

Ritesh Harjani (IBM) (7):
  fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes
  iomap: Add atomic write support for direct-io
  ext4: Add statx and other atomic write helper routines
  ext4: Adds direct-io atomic writes checks
  ext4: Add an inode flag for atomic writes
  ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io
  ext4: Adds atomic writes using fsawu

Ritesh Harjani (IBM) (1):
  e2fsprogs/chattr: Supports atomic writes attribute

 fs/ext4/ext4.h           | 87 +++++++++++++++++++++++++++++++++++++++-
 fs/ext4/file.c           | 38 ++++++++++++++++--
 fs/ext4/inode.c          | 16 ++++++++
 fs/ext4/ioctl.c          | 11 +++++
 fs/ext4/super.c          |  1 +
 fs/ioctl.c               |  4 ++
 fs/iomap/direct-io.c     | 75 ++++++++++++++++++++++++++++++++--
 fs/iomap/trace.h         |  3 +-
 include/linux/fileattr.h |  4 +-
 include/linux/iomap.h    |  1 +
 include/uapi/linux/fs.h  |  2 +
 11 files changed, 232 insertions(+), 10 deletions(-)

--
2.39.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag
  2024-03-02  7:41 [RFC 0/9] ext4: Add direct-io atomic write support using fsawu Ritesh Harjani (IBM)
@ 2024-03-02  7:41 ` Ritesh Harjani (IBM)
  2024-03-02  7:41   ` [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Ritesh Harjani (IBM)
                     ` (6 more replies)
  2024-03-02  7:42 ` [RFC 9/9] e2fsprogs/chattr: Supports atomic writes attribute Ritesh Harjani (IBM)
  2024-03-06 11:22 ` [RFC 0/9] ext4: Add direct-io atomic write support using fsawu John Garry
  2 siblings, 7 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:41 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani

From: John Garry <john.g.garry@oracle.com>

Add a flag indicating that a regular file is enabled for atomic writes.

Signed-off-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 include/uapi/linux/fs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index a0975ae81e64..b5b4e1db9576 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -140,6 +140,7 @@ struct fsxattr {
 #define FS_XFLAG_FILESTREAM	0x00004000	/* use filestream allocator */
 #define FS_XFLAG_DAX		0x00008000	/* use DAX for IO */
 #define FS_XFLAG_COWEXTSIZE	0x00010000	/* CoW extent size allocator hint */
+#define FS_XFLAG_ATOMICWRITES	0x00020000	/* atomic writes enabled */
 #define FS_XFLAG_HASATTR	0x80000000	/* no DIFLAG for this	*/
 
 /* the read-only stuff doesn't really belong here, but any other place is
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
@ 2024-03-02  7:41   ` Ritesh Harjani (IBM)
  2024-03-04  0:59     ` Dave Chinner
  2024-03-02  7:42   ` [RFC 3/8] iomap: Add atomic write support for direct-io Ritesh Harjani (IBM)
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:41 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

This reserves FS_ATOMICWRITES_FL for flags and adds support in
fileattr to support atomic writes flag & xflag needed for ext4
and xfs.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ioctl.c               | 4 ++++
 include/linux/fileattr.h | 4 ++--
 include/uapi/linux/fs.h  | 1 +
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 76cf22ac97d7..e0f7fae4777e 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -481,6 +481,8 @@ void fileattr_fill_xflags(struct fileattr *fa, u32 xflags)
 		fa->flags |= FS_DAX_FL;
 	if (fa->fsx_xflags & FS_XFLAG_PROJINHERIT)
 		fa->flags |= FS_PROJINHERIT_FL;
+	if (fa->fsx_xflags & FS_XFLAG_ATOMICWRITES)
+		fa->flags |= FS_ATOMICWRITES_FL;
 }
 EXPORT_SYMBOL(fileattr_fill_xflags);
 
@@ -511,6 +513,8 @@ void fileattr_fill_flags(struct fileattr *fa, u32 flags)
 		fa->fsx_xflags |= FS_XFLAG_DAX;
 	if (fa->flags & FS_PROJINHERIT_FL)
 		fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
+	if (fa->flags & FS_ATOMICWRITES_FL)
+		fa->fsx_xflags |= FS_XFLAG_ATOMICWRITES;
 }
 EXPORT_SYMBOL(fileattr_fill_flags);
 
diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
index 47c05a9851d0..ae9329afa46b 100644
--- a/include/linux/fileattr.h
+++ b/include/linux/fileattr.h
@@ -7,12 +7,12 @@
 #define FS_COMMON_FL \
 	(FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \
 	 FS_NODUMP_FL |	FS_NOATIME_FL | FS_DAX_FL | \
-	 FS_PROJINHERIT_FL)
+	 FS_PROJINHERIT_FL | FS_ATOMICWRITES_FL)
 
 #define FS_XFLAG_COMMON \
 	(FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | FS_XFLAG_APPEND | \
 	 FS_XFLAG_NODUMP | FS_XFLAG_NOATIME | FS_XFLAG_DAX | \
-	 FS_XFLAG_PROJINHERIT)
+	 FS_XFLAG_PROJINHERIT | FS_XFLAG_ATOMICWRITES)
 
 /*
  * Merged interface for miscellaneous file attributes.  'flags' originates from
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index b5b4e1db9576..17f52530f9c8 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -264,6 +264,7 @@ struct fsxattr {
 #define FS_EA_INODE_FL			0x00200000 /* Inode used for large EA */
 #define FS_EOFBLOCKS_FL			0x00400000 /* Reserved for ext4 */
 #define FS_NOCOW_FL			0x00800000 /* Do not cow file */
+#define FS_ATOMICWRITES_FL		0x01000000 /* Inode supports atomic writes */
 #define FS_DAX_FL			0x02000000 /* Inode is DAX */
 #define FS_INLINE_DATA_FL		0x10000000 /* Reserved for ext4 */
 #define FS_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 3/8] iomap: Add atomic write support for direct-io
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
  2024-03-02  7:41   ` [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Ritesh Harjani (IBM)
@ 2024-03-02  7:42   ` Ritesh Harjani (IBM)
  2024-03-04  1:16     ` Dave Chinner
  2024-03-02  7:42   ` [RFC 4/8] ext4: Add statx and other atomic write helper routines Ritesh Harjani (IBM)
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

This adds direct-io atomic writes support in iomap. This adds -
1. IOMAP_ATOMIC flag for iomap iter.
2. Sets REQ_ATOMIC to bio opflags.
3. Adds necessary checks in iomap_dio code to ensure a single bio is
   submitted for an atomic write request. (since we only support ubuf
   type iocb). Otherwise return an error EIO.
4. Adds a common helper routine iomap_dio_check_atomic(). It helps in
   verifying mapped length and start/end physical offset against the hw
   device constraints for supporting atomic writes.

This patch is based on a patch from John Garry <john.g.garry@oracle.com>
which adds such support of DIO atomic writes to iomap.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/iomap/direct-io.c  | 75 +++++++++++++++++++++++++++++++++++++++++--
 fs/iomap/trace.h      |  3 +-
 include/linux/iomap.h |  1 +
 3 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index bcd3f8cf5ea4..b4548acb74e7 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -256,7 +256,7 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
  * clearing the WRITE_THROUGH flag in the dio request.
  */
 static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio,
-		const struct iomap *iomap, bool use_fua)
+		const struct iomap *iomap, bool use_fua, bool atomic_write)
 {
 	blk_opf_t opflags = REQ_SYNC | REQ_IDLE;
 
@@ -269,6 +269,9 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio,
 	else
 		dio->flags &= ~IOMAP_DIO_WRITE_THROUGH;
 
+	if (atomic_write)
+		opflags |= REQ_ATOMIC;
+
 	return opflags;
 }
 
@@ -279,11 +282,12 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	struct inode *inode = iter->inode;
 	unsigned int fs_block_size = i_blocksize(inode), pad;
 	loff_t length = iomap_length(iter);
+	const size_t orig_len = iter->len;
 	loff_t pos = iter->pos;
 	blk_opf_t bio_opf;
 	struct bio *bio;
 	bool need_zeroout = false;
-	bool use_fua = false;
+	bool use_fua = false, atomic_write = iter->flags & IOMAP_ATOMIC;
 	int nr_pages, ret = 0;
 	size_t copied = 0;
 	size_t orig_count;
@@ -356,6 +360,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	if (need_zeroout) {
 		/* zero out from the start of the block to the write offset */
 		pad = pos & (fs_block_size - 1);
+		if (unlikely(pad && atomic_write)) {
+			WARN_ON_ONCE("pos not atomic write aligned\n");
+			ret = -EINVAL;
+			goto out;
+		}
 		if (pad)
 			iomap_dio_zero(iter, dio, pos - pad, pad);
 	}
@@ -365,7 +374,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	 * can set up the page vector appropriately for a ZONE_APPEND
 	 * operation.
 	 */
-	bio_opf = iomap_dio_bio_opflags(dio, iomap, use_fua);
+	bio_opf = iomap_dio_bio_opflags(dio, iomap, use_fua, atomic_write);
 
 	nr_pages = bio_iov_vecs_to_alloc(dio->submit.iter, BIO_MAX_VECS);
 	do {
@@ -397,6 +406,14 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 		}
 
 		n = bio->bi_iter.bi_size;
+
+		/* This bio should have covered the complete length */
+		if (unlikely(atomic_write && n != orig_len)) {
+			WARN_ON_ONCE(1);
+			ret = -EINVAL;
+			bio_put(bio);
+			goto out;
+		}
 		if (dio->flags & IOMAP_DIO_WRITE) {
 			task_io_account_write(n);
 		} else {
@@ -429,6 +446,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
 	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
 		/* zero out from the end of the write to the end of the block */
 		pad = pos & (fs_block_size - 1);
+		/* This should never happen */
+		WARN_ON_ONCE(unlikely(pad && atomic_write));
 		if (pad)
 			iomap_dio_zero(iter, dio, pos, fs_block_size - pad);
 	}
@@ -516,6 +535,44 @@ static loff_t iomap_dio_iter(const struct iomap_iter *iter,
 	}
 }
 
+/*
+ * iomap_dio_check_atomic:	DIO Atomic checks before calling bio submission.
+ * @iter:			iomap iterator
+ * This function is called after filesystem block mapping and before bio
+ * formation/submission. This is the right place to verify hw device/block
+ * layer constraints to be followed for doing atomic writes. Hence do those
+ * common checks here.
+ */
+static bool iomap_dio_check_atomic(struct iomap_iter *iter)
+{
+	struct block_device *bdev = iter->iomap.bdev;
+	unsigned long long map_len = iomap_length(iter);
+	unsigned long long start = iomap_sector(&iter->iomap, iter->pos)
+						<< SECTOR_SHIFT;
+	unsigned long long end = start + map_len - 1;
+	unsigned int awu_min =
+			queue_atomic_write_unit_min_bytes(bdev->bd_queue);
+	unsigned int awu_max =
+			queue_atomic_write_unit_max_bytes(bdev->bd_queue);
+	unsigned long boundary =
+			queue_atomic_write_boundary_bytes(bdev->bd_queue);
+	unsigned long mask = ~(boundary - 1);
+
+
+	/* map_len should be same as user specified iter->len */
+	if (map_len < iter->len)
+		return false;
+	/* start should be aligned to block device min atomic unit alignment */
+	if (!IS_ALIGNED(start, awu_min))
+		return false;
+	/* If top bits doesn't match, means atomic unit boundary is crossed */
+	if (boundary && ((start | mask) != (end | mask)))
+		return false;
+
+	return true;
+}
+
+
 /*
  * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO
  * is being issued as AIO or not.  This allows us to optimise pure data writes
@@ -554,12 +611,16 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 	struct blk_plug plug;
 	struct iomap_dio *dio;
 	loff_t ret = 0;
+	bool atomic_write = iocb->ki_flags & IOCB_ATOMIC;
 
 	trace_iomap_dio_rw_begin(iocb, iter, dio_flags, done_before);
 
 	if (!iomi.len)
 		return NULL;
 
+	if (atomic_write && !iter_is_ubuf(iter))
+		return ERR_PTR(-EINVAL);
+
 	dio = kmalloc(sizeof(*dio), GFP_KERNEL);
 	if (!dio)
 		return ERR_PTR(-ENOMEM);
@@ -605,6 +666,9 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		if (iocb->ki_flags & IOCB_DIO_CALLER_COMP)
 			dio->flags |= IOMAP_DIO_CALLER_COMP;
 
+		if (atomic_write)
+			iomi.flags |= IOMAP_ATOMIC;
+
 		if (dio_flags & IOMAP_DIO_OVERWRITE_ONLY) {
 			ret = -EAGAIN;
 			if (iomi.pos >= dio->i_size ||
@@ -656,6 +720,11 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 
 	blk_start_plug(&plug);
 	while ((ret = iomap_iter(&iomi, ops)) > 0) {
+		if (atomic_write && !iomap_dio_check_atomic(&iomi)) {
+			ret = -EIO;
+			break;
+		}
+
 		iomi.processed = iomap_dio_iter(&iomi, dio);
 
 		/*
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index c16fd55f5595..c95576420bca 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -98,7 +98,8 @@ DEFINE_RANGE_EVENT(iomap_dio_rw_queued);
 	{ IOMAP_REPORT,		"REPORT" }, \
 	{ IOMAP_FAULT,		"FAULT" }, \
 	{ IOMAP_DIRECT,		"DIRECT" }, \
-	{ IOMAP_NOWAIT,		"NOWAIT" }
+	{ IOMAP_NOWAIT,		"NOWAIT" }, \
+	{ IOMAP_ATOMIC,		"ATOMIC" }
 
 #define IOMAP_F_FLAGS_STRINGS \
 	{ IOMAP_F_NEW,		"NEW" }, \
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 96dd0acbba44..9eac704a0d6f 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -178,6 +178,7 @@ struct iomap_folio_ops {
 #else
 #define IOMAP_DAX		0
 #endif /* CONFIG_FS_DAX */
+#define IOMAP_ATOMIC		(1 << 9)
 
 struct iomap_ops {
 	/*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 4/8] ext4: Add statx and other atomic write helper routines
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
  2024-03-02  7:41   ` [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Ritesh Harjani (IBM)
  2024-03-02  7:42   ` [RFC 3/8] iomap: Add atomic write support for direct-io Ritesh Harjani (IBM)
@ 2024-03-02  7:42   ` Ritesh Harjani (IBM)
  2024-03-06 11:14     ` John Garry
  2024-03-02  7:42   ` [RFC 5/8] ext4: Adds direct-io atomic writes checks Ritesh Harjani (IBM)
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

This patch adds the statx (STATX_WRITE_ATOMIC) support in ext4_getattr()
to query for atomic_write_unit_min(awu_min), awu_max and other
attributes for atomic writes.
This adds a new runtime mount flag (EXT4_MF_ATOMIC_WRITE_FSAWU),
for querying whether ext4 supports atomic write using fsawu
(filesystem atomic write unit).

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/ext4.h  | 53 ++++++++++++++++++++++++++++++++++++++++++++++++-
 fs/ext4/inode.c | 16 +++++++++++++++
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 023571f8dd1b..1d2bce26e616 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1817,7 +1817,8 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
  */
 enum {
 	EXT4_MF_MNTDIR_SAMPLED,
-	EXT4_MF_FC_INELIGIBLE	/* Fast commit ineligible */
+	EXT4_MF_FC_INELIGIBLE,		/* Fast commit ineligible */
+	EXT4_MF_ATOMIC_WRITE_FSAWU	/* Atomic write via FSAWU */
 };
 
 static inline void ext4_set_mount_flag(struct super_block *sb, int bit)
@@ -3839,6 +3840,56 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh)
 	return buffer_uptodate(bh);
 }
 
+#define ext4_can_atomic_write_fsawu(sb)				\
+	ext4_test_mount_flag(sb, EXT4_MF_ATOMIC_WRITE_FSAWU)
+
+/**
+ * ext4_atomic_write_fsawu	Returns EXT4 filesystem atomic write unit.
+ *  @sb				super_block
+ *  This returns the filesystem min|max atomic write units.
+ *  For !bigalloc it is filesystem blocksize (fsawu_min)
+ *  For bigalloc it should be either blocksize or multiple of blocksize
+ *  (fsawu_min)
+ */
+static inline void ext4_atomic_write_fsawu(struct super_block *sb,
+					   unsigned int *fsawu_min,
+					   unsigned int *fsawu_max)
+{
+	u8 blkbits = sb->s_blocksize_bits;
+	unsigned int blocksize = 1U << blkbits;
+	unsigned int clustersize = blocksize;
+	struct block_device *bdev = sb->s_bdev;
+	unsigned int awu_min =
+			queue_atomic_write_unit_min_bytes(bdev->bd_queue);
+	unsigned int awu_max =
+			queue_atomic_write_unit_max_bytes(bdev->bd_queue);
+
+	if (ext4_has_feature_bigalloc(sb))
+		clustersize = 1U << (EXT4_SB(sb)->s_cluster_bits + blkbits);
+
+	/* fs min|max should respect awu_[min|max] units */
+	if (unlikely(awu_min > clustersize || awu_max < blocksize))
+		goto not_supported;
+
+	/* in case of !bigalloc fsawu_[min|max] should be same as blocksize */
+	if (!ext4_has_feature_bigalloc(sb)) {
+		*fsawu_min = blocksize;
+		*fsawu_max = blocksize;
+		return;
+	}
+
+	/* bigalloc can support write in blocksize units. So advertize it */
+	*fsawu_min = max(blocksize, awu_min);
+	*fsawu_max = min(clustersize, awu_max);
+
+	/* This should never happen, but let's keep a WARN_ON_ONCE */
+	WARN_ON_ONCE(!IS_ALIGNED(clustersize, *fsawu_min));
+	return;
+not_supported:
+	*fsawu_min = 0;
+	*fsawu_max = 0;
+}
+
 #endif	/* __KERNEL__ */
 
 #define EFSBADCRC	EBADMSG		/* Bad CRC detected */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2ccf3b5e3a7c..ea009ca9085d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5536,6 +5536,22 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
 		}
 	}
 
+	if (request_mask & STATX_WRITE_ATOMIC) {
+		unsigned int fsawu_min = 0, fsawu_max = 0;
+
+		/*
+		 * Get fsawu_[min|max] value which we can advertise to userspace
+		 * in statx call, if we support atomic writes using
+		 * EXT4_MF_ATOMIC_WRITE_FSAWU.
+		 */
+		if (ext4_can_atomic_write_fsawu(inode->i_sb)) {
+			ext4_atomic_write_fsawu(inode->i_sb, &fsawu_min,
+						&fsawu_max);
+		}
+
+		generic_fill_statx_atomic_writes(stat, fsawu_min, fsawu_max);
+	}
+
 	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
 	if (flags & EXT4_APPEND_FL)
 		stat->attributes |= STATX_ATTR_APPEND;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 5/8] ext4: Adds direct-io atomic writes checks
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
                     ` (2 preceding siblings ...)
  2024-03-02  7:42   ` [RFC 4/8] ext4: Add statx and other atomic write helper routines Ritesh Harjani (IBM)
@ 2024-03-02  7:42   ` Ritesh Harjani (IBM)
  2024-03-02  7:42   ` [RFC 6/8] ext4: Add an inode flag for atomic writes Ritesh Harjani (IBM)
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

This patch adds ext4 specific checks for supporting atomic writes
using fsawu (filesystem atomic write unit). We can enable this support
with either -
1. bigalloc on a 4k pagesize system or
2. bs < ps system with -b <BS>
3. filesystems with LBS (large block size) support (future)

Let's use generic_atomic_write_valid() helper for alignment
restrictions checking.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c | 34 +++++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 54d6ff22585c..8e309a9a0bd6 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -400,6 +400,21 @@ static const struct iomap_dio_ops ext4_dio_write_ops = {
 	.end_io = ext4_dio_write_end_io,
 };
 
+static bool ext4_dio_atomic_write_checks(struct kiocb *iocb,
+					 struct iov_iter *from)
+{
+	struct super_block *sb = file_inode(iocb->ki_filp)->i_sb;
+	loff_t pos = iocb->ki_pos;
+	unsigned int fsawu_min, fsawu_max;
+
+	if (!ext4_can_atomic_write_fsawu(sb))
+		return false;
+
+	ext4_atomic_write_fsawu(sb, &fsawu_min, &fsawu_max);
+
+	return generic_atomic_write_valid(pos, from, fsawu_min, fsawu_max);
+}
+
 /*
  * The intention here is to start with shared lock acquired then see if any
  * condition requires an exclusive inode lock. If yes, then we restart the
@@ -427,13 +442,19 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from,
 	loff_t offset;
 	size_t count;
 	ssize_t ret;
-	bool overwrite, unaligned_io;
+	bool overwrite, unaligned_io, atomic_write;
 
 restart:
 	ret = ext4_generic_write_checks(iocb, from);
 	if (ret <= 0)
 		goto out;
 
+	atomic_write = iocb->ki_flags & IOCB_ATOMIC;
+	if (atomic_write && !ext4_dio_atomic_write_checks(iocb, from)) {
+		ret = -EINVAL;
+		goto out;
+	}
+
 	offset = iocb->ki_pos;
 	count = ret;
 
@@ -576,8 +597,15 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		iomap_ops = &ext4_iomap_overwrite_ops;
 	ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops,
 			   dio_flags, NULL, 0);
-	if (ret == -ENOTBLK)
-		ret = 0;
+
+	/* Fallback to buffered-io for non-atomic DIO */
+	if (ret == -ENOTBLK) {
+		if (iocb->ki_flags & IOCB_ATOMIC)
+			ret = -EIO;
+		else
+			ret = 0;
+	}
+
 	if (extend) {
 		/*
 		 * We always perform extending DIO write synchronously so by
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 6/8] ext4: Add an inode flag for atomic writes
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
                     ` (3 preceding siblings ...)
  2024-03-02  7:42   ` [RFC 5/8] ext4: Adds direct-io atomic writes checks Ritesh Harjani (IBM)
@ 2024-03-02  7:42   ` Ritesh Harjani (IBM)
  2024-03-04 20:34     ` Dave Chinner
  2024-03-02  7:42   ` [RFC 7/8] ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io Ritesh Harjani (IBM)
  2024-03-02  7:42   ` [RFC 8/8] ext4: Adds atomic writes using fsawu Ritesh Harjani (IBM)
  6 siblings, 1 reply; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

This patch adds an inode atomic writes flag to ext4
(EXT4_ATOMICWRITES_FL which uses FS_ATOMICWRITES_FL flag).
Also add support for setting of this flag via ioctl.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/ext4.h  |  6 ++++++
 fs/ext4/ioctl.c | 11 +++++++++++
 2 files changed, 17 insertions(+)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1d2bce26e616..aa7fff2d6f96 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -495,8 +495,12 @@ struct flex_groups {
 #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
 /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
 
+#define EXT4_ATOMICWRITES_FL		FS_ATOMICWRITES_FL /* Inode supports atomic writes */
 #define EXT4_DAX_FL			0x02000000 /* Inode is DAX */
 
+/* 0x04000000 unused for now */
+/* 0x08000000 unused for now */
+
 #define EXT4_INLINE_DATA_FL		0x10000000 /* Inode has inline data. */
 #define EXT4_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
 #define EXT4_CASEFOLD_FL		0x40000000 /* Casefolded directory */
@@ -519,6 +523,7 @@ struct flex_groups {
 					 0x00400000 /* EXT4_EOFBLOCKS_FL */ | \
 					 EXT4_DAX_FL | \
 					 EXT4_PROJINHERIT_FL | \
+					 EXT4_ATOMICWRITES_FL | \
 					 EXT4_CASEFOLD_FL)
 
 /* User visible flags */
@@ -593,6 +598,7 @@ enum {
 	EXT4_INODE_VERITY	= 20,	/* Verity protected inode */
 	EXT4_INODE_EA_INODE	= 21,	/* Inode used for large EA */
 /* 22 was formerly EXT4_INODE_EOFBLOCKS */
+	EXT4_INODE_ATOMIC_WRITE	= 24,	/* file does ATOMIC WRITE */
 	EXT4_INODE_DAX		= 25,	/* Inode is DAX */
 	EXT4_INODE_INLINE_DATA	= 28,	/* Data in inode. */
 	EXT4_INODE_PROJINHERIT	= 29,	/* Create with parents projid */
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 7160a71044c8..03d0b501cbc8 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -632,6 +632,17 @@ static int ext4_ioctl_setflags(struct inode *inode,
 		}
 	}
 
+	if (flags & EXT4_ATOMICWRITES_FL) {
+		if (!ext4_can_atomic_write_fsawu(sb))
+			return -EOPNOTSUPP;
+
+		/* TODO: Do we need locks to check i_reserved_data_blocks */
+		if (!S_ISREG(inode->i_mode) || ext4_has_inline_data(inode) ||
+				READ_ONCE(ei->i_disksize) ||
+				EXT4_I(inode)->i_reserved_data_blocks)
+			return -EOPNOTSUPP;
+	}
+
 	/*
 	 * Wait for all pending directio and then flush all the dirty pages
 	 * for this file.  The flush marks all the pages readonly, so any
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 7/8] ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
                     ` (4 preceding siblings ...)
  2024-03-02  7:42   ` [RFC 6/8] ext4: Add an inode flag for atomic writes Ritesh Harjani (IBM)
@ 2024-03-02  7:42   ` Ritesh Harjani (IBM)
  2024-03-02  7:42   ` [RFC 8/8] ext4: Adds atomic writes using fsawu Ritesh Harjani (IBM)
  6 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

For inodes which has EXT4_INODE_ATOMIC_WRITE flag set, enable
FMODE_CAN_ATOMIC_WRITE mode in ext4 file open method for file opened
with O_DIRECT.

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/file.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8e309a9a0bd6..800fd79e2738 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -913,6 +913,10 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
 			return ret;
 	}
 
+	if (ext4_test_inode_flag(inode, EXT4_INODE_ATOMIC_WRITE) &&
+			(filp->f_flags & O_DIRECT))
+		filp->f_mode |= FMODE_CAN_ATOMIC_WRITE;
+
 	filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC |
 			FMODE_DIO_PARALLEL_WRITE;
 	return dquot_file_open(inode, filp);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 8/8] ext4: Adds atomic writes using fsawu
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
                     ` (5 preceding siblings ...)
  2024-03-02  7:42   ` [RFC 7/8] ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io Ritesh Harjani (IBM)
@ 2024-03-02  7:42   ` Ritesh Harjani (IBM)
  6 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

atomic write using fsawu (filesystem atomic write unit) means, a
filesystem can supports doing atomic writes as long as all of
below constraints are satisfied -
1. underlying block device HW supports atomic writes.
2. fsawu_[min|max] (fs blocksize or bigalloc cluster size), should
   be within the HW boundary range of awu_min and awu_max.

If this constraints are satisfied that a filesystem can do atomic
writes. There are no underlying filesystem layout changes required to
enable this. This patch enables this support in ext4 during mount time
if the underlying HW supports it.
We set a runtime mount flag to enable this support.

After this patch ext4 can support atomic writes with pwritev2's
RWF_ATOMIC flag with direct-io with -
1. mkfs.ext4 -b <BS=8k/16k/32k/64k> <dev_path>
(for a large pagesize system)
2. mkfs.ext4 -b <BS> -C <CS> <dev_path> (with bigalloc)

Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 fs/ext4/ext4.h  | 28 ++++++++++++++++++++++++++++
 fs/ext4/super.c |  1 +
 2 files changed, 29 insertions(+)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index aa7fff2d6f96..529ca32b9813 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3896,6 +3896,34 @@ static inline void ext4_atomic_write_fsawu(struct super_block *sb,
 	*fsawu_max = 0;
 }
 
+/**
+ * ext4_init_atomic_write	ext4 init atomic writes using fsawu
+ * @sb				super_block
+ *
+ * Function to initialize atomic/untorn write support using fsawu.
+ * TODO: In future, when mballoc will get aligned allocations support,
+ * then we can enable atomic write support for ext4 without fsawu restrictions.
+ */
+static inline void ext4_init_atomic_write(struct super_block *sb)
+{
+	struct block_device *bdev = sb->s_bdev;
+	unsigned int fsawu_min, fsawu_max;
+
+	if (!ext4_has_feature_extents(sb))
+		return;
+
+	if (!bdev_can_atomic_write(bdev))
+		return;
+
+	ext4_atomic_write_fsawu(sb, &fsawu_min, &fsawu_max);
+	if (fsawu_min && fsawu_max) {
+		ext4_set_mount_flag(sb, EXT4_MF_ATOMIC_WRITE_FSAWU);
+		ext4_msg(sb, KERN_NOTICE,
+			 "Supports atomic writes using EXT4_MF_ATOMIC_WRITE_FSAWU, fsawu_min %u fsawu_max: %u",
+			 fsawu_min, fsawu_max);
+	}
+}
+
 #endif	/* __KERNEL__ */
 
 #define EFSBADCRC	EBADMSG		/* Bad CRC detected */
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 0f931d0c227d..971bfd093997 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5352,6 +5352,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
 	mutex_init(&sbi->s_orphan_lock);
 
 	ext4_fast_commit_init(sb);
+	ext4_init_atomic_write(sb);
 
 	sb->s_root = NULL;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 9/9] e2fsprogs/chattr: Supports atomic writes attribute
  2024-03-02  7:41 [RFC 0/9] ext4: Add direct-io atomic write support using fsawu Ritesh Harjani (IBM)
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
@ 2024-03-02  7:42 ` Ritesh Harjani (IBM)
  2024-03-06 11:22 ` [RFC 0/9] ext4: Add direct-io atomic write support using fsawu John Garry
  2 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-02  7:42 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, John Garry, linux-kernel,
	Ritesh Harjani (IBM)

This adds 'W' which is atomic write attribute to chattr.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
 lib/e2p/pf.c         |  1 +
 lib/ext2fs/ext2_fs.h |  2 +-
 misc/chattr.1.in     | 18 ++++++++++++++----
 misc/chattr.c        |  3 ++-
 4 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/lib/e2p/pf.c b/lib/e2p/pf.c
index 81e3bb26..9b311477 100644
--- a/lib/e2p/pf.c
+++ b/lib/e2p/pf.c
@@ -45,6 +45,7 @@ static struct flags_name flags_array[] = {
 	{ EXT4_EXTENTS_FL, "e", "Extents" },
 	{ FS_NOCOW_FL, "C", "No_COW" },
 	{ FS_DAX_FL, "x", "DAX" },
+	{ FS_ATOMICWRITES_FL, "W", "ATOMIC_WRITES" },
 	{ EXT4_CASEFOLD_FL, "F", "Casefold" },
 	{ EXT4_INLINE_DATA_FL, "N", "Inline_Data" },
 	{ EXT4_PROJINHERIT_FL, "P", "Project_Hierarchy" },
diff --git a/lib/ext2fs/ext2_fs.h b/lib/ext2fs/ext2_fs.h
index 0fc9c09a..f9dcf71f 100644
--- a/lib/ext2fs/ext2_fs.h
+++ b/lib/ext2fs/ext2_fs.h
@@ -346,7 +346,7 @@ struct ext2_dx_tail {
 #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
 /* EXT4_EOFBLOCKS_FL 0x00400000 was here */
 #define FS_NOCOW_FL			0x00800000 /* Do not cow file */
-#define EXT4_SNAPFILE_FL		0x01000000  /* Inode is a snapshot */
+#define FS_ATOMICWRITES_FL		0x01000000  /* Inode can do atomic writes */
 #define FS_DAX_FL			0x02000000 /* Inode is DAX */
 #define EXT4_SNAPFILE_DELETED_FL	0x04000000  /* Snapshot is being deleted */
 #define EXT4_SNAPFILE_SHRUNK_FL		0x08000000  /* Snapshot shrink has completed */
diff --git a/misc/chattr.1.in b/misc/chattr.1.in
index 50c54e7d..22757123 100644
--- a/misc/chattr.1.in
+++ b/misc/chattr.1.in
@@ -26,7 +26,7 @@ changes the file attributes on a Linux file system.
 The format of a symbolic
 .I mode
 is
-.BR +-= [ aAcCdDeFijmPsStTux ].
+.BR +-= [ aAcCdDeFijmPsStTuxW ].
 .PP
 The operator
 .RB ' + '
@@ -38,7 +38,7 @@ causes them to be removed; and
 causes them to be the only attributes that the files have.
 .PP
 The letters
-.RB ' aAcCdDeFijmPsStTux '
+.RB ' aAcCdDeFijmPsStTuxW '
 select the new attributes for the files:
 append only
 .RB ( a ),
@@ -74,8 +74,10 @@ top of directory hierarchy
 .RB ( T ),
 undeletable
 .RB ( u ),
-and direct access for files
-.RB ( x ).
+direct access for files
+.RB ( x ),
+and atomic writes for files.
+.RB ( W ).
 .PP
 The following attributes are read-only, and may be listed by
 .BR lsattr (1)
@@ -263,6 +265,14 @@ directory.  If an existing directory has contained some files and
 subdirectories, modifying the attribute on the parent directory doesn't
 change the attributes on these files and subdirectories.
 .TP
+.B W
+The 'W' attribute can only be set on a regular file. A file which has this
+attribute set can do untorn writes i.e. if an atomic write is requested by
+user with proper alignment and atomic flags set (such as RWF_ATOMIC), then
+a subsequent read to that block(s) will either read entire new data or entire
+old data (in case of a power failure). The block(s) written can never contain
+mix of both.
+.TP
 .B V
 A file with the 'V' attribute set has fs-verity enabled.  It cannot be
 written to, and the file system will automatically verify all data read
diff --git a/misc/chattr.c b/misc/chattr.c
index c7382a37..24db790e 100644
--- a/misc/chattr.c
+++ b/misc/chattr.c
@@ -86,7 +86,7 @@ static unsigned long sf;
 static void usage(void)
 {
 	fprintf(stderr,
-		_("Usage: %s [-RVf] [-+=aAcCdDeijPsStTuFx] [-p project] [-v version] files...\n"),
+		_("Usage: %s [-RVf] [-+=aAcCdDeijPsStTuFxW] [-p project] [-v version] files...\n"),
 		program_name);
 	exit(1);
 }
@@ -114,6 +114,7 @@ static const struct flags_char flags_array[] = {
 	{ EXT2_TOPDIR_FL, 'T' },
 	{ FS_NOCOW_FL, 'C' },
 	{ FS_DAX_FL, 'x' },
+	{ FS_ATOMICWRITES_FL, 'W' },
 	{ EXT4_CASEFOLD_FL, 'F' },
 	{ 0, 0 }
 };
--
2.39.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes
  2024-03-02  7:41   ` [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Ritesh Harjani (IBM)
@ 2024-03-04  0:59     ` Dave Chinner
  2024-03-08  7:19       ` Ojaswin Mujoo
  0 siblings, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2024-03-04  0:59 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, John Garry, linux-kernel

On Sat, Mar 02, 2024 at 01:11:59PM +0530, Ritesh Harjani (IBM) wrote:
> This reserves FS_ATOMICWRITES_FL for flags and adds support in
> fileattr to support atomic writes flag & xflag needed for ext4
> and xfs.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/ioctl.c               | 4 ++++
>  include/linux/fileattr.h | 4 ++--
>  include/uapi/linux/fs.h  | 1 +
>  3 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 76cf22ac97d7..e0f7fae4777e 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -481,6 +481,8 @@ void fileattr_fill_xflags(struct fileattr *fa, u32 xflags)
>  		fa->flags |= FS_DAX_FL;
>  	if (fa->fsx_xflags & FS_XFLAG_PROJINHERIT)
>  		fa->flags |= FS_PROJINHERIT_FL;
> +	if (fa->fsx_xflags & FS_XFLAG_ATOMICWRITES)
> +		fa->flags |= FS_ATOMICWRITES_FL;
>  }
>  EXPORT_SYMBOL(fileattr_fill_xflags);
>  
> @@ -511,6 +513,8 @@ void fileattr_fill_flags(struct fileattr *fa, u32 flags)
>  		fa->fsx_xflags |= FS_XFLAG_DAX;
>  	if (fa->flags & FS_PROJINHERIT_FL)
>  		fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
> +	if (fa->flags & FS_ATOMICWRITES_FL)
> +		fa->fsx_xflags |= FS_XFLAG_ATOMICWRITES;
>  }
>  EXPORT_SYMBOL(fileattr_fill_flags);
>  
> diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
> index 47c05a9851d0..ae9329afa46b 100644
> --- a/include/linux/fileattr.h
> +++ b/include/linux/fileattr.h
> @@ -7,12 +7,12 @@
>  #define FS_COMMON_FL \
>  	(FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \
>  	 FS_NODUMP_FL |	FS_NOATIME_FL | FS_DAX_FL | \
> -	 FS_PROJINHERIT_FL)
> +	 FS_PROJINHERIT_FL | FS_ATOMICWRITES_FL)
>  
>  #define FS_XFLAG_COMMON \
>  	(FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | FS_XFLAG_APPEND | \
>  	 FS_XFLAG_NODUMP | FS_XFLAG_NOATIME | FS_XFLAG_DAX | \
> -	 FS_XFLAG_PROJINHERIT)
> +	 FS_XFLAG_PROJINHERIT | FS_XFLAG_ATOMICWRITES)

I'd much prefer that we only use a single user API to set/clear this
flag.

This functionality is going to be tied to using extent size hints on
XFS to indicate preferred atomic IO alignment/size, so applications
are going to have to use the FS_IOC_FS{G,S}ETXATTR APIs regardless
of whether it's added to the FS_IOC_{G,S}ETFLAGS API.

Also, there are relatively few flags left in the SETFLAGS 32-bit
space, so this duplication seems like a waste of the few flags
that are remaining.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/8] iomap: Add atomic write support for direct-io
  2024-03-02  7:42   ` [RFC 3/8] iomap: Add atomic write support for direct-io Ritesh Harjani (IBM)
@ 2024-03-04  1:16     ` Dave Chinner
  2024-03-04  5:33       ` Ritesh Harjani
  0 siblings, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2024-03-04  1:16 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, John Garry, linux-kernel

On Sat, Mar 02, 2024 at 01:12:00PM +0530, Ritesh Harjani (IBM) wrote:
> This adds direct-io atomic writes support in iomap. This adds -
> 1. IOMAP_ATOMIC flag for iomap iter.
> 2. Sets REQ_ATOMIC to bio opflags.
> 3. Adds necessary checks in iomap_dio code to ensure a single bio is
>    submitted for an atomic write request. (since we only support ubuf
>    type iocb). Otherwise return an error EIO.
> 4. Adds a common helper routine iomap_dio_check_atomic(). It helps in
>    verifying mapped length and start/end physical offset against the hw
>    device constraints for supporting atomic writes.
> 
> This patch is based on a patch from John Garry <john.g.garry@oracle.com>
> which adds such support of DIO atomic writes to iomap.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/iomap/direct-io.c  | 75 +++++++++++++++++++++++++++++++++++++++++--
>  fs/iomap/trace.h      |  3 +-
>  include/linux/iomap.h |  1 +
>  3 files changed, 75 insertions(+), 4 deletions(-)

Ugh. Now we have two competing sets of changes to bring RWF_ATOMIC
support to iomap. One from John here:

https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/

and now this one.

Can the two of you please co-ordinate your efforts and based your
filesysetm work off the same iomap infrastructure changes?

.....

> @@ -356,6 +360,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
>  	if (need_zeroout) {
>  		/* zero out from the start of the block to the write offset */
>  		pad = pos & (fs_block_size - 1);
> +		if (unlikely(pad && atomic_write)) {
> +			WARN_ON_ONCE("pos not atomic write aligned\n");
> +			ret = -EINVAL;
> +			goto out;
> +		}

This atomic IO should have been rejected before it even got to
the layers where the bios are being built. If the IO alignment is
such that it does not align to filesystem allocation constraints, it
should be rejected at the filesystem ->write_iter() method and not
even get to the iomap layer.

.....

> @@ -516,6 +535,44 @@ static loff_t iomap_dio_iter(const struct iomap_iter *iter,
>  	}
>  }
>  
> +/*
> + * iomap_dio_check_atomic:	DIO Atomic checks before calling bio submission.
> + * @iter:			iomap iterator
> + * This function is called after filesystem block mapping and before bio
> + * formation/submission. This is the right place to verify hw device/block
> + * layer constraints to be followed for doing atomic writes. Hence do those
> + * common checks here.
> + */
> +static bool iomap_dio_check_atomic(struct iomap_iter *iter)
> +{
> +	struct block_device *bdev = iter->iomap.bdev;
> +	unsigned long long map_len = iomap_length(iter);
> +	unsigned long long start = iomap_sector(&iter->iomap, iter->pos)
> +						<< SECTOR_SHIFT;
> +	unsigned long long end = start + map_len - 1;
> +	unsigned int awu_min =
> +			queue_atomic_write_unit_min_bytes(bdev->bd_queue);
> +	unsigned int awu_max =
> +			queue_atomic_write_unit_max_bytes(bdev->bd_queue);
> +	unsigned long boundary =
> +			queue_atomic_write_boundary_bytes(bdev->bd_queue);
> +	unsigned long mask = ~(boundary - 1);
> +
> +
> +	/* map_len should be same as user specified iter->len */
> +	if (map_len < iter->len)
> +		return false;
> +	/* start should be aligned to block device min atomic unit alignment */
> +	if (!IS_ALIGNED(start, awu_min))
> +		return false;
> +	/* If top bits doesn't match, means atomic unit boundary is crossed */
> +	if (boundary && ((start | mask) != (end | mask)))
> +		return false;
> +
> +	return true;
> +}

I think you are re-implementing stuff that John has already done at
higher layers and in a generic manner. i.e.
generic_atomic_write_valid() in this patch:

https://lore.kernel.org/linux-fsdevel/20240226173612.1478858-4-john.g.garry@oracle.com/

We shouldn't be getting anywhere near the iomap layer if the IO is
not properly aligned to atomic IO constraints...

So, yeah, can you please co-ordinate the development of this
patchset with John and the work that has already been done to
support this functionality on block devices and XFS?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/8] iomap: Add atomic write support for direct-io
  2024-03-04  1:16     ` Dave Chinner
@ 2024-03-04  5:33       ` Ritesh Harjani
  2024-03-04  8:49         ` John Garry
  2024-03-04 20:56         ` Dave Chinner
  0 siblings, 2 replies; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-04  5:33 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, John Garry, linux-kernel

Dave Chinner <david@fromorbit.com> writes:

> On Sat, Mar 02, 2024 at 01:12:00PM +0530, Ritesh Harjani (IBM) wrote:
>> This adds direct-io atomic writes support in iomap. This adds -
>> 1. IOMAP_ATOMIC flag for iomap iter.
>> 2. Sets REQ_ATOMIC to bio opflags.
>> 3. Adds necessary checks in iomap_dio code to ensure a single bio is
>>    submitted for an atomic write request. (since we only support ubuf
>>    type iocb). Otherwise return an error EIO.
>> 4. Adds a common helper routine iomap_dio_check_atomic(). It helps in
>>    verifying mapped length and start/end physical offset against the hw
>>    device constraints for supporting atomic writes.
>> 
>> This patch is based on a patch from John Garry <john.g.garry@oracle.com>
>> which adds such support of DIO atomic writes to iomap.

Please note this comment above. I will refer this in below comments.

>> 
>> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> ---
>>  fs/iomap/direct-io.c  | 75 +++++++++++++++++++++++++++++++++++++++++--
>>  fs/iomap/trace.h      |  3 +-
>>  include/linux/iomap.h |  1 +
>>  3 files changed, 75 insertions(+), 4 deletions(-)
>
> Ugh. Now we have two competing sets of changes to bring RWF_ATOMIC
> support to iomap. One from John here:

Not competing changes (and neither that was the intention). As you see I have
commented above saying that this patch is based on a previous patch in
iomap from John. 

So why did I send this one?  
1. John's latest patch series v5 was on "block atomic writes" [1], which
does not have these checks in iomap (as it was not required). 

2. For sake of completeness for ext4 atomic write support, I needed to
include this change along with this series. I have also tried to address all
the review comments he got on [2] (along with an extra function iomap_dio_check_atomic())

[1]: https://lore.kernel.org/all/20240226173612.1478858-1-john.g.garry@oracle.com/
[2]: https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/

>
> https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/
>
> and now this one.
>
> Can the two of you please co-ordinate your efforts and based your
> filesysetm work off the same iomap infrastructure changes?

Sure Dave, make sense. But we are cc'ing each other in this effort
together so that we are aware of what is being worked upon. 

And as I mentioned, this change is not competing with John's change. If
at all it is only complementing his initial change, since this iomap change
addresses review comments from others on the previous one and added one
extra check (on mapped physical extent) which I wanted people to provide feedback on.

>
> .....
>
>> @@ -356,6 +360,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
>>  	if (need_zeroout) {
>>  		/* zero out from the start of the block to the write offset */
>>  		pad = pos & (fs_block_size - 1);
>> +		if (unlikely(pad && atomic_write)) {
>> +			WARN_ON_ONCE("pos not atomic write aligned\n");
>> +			ret = -EINVAL;
>> +			goto out;
>> +		}
>
> This atomic IO should have been rejected before it even got to
> the layers where the bios are being built. If the IO alignment is
> such that it does not align to filesystem allocation constraints, it
> should be rejected at the filesystem ->write_iter() method and not
> even get to the iomap layer.

I had added this mainly from iomap sanity checking perspective. 
We are offloading some checks to be made by the filesystem before
submitting the I/O request to iomap. 
These "common" checks in iomap layer are mainly to provide sanity checking
to make sure FS did it's job, before iomap could form/process the bios and then
do submit_bio to the block layer. 



>
> .....
>
>> @@ -516,6 +535,44 @@ static loff_t iomap_dio_iter(const struct iomap_iter *iter,
>>  	}
>>  }
>>  
>> +/*
>> + * iomap_dio_check_atomic:	DIO Atomic checks before calling bio submission.
>> + * @iter:			iomap iterator
>> + * This function is called after filesystem block mapping and before bio
>> + * formation/submission. This is the right place to verify hw device/block
>> + * layer constraints to be followed for doing atomic writes. Hence do those
>> + * common checks here.
>> + */
>> +static bool iomap_dio_check_atomic(struct iomap_iter *iter)
>> +{
>> +	struct block_device *bdev = iter->iomap.bdev;
>> +	unsigned long long map_len = iomap_length(iter);
>> +	unsigned long long start = iomap_sector(&iter->iomap, iter->pos)
>> +						<< SECTOR_SHIFT;
>> +	unsigned long long end = start + map_len - 1;
>> +	unsigned int awu_min =
>> +			queue_atomic_write_unit_min_bytes(bdev->bd_queue);
>> +	unsigned int awu_max =
>> +			queue_atomic_write_unit_max_bytes(bdev->bd_queue);
>> +	unsigned long boundary =
>> +			queue_atomic_write_boundary_bytes(bdev->bd_queue);
>> +	unsigned long mask = ~(boundary - 1);
>> +
>> +
>> +	/* map_len should be same as user specified iter->len */
>> +	if (map_len < iter->len)
>> +		return false;
>> +	/* start should be aligned to block device min atomic unit alignment */
>> +	if (!IS_ALIGNED(start, awu_min))
>> +		return false;
>> +	/* If top bits doesn't match, means atomic unit boundary is crossed */
>> +	if (boundary && ((start | mask) != (end | mask)))
>> +		return false;
>> +
>> +	return true;
>> +}
>
> I think you are re-implementing stuff that John has already done at
> higher layers and in a generic manner. i.e.
> generic_atomic_write_valid() in this patch:
>
> https://lore.kernel.org/linux-fsdevel/20240226173612.1478858-4-john.g.garry@oracle.com/
>
> We shouldn't be getting anywhere near the iomap layer if the IO is
> not properly aligned to atomic IO constraints...

So current generic_atomic_write_valid() function mainly checks alignment
w.r.t logical offset and iter->len. 

What this function was checking was on the physical block offset and
mapped extent length. Hence it was made after iomap_iter() call.
i.e. ...

 +	/* map_len should be same as user specified iter->len */
 +	if (map_len < iter->len)
 +		return false;
 +	/* start should be aligned to block device min atomic unit alignment */
 +	if (!IS_ALIGNED(start, awu_min))
 +		return false;


But I agree, that maybe we can improve generic_atomic_write_valid()
to be able to work on both logical and physical offset and
iter->len + mapped len. 

Let me think about it. 

However, the point on which I would like a feedback from others is - 
1. After filesystem has returned the mapped extent in iomap_iter() call,
iomap will be forming a bio to be sent to the block layer.
So do we agree to add a check here in iomap layer to verify that the
mapped physical start and len should satisfy the requirements for doing
atomic writes?

>
> So, yeah, can you please co-ordinate the development of this
> patchset with John and the work that has already been done to
> support this functionality on block devices and XFS?

We actually are in a way. If you see this ext4 series is sitting on top of
John's v5 series of "block atomic write". This patch [1] ([RFC 5/8] part
of this series), in ext4 does use generic_atomic_write_valid() function
for DIO atomic write validity.

[1]: https://lore.kernel.org/linux-ext4/e332979deb70913c2c476a059b09015904a5b007.1709361537.git.ritesh.list@gmail.com/T/#u


Thanks for your review!

-ritesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/8] iomap: Add atomic write support for direct-io
  2024-03-04  5:33       ` Ritesh Harjani
@ 2024-03-04  8:49         ` John Garry
  2024-03-04 10:31           ` Ritesh Harjani
  2024-03-04 20:56         ` Dave Chinner
  1 sibling, 1 reply; 28+ messages in thread
From: John Garry @ 2024-03-04  8:49 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), Dave Chinner
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, linux-kernel


>>
>> https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/__;!!ACWV5N9M2RV99hQ!PqMMFBeUqdWwlm0AxVyI_Vr1HPajTQ6AG2_GwK_IrhBSa-Wnz4cc-1w0LEFyTXY9Q9gT0WwhxvXloSqnOHb6Btg$
>>
>> and now this one.
>>
>> Can the two of you please co-ordinate your efforts and based your
>> filesysetm work off the same iomap infrastructure changes?
> 
> Sure Dave, make sense. But we are cc'ing each other in this effort
> together so that we are aware of what is being worked upon.

Just cc'ing is not enough. I was going to send my v2 for XFS/iomap 
support today. I didn't announce that as I did not think that I had to. 
Admittedly it will be effectively an RFC, as the forcealign feature (now 
included) is not mature. But it's going to be a bit awkward to have 2x 
overlapping series' sent to the list.

FWIW, I think that it's better to send series based on top of other 
series, rather than cherry-picking necessary parts of other series (when 
posting)

Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/8] iomap: Add atomic write support for direct-io
  2024-03-04  8:49         ` John Garry
@ 2024-03-04 10:31           ` Ritesh Harjani
  0 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-04 10:31 UTC (permalink / raw)
  To: John Garry, Dave Chinner
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, linux-kernel

John Garry <john.g.garry@oracle.com> writes:

>>>
>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/__;!!ACWV5N9M2RV99hQ!PqMMFBeUqdWwlm0AxVyI_Vr1HPajTQ6AG2_GwK_IrhBSa-Wnz4cc-1w0LEFyTXY9Q9gT0WwhxvXloSqnOHb6Btg$
>>>
>>> and now this one.
>>>
>>> Can the two of you please co-ordinate your efforts and based your
>>> filesysetm work off the same iomap infrastructure changes?
>> 
>> Sure Dave, make sense. But we are cc'ing each other in this effort
>> together so that we are aware of what is being worked upon.
>
> Just cc'ing is not enough. I was going to send my v2 for XFS/iomap 
> support today. I didn't announce that as I did not think that I had to. 

ok. Let me take care of this next time to avoid any overlapping change
hitting the mailing list to avoid double reviews/competing changes from
2 people. Hopefully I can find you on xfs IRC channel in case if I would
like to post anything in the related/overlapping area . My handle is riteshh. 

> Admittedly it will be effectively an RFC, as the forcealign feature (now 
> included) is not mature. But it's going to be a bit awkward to have 2x 
> overlapping series' sent to the list.
>
> FWIW, I think that it's better to send series based on top of other 
> series, rather than cherry-picking necessary parts of other series (when 
> posting)
>

Ok. Sure John. Make sense. Now that I understood what I am looking for
in from iomap side of the changes, I can provide my review comments to
your series, whenever you post them.

Thanks for your feedback.

-ritesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 6/8] ext4: Add an inode flag for atomic writes
  2024-03-02  7:42   ` [RFC 6/8] ext4: Add an inode flag for atomic writes Ritesh Harjani (IBM)
@ 2024-03-04 20:34     ` Dave Chinner
  2024-03-08  8:02       ` Ritesh Harjani
  0 siblings, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2024-03-04 20:34 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, John Garry, linux-kernel

On Sat, Mar 02, 2024 at 01:12:03PM +0530, Ritesh Harjani (IBM) wrote:
> This patch adds an inode atomic writes flag to ext4
> (EXT4_ATOMICWRITES_FL which uses FS_ATOMICWRITES_FL flag).
> Also add support for setting of this flag via ioctl.
> 
> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> ---
>  fs/ext4/ext4.h  |  6 ++++++
>  fs/ext4/ioctl.c | 11 +++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1d2bce26e616..aa7fff2d6f96 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -495,8 +495,12 @@ struct flex_groups {
>  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
>  /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
>  
> +#define EXT4_ATOMICWRITES_FL		FS_ATOMICWRITES_FL /* Inode supports atomic writes */
>  #define EXT4_DAX_FL			0x02000000 /* Inode is DAX */

Tying the on disk format to the kernel user API is a poor choice.
While the flag bits might have the same value, anything parsing the
on-disk format should not be required to include kernel syscall API
header files just to get all the on-disk format definitions it
needs.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/8] iomap: Add atomic write support for direct-io
  2024-03-04  5:33       ` Ritesh Harjani
  2024-03-04  8:49         ` John Garry
@ 2024-03-04 20:56         ` Dave Chinner
  1 sibling, 0 replies; 28+ messages in thread
From: Dave Chinner @ 2024-03-04 20:56 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, John Garry, linux-kernel

On Mon, Mar 04, 2024 at 11:03:24AM +0530, Ritesh Harjani wrote:
> Dave Chinner <david@fromorbit.com> writes:
> 
> > On Sat, Mar 02, 2024 at 01:12:00PM +0530, Ritesh Harjani (IBM) wrote:
> >> This adds direct-io atomic writes support in iomap. This adds -
> >> 1. IOMAP_ATOMIC flag for iomap iter.
> >> 2. Sets REQ_ATOMIC to bio opflags.
> >> 3. Adds necessary checks in iomap_dio code to ensure a single bio is
> >>    submitted for an atomic write request. (since we only support ubuf
> >>    type iocb). Otherwise return an error EIO.
> >> 4. Adds a common helper routine iomap_dio_check_atomic(). It helps in
> >>    verifying mapped length and start/end physical offset against the hw
> >>    device constraints for supporting atomic writes.
> >> 
> >> This patch is based on a patch from John Garry <john.g.garry@oracle.com>
> >> which adds such support of DIO atomic writes to iomap.
> 
> Please note this comment above. I will refer this in below comments.
> 
> >> 
> >> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> >> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> >> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> >> ---
> >>  fs/iomap/direct-io.c  | 75 +++++++++++++++++++++++++++++++++++++++++--
> >>  fs/iomap/trace.h      |  3 +-
> >>  include/linux/iomap.h |  1 +
> >>  3 files changed, 75 insertions(+), 4 deletions(-)
> >
> > Ugh. Now we have two competing sets of changes to bring RWF_ATOMIC
> > support to iomap. One from John here:
> 
> Not competing changes (and neither that was the intention). As you see I have
> commented above saying that this patch is based on a previous patch in
> iomap from John. 

That's not the same as co-ordinating development or collaboration on
common aspects of the functionality required.

> So why did I send this one?  
> 1. John's latest patch series v5 was on "block atomic writes" [1], which
> does not have these checks in iomap (as it was not required). 
> 
> 2. For sake of completeness for ext4 atomic write support, I needed to
> include this change along with this series. I have also tried to address all
> the review comments he got on [2] (along with an extra function iomap_dio_check_atomic())
> 
> [1]: https://lore.kernel.org/all/20240226173612.1478858-1-john.g.garry@oracle.com/
> [2]: https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/

Yes, but you've clearly not seen the feedback that John has been
given because otherwise you would not have implemented things the
way you did.

That's my point - you're operating in isolation, and forcing
reviewers now to deal with two separate patch sets with overlapping
funcitonality and similar problems.

> > https://lore.kernel.org/linux-fsdevel/20240124142645.9334-1-john.g.garry@oracle.com/
> >
> > and now this one.
> >
> > Can the two of you please co-ordinate your efforts and based your
> > filesysetm work off the same iomap infrastructure changes?
> 
> Sure Dave, make sense. But we are cc'ing each other in this effort
> together so that we are aware of what is being worked upon. 

"ccing each other" is not the same as actively collaborating on
development.

> And as I mentioned, this change is not competing with John's change. If
> at all it is only complementing his initial change, since this iomap change
> addresses review comments from others on the previous one and added one
> extra check (on mapped physical extent) which I wanted people to provide feedback on.
> 
> >
> > .....
> >
> >> @@ -356,6 +360,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter,
> >>  	if (need_zeroout) {
> >>  		/* zero out from the start of the block to the write offset */
> >>  		pad = pos & (fs_block_size - 1);
> >> +		if (unlikely(pad && atomic_write)) {
> >> +			WARN_ON_ONCE("pos not atomic write aligned\n");
> >> +			ret = -EINVAL;
> >> +			goto out;
> >> +		}
> >
> > This atomic IO should have been rejected before it even got to
> > the layers where the bios are being built. If the IO alignment is
> > such that it does not align to filesystem allocation constraints, it
> > should be rejected at the filesystem ->write_iter() method and not
> > even get to the iomap layer.
> 
> I had added this mainly from iomap sanity checking perspective. 
> We are offloading some checks to be made by the filesystem before
> submitting the I/O request to iomap. 
> These "common" checks in iomap layer are mainly to provide sanity checking
> to make sure FS did it's job, before iomap could form/process the bios and then
> do submit_bio to the block layer. 

If you read the feedback John had been given, you'd know that
alignment verification for atomic writes belongs in the filesystem
before it even calls into iomap. See these two patches in the XFS
series he just sent out:

https://lore.kernel.org/linux-xfs/20240304130428.13026-11-john.g.garry@oracle.com/T/#u
https://lore.kernel.org/linux-xfs/20240304130428.13026-14-john.g.garry@oracle.com/T/#u

> > .....
> >
> >> @@ -516,6 +535,44 @@ static loff_t iomap_dio_iter(const struct iomap_iter *iter,
> >>  	}
> >>  }
> >>  
> >> +/*
> >> + * iomap_dio_check_atomic:	DIO Atomic checks before calling bio submission.
> >> + * @iter:			iomap iterator
> >> + * This function is called after filesystem block mapping and before bio
> >> + * formation/submission. This is the right place to verify hw device/block
> >> + * layer constraints to be followed for doing atomic writes. Hence do those
> >> + * common checks here.
> >> + */
> >> +static bool iomap_dio_check_atomic(struct iomap_iter *iter)
> >> +{
> >> +	struct block_device *bdev = iter->iomap.bdev;
> >> +	unsigned long long map_len = iomap_length(iter);
> >> +	unsigned long long start = iomap_sector(&iter->iomap, iter->pos)
> >> +						<< SECTOR_SHIFT;
> >> +	unsigned long long end = start + map_len - 1;
> >> +	unsigned int awu_min =
> >> +			queue_atomic_write_unit_min_bytes(bdev->bd_queue);
> >> +	unsigned int awu_max =
> >> +			queue_atomic_write_unit_max_bytes(bdev->bd_queue);
> >> +	unsigned long boundary =
> >> +			queue_atomic_write_boundary_bytes(bdev->bd_queue);
> >> +	unsigned long mask = ~(boundary - 1);
> >> +
> >> +
> >> +	/* map_len should be same as user specified iter->len */
> >> +	if (map_len < iter->len)
> >> +		return false;
> >> +	/* start should be aligned to block device min atomic unit alignment */
> >> +	if (!IS_ALIGNED(start, awu_min))
> >> +		return false;
> >> +	/* If top bits doesn't match, means atomic unit boundary is crossed */
> >> +	if (boundary && ((start | mask) != (end | mask)))
> >> +		return false;
> >> +
> >> +	return true;
> >> +}
> >
> > I think you are re-implementing stuff that John has already done at
> > higher layers and in a generic manner. i.e.
> > generic_atomic_write_valid() in this patch:
> >
> > https://lore.kernel.org/linux-fsdevel/20240226173612.1478858-4-john.g.garry@oracle.com/
> >
> > We shouldn't be getting anywhere near the iomap layer if the IO is
> > not properly aligned to atomic IO constraints...
> 
> So current generic_atomic_write_valid() function mainly checks alignment
> w.r.t logical offset and iter->len. 
> 
> What this function was checking was on the physical block offset and
> mapped extent length. Hence it was made after iomap_iter() call.
> i.e. ...

The filesystem is supposed to guarantee the alignment of the iomap
returned for mapping requests on inodes configured for atomic
writes. IOWs, if the filesystem returns an unaligned or short extent
for an atomic write enabled inode, the filesystem mapping operation
is buggy. If it can't map aligned extents, then it should return an
error, not leave crap for the iomap infrastructure to have to clean
up.

> 
>  +	/* map_len should be same as user specified iter->len */
>  +	if (map_len < iter->len)
>  +		return false;
>  +	/* start should be aligned to block device min atomic unit alignment */
>  +	if (!IS_ALIGNED(start, awu_min))
>  +		return false;
> 
> 
> But I agree, that maybe we can improve generic_atomic_write_valid()
> to be able to work on both logical and physical offset and
> iter->len + mapped len. 
> Let me think about it. 
> 
> However, the point on which I would like a feedback from others is - 
> 1. After filesystem has returned the mapped extent in iomap_iter() call,
> iomap will be forming a bio to be sent to the block layer.
> So do we agree to add a check here in iomap layer to verify that the
> mapped physical start and len should satisfy the requirements for doing
> atomic writes?

That's entirely the problem about you working on this in isolation:
we've already had that discussion and the simplest solution is that
this is a filesystem problem, not an iomap problem. That is, if the
filesystem cannot return a correctly aligned and sized extent for an
atomic write enabled inode, it must return an error and not a
malformed iomap.

IOWs, it's not the job of the iomap IO routines to enforce mapping
alignment on these inodes - the extent alignment must always be
correct for atomic writes regardless of whether an atomic write IO
is being done or not. Failure to align any extent in the inode
correctly will result in future atomic writes to that offset being
impossible to issue.

Hence if the inode is configured for atomic writes, it *must* return
aligned and sized iomaps that atomic writes can be issued against.
It's a filesystem implementation bug if this invariant is violated,
so the filesystem implementation is where all the debug checks need
to be to ensure it never returns an invalid mapping to the iomap
infrastructure.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 4/8] ext4: Add statx and other atomic write helper routines
  2024-03-02  7:42   ` [RFC 4/8] ext4: Add statx and other atomic write helper routines Ritesh Harjani (IBM)
@ 2024-03-06 11:14     ` John Garry
  2024-03-08  8:10       ` Ritesh Harjani
  0 siblings, 1 reply; 28+ messages in thread
From: John Garry @ 2024-03-06 11:14 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel

On 02/03/2024 07:42, Ritesh Harjani (IBM) wrote:
>   	}
>   
> +	if (request_mask & STATX_WRITE_ATOMIC) {
> +		unsigned int fsawu_min = 0, fsawu_max = 0;
> +
> +		/*
> +		 * Get fsawu_[min|max] value which we can advertise to userspace
> +		 * in statx call, if we support atomic writes using
> +		 * EXT4_MF_ATOMIC_WRITE_FSAWU.
> +		 */
> +		if (ext4_can_atomic_write_fsawu(inode->i_sb)) {

To me, it does not make sense to fill this in unless 
EXT4_INODE_ATOMIC_WRITE is also set for the inode.

> +			ext4_atomic_write_fsawu(inode->i_sb, &fsawu_min,
> +						&fsawu_max);
> +		}
> +
> +		generic_fill_statx_atomic_writes(stat, fsawu_min, fsawu_max);
> +	}
> +
>   	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/9] ext4: Add direct-io atomic write support using fsawu
  2024-03-02  7:41 [RFC 0/9] ext4: Add direct-io atomic write support using fsawu Ritesh Harjani (IBM)
  2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
  2024-03-02  7:42 ` [RFC 9/9] e2fsprogs/chattr: Supports atomic writes attribute Ritesh Harjani (IBM)
@ 2024-03-06 11:22 ` John Garry
  2024-03-06 13:13   ` Ritesh Harjani
  2024-03-08 20:25   ` [RFC] ext4: Add support for ext4_map_blocks_atomic() Ritesh Harjani (IBM)
  2 siblings, 2 replies; 28+ messages in thread
From: John Garry @ 2024-03-06 11:22 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel

On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote:
> Hello all,
> 
> This RFC series adds support for atomic writes to ext4 direct-io using
> filesystem atomic write unit. It's built on top of John's "block atomic
> write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables
> atomic write support in underlying device driver and block layer.
> 
> This series uses the same RWF_ATOMIC interface for adding atomic write support
> to ext4's direct-io path. One can utilize it by 2 of the methods explained below.
> ((1)mkfs.ext4 -b <BS>, (2) with bigalloc).
> 
> Filesystem atomic write unit (fsawu):
> ============================================
> Atomic writes within ext4 can be supported using below 3 methods -
> 1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize),
>     we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k).
>     Now if the underlying HW device supports atomic writes, than a corresponding
>     blocksize can be chosen as a filesystem atomic write unit (fsawu) which
>     should be within the underlying hw defined [awu_min, awu_max] range.
>     For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k)
> 
>     On a smaller pagesize system this can be utilized when support for LBS is
>     complete (on ext4).
> 
> 2. EXT4 already supports a feature called bigalloc. In that ext4 can handle
>     allocation in cluster size units. So for e.g. we can create a filesystem with
>     4k blocksize but with 64k clustersize. Such a configuration can also be used
>     to support atomic writes if the underlying hw device supports it.
>     In such case the fsawu_min will most likely be the filesystem blocksize and
>     fsawu_max will mostly likely be the cluster size.
> 
>     So a user can do an atomic write of any size between [fsawu_min, fsawu_max]
>     range as long as it satisfies other constraints being laid out by HW device
>     (or by software stack) to support atomic writes.
>     e.g. len should be a power of 2, pos % len should be naturally
>     aligned and [start | end] (phys offsets) should not straddle over
>     an atomic write boundary.

JFYI, I gave this a quick try, and it seems to work ok. Naturally it 
suffers from the same issue discussed at 
https://lore.kernel.org/linux-fsdevel/434c570e-39b2-4f1c-9b49-ac5241d310ca@oracle.com/ 
with regards to writing to partially written extents, which I have tried 
to address properly in my v2 for that same series.

Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 0/9] ext4: Add direct-io atomic write support using fsawu
  2024-03-06 11:22 ` [RFC 0/9] ext4: Add direct-io atomic write support using fsawu John Garry
@ 2024-03-06 13:13   ` Ritesh Harjani
  2024-03-08 20:25   ` [RFC] ext4: Add support for ext4_map_blocks_atomic() Ritesh Harjani (IBM)
  1 sibling, 0 replies; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-06 13:13 UTC (permalink / raw)
  To: John Garry, linux-fsdevel, linux-ext4, Dave Chinner
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel

John Garry <john.g.garry@oracle.com> writes:

> On 02/03/2024 07:41, Ritesh Harjani (IBM) wrote:
>> Hello all,
>> 
>> This RFC series adds support for atomic writes to ext4 direct-io using
>> filesystem atomic write unit. It's built on top of John's "block atomic
>> write v5" series which adds RWF_ATOMIC flag interface to pwritev2() and enables
>> atomic write support in underlying device driver and block layer.
>> 
>> This series uses the same RWF_ATOMIC interface for adding atomic write support
>> to ext4's direct-io path. One can utilize it by 2 of the methods explained below.
>> ((1)mkfs.ext4 -b <BS>, (2) with bigalloc).
>> 
>> Filesystem atomic write unit (fsawu):
>> ============================================
>> Atomic writes within ext4 can be supported using below 3 methods -
>> 1. On a large pagesize system (e.g. Power with 64k pagesize or aarch64 with 64k pagesize),
>>     we can mkfs using different blocksizes. e.g. mkfs.ext4 -b <4k/8k/16k/32k/64k).
>>     Now if the underlying HW device supports atomic writes, than a corresponding
>>     blocksize can be chosen as a filesystem atomic write unit (fsawu) which
>>     should be within the underlying hw defined [awu_min, awu_max] range.
>>     For such filesystem, fsawu_[min|max] both are equal to blocksize (e.g. 16k)
>> 
>>     On a smaller pagesize system this can be utilized when support for LBS is
>>     complete (on ext4).
>> 
>> 2. EXT4 already supports a feature called bigalloc. In that ext4 can handle
>>     allocation in cluster size units. So for e.g. we can create a filesystem with
>>     4k blocksize but with 64k clustersize. Such a configuration can also be used
>>     to support atomic writes if the underlying hw device supports it.
>>     In such case the fsawu_min will most likely be the filesystem blocksize and
>>     fsawu_max will mostly likely be the cluster size.
>> 
>>     So a user can do an atomic write of any size between [fsawu_min, fsawu_max]
>>     range as long as it satisfies other constraints being laid out by HW device
>>     (or by software stack) to support atomic writes.
>>     e.g. len should be a power of 2, pos % len should be naturally
>>     aligned and [start | end] (phys offsets) should not straddle over
>>     an atomic write boundary.
>
> JFYI, I gave this a quick try, and it seems to work ok. Naturally it 

Thanks John for giving this a try!

> suffers from the same issue discussed at 
> https://lore.kernel.org/linux-fsdevel/434c570e-39b2-4f1c-9b49-ac5241d310ca@oracle.com/ 
> with regards to writing to partially written extents, which I have tried 
> to address properly in my v2 for that same series.

I did go through other revisions, but I guess I missed going through this series.

Thanks Dave & John for your comments over the series.
Let me go through the revisions I have missed and John's latest revision.
I will update this series accordingly.

Appreciate your help!
-ritesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes
  2024-03-04  0:59     ` Dave Chinner
@ 2024-03-08  7:19       ` Ojaswin Mujoo
  0 siblings, 0 replies; 28+ messages in thread
From: Ojaswin Mujoo @ 2024-03-08  7:19 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Ritesh Harjani (IBM),
	linux-fsdevel, linux-ext4, Jan Kara, Theodore Ts'o,
	Matthew Wilcox, Darrick J . Wong, Luis Chamberlain, John Garry,
	linux-kernel

On Mon, Mar 04, 2024 at 11:59:02AM +1100, Dave Chinner wrote:
> On Sat, Mar 02, 2024 at 01:11:59PM +0530, Ritesh Harjani (IBM) wrote:
> > This reserves FS_ATOMICWRITES_FL for flags and adds support in
> > fileattr to support atomic writes flag & xflag needed for ext4
> > and xfs.
> > 
> > Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> > ---
> >  fs/ioctl.c               | 4 ++++
> >  include/linux/fileattr.h | 4 ++--
> >  include/uapi/linux/fs.h  | 1 +
> >  3 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > index 76cf22ac97d7..e0f7fae4777e 100644
> > --- a/fs/ioctl.c
> > +++ b/fs/ioctl.c
> > @@ -481,6 +481,8 @@ void fileattr_fill_xflags(struct fileattr *fa, u32 xflags)
> >  		fa->flags |= FS_DAX_FL;
> >  	if (fa->fsx_xflags & FS_XFLAG_PROJINHERIT)
> >  		fa->flags |= FS_PROJINHERIT_FL;
> > +	if (fa->fsx_xflags & FS_XFLAG_ATOMICWRITES)
> > +		fa->flags |= FS_ATOMICWRITES_FL;
> >  }
> >  EXPORT_SYMBOL(fileattr_fill_xflags);
> >  
> > @@ -511,6 +513,8 @@ void fileattr_fill_flags(struct fileattr *fa, u32 flags)
> >  		fa->fsx_xflags |= FS_XFLAG_DAX;
> >  	if (fa->flags & FS_PROJINHERIT_FL)
> >  		fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
> > +	if (fa->flags & FS_ATOMICWRITES_FL)
> > +		fa->fsx_xflags |= FS_XFLAG_ATOMICWRITES;
> >  }
> >  EXPORT_SYMBOL(fileattr_fill_flags);
> >  
> > diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
> > index 47c05a9851d0..ae9329afa46b 100644
> > --- a/include/linux/fileattr.h
> > +++ b/include/linux/fileattr.h
> > @@ -7,12 +7,12 @@
> >  #define FS_COMMON_FL \
> >  	(FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \
> >  	 FS_NODUMP_FL |	FS_NOATIME_FL | FS_DAX_FL | \
> > -	 FS_PROJINHERIT_FL)
> > +	 FS_PROJINHERIT_FL | FS_ATOMICWRITES_FL)
> >  
> >  #define FS_XFLAG_COMMON \
> >  	(FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | FS_XFLAG_APPEND | \
> >  	 FS_XFLAG_NODUMP | FS_XFLAG_NOATIME | FS_XFLAG_DAX | \
> > -	 FS_XFLAG_PROJINHERIT)
> > +	 FS_XFLAG_PROJINHERIT | FS_XFLAG_ATOMICWRITES)
> 
> I'd much prefer that we only use a single user API to set/clear this
> flag.

Hi Dave,

So right now we have 2 ways to mark this flag in ext4:

1. SETFLAGS ioctl() w/ FS_ATOMICWRITES_FL -> set EXT4_ATOMICWRITES_FL on inode
2. SETXFLAGS ioctl() w/ FS_XFLAG_ATOMICWRITES -> translate to FS_ATOMICWRITES_FL -> set EXT4_ATOMICWRITES_FL on inode

IIUC you want to only keep 2. and not support 1. so the user space only
has a single ioctl to use, correct?

One thing I see is that the ext4_fileattr_set() is not XFLAGS aware
at all and right now it expects the XFLAGS to already be translated to 
SETFLAG equivalent before setting it in the inode. Maybe we'll need
to add that logic however it'll be more of an exception than the usual 
pattern.

> 
> This functionality is going to be tied to using extent size hints on
> XFS to indicate preferred atomic IO alignment/size, so applications
> are going to have to use the FS_IOC_FS{G,S}ETXATTR APIs regardless
> of whether it's added to the FS_IOC_{G,S}ETFLAGS API.

Hmm that's right, I'm not sure how we'll handle it in ext4 yet since we
don't have a per file extent size hint, the closest we have is bigalloc
that is more of an mkfs time, FS wide feature. 

Regards,
ojasw
> 
> Also, there are relatively few flags left in the SETFLAGS 32-bit
> space, so this duplication seems like a waste of the few flags
> that are remaining.

> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 6/8] ext4: Add an inode flag for atomic writes
  2024-03-04 20:34     ` Dave Chinner
@ 2024-03-08  8:02       ` Ritesh Harjani
  0 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-08  8:02 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-fsdevel, linux-ext4, Ojaswin Mujoo, Jan Kara,
	Theodore Ts'o, Matthew Wilcox, Darrick J . Wong,
	Luis Chamberlain, John Garry, linux-kernel

Dave Chinner <david@fromorbit.com> writes:

> On Sat, Mar 02, 2024 at 01:12:03PM +0530, Ritesh Harjani (IBM) wrote:
>> This patch adds an inode atomic writes flag to ext4
>> (EXT4_ATOMICWRITES_FL which uses FS_ATOMICWRITES_FL flag).
>> Also add support for setting of this flag via ioctl.
>> 
>> Co-developed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> ---
>>  fs/ext4/ext4.h  |  6 ++++++
>>  fs/ext4/ioctl.c | 11 +++++++++++
>>  2 files changed, 17 insertions(+)
>> 
>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>> index 1d2bce26e616..aa7fff2d6f96 100644
>> --- a/fs/ext4/ext4.h
>> +++ b/fs/ext4/ext4.h
>> @@ -495,8 +495,12 @@ struct flex_groups {
>>  #define EXT4_EA_INODE_FL	        0x00200000 /* Inode used for large EA */
>>  /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */
>>  
>> +#define EXT4_ATOMICWRITES_FL		FS_ATOMICWRITES_FL /* Inode supports atomic writes */
>>  #define EXT4_DAX_FL			0x02000000 /* Inode is DAX */
>
> Tying the on disk format to the kernel user API is a poor choice.
> While the flag bits might have the same value, anything parsing the
> on-disk format should not be required to include kernel syscall API
> header files just to get all the on-disk format definitions it
> needs.

sure. Make sense.
I will hardcode that value.

-ritesh

>
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 4/8] ext4: Add statx and other atomic write helper routines
  2024-03-06 11:14     ` John Garry
@ 2024-03-08  8:10       ` Ritesh Harjani
  0 siblings, 0 replies; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-08  8:10 UTC (permalink / raw)
  To: John Garry, linux-fsdevel, linux-ext4
  Cc: Ojaswin Mujoo, Jan Kara, Theodore Ts'o, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel

John Garry <john.g.garry@oracle.com> writes:

> On 02/03/2024 07:42, Ritesh Harjani (IBM) wrote:
>>   	}
>>   
>> +	if (request_mask & STATX_WRITE_ATOMIC) {
>> +		unsigned int fsawu_min = 0, fsawu_max = 0;
>> +
>> +		/*
>> +		 * Get fsawu_[min|max] value which we can advertise to userspace
>> +		 * in statx call, if we support atomic writes using
>> +		 * EXT4_MF_ATOMIC_WRITE_FSAWU.
>> +		 */
>> +		if (ext4_can_atomic_write_fsawu(inode->i_sb)) {
>
> To me, it does not make sense to fill this in unless 
> EXT4_INODE_ATOMIC_WRITE is also set for the inode.
>

I was thinking advertising filesystem atomic write unit on an inode
could still be advertized. But I don't have any strong objection either.
We can advertize this values only when the inode has the atomic write
attribute enabled. I think this makes more sense. 

Thanks
-ritesh


>> +			ext4_atomic_write_fsawu(inode->i_sb, &fsawu_min,
>> +						&fsawu_max);
>> +		}
>> +
>> +		generic_fill_statx_atomic_writes(stat, fsawu_min, fsawu_max);
>> +	}
>> +
>>   	flags = ei->i_flags & EXT4_FL_USER_VISIBLE;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC] ext4: Add support for ext4_map_blocks_atomic()
  2024-03-06 11:22 ` [RFC 0/9] ext4: Add direct-io atomic write support using fsawu John Garry
  2024-03-06 13:13   ` Ritesh Harjani
@ 2024-03-08 20:25   ` Ritesh Harjani (IBM)
  2024-03-09  2:37     ` Ritesh Harjani
  2024-03-13 18:40     ` John Garry
  1 sibling, 2 replies; 28+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-03-08 20:25 UTC (permalink / raw)
  To: John Garry, linux-fsdevel, linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Ojaswin Mujoo, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel, Dave Chinner,
	Ritesh Harjani (IBM)

Currently ext4 exposes [fsawu_min, fsawu_max] size as
[blocksize, clustersize] (given the hw block device constraints are
larger than FS atomic write units).

That means a user should be allowed to -
1. pwrite 0 4k /mnt/test/f1
2. pwrite 0 16k /mnt/test/f1

w/o this patch the second atomic write will fail. Since current
ext4_map_blocks() will just return the already allocated extent length
to the iomap (which is less than the user requested write length).

So add ext4_map_blocks_atomic() function which can allocate full
requested length for doing an atomic write before returning to iomap.
With this we have - 

1. touch /mnt1/test/f2
2. chattr +W /mnt1/test/f2
3. xfs_io -dc "pwrite -b 4k -A -V 1 0 4k" /mnt1/test/f2
	wrote 4096/4096 bytes at offset 0
	4 KiB, 1 ops; 0.0320 sec (124.630 KiB/sec and 31.1575 ops/sec)
4. filefrag -v /mnt1/test/f2
	Filesystem type is: ef53
	File size of /mnt1/test/f2 is 4096 (1 block of 4096 bytes)
	 ext:     logical_offset:        physical_offset: length:   expected: flags:
	   0:        0..       0:       9728..      9728:      1:             last,eof
	/mnt1/test/f2: 1 extent found
5. xfs_io -dc "pwrite -b 16k -A -V 1 0 16k" /mnt1/test/f2
	wrote 16384/16384 bytes at offset 0
	16 KiB, 1 ops; 0.0337 sec (474.637 KiB/sec and 29.6648 ops/sec)
6. filefrag -v /mnt1/test/f2
	Filesystem type is: ef53
	File size of /mnt1/test/f2 is 16384 (4 blocks of 4096 bytes)
	 ext:     logical_offset:        physical_offset: length:   expected: flags:
	   0:        0..       3:       9728..      9731:      4:             last,eof
	/mnt1/test/f2: 1 extent found

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---

Please note, that this is very minimal tested. But it serves as a PoC of what
can be done within ext4 to allow the usecase which John pointed out.

This also shows that every filesystem can have a different ways of doing aligned
allocations to support atomic writes. So lifting extent size hints to iomap
perhaps might become very XFS centric? Althouh as long as other filesystems are 
not forced to follow that, I don't think it should be a problem.


 fs/ext4/ext4.h  |  2 ++
 fs/ext4/inode.c | 40 +++++++++++++++++++++++++++++++++++++---
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 529ca32b9813..1e9adc5d6569 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3702,6 +3702,8 @@ extern int ext4_convert_unwritten_io_end_vec(handle_t *handle,
 					     ext4_io_end_t *io_end);
 extern int ext4_map_blocks(handle_t *handle, struct inode *inode,
 			   struct ext4_map_blocks *map, int flags);
+extern int ext4_map_blocks_atomic(handle_t *handle, struct inode *inode,
+				  struct ext4_map_blocks *map, int flags);
 extern int ext4_ext_calc_credits_for_single_extent(struct inode *inode,
 						   int num,
 						   struct ext4_ext_path *path);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ea009ca9085d..db273c7faf36 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -453,6 +453,29 @@ static void ext4_map_blocks_es_recheck(handle_t *handle,
 }
 #endif /* ES_AGGRESSIVE_TEST */
 
+int ext4_map_blocks_atomic(handle_t *handle, struct inode *inode,
+			   struct ext4_map_blocks *map, int flags)
+{
+	unsigned int mapped_len = 0, m_len = map->m_len;
+	ext4_lblk_t m_lblk = map->m_lblk;
+	int ret;
+
+	WARN_ON(!(flags & EXT4_GET_BLOCKS_CREATE));
+
+	do {
+		ret = ext4_map_blocks(handle, inode, map, flags);
+		if (ret < 0)
+			return ret;
+		mapped_len += map->m_len;
+		map->m_lblk += map->m_len;
+		map->m_len = m_len - mapped_len;
+	} while (mapped_len < m_len);
+
+	map->m_lblk = m_lblk;
+	map->m_len = mapped_len;
+	return mapped_len;
+}
+
 /*
  * The ext4_map_blocks() function tries to look up the requested blocks,
  * and returns if the blocks are already mapped.
@@ -3315,7 +3338,10 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
 	else if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
 		m_flags = EXT4_GET_BLOCKS_IO_CREATE_EXT;
 
-	ret = ext4_map_blocks(handle, inode, map, m_flags);
+	if (flags & IOMAP_ATOMIC)
+		ret = ext4_map_blocks_atomic(handle, inode, map, m_flags);
+	else
+		ret = ext4_map_blocks(handle, inode, map, m_flags);
 
 	/*
 	 * We cannot fill holes in indirect tree based inodes as that could
@@ -3339,6 +3365,7 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	int ret;
 	struct ext4_map_blocks map;
 	u8 blkbits = inode->i_blkbits;
+	unsigned int orig_len;
 
 	if ((offset >> blkbits) > EXT4_MAX_LOGICAL_BLOCK)
 		return -EINVAL;
@@ -3352,6 +3379,7 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 	map.m_lblk = offset >> blkbits;
 	map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
 			  EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
+	orig_len = map.m_len;
 
 	if (flags & IOMAP_WRITE) {
 		/*
@@ -3362,9 +3390,15 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
 		 */
 		if (offset + length <= i_size_read(inode)) {
 			ret = ext4_map_blocks(NULL, inode, &map, 0);
-			if (ret > 0 && (map.m_flags & EXT4_MAP_MAPPED))
-				goto out;
+			if (map.m_flags & EXT4_MAP_MAPPED) {
+				if ((flags & IOMAP_ATOMIC && ret >= orig_len) ||
+				   (!(flags & IOMAP_ATOMIC) && ret > 0))
+					goto out;
+
+			}
 		}
+		WARN_ON(map.m_lblk != offset >> blkbits);
+		map.m_len = orig_len;
 		ret = ext4_iomap_alloc(inode, &map, flags);
 	} else {
 		ret = ext4_map_blocks(NULL, inode, &map, 0);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC] ext4: Add support for ext4_map_blocks_atomic()
  2024-03-08 20:25   ` [RFC] ext4: Add support for ext4_map_blocks_atomic() Ritesh Harjani (IBM)
@ 2024-03-09  2:37     ` Ritesh Harjani
  2024-03-13 18:40     ` John Garry
  1 sibling, 0 replies; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-09  2:37 UTC (permalink / raw)
  To: John Garry, linux-fsdevel, linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Ojaswin Mujoo, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel, Dave Chinner

"Ritesh Harjani (IBM)" <ritesh.list@gmail.com> writes:

> +int ext4_map_blocks_atomic(handle_t *handle, struct inode *inode,
> +			   struct ext4_map_blocks *map, int flags)
> +{
> +	unsigned int mapped_len = 0, m_len = map->m_len;
> +	ext4_lblk_t m_lblk = map->m_lblk;
> +	int ret;
> +
> +	WARN_ON(!(flags & EXT4_GET_BLOCKS_CREATE));
> +
> +	do {
> +		ret = ext4_map_blocks(handle, inode, map, flags);
> +		if (ret < 0)
> +			return ret;
> +		mapped_len += map->m_len;
> +		map->m_lblk += map->m_len;
> +		map->m_len = m_len - mapped_len;
> +	} while (mapped_len < m_len);
> +
> +	map->m_lblk = m_lblk;
> +	map->m_len = mapped_len;
> +	return mapped_len;

ouch! 
1. I need to make sure map.m_pblk is updated properly.
2. I need to make sure above call only happens with bigalloc.

Sorry about that. Generally not a good idea to send something that late
at night.
But I guess this can be fixed easily. so hopefully the algorithm should
still remain, more or less the same for ext4_map_blocks_atomic().

-ritesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] ext4: Add support for ext4_map_blocks_atomic()
  2024-03-08 20:25   ` [RFC] ext4: Add support for ext4_map_blocks_atomic() Ritesh Harjani (IBM)
  2024-03-09  2:37     ` Ritesh Harjani
@ 2024-03-13 18:40     ` John Garry
  2024-03-14 15:52       ` Ritesh Harjani
  1 sibling, 1 reply; 28+ messages in thread
From: John Garry @ 2024-03-13 18:40 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), linux-fsdevel, linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Ojaswin Mujoo, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel, Dave Chinner

On 08/03/2024 20:25, Ritesh Harjani (IBM) wrote:

Hi Ritesh,

> Currently ext4 exposes [fsawu_min, fsawu_max] size as
> [blocksize, clustersize] (given the hw block device constraints are
> larger than FS atomic write units).
> 
> That means a user should be allowed to -
> 1. pwrite 0 4k /mnt/test/f1
> 2. pwrite 0 16k /mnt/test/f1
> 

Previously you have mentioned 2 or 3 methods in which ext4 could support 
atomic writes. To avoid doubt, is this patch for the "Add intelligence 
in multi-block allocator of ext4 to provide aligned allocations (this 
option won't require any formatting)" method mentioned at 
https://lore.kernel.org/linux-fsdevel/8734tb0xx7.fsf@doe.com/

and same as method 3 at 
https://lore.kernel.org/linux-fsdevel/cover.1709356594.git.ritesh.list@gmail.com/? 


Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] ext4: Add support for ext4_map_blocks_atomic()
  2024-03-13 18:40     ` John Garry
@ 2024-03-14 15:52       ` Ritesh Harjani
  2024-03-18  8:22         ` John Garry
  0 siblings, 1 reply; 28+ messages in thread
From: Ritesh Harjani @ 2024-03-14 15:52 UTC (permalink / raw)
  To: John Garry, linux-fsdevel, linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Ojaswin Mujoo, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel, Dave Chinner

John Garry <john.g.garry@oracle.com> writes:

> On 08/03/2024 20:25, Ritesh Harjani (IBM) wrote:
>
> Hi Ritesh,
>
>> Currently ext4 exposes [fsawu_min, fsawu_max] size as
>> [blocksize, clustersize] (given the hw block device constraints are
>> larger than FS atomic write units).
>> 
>> That means a user should be allowed to -
>> 1. pwrite 0 4k /mnt/test/f1
>> 2. pwrite 0 16k /mnt/test/f1
>> 
>
> Previously you have mentioned 2 or 3 methods in which ext4 could support 
> atomic writes. To avoid doubt, is this patch for the "Add intelligence 
> in multi-block allocator of ext4 to provide aligned allocations (this 
> option won't require any formatting)" method mentioned at 
> https://lore.kernel.org/linux-fsdevel/8734tb0xx7.fsf@doe.com/
>
> and same as method 3 at 
> https://lore.kernel.org/linux-fsdevel/cover.1709356594.git.ritesh.list@gmail.com/? 

Hi John,

No. So this particular patch to add ext4_map_blocks_atomic() method is
only to support the usecase which you listed should work for a good user
behaviour. This is because, with bigalloc we advertizes fsawu_min and
fsawu_max as [blocksize, clustersize]
i.e. 

That means a user should be allowed to -
1. pwrite 0 4k /mnt/test/f1
followed by 
2. pwrite 0 16k /mnt/test/f1


So earlier we were failing the second 16k write at an offset where there
is already an existing extent smaller that 16k (that was because of the
assumption that the most of the users won't do such a thing).

But for a more general usecase, it is not difficult to support the
second 16k write in such a way for atomic writes with bigalloc,
so this patch just adds that support to this series.     

-ritesh 


>
>
> Thanks,
> John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC] ext4: Add support for ext4_map_blocks_atomic()
  2024-03-14 15:52       ` Ritesh Harjani
@ 2024-03-18  8:22         ` John Garry
  0 siblings, 0 replies; 28+ messages in thread
From: John Garry @ 2024-03-18  8:22 UTC (permalink / raw)
  To: Ritesh Harjani (IBM), linux-fsdevel, linux-ext4
  Cc: Jan Kara, Theodore Ts'o, Ojaswin Mujoo, Matthew Wilcox,
	Darrick J . Wong, Luis Chamberlain, linux-kernel, Dave Chinner

On 14/03/2024 15:52, Ritesh Harjani (IBM) wrote:
>> and same as method 3 at
>> https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/cover.1709356594.git.ritesh.list@gmail.com/?__;!!ACWV5N9M2RV99hQ!Pb-HbBdm2OWUIGDFfG1OkemtRSy2LyHsc5s6WiyTtGHW4uGWV6sMkoVjmknmBydf_i6TF_CDqp7dR0Y-CGY8EIc$   
> Hi John,
> 
> No. So this particular patch to add ext4_map_blocks_atomic() method is
> only to support the usecase which you listed should work for a good user
> behaviour. This is because, with bigalloc we advertizes fsawu_min and
> fsawu_max as [blocksize, clustersize]
> i.e.
> 
> That means a user should be allowed to -
> 1. pwrite 0 4k /mnt/test/f1
> followed by
> 2. pwrite 0 16k /mnt/test/f1
> 
> 
> So earlier we were failing the second 16k write at an offset where there
> is already an existing extent smaller that 16k (that was because of the
> assumption that the most of the users won't do such a thing).
> 
> But for a more general usecase, it is not difficult to support the
> second 16k write in such a way for atomic writes with bigalloc,
> so this patch just adds that support to this series.

Is there some reason for which the generic iomap solution in 
https://lore.kernel.org/linux-xfs/20240304130428.13026-1-john.g.garry@oracle.com/ 
won't work? That is, you would just need to set iomap->extent_shift 
appropriately. I will note that we gate this feature on XFS based on 
forcealign enabled for the inode - I am not sure if you would want this 
always for bigalloc.

Thanks,
John

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2024-03-18  8:22 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-02  7:41 [RFC 0/9] ext4: Add direct-io atomic write support using fsawu Ritesh Harjani (IBM)
2024-03-02  7:41 ` [RFC 1/8] fs: Add FS_XFLAG_ATOMICWRITES flag Ritesh Harjani (IBM)
2024-03-02  7:41   ` [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Ritesh Harjani (IBM)
2024-03-04  0:59     ` Dave Chinner
2024-03-08  7:19       ` Ojaswin Mujoo
2024-03-02  7:42   ` [RFC 3/8] iomap: Add atomic write support for direct-io Ritesh Harjani (IBM)
2024-03-04  1:16     ` Dave Chinner
2024-03-04  5:33       ` Ritesh Harjani
2024-03-04  8:49         ` John Garry
2024-03-04 10:31           ` Ritesh Harjani
2024-03-04 20:56         ` Dave Chinner
2024-03-02  7:42   ` [RFC 4/8] ext4: Add statx and other atomic write helper routines Ritesh Harjani (IBM)
2024-03-06 11:14     ` John Garry
2024-03-08  8:10       ` Ritesh Harjani
2024-03-02  7:42   ` [RFC 5/8] ext4: Adds direct-io atomic writes checks Ritesh Harjani (IBM)
2024-03-02  7:42   ` [RFC 6/8] ext4: Add an inode flag for atomic writes Ritesh Harjani (IBM)
2024-03-04 20:34     ` Dave Chinner
2024-03-08  8:02       ` Ritesh Harjani
2024-03-02  7:42   ` [RFC 7/8] ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io Ritesh Harjani (IBM)
2024-03-02  7:42   ` [RFC 8/8] ext4: Adds atomic writes using fsawu Ritesh Harjani (IBM)
2024-03-02  7:42 ` [RFC 9/9] e2fsprogs/chattr: Supports atomic writes attribute Ritesh Harjani (IBM)
2024-03-06 11:22 ` [RFC 0/9] ext4: Add direct-io atomic write support using fsawu John Garry
2024-03-06 13:13   ` Ritesh Harjani
2024-03-08 20:25   ` [RFC] ext4: Add support for ext4_map_blocks_atomic() Ritesh Harjani (IBM)
2024-03-09  2:37     ` Ritesh Harjani
2024-03-13 18:40     ` John Garry
2024-03-14 15:52       ` Ritesh Harjani
2024-03-18  8:22         ` John Garry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.