linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/15] Btrfs iomap
@ 2019-09-01 20:08 Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Goldwyn Rodrigues
                   ` (15 more replies)
  0 siblings, 16 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-btrfs, darrick.wong, hch, david, riteshh

This is an effort to use iomap for btrfs. This would keep most
responsibility of page handling during writes in iomap code, hence
code reduction. For CoW support, changes are needed in iomap code
to make sure we perform a copy before the write.
This is in line with the discussion we had during adding dax support in
btrfs.

[1] https://github.com/goldwynr/linux/tree/btrfs-iomap

-- 
Goldwyn

Changes since v1
- Added Direct I/O support
- Remove PagePrivate from btrfs pages for regular files

Changes since v2
- Added CONFIG_FS_IOMAP_DEBUG and some checks
- Fallback to buffered read in case of short direct reads

 fs/Kconfig                  |    3 
 fs/btrfs/Kconfig            |    1 
 fs/btrfs/Makefile           |    2 
 fs/btrfs/compression.c      |    1 
 fs/btrfs/ctree.h            |   15 -
 fs/btrfs/extent_io.c        |   13 
 fs/btrfs/extent_io.h        |    2 
 fs/btrfs/file.c             |  523 +-------------------------------------
 fs/btrfs/free-space-cache.c |    1 
 fs/btrfs/inode.c            |  170 +++---------
 fs/btrfs/ioctl.c            |    4 
 fs/btrfs/iomap.c            |  598 +++++++++++++++++++++++++++++++++++++++++++-
 fs/btrfs/relocation.c       |    2 
 fs/dax.c                    |    8 
 fs/ext2/inode.c             |    2 
 fs/ext4/inode.c             |    2 
 fs/gfs2/bmap.c              |    3 
 fs/iomap/apply.c            |    5 
 fs/iomap/buffered-io.c      |   37 +-
 fs/iomap/direct-io.c        |   18 -
 fs/iomap/fiemap.c           |    4 
 fs/iomap/seek.c             |    4 
 fs/iomap/swapfile.c         |    3 
 fs/xfs/xfs_iomap.c          |    9 
 include/linux/fs.h          |    2 
 include/linux/iomap.h       |   18 +
 mm/filemap.c                |   13 
 27 files changed, 757 insertions(+), 706 deletions(-)


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:50   ` Darrick J. Wong
  2019-09-02 16:29   ` Christoph Hellwig
  2019-09-01 20:08 ` [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Goldwyn Rodrigues
                   ` (14 subsequent siblings)
  15 siblings, 2 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

To improve debugging abilities, especially invalid option
asserts.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/Kconfig            |  3 +++
 include/linux/iomap.h | 11 +++++++++++
 2 files changed, 14 insertions(+)

diff --git a/fs/Kconfig b/fs/Kconfig
index bfb1c6095c7a..4bed5df9b55f 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -19,6 +19,9 @@ if BLOCK
 
 config FS_IOMAP
 	bool
+config FS_IOMAP_DEBUG
+	bool "Debugging for the iomap code"
+	depends on FS_IOMAP
 
 source "fs/ext2/Kconfig"
 source "fs/ext4/Kconfig"
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index bc499ceae392..209b6c93674e 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -18,6 +18,17 @@ struct page;
 struct vm_area_struct;
 struct vm_fault;
 
+#ifdef CONFIG_FS_IOMAP_DEBUG
+#define iomap_assert(expr) \
+	if(!(expr)) { \
+		printk( "Assertion failed! %s,%s,%s,line=%d\n",\
+#expr,__FILE__,__func__,__LINE__); \
+		BUG(); \
+	}
+#else
+#define iomap_assert(expr)
+#endif
+
 /*
  * Types of block ranges for iomap mappings:
  */
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-02 16:31   ` Christoph Hellwig
  2019-09-01 20:08 ` [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW Goldwyn Rodrigues
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Introduces a new type IOMAP_COW, which means the data at offset
must be read from a srcmap and copied before performing the
write on the offset.

The srcmap is used to identify where the read is to be performed
from. This is passed to iomap->begin() of the respective
filesystem, which is supposed to put in the details for
reading before performing the copy for CoW.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/dax.c               |  8 +++++---
 fs/ext2/inode.c        |  2 +-
 fs/ext4/inode.c        |  2 +-
 fs/gfs2/bmap.c         |  3 ++-
 fs/iomap/apply.c       |  5 +++--
 fs/iomap/buffered-io.c | 14 +++++++-------
 fs/iomap/direct-io.c   |  2 +-
 fs/iomap/fiemap.c      |  4 ++--
 fs/iomap/seek.c        |  4 ++--
 fs/iomap/swapfile.c    |  3 ++-
 fs/xfs/xfs_iomap.c     |  9 ++++++---
 include/linux/iomap.h  |  6 ++++--
 12 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 6bf81f931de3..e961d8dc23ef 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1090,7 +1090,7 @@ EXPORT_SYMBOL_GPL(__dax_zero_page_range);
 
 static loff_t
 dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
-		struct iomap *iomap)
+		struct iomap *iomap, struct iomap *srcmap)
 {
 	struct block_device *bdev = iomap->bdev;
 	struct dax_device *dax_dev = iomap->dax_dev;
@@ -1248,6 +1248,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
 	unsigned long vaddr = vmf->address;
 	loff_t pos = (loff_t)vmf->pgoff << PAGE_SHIFT;
 	struct iomap iomap = { 0 };
+	struct iomap srcmap = { 0 };
 	unsigned flags = IOMAP_FAULT;
 	int error, major = 0;
 	bool write = vmf->flags & FAULT_FLAG_WRITE;
@@ -1292,7 +1293,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
 	 * the file system block size to be equal the page size, which means
 	 * that we never have to deal with more than a single extent here.
 	 */
-	error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap);
+	error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap, &srcmap);
 	if (iomap_errp)
 		*iomap_errp = error;
 	if (error) {
@@ -1472,6 +1473,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
 	struct inode *inode = mapping->host;
 	vm_fault_t result = VM_FAULT_FALLBACK;
 	struct iomap iomap = { 0 };
+	struct iomap srcmap = { 0 };
 	pgoff_t max_pgoff;
 	void *entry;
 	loff_t pos;
@@ -1546,7 +1548,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
 	 * to look up our filesystem block.
 	 */
 	pos = (loff_t)xas.xa_index << PAGE_SHIFT;
-	error = ops->iomap_begin(inode, pos, PMD_SIZE, iomap_flags, &iomap);
+	error = ops->iomap_begin(inode, pos, PMD_SIZE, iomap_flags, &iomap, &srcmap);
 	if (error)
 		goto unlock_entry;
 
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 7004ce581a32..467c13ff6b40 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -801,7 +801,7 @@ int ext2_get_block(struct inode *inode, sector_t iblock,
 
 #ifdef CONFIG_FS_DAX
 static int ext2_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
-		unsigned flags, struct iomap *iomap)
+		unsigned flags, struct iomap *iomap, struct iomap *srcmap)
 {
 	unsigned int blkbits = inode->i_blkbits;
 	unsigned long first_block = offset >> blkbits;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 420fe3deed39..918f94eff799 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3453,7 +3453,7 @@ static bool ext4_inode_datasync_dirty(struct inode *inode)
 }
 
 static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
-			    unsigned flags, struct iomap *iomap)
+			    unsigned flags, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 	unsigned int blkbits = inode->i_blkbits;
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 4f8b5fd6c81f..0189262989f2 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1164,7 +1164,8 @@ static int gfs2_iomap_begin_write(struct inode *inode, loff_t pos,
 }
 
 static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length,
-			    unsigned flags, struct iomap *iomap)
+			    unsigned flags, struct iomap *iomap,
+			    struct iomap *srcmap)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct metapath mp = { .mp_aheight = 1, };
diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c
index 54c02aecf3cd..6cdb362fff36 100644
--- a/fs/iomap/apply.c
+++ b/fs/iomap/apply.c
@@ -24,6 +24,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags,
 		const struct iomap_ops *ops, void *data, iomap_actor_t actor)
 {
 	struct iomap iomap = { 0 };
+	struct iomap srcmap = { 0 };
 	loff_t written = 0, ret;
 
 	/*
@@ -38,7 +39,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags,
 	 * expose transient stale data. If the reserve fails, we can safely
 	 * back out at this point as there is nothing to undo.
 	 */
-	ret = ops->iomap_begin(inode, pos, length, flags, &iomap);
+	ret = ops->iomap_begin(inode, pos, length, flags, &iomap, &srcmap);
 	if (ret)
 		return ret;
 	if (WARN_ON(iomap.offset > pos))
@@ -58,7 +59,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags,
 	 * we can do the copy-in page by page without having to worry about
 	 * failures exposing transient data.
 	 */
-	written = actor(inode, pos, length, data, &iomap);
+	written = actor(inode, pos, length, data, &iomap, &srcmap);
 
 	/*
 	 * Now the data has been copied, commit the range we've copied.  This
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e25901ae3ff4..f27756c0b31c 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -205,7 +205,7 @@ iomap_read_inline_data(struct inode *inode, struct page *page,
 
 static loff_t
 iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
-		struct iomap *iomap)
+		struct iomap *iomap, struct iomap *srcmap)
 {
 	struct iomap_readpage_ctx *ctx = data;
 	struct page *page = ctx->cur_page;
@@ -351,7 +351,7 @@ iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos,
 
 static loff_t
 iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length,
-		void *data, struct iomap *iomap)
+		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct iomap_readpage_ctx *ctx = data;
 	loff_t done, ret;
@@ -371,7 +371,7 @@ iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length,
 			ctx->cur_page_in_bio = false;
 		}
 		ret = iomap_readpage_actor(inode, pos + done, length - done,
-				ctx, iomap);
+				ctx, iomap, srcmap);
 	}
 
 	return done;
@@ -736,7 +736,7 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len,
 
 static loff_t
 iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
-		struct iomap *iomap)
+		struct iomap *iomap, struct iomap *srcmap)
 {
 	struct iov_iter *i = data;
 	long status = 0;
@@ -853,7 +853,7 @@ __iomap_read_page(struct inode *inode, loff_t offset)
 
 static loff_t
 iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
-		struct iomap *iomap)
+		struct iomap *iomap, struct iomap *srcmap)
 {
 	long status = 0;
 	ssize_t written = 0;
@@ -942,7 +942,7 @@ static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes,
 
 static loff_t
 iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
-		void *data, struct iomap *iomap)
+		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	bool *did_zero = data;
 	loff_t written = 0;
@@ -1011,7 +1011,7 @@ EXPORT_SYMBOL_GPL(iomap_truncate_page);
 
 static loff_t
 iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length,
-		void *data, struct iomap *iomap)
+		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct page *page = data;
 	int ret;
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 10517cea9682..5279029c7a3c 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -362,7 +362,7 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length,
 
 static loff_t
 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
-		void *data, struct iomap *iomap)
+		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct iomap_dio *dio = data;
 
diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index f26fdd36e383..690ef2d7c6c8 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -44,7 +44,7 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi,
 
 static loff_t
 iomap_fiemap_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
-		struct iomap *iomap)
+		struct iomap *iomap, struct iomap *srcmap)
 {
 	struct fiemap_ctx *ctx = data;
 	loff_t ret = length;
@@ -111,7 +111,7 @@ EXPORT_SYMBOL_GPL(iomap_fiemap);
 
 static loff_t
 iomap_bmap_actor(struct inode *inode, loff_t pos, loff_t length,
-		void *data, struct iomap *iomap)
+		void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	sector_t *bno = data, addr;
 
diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c
index c04bad4b2b43..89f61d93c0bc 100644
--- a/fs/iomap/seek.c
+++ b/fs/iomap/seek.c
@@ -119,7 +119,7 @@ page_cache_seek_hole_data(struct inode *inode, loff_t offset, loff_t length,
 
 static loff_t
 iomap_seek_hole_actor(struct inode *inode, loff_t offset, loff_t length,
-		      void *data, struct iomap *iomap)
+		      void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	switch (iomap->type) {
 	case IOMAP_UNWRITTEN:
@@ -165,7 +165,7 @@ EXPORT_SYMBOL_GPL(iomap_seek_hole);
 
 static loff_t
 iomap_seek_data_actor(struct inode *inode, loff_t offset, loff_t length,
-		      void *data, struct iomap *iomap)
+		      void *data, struct iomap *iomap, struct iomap *srcmap)
 {
 	switch (iomap->type) {
 	case IOMAP_HOLE:
diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
index 152a230f668d..a648dbf6991e 100644
--- a/fs/iomap/swapfile.c
+++ b/fs/iomap/swapfile.c
@@ -76,7 +76,8 @@ static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi)
  * distinction between written and unwritten extents.
  */
 static loff_t iomap_swapfile_activate_actor(struct inode *inode, loff_t pos,
-		loff_t count, void *data, struct iomap *iomap)
+		loff_t count, void *data, struct iomap *iomap,
+		struct iomap *srcmap)
 {
 	struct iomap_swapfile_info *isi = data;
 	int error;
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 3a4310d7cb59..8321733c16c3 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -922,7 +922,8 @@ xfs_file_iomap_begin(
 	loff_t			offset,
 	loff_t			length,
 	unsigned		flags,
-	struct iomap		*iomap)
+	struct iomap		*iomap,
+	struct iomap		*srcmap)
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
@@ -1145,7 +1146,8 @@ xfs_seek_iomap_begin(
 	loff_t			offset,
 	loff_t			length,
 	unsigned		flags,
-	struct iomap		*iomap)
+	struct iomap		*iomap,
+	struct iomap		*srcmap)
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
@@ -1231,7 +1233,8 @@ xfs_xattr_iomap_begin(
 	loff_t			offset,
 	loff_t			length,
 	unsigned		flags,
-	struct iomap		*iomap)
+	struct iomap		*iomap,
+	struct iomap		*srcmap)
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 209b6c93674e..9c082d36286a 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -37,6 +37,7 @@ struct vm_fault;
 #define IOMAP_MAPPED	0x03	/* blocks allocated at @addr */
 #define IOMAP_UNWRITTEN	0x04	/* blocks allocated at @addr in unwritten state */
 #define IOMAP_INLINE	0x05	/* data inline in the inode */
+#define IOMAP_COW	0x06	/* copy data from srcmap before writing */
 
 /*
  * Flags for all iomap mappings:
@@ -121,7 +122,8 @@ struct iomap_ops {
 	 * The actual length is returned in iomap->length.
 	 */
 	int (*iomap_begin)(struct inode *inode, loff_t pos, loff_t length,
-			unsigned flags, struct iomap *iomap);
+			unsigned flags, struct iomap *iomap,
+			struct iomap *srcmap);
 
 	/*
 	 * Commit and/or unreserve space previous allocated using iomap_begin.
@@ -137,7 +139,7 @@ struct iomap_ops {
  * Main iomap iterator function.
  */
 typedef loff_t (*iomap_actor_t)(struct inode *inode, loff_t pos, loff_t len,
-		void *data, struct iomap *iomap);
+		void *data, struct iomap *iomap, struct iomap *srcmap);
 
 loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length,
 		unsigned flags, const struct iomap_ops *ops, void *data,
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-02 16:31   ` Christoph Hellwig
  2019-09-01 20:08 ` [PATCH 04/15] btrfs: Eliminate PagePrivate for btrfs data pages Goldwyn Rodrigues
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

In case of a IOMAP_COW, read a page from the srcmap before
performing a write on the page.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/iomap/buffered-io.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index f27756c0b31c..99d63ba14d1b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -581,7 +581,7 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len,
 
 static int
 iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
-		struct page **pagep, struct iomap *iomap)
+		struct page **pagep, struct iomap *iomap, struct iomap *srcmap)
 {
 	const struct iomap_page_ops *page_ops = iomap->page_ops;
 	pgoff_t index = pos >> PAGE_SHIFT;
@@ -605,12 +605,17 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		goto out_no_page;
 	}
 
-	if (iomap->type == IOMAP_INLINE)
+	if (iomap->type == IOMAP_INLINE) {
 		iomap_read_inline_data(inode, page, iomap);
-	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
+	} else if (iomap->type == IOMAP_COW) {
+		iomap_assert(!(iomap->flags & IOMAP_F_BUFFER_HEAD));
+		iomap_assert(srcmap->type == IOMAP_HOLE || srcmap->addr > 0);
+		status = __iomap_write_begin(inode, pos, len, page, srcmap);
+	} else if (iomap->flags & IOMAP_F_BUFFER_HEAD) {
 		status = __block_write_begin_int(page, pos, len, NULL, iomap);
-	else
+	} else {
 		status = __iomap_write_begin(inode, pos, len, page, iomap);
+	}
 
 	if (unlikely(status))
 		goto out_unlock;
@@ -772,7 +777,7 @@ iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 		}
 
 		status = iomap_write_begin(inode, pos, bytes, flags, &page,
-				iomap);
+				iomap, srcmap);
 		if (unlikely(status))
 			break;
 
@@ -871,7 +876,7 @@ iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 			return PTR_ERR(rpage);
 
 		status = iomap_write_begin(inode, pos, bytes,
-					   AOP_FLAG_NOFS, &page, iomap);
+					   AOP_FLAG_NOFS, &page, iomap, srcmap);
 		put_page(rpage);
 		if (unlikely(status))
 			return status;
@@ -917,13 +922,13 @@ iomap_file_dirty(struct inode *inode, loff_t pos, loff_t len,
 EXPORT_SYMBOL_GPL(iomap_file_dirty);
 
 static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset,
-		unsigned bytes, struct iomap *iomap)
+		unsigned bytes, struct iomap *iomap, struct iomap *srcmap)
 {
 	struct page *page;
 	int status;
 
 	status = iomap_write_begin(inode, pos, bytes, AOP_FLAG_NOFS, &page,
-				   iomap);
+				   iomap, srcmap);
 	if (status)
 		return status;
 
@@ -961,7 +966,7 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
 		if (IS_DAX(inode))
 			status = iomap_dax_zero(pos, offset, bytes, iomap);
 		else
-			status = iomap_zero(inode, pos, offset, bytes, iomap);
+			status = iomap_zero(inode, pos, offset, bytes, iomap, srcmap);
 		if (status < 0)
 			return status;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/15] btrfs: Eliminate PagePrivate for btrfs data pages
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (2 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 05/15] btrfs: Add a simple buffered iomap write Goldwyn Rodrigues
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

While most of the code works just eliminating page's private
field and related code, there is a problem when we are cloning.
The extent assumes the data is uptodate. Clear the EXTENT_UPTODATE
flag for the extent so the next time the file is read, it is
forced to be read from the disk as opposed to pagecache.

This patch is required to make sure we don't conflict with iomap's
usage of page->private.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/compression.c      |  1 -
 fs/btrfs/extent_io.c        | 13 -------------
 fs/btrfs/extent_io.h        |  2 --
 fs/btrfs/file.c             |  1 -
 fs/btrfs/free-space-cache.c |  1 -
 fs/btrfs/inode.c            | 15 +--------------
 fs/btrfs/ioctl.c            |  4 ++--
 fs/btrfs/relocation.c       |  2 --
 8 files changed, 3 insertions(+), 36 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 60c47b417a4b..fe41fa3d2999 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -481,7 +481,6 @@ static noinline int add_ra_bio_pages(struct inode *inode,
 		 * for these bytes in the file.  But, we have to make
 		 * sure they map to this compressed extent on disk.
 		 */
-		set_page_extent_mapped(page);
 		lock_extent(tree, last_offset, end);
 		read_lock(&em_tree->lock);
 		em = lookup_extent_mapping(em_tree, last_offset,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1ff438fd5bc2..27233fb6660c 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3005,15 +3005,6 @@ static void attach_extent_buffer_page(struct extent_buffer *eb,
 	}
 }
 
-void set_page_extent_mapped(struct page *page)
-{
-	if (!PagePrivate(page)) {
-		SetPagePrivate(page);
-		get_page(page);
-		set_page_private(page, EXTENT_PAGE_PRIVATE);
-	}
-}
-
 static struct extent_map *
 __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset,
 		 u64 start, u64 len, get_extent_t *get_extent,
@@ -3074,8 +3065,6 @@ static int __do_readpage(struct extent_io_tree *tree,
 	size_t blocksize = inode->i_sb->s_blocksize;
 	unsigned long this_bio_flag = 0;
 
-	set_page_extent_mapped(page);
-
 	if (!PageUptodate(page)) {
 		if (cleancache_get_page(page) == 0) {
 			BUG_ON(blocksize != PAGE_SIZE);
@@ -3589,8 +3578,6 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc,
 
 	pg_offset = 0;
 
-	set_page_extent_mapped(page);
-
 	if (!epd->extent_locked) {
 		ret = writepage_delalloc(inode, page, wbc, start, &nr_written);
 		if (ret == 1)
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 401423b16976..8082774371b5 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -416,8 +416,6 @@ int extent_readpages(struct address_space *mapping, struct list_head *pages,
 		     unsigned nr_pages);
 int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		__u64 start, __u64 len);
-void set_page_extent_mapped(struct page *page);
-
 struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
 					  u64 start);
 struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 58a18ed11546..4466a09f2d98 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1539,7 +1539,6 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages,
 	 * delalloc bits and dirty the pages as required.
 	 */
 	for (i = 0; i < num_pages; i++) {
-		set_page_extent_mapped(pages[i]);
 		WARN_ON(!PageLocked(pages[i]));
 	}
 
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 062be9dde4c6..9a0c519bd6d4 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -395,7 +395,6 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, struct inode *inode
 
 	for (i = 0; i < io_ctl->num_pages; i++) {
 		clear_page_dirty_for_io(io_ctl->pages[i]);
-		set_page_extent_mapped(io_ctl->pages[i]);
 	}
 
 	return 0;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index ee582a36653d..258bacefdf5f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4932,7 +4932,6 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	wait_on_page_writeback(page);
 
 	lock_extent_bits(io_tree, block_start, block_end, &cached_state);
-	set_page_extent_mapped(page);
 
 	ordered = btrfs_lookup_ordered_extent(inode, block_start);
 	if (ordered) {
@@ -8754,13 +8753,7 @@ btrfs_readpages(struct file *file, struct address_space *mapping,
 
 static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags)
 {
-	int ret = try_release_extent_mapping(page, gfp_flags);
-	if (ret == 1) {
-		ClearPagePrivate(page);
-		set_page_private(page, 0);
-		put_page(page);
-	}
-	return ret;
+	return try_release_extent_mapping(page, gfp_flags);
 }
 
 static int btrfs_releasepage(struct page *page, gfp_t gfp_flags)
@@ -8878,11 +8871,6 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 	}
 
 	ClearPageChecked(page);
-	if (PagePrivate(page)) {
-		ClearPagePrivate(page);
-		set_page_private(page, 0);
-		put_page(page);
-	}
 }
 
 /*
@@ -8961,7 +8949,6 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
 	wait_on_page_writeback(page);
 
 	lock_extent_bits(io_tree, page_start, page_end, &cached_state);
-	set_page_extent_mapped(page);
 
 	/*
 	 * we can't set the delalloc bits if there are pending ordered
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 818f7ec8bb0e..861617e3d0c9 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1355,7 +1355,6 @@ static int cluster_pages_for_defrag(struct inode *inode,
 	for (i = 0; i < i_done; i++) {
 		clear_page_dirty_for_io(pages[i]);
 		ClearPageChecked(pages[i]);
-		set_page_extent_mapped(pages[i]);
 		set_page_dirty(pages[i]);
 		unlock_page(pages[i]);
 		put_page(pages[i]);
@@ -3550,6 +3549,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode,
 	int ret;
 	const u64 len = olen_aligned;
 	u64 last_dest_end = destoff;
+	struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
 
 	ret = -ENOMEM;
 	buf = kvmalloc(fs_info->nodesize, GFP_KERNEL);
@@ -3864,6 +3864,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode,
 						destoff, olen, no_time_update);
 	}
 
+	clear_extent_uptodate(tree, destoff, destoff+olen, NULL);
 out:
 	btrfs_free_path(path);
 	kvfree(buf);
@@ -3935,7 +3936,6 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src,
 	truncate_inode_pages_range(&inode->i_data,
 				round_down(destoff, PAGE_SIZE),
 				round_up(destoff + len, PAGE_SIZE) - 1);
-
 	return ret;
 }
 
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 7f219851fa23..612988b7eb27 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3300,8 +3300,6 @@ static int relocate_file_extent_cluster(struct inode *inode,
 
 		lock_extent(&BTRFS_I(inode)->io_tree, page_start, page_end);
 
-		set_page_extent_mapped(page);
-
 		if (nr < cluster->nr &&
 		    page_start + offset == cluster->boundary[nr]) {
 			set_extent_bits(&BTRFS_I(inode)->io_tree,
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/15] btrfs: Add a simple buffered iomap write
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (3 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 04/15] btrfs: Eliminate PagePrivate for btrfs data pages Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 06/15] btrfs: Add CoW in iomap based writes Goldwyn Rodrigues
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

This one is a long patch. Most of the code is "inspired" by
fs/btrfs/file.c. To keep the size small, all removals are in
following patches.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/Kconfig  |   1 +
 fs/btrfs/Makefile |   2 +-
 fs/btrfs/ctree.h  |   1 +
 fs/btrfs/file.c   |   4 +-
 fs/btrfs/iomap.c  | 383 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 388 insertions(+), 3 deletions(-)
 create mode 100644 fs/btrfs/iomap.c

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 38651fae7f21..ba87d21885ca 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -14,6 +14,7 @@ config BTRFS_FS
 	select RAID6_PQ
 	select XOR_BLOCKS
 	select SRCU
+	select FS_IOMAP
 
 	help
 	  Btrfs is a general purpose copy-on-write filesystem with extents,
diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 76a843198bcb..f88e696b0698 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -11,7 +11,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
 	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
 	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
 	   uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \
-	   block-rsv.o delalloc-space.o
+	   block-rsv.o delalloc-space.o iomap.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 94660063a162..9c501c7826b4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3243,6 +3243,7 @@ int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end);
 loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in,
 			      struct file *file_out, loff_t pos_out,
 			      loff_t len, unsigned int remap_flags);
+size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from);
 
 /* tree-defrag.c */
 int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 4466a09f2d98..0707db04d3cc 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1829,7 +1829,7 @@ static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 		return written;
 
 	pos = iocb->ki_pos;
-	written_buffered = btrfs_buffered_write(iocb, from);
+	written_buffered = btrfs_buffered_iomap_write(iocb, from);
 	if (written_buffered < 0) {
 		err = written_buffered;
 		goto out;
@@ -1966,7 +1966,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 	if (iocb->ki_flags & IOCB_DIRECT) {
 		num_written = __btrfs_direct_write(iocb, from);
 	} else {
-		num_written = btrfs_buffered_write(iocb, from);
+		num_written = btrfs_buffered_iomap_write(iocb, from);
 		if (num_written > 0)
 			iocb->ki_pos = pos + num_written;
 		if (clean_page)
diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c
new file mode 100644
index 000000000000..025ccbf471bf
--- /dev/null
+++ b/fs/btrfs/iomap.c
@@ -0,0 +1,383 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * iomap support for BTRFS
+ *
+ * Copyright (c) 2019  SUSE Linux
+ * Author: Goldwyn Rodrigues <rgoldwyn@suse.com>
+ */
+
+#include <linux/iomap.h>
+#include "ctree.h"
+#include "btrfs_inode.h"
+#include "volumes.h"
+#include "disk-io.h"
+#include "delalloc-space.h"
+
+struct btrfs_iomap {
+	u64 start;
+	u64 end;
+	bool nocow;
+	int extents_locked;
+	ssize_t reserved_bytes;
+	struct extent_changeset *data_reserved;
+	struct extent_state *cached_state;
+};
+
+
+/*
+ * This function locks the extent and properly waits for data=ordered extents
+ * to finish before allowing the pages to be modified if need.
+ *
+ * The return value:
+ * 1 - the extent is locked
+ * 0 - the extent is not locked, and everything is OK
+ * -EAGAIN - need re-prepare the pages
+ * the other < 0 number - Something wrong happens
+ */
+static noinline int
+lock_and_cleanup_extent(struct btrfs_inode *inode, loff_t pos,
+			 size_t write_bytes,
+			 u64 *lockstart, u64 *lockend,
+			 struct extent_state **cached_state)
+{
+	struct btrfs_fs_info *fs_info = inode->root->fs_info;
+	u64 start_pos;
+	u64 last_pos;
+	int ret = 0;
+
+	start_pos = round_down(pos, fs_info->sectorsize);
+	last_pos = start_pos
+		+ round_up(pos + write_bytes - start_pos,
+			   fs_info->sectorsize) - 1;
+
+	if (start_pos < inode->vfs_inode.i_size) {
+		struct btrfs_ordered_extent *ordered;
+
+		lock_extent_bits(&inode->io_tree, start_pos, last_pos,
+				cached_state);
+		ordered = btrfs_lookup_ordered_range(inode, start_pos,
+						     last_pos - start_pos + 1);
+		if (ordered &&
+		    ordered->file_offset + ordered->len > start_pos &&
+		    ordered->file_offset <= last_pos) {
+			unlock_extent_cached(&inode->io_tree, start_pos,
+					last_pos, cached_state);
+			btrfs_start_ordered_extent(&inode->vfs_inode,
+					ordered, 1);
+			btrfs_put_ordered_extent(ordered);
+			return -EAGAIN;
+		}
+		if (ordered)
+			btrfs_put_ordered_extent(ordered);
+
+		*lockstart = start_pos;
+		*lockend = last_pos;
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
+				    size_t *write_bytes)
+{
+	struct btrfs_fs_info *fs_info = inode->root->fs_info;
+	struct btrfs_root *root = inode->root;
+	struct btrfs_ordered_extent *ordered;
+	u64 lockstart, lockend;
+	u64 num_bytes;
+	int ret;
+
+	ret = btrfs_start_write_no_snapshotting(root);
+	if (!ret)
+		return -ENOSPC;
+
+	lockstart = round_down(pos, fs_info->sectorsize);
+	lockend = round_up(pos + *write_bytes,
+			   fs_info->sectorsize) - 1;
+
+	while (1) {
+		lock_extent(&inode->io_tree, lockstart, lockend);
+		ordered = btrfs_lookup_ordered_range(inode, lockstart,
+						     lockend - lockstart + 1);
+		if (!ordered) {
+			break;
+		}
+		unlock_extent(&inode->io_tree, lockstart, lockend);
+		btrfs_start_ordered_extent(&inode->vfs_inode, ordered, 1);
+		btrfs_put_ordered_extent(ordered);
+	}
+
+	num_bytes = lockend - lockstart + 1;
+	ret = can_nocow_extent(&inode->vfs_inode, lockstart, &num_bytes,
+			NULL, NULL, NULL);
+	if (ret <= 0) {
+		ret = 0;
+		btrfs_end_write_no_snapshotting(root);
+	} else {
+		*write_bytes = min_t(size_t, *write_bytes ,
+				     num_bytes - pos + lockstart);
+	}
+
+	unlock_extent(&inode->io_tree, lockstart, lockend);
+
+	return ret;
+}
+
+static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode,
+					 const u64 start,
+					 const u64 len,
+					 struct extent_state **cached_state)
+{
+	u64 search_start = start;
+	const u64 end = start + len - 1;
+
+	while (search_start < end) {
+		const u64 search_len = end - search_start + 1;
+		struct extent_map *em;
+		u64 em_len;
+		int ret = 0;
+
+		em = btrfs_get_extent(inode, NULL, 0, search_start,
+				      search_len, 0);
+		if (IS_ERR(em))
+			return PTR_ERR(em);
+
+		if (em->block_start != EXTENT_MAP_HOLE)
+			goto next;
+
+		em_len = em->len;
+		if (em->start < search_start)
+			em_len -= search_start - em->start;
+		if (em_len > search_len)
+			em_len = search_len;
+
+		ret = set_extent_bit(&inode->io_tree, search_start,
+				     search_start + em_len - 1,
+				     EXTENT_DELALLOC_NEW,
+				     NULL, cached_state, GFP_NOFS);
+next:
+		search_start = extent_map_end(em);
+		free_extent_map(em);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+static void btrfs_buffered_page_done(struct inode *inode, loff_t pos,
+		unsigned copied, struct page *page,
+		struct iomap *iomap)
+{
+	if (!page)
+		return;
+	SetPageUptodate(page);
+	ClearPageChecked(page);
+	set_page_dirty(page);
+	get_page(page);
+}
+
+
+static const struct iomap_page_ops btrfs_buffered_page_ops = {
+	.page_done = btrfs_buffered_page_done,
+};
+
+
+static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos,
+		loff_t length, unsigned flags, struct iomap *iomap,
+		struct iomap *srcmap)
+{
+	int ret;
+	size_t write_bytes = length;
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	size_t sector_offset = pos & (fs_info->sectorsize - 1);
+	struct btrfs_iomap *bi;
+
+	bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS);
+	if (!bi)
+		return -ENOMEM;
+
+	bi->reserved_bytes = round_up(write_bytes + sector_offset,
+			fs_info->sectorsize);
+
+	/* Reserve data space */
+	ret = btrfs_check_data_free_space(inode, &bi->data_reserved, pos,
+			write_bytes);
+	if (ret < 0) {
+		/*
+		 * Space allocation failed. Let's check if we can
+		 * continue I/O without allocations
+		 */
+		if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
+						BTRFS_INODE_PREALLOC)) &&
+				check_can_nocow(BTRFS_I(inode), pos,
+					&write_bytes) > 0) {
+			bi->nocow = true;
+			/*
+			 * our prealloc extent may be smaller than
+			 * write_bytes, so scale down.
+			 */
+			bi->reserved_bytes = round_up(write_bytes +
+					sector_offset,
+					fs_info->sectorsize);
+		} else {
+			goto error;
+		}
+	}
+
+	WARN_ON(bi->reserved_bytes == 0);
+
+	/* We have the data space allocated, reserve the metadata now */
+	ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode),
+			bi->reserved_bytes);
+	if (ret) {
+		struct btrfs_root *root = BTRFS_I(inode)->root;
+		if (!bi->nocow)
+			btrfs_free_reserved_data_space(inode,
+					bi->data_reserved, pos,
+					write_bytes);
+		else
+			btrfs_end_write_no_snapshotting(root);
+		goto error;
+	}
+
+	do {
+		ret = lock_and_cleanup_extent(
+				BTRFS_I(inode), pos, write_bytes, &bi->start,
+				&bi->end, &bi->cached_state);
+	} while (ret == -EAGAIN);
+
+	if (ret < 0) {
+		btrfs_delalloc_release_extents(BTRFS_I(inode),
+					       bi->reserved_bytes, true);
+		goto release;
+	} else {
+		bi->extents_locked = ret;
+	}
+	iomap->private = bi;
+	iomap->length = round_up(write_bytes, fs_info->sectorsize);
+	iomap->offset = round_down(pos, fs_info->sectorsize);
+	iomap->addr = IOMAP_NULL_ADDR;
+	iomap->type = IOMAP_DELALLOC;
+	iomap->bdev = fs_info->fs_devices->latest_bdev;
+	iomap->page_ops = &btrfs_buffered_page_ops;
+	return 0;
+release:
+	if (bi->extents_locked)
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start,
+				bi->end, &bi->cached_state);
+	if (bi->nocow) {
+		struct btrfs_root *root = BTRFS_I(inode)->root;
+		btrfs_end_write_no_snapshotting(root);
+		btrfs_delalloc_release_metadata(BTRFS_I(inode),
+				bi->reserved_bytes, true);
+	} else {
+		btrfs_delalloc_release_space(inode, bi->data_reserved,
+				round_down(pos, fs_info->sectorsize),
+				bi->reserved_bytes, true);
+	}
+	extent_changeset_free(bi->data_reserved);
+
+error:
+	kfree(bi);
+	return ret;
+}
+
+static int btrfs_buffered_iomap_end(struct inode *inode, loff_t pos,
+		loff_t length, ssize_t written, unsigned flags,
+		struct iomap *iomap)
+{
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	struct btrfs_iomap *bi = iomap->private;
+	ssize_t release_bytes = round_down(bi->reserved_bytes - written,
+			1 << fs_info->sb->s_blocksize_bits);
+	unsigned int extra_bits = 0;
+	u64 start_pos = pos & ~((u64) fs_info->sectorsize - 1);
+	u64 num_bytes = round_up(written + pos - start_pos,
+			fs_info->sectorsize);
+	u64 end_of_last_block = start_pos + num_bytes - 1;
+	int ret = 0;
+
+	if (release_bytes > 0) {
+		if (bi->nocow) {
+			btrfs_delalloc_release_metadata(BTRFS_I(inode),
+					release_bytes, true);
+		} else {
+			u64 __pos = round_down(pos + written, fs_info->sectorsize);
+			btrfs_delalloc_release_space(inode, bi->data_reserved,
+					__pos, release_bytes, true);
+		}
+	}
+
+	/*
+	 * The pages may have already been dirty, clear out old accounting so
+	 * we can set things up properly
+	 */
+	clear_extent_bit(&BTRFS_I(inode)->io_tree, start_pos, end_of_last_block,
+			EXTENT_DIRTY | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING |
+			EXTENT_DEFRAG, 0, 0, &bi->cached_state);
+
+	if (!btrfs_is_free_space_inode(BTRFS_I(inode))) {
+		if (start_pos >= i_size_read(inode) &&
+		    !(BTRFS_I(inode)->flags & BTRFS_INODE_PREALLOC)) {
+			/*
+			 * There can't be any extents following eof in this case
+			 * so just set the delalloc new bit for the range
+			 * directly.
+			 */
+			extra_bits |= EXTENT_DELALLOC_NEW;
+		} else {
+			ret = btrfs_find_new_delalloc_bytes(BTRFS_I(inode),
+					start_pos, num_bytes,
+					&bi->cached_state);
+			if (ret)
+				goto unlock;
+		}
+	}
+
+	ret = btrfs_set_extent_delalloc(inode, start_pos, end_of_last_block,
+			extra_bits, &bi->cached_state, 0);
+unlock:
+	if (bi->extents_locked)
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree,
+				bi->start, bi->end, &bi->cached_state);
+
+	if (bi->nocow) {
+		struct btrfs_root *root = BTRFS_I(inode)->root;
+		btrfs_end_write_no_snapshotting(root);
+		if (written > 0) {
+			u64 start = round_down(pos, fs_info->sectorsize);
+			u64 end = round_up(pos + written, fs_info->sectorsize) - 1;
+			set_extent_bit(&BTRFS_I(inode)->io_tree, start, end,
+					EXTENT_NORESERVE, NULL, NULL, GFP_NOFS);
+		}
+
+	}
+	btrfs_delalloc_release_extents(BTRFS_I(inode), bi->reserved_bytes,
+			true);
+
+	if (written < fs_info->nodesize)
+		btrfs_btree_balance_dirty(fs_info);
+
+	extent_changeset_free(bi->data_reserved);
+	kfree(bi);
+	return ret;
+}
+
+static const struct iomap_ops btrfs_buffered_iomap_ops = {
+	.iomap_begin            = btrfs_buffered_iomap_begin,
+	.iomap_end              = btrfs_buffered_iomap_end,
+};
+
+size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from)
+{
+	ssize_t written;
+	struct inode *inode = file_inode(iocb->ki_filp);
+	written = iomap_file_buffered_write(iocb, from, &btrfs_buffered_iomap_ops);
+	if (written > 0)
+		iocb->ki_pos += written;
+	if (iocb->ki_pos > i_size_read(inode))
+		i_size_write(inode, iocb->ki_pos);
+	return written;
+}
+
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/15] btrfs: Add CoW in iomap based writes
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (4 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 05/15] btrfs: Add a simple buffered iomap write Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 07/15] btrfs: remove buffered write code made unnecessary Goldwyn Rodrigues
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Set iomap->type to IOMAP_COW and fill up the source map in case
the I/O is not page aligned.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/iomap.c | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c
index 025ccbf471bf..8ed93d427962 100644
--- a/fs/btrfs/iomap.c
+++ b/fs/btrfs/iomap.c
@@ -165,6 +165,35 @@ static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode,
 	return 0;
 }
 
+/*
+ * get_iomap: Get the block map and fill the iomap structure
+ * @pos: file position
+ * @length: I/O length
+ * @iomap: The iomap structure to fill
+ */
+
+static int get_iomap(struct inode *inode, loff_t pos, loff_t length,
+		struct iomap *iomap)
+{
+	struct extent_map *em;
+	iomap->addr = IOMAP_NULL_ADDR;
+	em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0);
+	if (IS_ERR(em))
+		return PTR_ERR(em);
+	/* XXX Do we need to check for em->flags here? */
+	if (em->block_start == EXTENT_MAP_HOLE) {
+		iomap->type = IOMAP_HOLE;
+	} else {
+		iomap->addr = em->block_start;
+		iomap->type = IOMAP_MAPPED;
+	}
+	iomap->offset = em->start;
+	iomap->bdev = em->bdev;
+	iomap->length = em->len;
+	free_extent_map(em);
+	return 0;
+}
+
 static void btrfs_buffered_page_done(struct inode *inode, loff_t pos,
 		unsigned copied, struct page *page,
 		struct iomap *iomap)
@@ -190,6 +219,7 @@ static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos,
 	int ret;
 	size_t write_bytes = length;
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+	size_t end;
 	size_t sector_offset = pos & (fs_info->sectorsize - 1);
 	struct btrfs_iomap *bi;
 
@@ -257,8 +287,18 @@ static int btrfs_buffered_iomap_begin(struct inode *inode, loff_t pos,
 	iomap->private = bi;
 	iomap->length = round_up(write_bytes, fs_info->sectorsize);
 	iomap->offset = round_down(pos, fs_info->sectorsize);
+	end = pos + write_bytes;
+	/* Set IOMAP_COW if start/end is not page aligned */
+	if (((pos & (PAGE_SIZE - 1)) || (end & (PAGE_SIZE - 1)))) {
+		iomap->type = IOMAP_COW;
+		ret = get_iomap(inode, pos, length, srcmap);
+		if (ret < 0)
+			goto release;
+	} else {
+		iomap->type = IOMAP_DELALLOC;
+	}
+
 	iomap->addr = IOMAP_NULL_ADDR;
-	iomap->type = IOMAP_DELALLOC;
 	iomap->bdev = fs_info->fs_devices->latest_bdev;
 	iomap->page_ops = &btrfs_buffered_page_ops;
 	return 0;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/15] btrfs: remove buffered write code made unnecessary
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (5 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 06/15] btrfs: Add CoW in iomap based writes Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 08/15] fs: Export generic_file_buffered_read() Goldwyn Rodrigues
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Better done in a separate patch to keep the main patch short(er)

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/file.c | 463 --------------------------------------------------------
 1 file changed, 463 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0707db04d3cc..f7087e28ac08 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -390,79 +390,6 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info)
 	return 0;
 }
 
-/* simple helper to fault in pages and copy.  This should go away
- * and be replaced with calls into generic code.
- */
-static noinline int btrfs_copy_from_user(loff_t pos, size_t write_bytes,
-					 struct page **prepared_pages,
-					 struct iov_iter *i)
-{
-	size_t copied = 0;
-	size_t total_copied = 0;
-	int pg = 0;
-	int offset = offset_in_page(pos);
-
-	while (write_bytes > 0) {
-		size_t count = min_t(size_t,
-				     PAGE_SIZE - offset, write_bytes);
-		struct page *page = prepared_pages[pg];
-		/*
-		 * Copy data from userspace to the current page
-		 */
-		copied = iov_iter_copy_from_user_atomic(page, i, offset, count);
-
-		/* Flush processor's dcache for this page */
-		flush_dcache_page(page);
-
-		/*
-		 * if we get a partial write, we can end up with
-		 * partially up to date pages.  These add
-		 * a lot of complexity, so make sure they don't
-		 * happen by forcing this copy to be retried.
-		 *
-		 * The rest of the btrfs_file_write code will fall
-		 * back to page at a time copies after we return 0.
-		 */
-		if (!PageUptodate(page) && copied < count)
-			copied = 0;
-
-		iov_iter_advance(i, copied);
-		write_bytes -= copied;
-		total_copied += copied;
-
-		/* Return to btrfs_file_write_iter to fault page */
-		if (unlikely(copied == 0))
-			break;
-
-		if (copied < PAGE_SIZE - offset) {
-			offset += copied;
-		} else {
-			pg++;
-			offset = 0;
-		}
-	}
-	return total_copied;
-}
-
-/*
- * unlocks pages after btrfs_file_write is done with them
- */
-static void btrfs_drop_pages(struct page **pages, size_t num_pages)
-{
-	size_t i;
-	for (i = 0; i < num_pages; i++) {
-		/* page checked is some magic around finding pages that
-		 * have been modified without going through btrfs_set_page_dirty
-		 * clear it here. There should be no need to mark the pages
-		 * accessed as prepare_pages should have marked them accessed
-		 * in prepare_pages via find_or_create_page()
-		 */
-		ClearPageChecked(pages[i]);
-		unlock_page(pages[i]);
-		put_page(pages[i]);
-	}
-}
-
 static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode,
 					 const u64 start,
 					 const u64 len,
@@ -1387,164 +1314,6 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
-/*
- * on error we return an unlocked page and the error value
- * on success we return a locked page and 0
- */
-static int prepare_uptodate_page(struct inode *inode,
-				 struct page *page, u64 pos,
-				 bool force_uptodate)
-{
-	int ret = 0;
-
-	if (((pos & (PAGE_SIZE - 1)) || force_uptodate) &&
-	    !PageUptodate(page)) {
-		ret = btrfs_readpage(NULL, page);
-		if (ret)
-			return ret;
-		lock_page(page);
-		if (!PageUptodate(page)) {
-			unlock_page(page);
-			return -EIO;
-		}
-		if (page->mapping != inode->i_mapping) {
-			unlock_page(page);
-			return -EAGAIN;
-		}
-	}
-	return 0;
-}
-
-/*
- * this just gets pages into the page cache and locks them down.
- */
-static noinline int prepare_pages(struct inode *inode, struct page **pages,
-				  size_t num_pages, loff_t pos,
-				  size_t write_bytes, bool force_uptodate)
-{
-	int i;
-	unsigned long index = pos >> PAGE_SHIFT;
-	gfp_t mask = btrfs_alloc_write_mask(inode->i_mapping);
-	int err = 0;
-	int faili;
-
-	for (i = 0; i < num_pages; i++) {
-again:
-		pages[i] = find_or_create_page(inode->i_mapping, index + i,
-					       mask | __GFP_WRITE);
-		if (!pages[i]) {
-			faili = i - 1;
-			err = -ENOMEM;
-			goto fail;
-		}
-
-		if (i == 0)
-			err = prepare_uptodate_page(inode, pages[i], pos,
-						    force_uptodate);
-		if (!err && i == num_pages - 1)
-			err = prepare_uptodate_page(inode, pages[i],
-						    pos + write_bytes, false);
-		if (err) {
-			put_page(pages[i]);
-			if (err == -EAGAIN) {
-				err = 0;
-				goto again;
-			}
-			faili = i - 1;
-			goto fail;
-		}
-		wait_on_page_writeback(pages[i]);
-	}
-
-	return 0;
-fail:
-	while (faili >= 0) {
-		unlock_page(pages[faili]);
-		put_page(pages[faili]);
-		faili--;
-	}
-	return err;
-
-}
-
-/*
- * This function locks the extent and properly waits for data=ordered extents
- * to finish before allowing the pages to be modified if need.
- *
- * The return value:
- * 1 - the extent is locked
- * 0 - the extent is not locked, and everything is OK
- * -EAGAIN - need re-prepare the pages
- * the other < 0 number - Something wrong happens
- */
-static noinline int
-lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages,
-				size_t num_pages, loff_t pos,
-				size_t write_bytes,
-				u64 *lockstart, u64 *lockend,
-				struct extent_state **cached_state)
-{
-	struct btrfs_fs_info *fs_info = inode->root->fs_info;
-	u64 start_pos;
-	u64 last_pos;
-	int i;
-	int ret = 0;
-
-	start_pos = round_down(pos, fs_info->sectorsize);
-	last_pos = start_pos
-		+ round_up(pos + write_bytes - start_pos,
-			   fs_info->sectorsize) - 1;
-
-	if (start_pos < inode->vfs_inode.i_size) {
-		struct btrfs_ordered_extent *ordered;
-
-		lock_extent_bits(&inode->io_tree, start_pos, last_pos,
-				cached_state);
-		ordered = btrfs_lookup_ordered_range(inode, start_pos,
-						     last_pos - start_pos + 1);
-		if (ordered &&
-		    ordered->file_offset + ordered->len > start_pos &&
-		    ordered->file_offset <= last_pos) {
-			unlock_extent_cached(&inode->io_tree, start_pos,
-					last_pos, cached_state);
-			for (i = 0; i < num_pages; i++) {
-				unlock_page(pages[i]);
-				put_page(pages[i]);
-			}
-			btrfs_start_ordered_extent(&inode->vfs_inode,
-					ordered, 1);
-			btrfs_put_ordered_extent(ordered);
-			return -EAGAIN;
-		}
-		if (ordered)
-			btrfs_put_ordered_extent(ordered);
-
-		*lockstart = start_pos;
-		*lockend = last_pos;
-		ret = 1;
-	}
-
-	/*
-	 * It's possible the pages are dirty right now, but we don't want
-	 * to clean them yet because copy_from_user may catch a page fault
-	 * and we might have to fall back to one page at a time.  If that
-	 * happens, we'll unlock these pages and we'd have a window where
-	 * reclaim could sneak in and drop the once-dirty page on the floor
-	 * without writing it.
-	 *
-	 * We have the pages locked and the extent range locked, so there's
-	 * no way someone can start IO on any dirty pages in this range.
-	 *
-	 * We'll call btrfs_dirty_pages() later on, and that will flip around
-	 * delalloc bits and dirty the pages as required.
-	 */
-	for (i = 0; i < num_pages; i++) {
-		WARN_ON(!PageLocked(pages[i]));
-	}
-
-	return ret;
-}
-
 static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
 				    size_t *write_bytes)
 {
@@ -1581,238 +1350,6 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
 	return ret;
 }
 
-static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
-					       struct iov_iter *i)
-{
-	struct file *file = iocb->ki_filp;
-	loff_t pos = iocb->ki_pos;
-	struct inode *inode = file_inode(file);
-	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-	struct btrfs_root *root = BTRFS_I(inode)->root;
-	struct page **pages = NULL;
-	struct extent_state *cached_state = NULL;
-	struct extent_changeset *data_reserved = NULL;
-	u64 release_bytes = 0;
-	u64 lockstart;
-	u64 lockend;
-	size_t num_written = 0;
-	int nrptrs;
-	int ret = 0;
-	bool only_release_metadata = false;
-	bool force_page_uptodate = false;
-
-	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
-			PAGE_SIZE / (sizeof(struct page *)));
-	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
-	nrptrs = max(nrptrs, 8);
-	pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL);
-	if (!pages)
-		return -ENOMEM;
-
-	while (iov_iter_count(i) > 0) {
-		size_t offset = offset_in_page(pos);
-		size_t sector_offset;
-		size_t write_bytes = min(iov_iter_count(i),
-					 nrptrs * (size_t)PAGE_SIZE -
-					 offset);
-		size_t num_pages = DIV_ROUND_UP(write_bytes + offset,
-						PAGE_SIZE);
-		size_t reserve_bytes;
-		size_t dirty_pages;
-		size_t copied;
-		size_t dirty_sectors;
-		size_t num_sectors;
-		int extents_locked;
-
-		WARN_ON(num_pages > nrptrs);
-
-		/*
-		 * Fault pages before locking them in prepare_pages
-		 * to avoid recursive lock
-		 */
-		if (unlikely(iov_iter_fault_in_readable(i, write_bytes))) {
-			ret = -EFAULT;
-			break;
-		}
-
-		sector_offset = pos & (fs_info->sectorsize - 1);
-		reserve_bytes = round_up(write_bytes + sector_offset,
-				fs_info->sectorsize);
-
-		extent_changeset_release(data_reserved);
-		ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
-						  write_bytes);
-		if (ret < 0) {
-			if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
-						      BTRFS_INODE_PREALLOC)) &&
-			    check_can_nocow(BTRFS_I(inode), pos,
-					&write_bytes) > 0) {
-				/*
-				 * For nodata cow case, no need to reserve
-				 * data space.
-				 */
-				only_release_metadata = true;
-				/*
-				 * our prealloc extent may be smaller than
-				 * write_bytes, so scale down.
-				 */
-				num_pages = DIV_ROUND_UP(write_bytes + offset,
-							 PAGE_SIZE);
-				reserve_bytes = round_up(write_bytes +
-							 sector_offset,
-							 fs_info->sectorsize);
-			} else {
-				break;
-			}
-		}
-
-		WARN_ON(reserve_bytes == 0);
-		ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode),
-				reserve_bytes);
-		if (ret) {
-			if (!only_release_metadata)
-				btrfs_free_reserved_data_space(inode,
-						data_reserved, pos,
-						write_bytes);
-			else
-				btrfs_end_write_no_snapshotting(root);
-			break;
-		}
-
-		release_bytes = reserve_bytes;
-again:
-		/*
-		 * This is going to setup the pages array with the number of
-		 * pages we want, so we don't really need to worry about the
-		 * contents of pages from loop to loop
-		 */
-		ret = prepare_pages(inode, pages, num_pages,
-				    pos, write_bytes,
-				    force_page_uptodate);
-		if (ret) {
-			btrfs_delalloc_release_extents(BTRFS_I(inode),
-						       reserve_bytes, true);
-			break;
-		}
-
-		extents_locked = lock_and_cleanup_extent_if_need(
-				BTRFS_I(inode), pages,
-				num_pages, pos, write_bytes, &lockstart,
-				&lockend, &cached_state);
-		if (extents_locked < 0) {
-			if (extents_locked == -EAGAIN)
-				goto again;
-			btrfs_delalloc_release_extents(BTRFS_I(inode),
-						       reserve_bytes, true);
-			ret = extents_locked;
-			break;
-		}
-
-		copied = btrfs_copy_from_user(pos, write_bytes, pages, i);
-
-		num_sectors = BTRFS_BYTES_TO_BLKS(fs_info, reserve_bytes);
-		dirty_sectors = round_up(copied + sector_offset,
-					fs_info->sectorsize);
-		dirty_sectors = BTRFS_BYTES_TO_BLKS(fs_info, dirty_sectors);
-
-		/*
-		 * if we have trouble faulting in the pages, fall
-		 * back to one page at a time
-		 */
-		if (copied < write_bytes)
-			nrptrs = 1;
-
-		if (copied == 0) {
-			force_page_uptodate = true;
-			dirty_sectors = 0;
-			dirty_pages = 0;
-		} else {
-			force_page_uptodate = false;
-			dirty_pages = DIV_ROUND_UP(copied + offset,
-						   PAGE_SIZE);
-		}
-
-		if (num_sectors > dirty_sectors) {
-			/* release everything except the sectors we dirtied */
-			release_bytes -= dirty_sectors <<
-						fs_info->sb->s_blocksize_bits;
-			if (only_release_metadata) {
-				btrfs_delalloc_release_metadata(BTRFS_I(inode),
-							release_bytes, true);
-			} else {
-				u64 __pos;
-
-				__pos = round_down(pos,
-						   fs_info->sectorsize) +
-					(dirty_pages << PAGE_SHIFT);
-				btrfs_delalloc_release_space(inode,
-						data_reserved, __pos,
-						release_bytes, true);
-			}
-		}
-
-		release_bytes = round_up(copied + sector_offset,
-					fs_info->sectorsize);
-
-		if (copied > 0)
-			ret = btrfs_dirty_pages(inode, pages, dirty_pages,
-						pos, copied, &cached_state);
-		if (extents_locked)
-			unlock_extent_cached(&BTRFS_I(inode)->io_tree,
-					     lockstart, lockend, &cached_state);
-		btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes,
-					       true);
-		if (ret) {
-			btrfs_drop_pages(pages, num_pages);
-			break;
-		}
-
-		release_bytes = 0;
-		if (only_release_metadata)
-			btrfs_end_write_no_snapshotting(root);
-
-		if (only_release_metadata && copied > 0) {
-			lockstart = round_down(pos,
-					       fs_info->sectorsize);
-			lockend = round_up(pos + copied,
-					   fs_info->sectorsize) - 1;
-
-			set_extent_bit(&BTRFS_I(inode)->io_tree, lockstart,
-				       lockend, EXTENT_NORESERVE, NULL,
-				       NULL, GFP_NOFS);
-			only_release_metadata = false;
-		}
-
-		btrfs_drop_pages(pages, num_pages);
-
-		cond_resched();
-
-		balance_dirty_pages_ratelimited(inode->i_mapping);
-		if (dirty_pages < (fs_info->nodesize >> PAGE_SHIFT) + 1)
-			btrfs_btree_balance_dirty(fs_info);
-
-		pos += copied;
-		num_written += copied;
-	}
-
-	kfree(pages);
-
-	if (release_bytes) {
-		if (only_release_metadata) {
-			btrfs_end_write_no_snapshotting(root);
-			btrfs_delalloc_release_metadata(BTRFS_I(inode),
-					release_bytes, true);
-		} else {
-			btrfs_delalloc_release_space(inode, data_reserved,
-					round_down(pos, fs_info->sectorsize),
-					release_bytes, true);
-		}
-	}
-
-	extent_changeset_free(data_reserved);
-	return num_written ? num_written : ret;
-}
-
 static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/15] fs: Export generic_file_buffered_read()
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (6 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 07/15] btrfs: remove buffered write code made unnecessary Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 09/15] btrfs: basic direct read operation Goldwyn Rodrigues
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Export generic_file_buffered_read() to be used to
supplement incomplete direct reads.

While we are at it, correct the comments and variable names.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 include/linux/fs.h |  2 ++
 mm/filemap.c       | 13 +++++++------
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 997a530ff4e9..26d827434060 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3041,6 +3041,8 @@ extern int generic_file_rw_checks(struct file *file_in, struct file *file_out);
 extern int generic_copy_file_checks(struct file *file_in, loff_t pos_in,
 				    struct file *file_out, loff_t pos_out,
 				    size_t *count, unsigned int flags);
+extern ssize_t generic_file_buffered_read(struct kiocb *iocb,
+		struct iov_iter *to, ssize_t already_read);
 extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *);
 extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *);
 extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *);
diff --git a/mm/filemap.c b/mm/filemap.c
index d0cf700bf201..2121ae01eae8 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2014,7 +2014,7 @@ static void shrink_readahead_size_eio(struct file *filp,
  * generic_file_buffered_read - generic file read routine
  * @iocb:	the iocb to read
  * @iter:	data destination
- * @written:	already copied
+ * @copied:	already copied
  *
  * This is a generic file read routine, and uses the
  * mapping->a_ops->readpage() function for the actual low-level stuff.
@@ -2023,11 +2023,11 @@ static void shrink_readahead_size_eio(struct file *filp,
  * of the logic when it comes to error handling etc.
  *
  * Return:
- * * total number of bytes copied, including those the were already @written
+ * * total number of bytes copied, including those the were @copied
  * * negative error code if nothing was copied
  */
-static ssize_t generic_file_buffered_read(struct kiocb *iocb,
-		struct iov_iter *iter, ssize_t written)
+ssize_t generic_file_buffered_read(struct kiocb *iocb,
+		struct iov_iter *iter, ssize_t copied)
 {
 	struct file *filp = iocb->ki_filp;
 	struct address_space *mapping = filp->f_mapping;
@@ -2168,7 +2168,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb,
 		prev_offset = offset;
 
 		put_page(page);
-		written += ret;
+		copied += ret;
 		if (!iov_iter_count(iter))
 			goto out;
 		if (ret < nr) {
@@ -2276,8 +2276,9 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb,
 
 	*ppos = ((loff_t)index << PAGE_SHIFT) + offset;
 	file_accessed(filp);
-	return written ? written : error;
+	return copied ? copied : error;
 }
+EXPORT_SYMBOL_GPL(generic_file_buffered_read);
 
 /**
  * generic_file_read_iter - generic filesystem read routine
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/15] btrfs: basic direct read operation
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (7 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 08/15] fs: Export generic_file_buffered_read() Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 10/15] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Goldwyn Rodrigues
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Add btrfs_dio_iomap_ops for iomap.begin() function. In order to
accomodate dio reads, add a new function btrfs_file_read_iter()
which would call btrfs_dio_iomap_read() for DIO reads and
fallback to generic_file_buffered_read otherwise.

Changed parameter written in generic_file_buffered_read() to
already_read.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/ctree.h |  2 ++
 fs/btrfs/file.c  | 13 ++++++++++++-
 fs/btrfs/iomap.c | 20 ++++++++++++++++++++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 9c501c7826b4..5ca3a365e639 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3243,7 +3243,9 @@ int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end);
 loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in,
 			      struct file *file_out, loff_t pos_out,
 			      loff_t len, unsigned int remap_flags);
+/* iomap.c */
 size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from);
+ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to);
 
 /* tree-defrag.c */
 int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index f7087e28ac08..e7f67d514ba8 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2839,9 +2839,20 @@ static int btrfs_file_open(struct inode *inode, struct file *filp)
 	return generic_file_open(inode, filp);
 }
 
+static ssize_t btrfs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
+{
+	ssize_t ret = 0;
+	if (iocb->ki_flags & IOCB_DIRECT)
+		ret = btrfs_dio_iomap_read(iocb, to);
+	if (ret < 0)
+		return ret;
+
+	return generic_file_buffered_read(iocb, to, ret);
+}
+
 const struct file_operations btrfs_file_operations = {
 	.llseek		= btrfs_file_llseek,
-	.read_iter      = generic_file_read_iter,
+	.read_iter      = btrfs_file_read_iter,
 	.splice_read	= generic_file_splice_read,
 	.write_iter	= btrfs_file_write_iter,
 	.mmap		= btrfs_file_mmap,
diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c
index 8ed93d427962..ee94c687db1a 100644
--- a/fs/btrfs/iomap.c
+++ b/fs/btrfs/iomap.c
@@ -421,3 +421,23 @@ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from)
 	return written;
 }
 
+static int btrfs_dio_iomap_begin(struct inode *inode, loff_t pos,
+		loff_t length, unsigned flags, struct iomap *iomap,
+		struct iomap *srcmap)
+{
+	return get_iomap(inode, pos, length, iomap);
+}
+
+static const struct iomap_ops btrfs_dio_iomap_ops = {
+	.iomap_begin            = btrfs_dio_iomap_begin,
+};
+
+ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	ssize_t ret;
+	inode_lock_shared(inode);
+	ret = iomap_dio_rw(iocb, to, &btrfs_dio_iomap_ops, NULL);
+	inode_unlock_shared(inode);
+	return ret;
+}
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/15] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write()
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (8 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 09/15] btrfs: basic direct read operation Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 11/15] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Goldwyn Rodrigues
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

This makes btrfs_get_extent_map_write() independent of Direct
I/O code.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/ctree.h |  2 ++
 fs/btrfs/inode.c | 40 +++++++++++++++++++++++++++-------------
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5ca3a365e639..c25dfd8e619b 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3168,6 +3168,8 @@ struct inode *btrfs_iget_path(struct super_block *s, struct btrfs_key *location,
 			      struct btrfs_path *path);
 struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 			 struct btrfs_root *root, int *was_new);
+int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh,
+		struct inode *inode, u64 start, u64 len);
 struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 				    struct page *page, size_t pg_offset,
 				    u64 start, u64 end, int create);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 258bacefdf5f..24895793fd91 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7592,11 +7592,10 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em,
 	return 0;
 }
 
-static int btrfs_get_blocks_direct_write(struct extent_map **map,
-					 struct buffer_head *bh_result,
-					 struct inode *inode,
-					 struct btrfs_dio_data *dio_data,
-					 u64 start, u64 len)
+int btrfs_get_extent_map_write(struct extent_map **map,
+		struct buffer_head *bh,
+		struct inode *inode,
+		u64 start, u64 len)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct extent_map *em = *map;
@@ -7650,22 +7649,38 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
 			 */
 			btrfs_free_reserved_data_space_noquota(inode, start,
 							       len);
-			goto skip_cow;
+			/* skip COW */
+			goto out;
 		}
 	}
 
 	/* this will cow the extent */
-	len = bh_result->b_size;
+	if (bh)
+		len = bh->b_size;
 	free_extent_map(em);
 	*map = em = btrfs_new_extent_direct(inode, start, len);
-	if (IS_ERR(em)) {
-		ret = PTR_ERR(em);
-		goto out;
-	}
+	if (IS_ERR(em))
+		return PTR_ERR(em);
+out:
+	return ret;
+}
 
+static int btrfs_get_blocks_direct_write(struct extent_map **map,
+					 struct buffer_head *bh_result,
+					 struct inode *inode,
+					 struct btrfs_dio_data *dio_data,
+					 u64 start, u64 len)
+{
+	int ret;
+	struct extent_map *em;
+
+	ret = btrfs_get_extent_map_write(map, bh_result, inode,
+			start, len);
+	if (ret < 0)
+		return ret;
+	em = *map;
 	len = min(len, em->len - (start - em->start));
 
-skip_cow:
 	bh_result->b_blocknr = (em->block_start + (start - em->start)) >>
 		inode->i_blkbits;
 	bh_result->b_size = len;
@@ -7686,7 +7701,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
 	dio_data->reserve -= len;
 	dio_data->unsubmitted_oe_range_end = start + len;
 	current->journal_info = dio_data;
-out:
 	return ret;
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/15] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent()
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (9 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 10/15] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 12/15] iomap: use a function pointer for dio submits Goldwyn Rodrigues
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Since we will be using it in another part of the code, use a
better name to declare it non-static

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/ctree.h |  7 +++++--
 fs/btrfs/inode.c | 14 +++++---------
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index c25dfd8e619b..04c119ca229b 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3171,8 +3171,11 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 int btrfs_get_extent_map_write(struct extent_map **map, struct buffer_head *bh,
 		struct inode *inode, u64 start, u64 len);
 struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
-				    struct page *page, size_t pg_offset,
-				    u64 start, u64 end, int create);
+		struct page *page, size_t pg_offset,
+		u64 start, u64 end, int create);
+void btrfs_update_ordered_extent(struct inode *inode,
+		const u64 offset, const u64 bytes,
+		const bool uptodate);
 int btrfs_update_inode(struct btrfs_trans_handle *trans,
 			      struct btrfs_root *root,
 			      struct inode *inode);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 24895793fd91..d415534ce733 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -89,10 +89,6 @@ static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len,
 				       u64 ram_bytes, int compress_type,
 				       int type);
 
-static void __endio_write_update_ordered(struct inode *inode,
-					 const u64 offset, const u64 bytes,
-					 const bool uptodate);
-
 /*
  * Cleanup all submitted ordered extents in specified range to handle errors
  * from the btrfs_run_delalloc_range() callback.
@@ -133,7 +129,7 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode,
 		bytes -= PAGE_SIZE;
 	}
 
-	return __endio_write_update_ordered(inode, offset, bytes, false);
+	return btrfs_update_ordered_extent(inode, offset, bytes, false);
 }
 
 static int btrfs_dirty_inode(struct inode *inode);
@@ -8176,7 +8172,7 @@ static void btrfs_endio_direct_read(struct bio *bio)
 	bio_put(bio);
 }
 
-static void __endio_write_update_ordered(struct inode *inode,
+void btrfs_update_ordered_extent(struct inode *inode,
 					 const u64 offset, const u64 bytes,
 					 const bool uptodate)
 {
@@ -8229,7 +8225,7 @@ static void btrfs_endio_direct_write(struct bio *bio)
 	struct btrfs_dio_private *dip = bio->bi_private;
 	struct bio *dio_bio = dip->dio_bio;
 
-	__endio_write_update_ordered(dip->inode, dip->logical_offset,
+	btrfs_update_ordered_extent(dip->inode, dip->logical_offset,
 				     dip->bytes, !bio->bi_status);
 
 	kfree(dip);
@@ -8546,7 +8542,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
 		bio = NULL;
 	} else {
 		if (write)
-			__endio_write_update_ordered(inode,
+			btrfs_update_ordered_extent(inode,
 						file_offset,
 						dio_bio->bi_iter.bi_size,
 						false);
@@ -8686,7 +8682,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 			 */
 			if (dio_data.unsubmitted_oe_range_start <
 			    dio_data.unsubmitted_oe_range_end)
-				__endio_write_update_ordered(inode,
+				btrfs_update_ordered_extent(inode,
 					dio_data.unsubmitted_oe_range_start,
 					dio_data.unsubmitted_oe_range_end -
 					dio_data.unsubmitted_oe_range_start,
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/15] iomap: use a function pointer for dio submits
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (10 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 11/15] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-02 16:41   ` Christoph Hellwig
  2019-09-01 20:08 ` [PATCH 13/15] btrfs: Use iomap_dio_rw for performing direct I/O writes Goldwyn Rodrigues
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

This helps filesystems to perform tasks on the bio while
submitting for I/O. Since btrfs requires the position
we are working on, pass pos to iomap_dio_submit_bio()

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/iomap/direct-io.c  | 16 +++++++++++-----
 include/linux/iomap.h |  1 +
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 5279029c7a3c..00b8af35cf97 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -59,7 +59,7 @@ int iomap_dio_iopoll(struct kiocb *kiocb, bool spin)
 EXPORT_SYMBOL_GPL(iomap_dio_iopoll);
 
 static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
-		struct bio *bio)
+		struct bio *bio, loff_t pos)
 {
 	atomic_inc(&dio->ref);
 
@@ -67,7 +67,13 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
 		bio_set_polled(bio, dio->iocb);
 
 	dio->submit.last_queue = bdev_get_queue(iomap->bdev);
-	dio->submit.cookie = submit_bio(bio);
+	if (iomap->submit_io) {
+		iomap->submit_io(bio, file_inode(dio->iocb->ki_filp),
+				pos);
+		dio->submit.cookie = BLK_QC_T_NONE;
+	} else {
+		dio->submit.cookie = submit_bio(bio);
+	}
 }
 
 static ssize_t iomap_dio_complete(struct iomap_dio *dio)
@@ -195,7 +201,7 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos,
 	get_page(page);
 	__bio_add_page(bio, page, len, 0);
 	bio_set_op_attrs(bio, REQ_OP_WRITE, flags);
-	iomap_dio_submit_bio(dio, iomap, bio);
+	iomap_dio_submit_bio(dio, iomap, bio, pos);
 }
 
 static loff_t
@@ -301,11 +307,11 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 		iov_iter_advance(dio->submit.iter, n);
 
 		dio->size += n;
-		pos += n;
 		copied += n;
 
 		nr_pages = iov_iter_npages(&iter, BIO_MAX_PAGES);
-		iomap_dio_submit_bio(dio, iomap, bio);
+		iomap_dio_submit_bio(dio, iomap, bio, pos);
+		pos += n;
 	} while (nr_pages);
 
 	/*
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 9c082d36286a..465d736d74ac 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -80,6 +80,7 @@ struct iomap {
 	void			*inline_data;
 	void			*private; /* filesystem private */
 	const struct iomap_page_ops *page_ops;
+	dio_submit_t 		*submit_io;
 };
 
 static inline sector_t
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/15] btrfs: Use iomap_dio_rw for performing direct I/O writes
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (11 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 12/15] iomap: use a function pointer for dio submits Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 14/15] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Goldwyn Rodrigues
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

btrfs_iomap_init() is a function to be used to btrfs_iomap
structure which is used to pass information between iomap
begin() and end(). All data reservations and allocations must
be performed in this function. For reads, btrfs_iomap allocation
is not required.

We perform space allocation and reservation before the
iomap_dio_rw() call, as opposed to iomap_begin(). This is how
the current direct I/O path performs. The problem with putting
in the iomap_begin() is that the transaction needs to be
committed before new allocation is performed. Performing a
transaction for every direct sub-write will reduce performance.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/ctree.h |   3 ++
 fs/btrfs/file.c  |   2 +-
 fs/btrfs/inode.c |  14 +++--
 fs/btrfs/iomap.c | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 160 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 04c119ca229b..7f84b7e47c8a 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3202,6 +3202,8 @@ int btrfs_writepage_cow_fixup(struct page *page, u64 start, u64 end);
 void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start,
 					  u64 end, int uptodate);
 extern const struct dentry_operations btrfs_dentry_operations;
+void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
+		loff_t file_offset);
 
 /* ioctl.c */
 long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
@@ -3251,6 +3253,7 @@ loff_t btrfs_remap_file_range(struct file *file_in, loff_t pos_in,
 /* iomap.c */
 size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from);
 ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to);
+ssize_t btrfs_dio_iomap_write(struct kiocb *iocb, struct iov_iter *from);
 
 /* tree-defrag.c */
 int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index e7f67d514ba8..5d4347e12cdc 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1501,7 +1501,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
 		atomic_inc(&BTRFS_I(inode)->sync_writers);
 
 	if (iocb->ki_flags & IOCB_DIRECT) {
-		num_written = __btrfs_direct_write(iocb, from);
+		num_written = btrfs_dio_iomap_write(iocb, from);
 	} else {
 		num_written = btrfs_buffered_iomap_write(iocb, from);
 		if (num_written > 0)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d415534ce733..323d72858c9c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8167,9 +8167,8 @@ static void btrfs_endio_direct_read(struct bio *bio)
 	kfree(dip);
 
 	dio_bio->bi_status = err;
-	dio_end_io(dio_bio);
+	bio_endio(dio_bio);
 	btrfs_io_bio_free_csum(io_bio);
-	bio_put(bio);
 }
 
 void btrfs_update_ordered_extent(struct inode *inode,
@@ -8231,8 +8230,7 @@ static void btrfs_endio_direct_write(struct bio *bio)
 	kfree(dip);
 
 	dio_bio->bi_status = bio->bi_status;
-	dio_end_io(dio_bio);
-	bio_put(bio);
+	bio_endio(dio_bio);
 }
 
 static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data,
@@ -8464,8 +8462,8 @@ static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip)
 	return 0;
 }
 
-static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
-				loff_t file_offset)
+void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
+			 loff_t file_offset)
 {
 	struct btrfs_dio_private *dip = NULL;
 	struct bio *bio = NULL;
@@ -8536,7 +8534,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
 		/*
 		 * The end io callbacks free our dip, do the final put on bio
 		 * and all the cleanup and final put for dio_bio (through
-		 * dio_end_io()).
+		 * end_io()).
 		 */
 		dip = NULL;
 		bio = NULL;
@@ -8555,7 +8553,7 @@ static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
 		 * Releases and cleans up our dio_bio, no need to bio_put()
 		 * nor bio_endio()/bio_io_error() against dio_bio.
 		 */
-		dio_end_io(dio_bio);
+		bio_endio(dio_bio);
 	}
 	if (bio)
 		bio_put(bio);
diff --git a/fs/btrfs/iomap.c b/fs/btrfs/iomap.c
index ee94c687db1a..554f30d83d57 100644
--- a/fs/btrfs/iomap.c
+++ b/fs/btrfs/iomap.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/iomap.h>
+#include <linux/uio.h>
 #include "ctree.h"
 #include "btrfs_inode.h"
 #include "volumes.h"
@@ -421,15 +422,110 @@ size_t btrfs_buffered_iomap_write(struct kiocb *iocb, struct iov_iter *from)
 	return written;
 }
 
+static struct btrfs_iomap *btrfs_iomap_init(struct inode *inode,
+		struct extent_map **em,
+		loff_t pos, loff_t length)
+{
+	int ret = 0;
+	struct extent_map *map = *em;
+	struct btrfs_iomap *bi;
+	u64 num_bytes;
+
+	bi = kzalloc(sizeof(struct btrfs_iomap), GFP_NOFS);
+	if (!bi)
+		return ERR_PTR(-ENOMEM);
+
+	bi->start = round_down(pos, PAGE_SIZE);
+	bi->end = PAGE_ALIGN(pos + length) - 1;
+	num_bytes = bi->end - bi->start + 1;
+
+	/* Wait for existing ordered extents in range to finish */
+	btrfs_wait_ordered_range(inode, bi->start, num_bytes);
+
+	lock_extent_bits(&BTRFS_I(inode)->io_tree, bi->start, bi->end, &bi->cached_state);
+
+	if (ret) {
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end,
+				&bi->cached_state);
+		kfree(bi);
+		return ERR_PTR(ret);
+	}
+
+	refcount_inc(&map->refs);
+	ret = btrfs_get_extent_map_write(em, NULL,
+			inode, bi->start, num_bytes);
+	if (ret) {
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end,
+				&bi->cached_state);
+		kfree(bi);
+		return ERR_PTR(ret);
+	}
+	free_extent_map(map);
+	return bi;
+}
+
 static int btrfs_dio_iomap_begin(struct inode *inode, loff_t pos,
-		loff_t length, unsigned flags, struct iomap *iomap,
-		struct iomap *srcmap)
+                loff_t length, unsigned flags, struct iomap *iomap,
+                struct iomap *srcmap)
+{
+        struct extent_map *em;
+        struct btrfs_iomap *bi = NULL;
+
+        em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, pos, length, 0);
+
+        if (flags & IOMAP_WRITE) {
+                srcmap->offset = em->start;
+                srcmap->length = em->len;
+                srcmap->bdev = em->bdev;
+                if (em->block_start == EXTENT_MAP_HOLE) {
+                        srcmap->type = IOMAP_HOLE;
+                } else {
+                        srcmap->type = IOMAP_MAPPED;
+                        srcmap->addr = em->block_start;
+                }
+                bi = btrfs_iomap_init(inode, &em, pos, length);
+                if (IS_ERR(bi))
+                        return PTR_ERR(bi);
+        }
+
+        iomap->offset = em->start;
+        iomap->length = em->len;
+        iomap->bdev = em->bdev;
+
+	if (em->block_start == EXTENT_MAP_HOLE) {
+		iomap->type = IOMAP_HOLE;
+	} else {
+		iomap->type = IOMAP_MAPPED;
+		iomap->addr = em->block_start;
+	}
+        iomap->private = bi;
+	iomap->submit_io = btrfs_submit_direct;
+        return 0;
+}
+
+static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos,
+		loff_t length, ssize_t written, unsigned flags,
+		struct iomap *iomap)
 {
-	return get_iomap(inode, pos, length, iomap);
+	struct btrfs_iomap *bi = iomap->private;
+	u64 wend;
+	loff_t release_bytes;
+
+	if (!bi)
+		return 0;
+
+	unlock_extent_cached(&BTRFS_I(inode)->io_tree, bi->start, bi->end,
+			&bi->cached_state);
+
+	wend = PAGE_ALIGN(pos + written);
+	release_bytes = wend - bi->end - 1;
+	kfree(bi);
+	return 0;
 }
 
 static const struct iomap_ops btrfs_dio_iomap_ops = {
 	.iomap_begin            = btrfs_dio_iomap_begin,
+	.iomap_end              = btrfs_dio_iomap_end,
 };
 
 ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to)
@@ -441,3 +537,54 @@ ssize_t btrfs_dio_iomap_read(struct kiocb *iocb, struct iov_iter *to)
 	inode_unlock_shared(inode);
 	return ret;
 }
+
+ssize_t btrfs_dio_iomap_write(struct kiocb *iocb, struct iov_iter *from)
+{
+	struct file *file = iocb->ki_filp;
+	struct inode *inode = file_inode(file);
+	ssize_t written, written_buffered;
+	loff_t pos = iocb->ki_pos, endbyte;
+	size_t count = iov_iter_count(from);
+	struct extent_changeset *data_reserved = NULL;
+	int err;
+
+	btrfs_delalloc_reserve_space(inode, &data_reserved, pos, count);
+
+	written = iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops, NULL);
+	if (written < count) {
+		ssize_t done = (written < 0) ? 0 : written;
+		btrfs_delalloc_release_space(inode, data_reserved, pos, count - done,
+	                       true);
+	}
+	btrfs_delalloc_release_extents(BTRFS_I(inode), count, false);
+	extent_changeset_free(data_reserved);
+
+	if (written < 0 || !iov_iter_count(from))
+		return written;
+
+	pos = iocb->ki_pos;
+	written_buffered = btrfs_buffered_iomap_write(iocb, from);
+	if (written_buffered < 0) {
+		err = written_buffered;
+		goto out;
+	}
+	/*
+	 * Ensure all data is persisted. We want the next direct IO read to be
+	 * able to read what was just written.
+	 */
+	endbyte = pos + written_buffered - 1;
+	err = btrfs_fdatawrite_range(inode, pos, endbyte);
+	if (err)
+		goto out;
+	err = filemap_fdatawait_range(inode->i_mapping, pos, endbyte);
+	if (err)
+		goto out;
+	written += written_buffered;
+	iocb->ki_pos = pos + written_buffered;
+	invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
+			endbyte >> PAGE_SHIFT);
+out:
+	if (written > 0 && iocb->ki_pos > i_size_read(inode))
+			i_size_write(inode, iocb->ki_pos);
+	return written ? written : err;
+}
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 14/15] btrfs: Remove btrfs_dio_data and __btrfs_direct_write
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (12 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 13/15] btrfs: Use iomap_dio_rw for performing direct I/O writes Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-01 20:08 ` [PATCH 15/15] btrfs: update inode size during bio completion Goldwyn Rodrigues
  2019-09-02 16:43 ` [PATCH v3 0/15] Btrfs iomap Christoph Hellwig
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

btrfs_dio_data is unnecessary since we are now storing all
informaiton in btrfs_iomap.

Advantage: We don't abuse current->journal_info anymore :)
---
 fs/btrfs/file.c  | 40 ----------------------------
 fs/btrfs/inode.c | 81 ++------------------------------------------------------
 2 files changed, 2 insertions(+), 119 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 5d4347e12cdc..e6c1ffd74660 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1350,46 +1350,6 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
 	return ret;
 }
 
-static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
-{
-	struct file *file = iocb->ki_filp;
-	struct inode *inode = file_inode(file);
-	loff_t pos;
-	ssize_t written;
-	ssize_t written_buffered;
-	loff_t endbyte;
-	int err;
-
-	written = generic_file_direct_write(iocb, from);
-
-	if (written < 0 || !iov_iter_count(from))
-		return written;
-
-	pos = iocb->ki_pos;
-	written_buffered = btrfs_buffered_iomap_write(iocb, from);
-	if (written_buffered < 0) {
-		err = written_buffered;
-		goto out;
-	}
-	/*
-	 * Ensure all data is persisted. We want the next direct IO read to be
-	 * able to read what was just written.
-	 */
-	endbyte = pos + written_buffered - 1;
-	err = btrfs_fdatawrite_range(inode, pos, endbyte);
-	if (err)
-		goto out;
-	err = filemap_fdatawait_range(inode->i_mapping, pos, endbyte);
-	if (err)
-		goto out;
-	written += written_buffered;
-	iocb->ki_pos = pos + written_buffered;
-	invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
-				 endbyte >> PAGE_SHIFT);
-out:
-	return written ? written : err;
-}
-
 static void update_time_for_write(struct inode *inode)
 {
 	struct timespec64 now;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 323d72858c9c..87fbe73ca2e4 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -54,13 +54,6 @@ struct btrfs_iget_args {
 	struct btrfs_root *root;
 };
 
-struct btrfs_dio_data {
-	u64 reserve;
-	u64 unsubmitted_oe_range_start;
-	u64 unsubmitted_oe_range_end;
-	int overwrite;
-};
-
 static const struct inode_operations btrfs_dir_inode_operations;
 static const struct inode_operations btrfs_symlink_inode_operations;
 static const struct inode_operations btrfs_dir_ro_inode_operations;
@@ -7664,7 +7657,6 @@ int btrfs_get_extent_map_write(struct extent_map **map,
 static int btrfs_get_blocks_direct_write(struct extent_map **map,
 					 struct buffer_head *bh_result,
 					 struct inode *inode,
-					 struct btrfs_dio_data *dio_data,
 					 u64 start, u64 len)
 {
 	int ret;
@@ -7686,17 +7678,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
 	if (!test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
 		set_buffer_new(bh_result);
 
-	/*
-	 * Need to update the i_size under the extent lock so buffered
-	 * readers will get the updated i_size when we unlock.
-	 */
-	if (!dio_data->overwrite && start + len > i_size_read(inode))
-		i_size_write(inode, start + len);
-
-	WARN_ON(dio_data->reserve < len);
-	dio_data->reserve -= len;
-	dio_data->unsubmitted_oe_range_end = start + len;
-	current->journal_info = dio_data;
 	return ret;
 }
 
@@ -7706,7 +7687,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct extent_map *em;
 	struct extent_state *cached_state = NULL;
-	struct btrfs_dio_data *dio_data = NULL;
 	u64 start = iblock << inode->i_blkbits;
 	u64 lockstart, lockend;
 	u64 len = bh_result->b_size;
@@ -7721,16 +7701,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 	lockstart = start;
 	lockend = start + len - 1;
 
-	if (current->journal_info) {
-		/*
-		 * Need to pull our outstanding extents and set journal_info to NULL so
-		 * that anything that needs to check if there's a transaction doesn't get
-		 * confused.
-		 */
-		dio_data = current->journal_info;
-		current->journal_info = NULL;
-	}
-
 	/*
 	 * If this errors out it's because we couldn't invalidate pagecache for
 	 * this range and we need to fallback to buffered.
@@ -7770,7 +7740,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 
 	if (create) {
 		ret = btrfs_get_blocks_direct_write(&em, bh_result, inode,
-						    dio_data, start, len);
+						    start, len);
 		if (ret < 0)
 			goto unlock_err;
 
@@ -7808,8 +7778,6 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend,
 			 unlock_bits, 1, 0, &cached_state);
 err:
-	if (dio_data)
-		current->journal_info = dio_data;
 	return ret;
 }
 
@@ -8498,21 +8466,6 @@ void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
 		dip->subio_endio = btrfs_subio_endio_read;
 	}
 
-	/*
-	 * Reset the range for unsubmitted ordered extents (to a 0 length range)
-	 * even if we fail to submit a bio, because in such case we do the
-	 * corresponding error handling below and it must not be done a second
-	 * time by btrfs_direct_IO().
-	 */
-	if (write) {
-		struct btrfs_dio_data *dio_data = current->journal_info;
-
-		dio_data->unsubmitted_oe_range_end = dip->logical_offset +
-			dip->bytes;
-		dio_data->unsubmitted_oe_range_start =
-			dio_data->unsubmitted_oe_range_end;
-	}
-
 	ret = btrfs_submit_direct_hook(dip);
 	if (!ret)
 		return;
@@ -8598,7 +8551,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file->f_mapping->host;
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-	struct btrfs_dio_data dio_data = { 0 };
 	struct extent_changeset *data_reserved = NULL;
 	loff_t offset = iocb->ki_pos;
 	size_t count = 0;
@@ -8631,7 +8583,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 		 * not unlock the i_mutex at this case.
 		 */
 		if (offset + count <= inode->i_size) {
-			dio_data.overwrite = 1;
 			inode_unlock(inode);
 			relock = true;
 		} else if (iocb->ki_flags & IOCB_NOWAIT) {
@@ -8643,16 +8594,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 		if (ret)
 			goto out;
 
-		/*
-		 * We need to know how many extents we reserved so that we can
-		 * do the accounting properly if we go over the number we
-		 * originally calculated.  Abuse current->journal_info for this.
-		 */
-		dio_data.reserve = round_up(count,
-					    fs_info->sectorsize);
-		dio_data.unsubmitted_oe_range_start = (u64)offset;
-		dio_data.unsubmitted_oe_range_end = (u64)offset;
-		current->journal_info = &dio_data;
 		down_read(&BTRFS_I(inode)->dio_sem);
 	} else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
 				     &BTRFS_I(inode)->runtime_flags)) {
@@ -8667,25 +8608,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 				   btrfs_submit_direct, flags);
 	if (iov_iter_rw(iter) == WRITE) {
 		up_read(&BTRFS_I(inode)->dio_sem);
-		current->journal_info = NULL;
-		if (ret < 0 && ret != -EIOCBQUEUED) {
-			if (dio_data.reserve)
-				btrfs_delalloc_release_space(inode, data_reserved,
-					offset, dio_data.reserve, true);
-			/*
-			 * On error we might have left some ordered extents
-			 * without submitting corresponding bios for them, so
-			 * cleanup them up to avoid other tasks getting them
-			 * and waiting for them to complete forever.
-			 */
-			if (dio_data.unsubmitted_oe_range_start <
-			    dio_data.unsubmitted_oe_range_end)
-				btrfs_update_ordered_extent(inode,
-					dio_data.unsubmitted_oe_range_start,
-					dio_data.unsubmitted_oe_range_end -
-					dio_data.unsubmitted_oe_range_start,
-					false);
-		} else if (ret >= 0 && (size_t)ret < count)
+		if (ret >= 0 && (size_t)ret < count)
 			btrfs_delalloc_release_space(inode, data_reserved,
 					offset, count - (size_t)ret, true);
 		btrfs_delalloc_release_extents(BTRFS_I(inode), count, false);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 15/15] btrfs: update inode size during bio completion
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (13 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 14/15] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Goldwyn Rodrigues
@ 2019-09-01 20:08 ` Goldwyn Rodrigues
  2019-09-02 16:43 ` [PATCH v3 0/15] Btrfs iomap Christoph Hellwig
  15 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-01 20:08 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-btrfs, darrick.wong, hch, david, riteshh, Goldwyn Rodrigues

From: Goldwyn Rodrigues <rgoldwyn@suse.com>

Update the inode size for dio writes during bio completion.
This ties the success of the underlying block layer
whether to increase the size of the inode. Especially for
in aio cases.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
 fs/btrfs/inode.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 87fbe73ca2e4..f87a9dd154a9 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8191,9 +8191,13 @@ static void btrfs_endio_direct_write(struct bio *bio)
 {
 	struct btrfs_dio_private *dip = bio->bi_private;
 	struct bio *dio_bio = dip->dio_bio;
+	struct inode *inode = dip->inode;
 
-	btrfs_update_ordered_extent(dip->inode, dip->logical_offset,
+	btrfs_update_ordered_extent(inode, dip->logical_offset,
 				     dip->bytes, !bio->bi_status);
+	if (!bio->bi_status &&
+	    i_size_read(inode) < dip->logical_offset + dip->bytes)
+		i_size_write(inode, dip->logical_offset + dip->bytes);
 
 	kfree(dip);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG
  2019-09-01 20:08 ` [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Goldwyn Rodrigues
@ 2019-09-01 20:50   ` Darrick J. Wong
  2019-09-02 16:29   ` Christoph Hellwig
  1 sibling, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-01 20:50 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, hch, david, riteshh, Goldwyn Rodrigues

On Sun, Sep 01, 2019 at 03:08:22PM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> To improve debugging abilities, especially invalid option
> asserts.
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
>  fs/Kconfig            |  3 +++
>  include/linux/iomap.h | 11 +++++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/fs/Kconfig b/fs/Kconfig
> index bfb1c6095c7a..4bed5df9b55f 100644
> --- a/fs/Kconfig
> +++ b/fs/Kconfig
> @@ -19,6 +19,9 @@ if BLOCK
>  
>  config FS_IOMAP
>  	bool
> +config FS_IOMAP_DEBUG
> +	bool "Debugging for the iomap code"
> +	depends on FS_IOMAP
>  
>  source "fs/ext2/Kconfig"
>  source "fs/ext4/Kconfig"
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index bc499ceae392..209b6c93674e 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -18,6 +18,17 @@ struct page;
>  struct vm_area_struct;
>  struct vm_fault;
>  
> +#ifdef CONFIG_FS_IOMAP_DEBUG
> +#define iomap_assert(expr) \
> +	if(!(expr)) { \
> +		printk( "Assertion failed! %s,%s,%s,line=%d\n",\
> +#expr,__FILE__,__func__,__LINE__); \
> +		BUG(); \

Hmm, this log message ought to have a priority level, right?

#define IOMAP_ASSERT(expr) do { WARN_ON(!(expr)); } while(0)

(or crib ASSERT/ass{fail,warn} from XFS? :D)

> +	}
> +#else
> +#define iomap_assert(expr)

((void)0)

So we don't just have stray semicolons in the code stream?

--D

> +#endif
> +
>  /*
>   * Types of block ranges for iomap mappings:
>   */
> -- 
> 2.16.4
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG
  2019-09-01 20:08 ` [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Goldwyn Rodrigues
  2019-09-01 20:50   ` Darrick J. Wong
@ 2019-09-02 16:29   ` Christoph Hellwig
  2019-09-02 17:09     ` Darrick J. Wong
  1 sibling, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 16:29 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, hch, david, riteshh,
	Goldwyn Rodrigues

On Sun, Sep 01, 2019 at 03:08:22PM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> To improve debugging abilities, especially invalid option
> asserts.

Looking at the code I'd much rather have unconditional WARN_ON_ONCE
statements in most places.  Including returning an error when we see
something invalid in most cases.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O
  2019-09-01 20:08 ` [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Goldwyn Rodrigues
@ 2019-09-02 16:31   ` Christoph Hellwig
  2019-09-03  3:18     ` Darrick J. Wong
  2019-09-03 14:05     ` Goldwyn Rodrigues
  0 siblings, 2 replies; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 16:31 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, hch, david, riteshh,
	Goldwyn Rodrigues

On Sun, Sep 01, 2019 at 03:08:23PM -0500, Goldwyn Rodrigues wrote:
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -37,6 +37,7 @@ struct vm_fault;
>  #define IOMAP_MAPPED	0x03	/* blocks allocated at @addr */
>  #define IOMAP_UNWRITTEN	0x04	/* blocks allocated at @addr in unwritten state */
>  #define IOMAP_INLINE	0x05	/* data inline in the inode */
> +#define IOMAP_COW	0x06	/* copy data from srcmap before writing */

I don't think IOMAP_COW can be a type - it is a flag given that we
can do COW operations that allocate normal written extents (e.g. for
direct I/O or DAX) and for delayed allocations.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW
  2019-09-01 20:08 ` [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW Goldwyn Rodrigues
@ 2019-09-02 16:31   ` Christoph Hellwig
  2019-09-02 17:01     ` Darrick J. Wong
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 16:31 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, hch, david, riteshh,
	Goldwyn Rodrigues

On Sun, Sep 01, 2019 at 03:08:24PM -0500, Goldwyn Rodrigues wrote:

> +		iomap_assert(!(iomap->flags & IOMAP_F_BUFFER_HEAD));
> +		iomap_assert(srcmap->type == IOMAP_HOLE || srcmap->addr > 0);

0 can be a valid address in various file systems, so I don't think we
can just exclude it.  Then again COWing from a hole seems pointless,
doesn't it?

So just check for addr != IOMAP_NULL_ADDR here?

>  
> @@ -961,7 +966,7 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
>  		if (IS_DAX(inode))
>  			status = iomap_dax_zero(pos, offset, bytes, iomap);
>  		else
> -			status = iomap_zero(inode, pos, offset, bytes, iomap);
> +			status = iomap_zero(inode, pos, offset, bytes, iomap, srcmap);

This introduces an > 80 character line.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 12/15] iomap: use a function pointer for dio submits
  2019-09-01 20:08 ` [PATCH 12/15] iomap: use a function pointer for dio submits Goldwyn Rodrigues
@ 2019-09-02 16:41   ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 16:41 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, hch, david, riteshh,
	Goldwyn Rodrigues

On Sun, Sep 01, 2019 at 03:08:33PM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> 
> This helps filesystems to perform tasks on the bio while
> submitting for I/O. Since btrfs requires the position
> we are working on, pass pos to iomap_dio_submit_bio()

Please avoid passing function pointers in mutable data structures.

I think the best idea here is to move the end_io handler into
a structure that can be marked const, and to which we can add the
submit handler later.  Something like this:

---
From d17419b6b4124e95088858d3dcea486db6be9ffa Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Mon, 2 Sep 2019 18:39:34 +0200
Subject: iomap: move the direct I/O ->end_io callback into a structure

Add a new iomap_dio_ops structure that for now just contains the end_io
handler.  This avoid storing the function pointer in a mutable structure,
which is a possible exploit vector for kernel code execution, and prepares
for adding a submit_io handler that btrfs needs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/direct-io.c  | 17 +++++++++--------
 fs/xfs/xfs_file.c     |  6 +++++-
 include/linux/iomap.h |  9 ++++++---
 3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 10517cea9682..cfe5f6a88945 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -24,7 +24,7 @@
 
 struct iomap_dio {
 	struct kiocb		*iocb;
-	iomap_dio_end_io_t	*end_io;
+	const struct iomap_dio_ops *dops;
 	loff_t			i_size;
 	loff_t			size;
 	atomic_t		ref;
@@ -72,13 +72,14 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
 
 static ssize_t iomap_dio_complete(struct iomap_dio *dio)
 {
+	const struct iomap_dio_ops *dops = dio->dops;
 	struct kiocb *iocb = dio->iocb;
 	struct inode *inode = file_inode(iocb->ki_filp);
 	loff_t offset = iocb->ki_pos;
 	ssize_t ret;
 
-	if (dio->end_io) {
-		ret = dio->end_io(iocb,
+	if (dops && dops->end_io) {
+		ret = dops->end_io(iocb,
 				dio->error ? dio->error : dio->size,
 				dio->flags);
 	} else {
@@ -101,9 +102,9 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
 	 * one is a pretty crazy thing to do, so we don't support it 100%.  If
 	 * this invalidation fails, tough, the write still worked...
 	 *
-	 * And this page cache invalidation has to be after dio->end_io(), as
-	 * some filesystems convert unwritten extents to real allocations in
-	 * end_io() when necessary, otherwise a racing buffer read would cache
+	 * And this page cache invalidation has to be after ->end_io(), as some
+	 * filesystems convert unwritten extents to real allocations in
+	 * ->end_io() when necessary, otherwise a racing buffer read would cache
 	 * zeros from unwritten extents.
 	 */
 	if (!dio->error &&
@@ -396,7 +397,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
  */
 ssize_t
 iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
-		const struct iomap_ops *ops, iomap_dio_end_io_t end_io)
+		const struct iomap_ops *ops, const struct iomap_dio_ops *dops)
 {
 	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct inode *inode = file_inode(iocb->ki_filp);
@@ -421,7 +422,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 	atomic_set(&dio->ref, 1);
 	dio->size = 0;
 	dio->i_size = i_size_read(inode);
-	dio->end_io = end_io;
+	dio->dops = dops;
 	dio->error = 0;
 	dio->flags = 0;
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 28101bbc0b78..0a2dc6b71c3c 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -440,6 +440,10 @@ xfs_dio_write_end_io(
 	return error;
 }
 
+static const struct iomap_dio_ops xfs_dio_write_ops = {
+	.end_io		= xfs_dio_write_end_io,
+};
+
 /*
  * xfs_file_dio_aio_write - handle direct IO writes
  *
@@ -540,7 +544,7 @@ xfs_file_dio_aio_write(
 	}
 
 	trace_xfs_file_direct_write(ip, count, iocb->ki_pos);
-	ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io);
+	ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, &xfs_dio_write_ops);
 
 	/*
 	 * If unaligned, this is the only IO in-flight. If it has not yet
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index bc499ceae392..d8c0769bb360 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -188,10 +188,13 @@ sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
  */
 #define IOMAP_DIO_UNWRITTEN	(1 << 0)	/* covers unwritten extent(s) */
 #define IOMAP_DIO_COW		(1 << 1)	/* covers COW extent(s) */
-typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret,
-		unsigned flags);
+
+struct iomap_dio_ops {
+	int	(*end_io)(struct kiocb *iocb, ssize_t ret, unsigned flags);
+};
+
 ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
-		const struct iomap_ops *ops, iomap_dio_end_io_t end_io);
+		const struct iomap_ops *ops, const struct iomap_dio_ops *dops);
 int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
 
 #ifdef CONFIG_SWAP
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/15] Btrfs iomap
  2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
                   ` (14 preceding siblings ...)
  2019-09-01 20:08 ` [PATCH 15/15] btrfs: update inode size during bio completion Goldwyn Rodrigues
@ 2019-09-02 16:43 ` Christoph Hellwig
  2019-09-03  3:51   ` Shiyang Ruan
  2019-09-03 13:44   ` Goldwyn Rodrigues
  15 siblings, 2 replies; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 16:43 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, hch, david, riteshh

On Sun, Sep 01, 2019 at 03:08:21PM -0500, Goldwyn Rodrigues wrote:
> This is an effort to use iomap for btrfs. This would keep most
> responsibility of page handling during writes in iomap code, hence
> code reduction. For CoW support, changes are needed in iomap code
> to make sure we perform a copy before the write.
> This is in line with the discussion we had during adding dax support in
> btrfs.

This looks pretty good modulo a few comments.

Can you please convert the XFS code to use your two iomaps for COW
approach as well to validate it?

Also the iomap_file_dirty helper would really benefit from using the
two iomaps, any chance you could look into improving it to use your
new infrastructure?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW
  2019-09-02 16:31   ` Christoph Hellwig
@ 2019-09-02 17:01     ` Darrick J. Wong
  2019-09-02 17:13       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-02 17:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Goldwyn Rodrigues, linux-fsdevel, linux-btrfs, david, riteshh,
	Goldwyn Rodrigues

On Mon, Sep 02, 2019 at 06:31:24PM +0200, Christoph Hellwig wrote:
> On Sun, Sep 01, 2019 at 03:08:24PM -0500, Goldwyn Rodrigues wrote:
> 
> > +		iomap_assert(!(iomap->flags & IOMAP_F_BUFFER_HEAD));
> > +		iomap_assert(srcmap->type == IOMAP_HOLE || srcmap->addr > 0);
> 
> 0 can be a valid address in various file systems, so I don't think we
> can just exclude it.  Then again COWing from a hole seems pointless,
> doesn't it?

XFS does that if you set a cowextsize hint and a speculative cow
preallocation ends up covering a hole.  Granted I don't think there's
much point in reading from a COW fork extent to fill in an unaligned
buffered write since it /should/ just end up zero-filling the pagecache
regardless of fork... but I don't see much harm in doing that.

--D

> So just check for addr != IOMAP_NULL_ADDR here?
> 
> >  
> > @@ -961,7 +966,7 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count,
> >  		if (IS_DAX(inode))
> >  			status = iomap_dax_zero(pos, offset, bytes, iomap);
> >  		else
> > -			status = iomap_zero(inode, pos, offset, bytes, iomap);
> > +			status = iomap_zero(inode, pos, offset, bytes, iomap, srcmap);
> 
> This introduces an > 80 character line.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG
  2019-09-02 16:29   ` Christoph Hellwig
@ 2019-09-02 17:09     ` Darrick J. Wong
  2019-09-02 17:18       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-02 17:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Goldwyn Rodrigues, linux-fsdevel, linux-btrfs, david, riteshh,
	Goldwyn Rodrigues

On Mon, Sep 02, 2019 at 06:29:34PM +0200, Christoph Hellwig wrote:
> On Sun, Sep 01, 2019 at 03:08:22PM -0500, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > 
> > To improve debugging abilities, especially invalid option
> > asserts.
> 
> Looking at the code I'd much rather have unconditional WARN_ON_ONCE
> statements in most places.  Including returning an error when we see
> something invalid in most cases.

Yeah, I was thinking something like this, which has the advantage that
the report format is familiar to XFS developers and will get picked up
by the automated error collection stuff I put in xfstests to complain
about any XFS assertion failures:

iomap: Introduce CONFIG_FS_IOMAP_DEBUG

To improve debugging abilities, especially invalid option
asserts.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
[darrick: restructure it to follow what xfs does]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/Kconfig            |    3 +++
 fs/iomap/apply.c      |   10 ++++++++++
 include/linux/iomap.h |   14 ++++++++++++++
 3 files changed, 27 insertions(+)

diff --git a/fs/Kconfig b/fs/Kconfig
index bfb1c6095c7a..4bed5df9b55f 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -19,6 +19,9 @@ if BLOCK
 
 config FS_IOMAP
 	bool
+config FS_IOMAP_DEBUG
+	bool "Debugging for the iomap code"
+	depends on FS_IOMAP
 
 source "fs/ext2/Kconfig"
 source "fs/ext4/Kconfig"
diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c
index 54c02aecf3cd..95cc3a2cadd5 100644
--- a/fs/iomap/apply.c
+++ b/fs/iomap/apply.c
@@ -8,6 +8,16 @@
 #include <linux/fs.h>
 #include <linux/iomap.h>
 
+#ifdef CONFIG_FS_IOMAP_DEBUG
+void
+iomap_assertion_failed(const char *expr, const char *file, int line)
+{
+	printk(KERN_EMERG "IOMAP: Assertion failed: %s, file: %s, line: %d",
+	       expr, file, line);
+	WARN_ON_ONCE(1);
+}
+#endif
+
 /*
  * Execute a iomap write on a segment of the mapping that spans a
  * contiguous range of pages that have identical block mapping state.
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 834d3923e2f2..b3d5d6f323cf 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -20,6 +20,20 @@ struct page;
 struct vm_area_struct;
 struct vm_fault;
 
+#ifdef CONFIG_FS_IOMAP_DEBUG
+
+extern void iomap_assertion_failed(const char *expr, const char *f, int l);
+
+#define IOMAP_ASSERT(expr) \
+	do { \
+		if (unlikely(!(expr))) \
+			iomap_assertion_failed(#expr, __FILE__, __LINE__); \
+	} while(0)
+
+#else
+#define IOMAP_ASSERT(expr)	((void)0)
+#endif
+
 /*
  * Types of block ranges for iomap mappings:
  */

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW
  2019-09-02 17:01     ` Darrick J. Wong
@ 2019-09-02 17:13       ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 17:13 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Goldwyn Rodrigues, linux-fsdevel, linux-btrfs,
	david, riteshh, Goldwyn Rodrigues

On Mon, Sep 02, 2019 at 10:01:09AM -0700, Darrick J. Wong wrote:
> On Mon, Sep 02, 2019 at 06:31:24PM +0200, Christoph Hellwig wrote:
> > On Sun, Sep 01, 2019 at 03:08:24PM -0500, Goldwyn Rodrigues wrote:
> > 
> > > +		iomap_assert(!(iomap->flags & IOMAP_F_BUFFER_HEAD));
> > > +		iomap_assert(srcmap->type == IOMAP_HOLE || srcmap->addr > 0);
> > 
> > 0 can be a valid address in various file systems, so I don't think we
> > can just exclude it.  Then again COWing from a hole seems pointless,
> > doesn't it?
> 
> XFS does that if you set a cowextsize hint and a speculative cow
> preallocation ends up covering a hole.  Granted I don't think there's
> much point in reading from a COW fork extent to fill in an unaligned
> buffered write since it /should/ just end up zero-filling the pagecache
> regardless of fork... but I don't see much harm in doing that.

That assumes you'd set the iomap-level COW flag for anything that
writes to the COW fork in XFS.  Which doesn't sound right to me - the
iomap-level indicates that we actually need to read some data, which
for a hole is rather pointless as you said.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG
  2019-09-02 17:09     ` Darrick J. Wong
@ 2019-09-02 17:18       ` Christoph Hellwig
  2019-09-03 15:06         ` Darrick J. Wong
  0 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-02 17:18 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Goldwyn Rodrigues, linux-fsdevel, linux-btrfs,
	david, riteshh, Goldwyn Rodrigues

On Mon, Sep 02, 2019 at 10:09:16AM -0700, Darrick J. Wong wrote:
> On Mon, Sep 02, 2019 at 06:29:34PM +0200, Christoph Hellwig wrote:
> > On Sun, Sep 01, 2019 at 03:08:22PM -0500, Goldwyn Rodrigues wrote:
> > > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > > 
> > > To improve debugging abilities, especially invalid option
> > > asserts.
> > 
> > Looking at the code I'd much rather have unconditional WARN_ON_ONCE
> > statements in most places.  Including returning an error when we see
> > something invalid in most cases.
> 
> Yeah, I was thinking something like this, which has the advantage that
> the report format is familiar to XFS developers and will get picked up
> by the automated error collection stuff I put in xfstests to complain
> about any XFS assertion failures:
> 
> iomap: Introduce CONFIG_FS_IOMAP_DEBUG
> 
> To improve debugging abilities, especially invalid option
> asserts.

I'd actually just rather have more unconditional WARN_ON_ONCE calls,
including actually recovering from the situation by return an actual
error code.  That is more

	if (WARN_ON_ONCE(some_impossible_condition))
		return -EIO;

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O
  2019-09-02 16:31   ` Christoph Hellwig
@ 2019-09-03  3:18     ` Darrick J. Wong
  2019-09-03 14:12       ` Goldwyn Rodrigues
  2019-09-03 14:05     ` Goldwyn Rodrigues
  1 sibling, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-03  3:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Goldwyn Rodrigues, linux-fsdevel, linux-btrfs, david, riteshh,
	Goldwyn Rodrigues

On Mon, Sep 02, 2019 at 06:31:04PM +0200, Christoph Hellwig wrote:
> On Sun, Sep 01, 2019 at 03:08:23PM -0500, Goldwyn Rodrigues wrote:
> > --- a/include/linux/iomap.h
> > +++ b/include/linux/iomap.h
> > @@ -37,6 +37,7 @@ struct vm_fault;
> >  #define IOMAP_MAPPED	0x03	/* blocks allocated at @addr */
> >  #define IOMAP_UNWRITTEN	0x04	/* blocks allocated at @addr in unwritten state */
> >  #define IOMAP_INLINE	0x05	/* data inline in the inode */
> > +#define IOMAP_COW	0x06	/* copy data from srcmap before writing */
> 
> I don't think IOMAP_COW can be a type - it is a flag given that we
> can do COW operations that allocate normal written extents (e.g. for
> direct I/O or DAX) and for delayed allocations.

If iomap_apply always zeros out @srcmap before calling ->iomap_begin, do
we even need a flag/type code?  Or does it suffice to check that
srcmap.length > 0 and use it appropriately?

--D

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/15] Btrfs iomap
  2019-09-02 16:43 ` [PATCH v3 0/15] Btrfs iomap Christoph Hellwig
@ 2019-09-03  3:51   ` Shiyang Ruan
  2019-09-03  4:29     ` Darrick J. Wong
  2019-09-03  6:21     ` Christoph Hellwig
  2019-09-03 13:44   ` Goldwyn Rodrigues
  1 sibling, 2 replies; 36+ messages in thread
From: Shiyang Ruan @ 2019-09-03  3:51 UTC (permalink / raw)
  To: Christoph Hellwig, Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, david, riteshh



On 9/3/19 12:43 AM, Christoph Hellwig wrote:
> On Sun, Sep 01, 2019 at 03:08:21PM -0500, Goldwyn Rodrigues wrote:
>> This is an effort to use iomap for btrfs. This would keep most
>> responsibility of page handling during writes in iomap code, hence
>> code reduction. For CoW support, changes are needed in iomap code
>> to make sure we perform a copy before the write.
>> This is in line with the discussion we had during adding dax support in
>> btrfs.
> 
> This looks pretty good modulo a few comments.
> 
> Can you please convert the XFS code to use your two iomaps for COW
> approach as well to validate it?

Hi,

The XFS part of dax CoW support has been implementing recently.  Please 
review this[1] if necessary.  It's based on this iomap patchset(the 1st 
version), and uses the new srcmap.

[1]: https://lkml.org/lkml/2019/7/31/449

-- 
Thanks,
Shiyang Ruan.
> 
> Also the iomap_file_dirty helper would really benefit from using the
> two iomaps, any chance you could look into improving it to use your
> new infrastructure?
> 
> 




^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/15] Btrfs iomap
  2019-09-03  3:51   ` Shiyang Ruan
@ 2019-09-03  4:29     ` Darrick J. Wong
  2019-09-03  5:00       ` Shiyang Ruan
  2019-09-03  6:21     ` Christoph Hellwig
  1 sibling, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-03  4:29 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: Christoph Hellwig, Goldwyn Rodrigues, linux-fsdevel, linux-btrfs,
	david, riteshh

On Tue, Sep 03, 2019 at 11:51:24AM +0800, Shiyang Ruan wrote:
> 
> 
> On 9/3/19 12:43 AM, Christoph Hellwig wrote:
> > On Sun, Sep 01, 2019 at 03:08:21PM -0500, Goldwyn Rodrigues wrote:
> > > This is an effort to use iomap for btrfs. This would keep most
> > > responsibility of page handling during writes in iomap code, hence
> > > code reduction. For CoW support, changes are needed in iomap code
> > > to make sure we perform a copy before the write.
> > > This is in line with the discussion we had during adding dax support in
> > > btrfs.
> > 
> > This looks pretty good modulo a few comments.
> > 
> > Can you please convert the XFS code to use your two iomaps for COW
> > approach as well to validate it?
> 
> Hi,
> 
> The XFS part of dax CoW support has been implementing recently.  Please
> review this[1] if necessary.  It's based on this iomap patchset(the 1st
> version), and uses the new srcmap.
> 
> [1]: https://lkml.org/lkml/2019/7/31/449

It sure would be nice to have (a) this patchset of Goldwyn's cleaned up
a bit per the review comments and (b) the XFS DAX COW series rebased on
that (instead of a month-old submission). ;)

--D

> -- 
> Thanks,
> Shiyang Ruan.
> > 
> > Also the iomap_file_dirty helper would really benefit from using the
> > two iomaps, any chance you could look into improving it to use your
> > new infrastructure?
> > 
> > 
> 
> 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/15] Btrfs iomap
  2019-09-03  4:29     ` Darrick J. Wong
@ 2019-09-03  5:00       ` Shiyang Ruan
  0 siblings, 0 replies; 36+ messages in thread
From: Shiyang Ruan @ 2019-09-03  5:00 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Goldwyn Rodrigues, linux-fsdevel, linux-btrfs,
	david, riteshh



On 9/3/19 12:29 PM, Darrick J. Wong wrote:
> On Tue, Sep 03, 2019 at 11:51:24AM +0800, Shiyang Ruan wrote:
>>
>>
>> On 9/3/19 12:43 AM, Christoph Hellwig wrote:
>>> On Sun, Sep 01, 2019 at 03:08:21PM -0500, Goldwyn Rodrigues wrote:
>>>> This is an effort to use iomap for btrfs. This would keep most
>>>> responsibility of page handling during writes in iomap code, hence
>>>> code reduction. For CoW support, changes are needed in iomap code
>>>> to make sure we perform a copy before the write.
>>>> This is in line with the discussion we had during adding dax support in
>>>> btrfs.
>>>
>>> This looks pretty good modulo a few comments.
>>>
>>> Can you please convert the XFS code to use your two iomaps for COW
>>> approach as well to validate it?
>>
>> Hi,
>>
>> The XFS part of dax CoW support has been implementing recently.  Please
>> review this[1] if necessary.  It's based on this iomap patchset(the 1st
>> version), and uses the new srcmap.
>>
>> [1]: https://lkml.org/lkml/2019/7/31/449
> 
> It sure would be nice to have (a) this patchset of Goldwyn's cleaned up
> a bit per the review comments and (b) the XFS DAX COW series rebased on
> that (instead of a month-old submission). ;)

Yes, I think so too.  Just in case Christoph did not notice that.

-- 
Thanks,
Shiyang Ruan.
> 
> --D
> 
>> -- 
>> Thanks,
>> Shiyang Ruan.
>>>
>>> Also the iomap_file_dirty helper would really benefit from using the
>>> two iomaps, any chance you could look into improving it to use your
>>> new infrastructure?
>>>
>>>
>>
>>
>>
> 
> 




^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/15] Btrfs iomap
  2019-09-03  3:51   ` Shiyang Ruan
  2019-09-03  4:29     ` Darrick J. Wong
@ 2019-09-03  6:21     ` Christoph Hellwig
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2019-09-03  6:21 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: Christoph Hellwig, Goldwyn Rodrigues, linux-fsdevel, linux-btrfs,
	darrick.wong, david, riteshh

On Tue, Sep 03, 2019 at 11:51:24AM +0800, Shiyang Ruan wrote:
> The XFS part of dax CoW support has been implementing recently.  Please 
> review this[1] if necessary.  It's based on this iomap patchset(the 1st 
> version), and uses the new srcmap.
>
> [1]: https://lkml.org/lkml/2019/7/31/449

For the initial stage I'm not interested in DAX reflink support yet
(although that is also very interesting), just to make sure the
two iomap support is also used by the buffer write path for XFS,
removing the current hack where iomap_begin reports the data fork
mapping only and relies in ->writepage to handle that case.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v3 0/15] Btrfs iomap
  2019-09-02 16:43 ` [PATCH v3 0/15] Btrfs iomap Christoph Hellwig
  2019-09-03  3:51   ` Shiyang Ruan
@ 2019-09-03 13:44   ` Goldwyn Rodrigues
  1 sibling, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-03 13:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, david, riteshh

On 18:43 02/09, Christoph Hellwig wrote:
> On Sun, Sep 01, 2019 at 03:08:21PM -0500, Goldwyn Rodrigues wrote:
> > This is an effort to use iomap for btrfs. This would keep most
> > responsibility of page handling during writes in iomap code, hence
> > code reduction. For CoW support, changes are needed in iomap code
> > to make sure we perform a copy before the write.
> > This is in line with the discussion we had during adding dax support in
> > btrfs.
> 
> This looks pretty good modulo a few comments.
> 
> Can you please convert the XFS code to use your two iomaps for COW
> approach as well to validate it?
> 
> Also the iomap_file_dirty helper would really benefit from using the
> two iomaps, any chance you could look into improving it to use your
> new infrastructure?

Yes. However, I haven't looked at much of XFS code, but from the initial
looks of it, it is simple to read.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O
  2019-09-02 16:31   ` Christoph Hellwig
  2019-09-03  3:18     ` Darrick J. Wong
@ 2019-09-03 14:05     ` Goldwyn Rodrigues
  2019-09-03 14:56       ` Darrick J. Wong
  1 sibling, 1 reply; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-03 14:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, linux-btrfs, darrick.wong, david, riteshh,
	Goldwyn Rodrigues

On 18:31 02/09, Christoph Hellwig wrote:
> On Sun, Sep 01, 2019 at 03:08:23PM -0500, Goldwyn Rodrigues wrote:
> > --- a/include/linux/iomap.h
> > +++ b/include/linux/iomap.h
> > @@ -37,6 +37,7 @@ struct vm_fault;
> >  #define IOMAP_MAPPED	0x03	/* blocks allocated at @addr */
> >  #define IOMAP_UNWRITTEN	0x04	/* blocks allocated at @addr in unwritten state */
> >  #define IOMAP_INLINE	0x05	/* data inline in the inode */
> > +#define IOMAP_COW	0x06	/* copy data from srcmap before writing */
> 
> I don't think IOMAP_COW can be a type - it is a flag given that we
> can do COW operations that allocate normal written extents (e.g. for
> direct I/O or DAX) and for delayed allocations.
> 

Ah.. we have come a full circle on this one. From going to a flag, to a type,
and now back to flag. Personally, I like COW to be a flag, because we are
doing a write, just doining extra steps which should be a flag.
From previous objections, using two iomaps should help the cause and we
can not worry about bloating.


-- 
Goldwyn

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O
  2019-09-03  3:18     ` Darrick J. Wong
@ 2019-09-03 14:12       ` Goldwyn Rodrigues
  0 siblings, 0 replies; 36+ messages in thread
From: Goldwyn Rodrigues @ 2019-09-03 14:12 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, linux-fsdevel, linux-btrfs, david, riteshh,
	Goldwyn Rodrigues

On 20:18 02/09, Darrick J. Wong wrote:
> On Mon, Sep 02, 2019 at 06:31:04PM +0200, Christoph Hellwig wrote:
> > On Sun, Sep 01, 2019 at 03:08:23PM -0500, Goldwyn Rodrigues wrote:
> > > --- a/include/linux/iomap.h
> > > +++ b/include/linux/iomap.h
> > > @@ -37,6 +37,7 @@ struct vm_fault;
> > >  #define IOMAP_MAPPED	0x03	/* blocks allocated at @addr */
> > >  #define IOMAP_UNWRITTEN	0x04	/* blocks allocated at @addr in unwritten state */
> > >  #define IOMAP_INLINE	0x05	/* data inline in the inode */
> > > +#define IOMAP_COW	0x06	/* copy data from srcmap before writing */
> > 
> > I don't think IOMAP_COW can be a type - it is a flag given that we
> > can do COW operations that allocate normal written extents (e.g. for
> > direct I/O or DAX) and for delayed allocations.
> 
> If iomap_apply always zeros out @srcmap before calling ->iomap_begin, do
> we even need a flag/type code?  Or does it suffice to check that
> srcmap.length > 0 and use it appropriately?
> 

While I understand your idea, it would be more robust to use the flag.
Makes it clean and scalable for other aspects of two iomaps (like extent
comparisons).

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O
  2019-09-03 14:05     ` Goldwyn Rodrigues
@ 2019-09-03 14:56       ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-03 14:56 UTC (permalink / raw)
  To: Goldwyn Rodrigues
  Cc: Christoph Hellwig, linux-fsdevel, linux-btrfs, david, riteshh,
	Goldwyn Rodrigues

On Tue, Sep 03, 2019 at 09:05:36AM -0500, Goldwyn Rodrigues wrote:
> On 18:31 02/09, Christoph Hellwig wrote:
> > On Sun, Sep 01, 2019 at 03:08:23PM -0500, Goldwyn Rodrigues wrote:
> > > --- a/include/linux/iomap.h
> > > +++ b/include/linux/iomap.h
> > > @@ -37,6 +37,7 @@ struct vm_fault;
> > >  #define IOMAP_MAPPED	0x03	/* blocks allocated at @addr */
> > >  #define IOMAP_UNWRITTEN	0x04	/* blocks allocated at @addr in unwritten state */
> > >  #define IOMAP_INLINE	0x05	/* data inline in the inode */
> > > +#define IOMAP_COW	0x06	/* copy data from srcmap before writing */
> > 
> > I don't think IOMAP_COW can be a type - it is a flag given that we
> > can do COW operations that allocate normal written extents (e.g. for
> > direct I/O or DAX) and for delayed allocations.
> > 
> 
> Ah.. we have come a full circle on this one. From going to a flag, to a type,
> and now back to flag. Personally, I like COW to be a flag, because we are
> doing a write, just doining extra steps which should be a flag.
> From previous objections, using two iomaps should help the cause and we
> can not worry about bloating.

Heh, ok, let's do a cow flag.  Thank you for driving the consensus. :)

(Sorry it took so long while we went around in circles.)

Also, I'm going on vacation starting Friday at noon PDT, so please have
the three patches that touch fs/iomap/ in before 7am Friday.  (Assuming
you're not making drastic changes to your iomap changes, they've tested
ok so far.)

--D

> 
> -- 
> Goldwyn

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG
  2019-09-02 17:18       ` Christoph Hellwig
@ 2019-09-03 15:06         ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2019-09-03 15:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Goldwyn Rodrigues, linux-fsdevel, linux-btrfs, david, riteshh,
	Goldwyn Rodrigues

On Mon, Sep 02, 2019 at 07:18:00PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 02, 2019 at 10:09:16AM -0700, Darrick J. Wong wrote:
> > On Mon, Sep 02, 2019 at 06:29:34PM +0200, Christoph Hellwig wrote:
> > > On Sun, Sep 01, 2019 at 03:08:22PM -0500, Goldwyn Rodrigues wrote:
> > > > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> > > > 
> > > > To improve debugging abilities, especially invalid option
> > > > asserts.
> > > 
> > > Looking at the code I'd much rather have unconditional WARN_ON_ONCE
> > > statements in most places.  Including returning an error when we see
> > > something invalid in most cases.
> > 
> > Yeah, I was thinking something like this, which has the advantage that
> > the report format is familiar to XFS developers and will get picked up
> > by the automated error collection stuff I put in xfstests to complain
> > about any XFS assertion failures:
> > 
> > iomap: Introduce CONFIG_FS_IOMAP_DEBUG
> > 
> > To improve debugging abilities, especially invalid option
> > asserts.
> 
> I'd actually just rather have more unconditional WARN_ON_ONCE calls,
> including actually recovering from the situation by return an actual
> error code.  That is more
> 
> 	if (WARN_ON_ONCE(some_impossible_condition))
> 		return -EIO;

Oh, right, WARNings actually do spit out the file and line number.
Let's do that. :)

--D

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-09-03 15:06 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-01 20:08 [PATCH v3 0/15] Btrfs iomap Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 01/15] iomap: Introduce CONFIG_FS_IOMAP_DEBUG Goldwyn Rodrigues
2019-09-01 20:50   ` Darrick J. Wong
2019-09-02 16:29   ` Christoph Hellwig
2019-09-02 17:09     ` Darrick J. Wong
2019-09-02 17:18       ` Christoph Hellwig
2019-09-03 15:06         ` Darrick J. Wong
2019-09-01 20:08 ` [PATCH 02/15] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O Goldwyn Rodrigues
2019-09-02 16:31   ` Christoph Hellwig
2019-09-03  3:18     ` Darrick J. Wong
2019-09-03 14:12       ` Goldwyn Rodrigues
2019-09-03 14:05     ` Goldwyn Rodrigues
2019-09-03 14:56       ` Darrick J. Wong
2019-09-01 20:08 ` [PATCH 03/15] iomap: Read page from srcmap for IOMAP_COW Goldwyn Rodrigues
2019-09-02 16:31   ` Christoph Hellwig
2019-09-02 17:01     ` Darrick J. Wong
2019-09-02 17:13       ` Christoph Hellwig
2019-09-01 20:08 ` [PATCH 04/15] btrfs: Eliminate PagePrivate for btrfs data pages Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 05/15] btrfs: Add a simple buffered iomap write Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 06/15] btrfs: Add CoW in iomap based writes Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 07/15] btrfs: remove buffered write code made unnecessary Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 08/15] fs: Export generic_file_buffered_read() Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 09/15] btrfs: basic direct read operation Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 10/15] btrfs: Carve out btrfs_get_extent_map_write() out of btrfs_get_blocks_write() Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 11/15] btrfs: Rename __endio_write_update_ordered() to btrfs_update_ordered_extent() Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 12/15] iomap: use a function pointer for dio submits Goldwyn Rodrigues
2019-09-02 16:41   ` Christoph Hellwig
2019-09-01 20:08 ` [PATCH 13/15] btrfs: Use iomap_dio_rw for performing direct I/O writes Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 14/15] btrfs: Remove btrfs_dio_data and __btrfs_direct_write Goldwyn Rodrigues
2019-09-01 20:08 ` [PATCH 15/15] btrfs: update inode size during bio completion Goldwyn Rodrigues
2019-09-02 16:43 ` [PATCH v3 0/15] Btrfs iomap Christoph Hellwig
2019-09-03  3:51   ` Shiyang Ruan
2019-09-03  4:29     ` Darrick J. Wong
2019-09-03  5:00       ` Shiyang Ruan
2019-09-03  6:21     ` Christoph Hellwig
2019-09-03 13:44   ` Goldwyn Rodrigues

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).