All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Huang Jianan <huangjianan@oppo.com>,
	linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Andreas Gruenbacher <andreas.gruenbacher@gmail.com>
Subject: Re: [PATCH v7] iomap: make inline data support more flexible
Date: Mon, 26 Jul 2021 00:16:39 +0200	[thread overview]
Message-ID: <20210725221639.426565-1-agruenba@redhat.com> (raw)
In-Reply-To: <CAHpGcMJBhWcwteLDSBU3hgwq1tk_+LqogM1ZM=Fv8U0VtY5hMg@mail.gmail.com>

Here's a fixed and cleaned up version that passes fstests on gfs2.

I see no reason why the combination of tail packing + writing should
cause any issues, so in my opinion, the check that disables that
combination in iomap_write_begin_inline should still be removed.

It turns out that returning the number of bytes copied from
iomap_read_inline_data is a bit irritating: the function is really used
for filling the page, but that's not always the "progress" we're looking
for.  In the iomap_readpage case, we actually need to advance by an
antire page, but in the iomap_file_buffered_write case, we need to
advance by the length parameter of iomap_write_actor or less.  So I've
changed that back.

I've also renamed iomap_inline_buf to iomap_inline_data and I've turned
iomap_inline_data_size_valid into iomap_within_inline_data, which seems
more useful to me.

Thanks,
Andreas

--

Subject: [PATCH] iomap: Support tail packing

The existing inline data support only works for cases where the entire
file is stored as inline data.  For larger files, EROFS stores the
initial blocks separately and then can pack a small tail adjacent to the
inode.  Generalise inline data to allow for tail packing.  Tails may not
cross a page boundary in memory.

We currently have no filesystems that support tail packing and writing,
so that case is currently disabled (see iomap_write_begin_inline).  I'm
not aware of any reason why this code path shouldn't work, however.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Andreas Gruenbacher <andreas.gruenbacher@gmail.com>
Tested-by: Huang Jianan <huangjianan@oppo.com> # erofs
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 fs/iomap/buffered-io.c | 34 +++++++++++++++++++++++-----------
 fs/iomap/direct-io.c   | 11 ++++++-----
 include/linux/iomap.h  | 22 +++++++++++++++++++++-
 3 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 87ccb3438bec..334bf98fdd4a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -205,25 +205,29 @@ struct iomap_readpage_ctx {
 	struct readahead_control *rac;
 };
 
-static void
-iomap_read_inline_data(struct inode *inode, struct page *page,
+static int iomap_read_inline_data(struct inode *inode, struct page *page,
 		struct iomap *iomap)
 {
-	size_t size = i_size_read(inode);
+	size_t size = i_size_read(inode) - iomap->offset;
 	void *addr;
 
 	if (PageUptodate(page))
-		return;
+		return 0;
 
-	BUG_ON(page_has_private(page));
-	BUG_ON(page->index);
-	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	/* inline and tail-packed data must start page aligned in the file */
+	if (WARN_ON_ONCE(offset_in_page(iomap->offset)))
+		return -EIO;
+	if (WARN_ON_ONCE(size > PAGE_SIZE - offset_in_page(iomap->inline_data)))
+		return -EIO;
+	if (WARN_ON_ONCE(page_has_private(page)))
+		return -EIO;
 
 	addr = kmap_atomic(page);
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - size);
 	kunmap_atomic(addr);
 	SetPageUptodate(page);
+	return 0;
 }
 
 static inline bool iomap_block_needs_zeroing(struct inode *inode,
@@ -247,7 +251,6 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	sector_t sector;
 
 	if (iomap->type == IOMAP_INLINE) {
-		WARN_ON_ONCE(pos);
 		iomap_read_inline_data(inode, page, iomap);
 		return PAGE_SIZE;
 	}
@@ -589,6 +592,15 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 	return 0;
 }
 
+static int iomap_write_begin_inline(struct inode *inode,
+		struct page *page, struct iomap *srcmap)
+{
+	/* needs more work for the tailpacking case, disable for now */
+	if (WARN_ON_ONCE(srcmap->offset != 0))
+		return -EIO;
+	return iomap_read_inline_data(inode, page, srcmap);
+}
+
 static int
 iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		struct page **pagep, struct iomap *iomap, struct iomap *srcmap)
@@ -618,7 +630,7 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 	}
 
 	if (srcmap->type == IOMAP_INLINE)
-		iomap_read_inline_data(inode, page, srcmap);
+		status = iomap_write_begin_inline(inode, page, srcmap);
 	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
 	else
@@ -671,11 +683,11 @@ static size_t iomap_write_end_inline(struct inode *inode, struct page *page,
 	void *addr;
 
 	WARN_ON_ONCE(!PageUptodate(page));
-	BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	BUG_ON(!iomap_within_inline_data(iomap, pos + copied - 1));
 
 	flush_dcache_page(page);
 	addr = kmap_atomic(page);
-	memcpy(iomap->inline_data + pos, addr + pos, copied);
+	memcpy(iomap_inline_data(iomap, pos), addr + pos, copied);
 	kunmap_atomic(addr);
 
 	mark_inode_dirty(inode);
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 9398b8c31323..c9424e58f613 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -380,21 +380,22 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length,
 	struct iov_iter *iter = dio->submit.iter;
 	size_t copied;
 
-	BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	if (WARN_ON_ONCE(!iomap_within_inline_data(iomap, pos + length - 1)))
+		return -EIO;
 
 	if (dio->flags & IOMAP_DIO_WRITE) {
-		loff_t size = inode->i_size;
+		loff_t size = iomap->offset + iomap->length;
 
 		if (pos > size)
-			memset(iomap->inline_data + size, 0, pos - size);
-		copied = copy_from_iter(iomap->inline_data + pos, length, iter);
+			memset(iomap_inline_data(iomap, size), 0, pos - size);
+		copied = copy_from_iter(iomap_inline_data(iomap, pos), length, iter);
 		if (copied) {
 			if (pos + copied > size)
 				i_size_write(inode, pos + copied);
 			mark_inode_dirty(inode);
 		}
 	} else {
-		copied = copy_to_iter(iomap->inline_data + pos, length, iter);
+		copied = copy_to_iter(iomap_inline_data(iomap, pos), length, iter);
 	}
 	dio->size += copied;
 	return copied;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 479c1da3e221..c1b57d34cb76 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -28,7 +28,7 @@ struct vm_fault;
 #define IOMAP_DELALLOC	1	/* delayed allocation blocks */
 #define IOMAP_MAPPED	2	/* blocks allocated at @addr */
 #define IOMAP_UNWRITTEN	3	/* blocks allocated at @addr in unwritten state */
-#define IOMAP_INLINE	4	/* data inline in the inode */
+#define IOMAP_INLINE	4	/* inline or tail-packed data */
 
 /*
  * Flags reported by the file system from iomap_begin:
@@ -97,6 +97,26 @@ iomap_sector(struct iomap *iomap, loff_t pos)
 	return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT;
 }
 
+/*
+ * Returns the inline data pointer for logical offset @pos.
+ */
+static void *iomap_inline_data(struct iomap *iomap, loff_t pos)
+{
+	return iomap->inline_data + pos - iomap->offset;
+}
+
+/*
+ * Check if logical offset @pos is within the valid range for inline data.
+ * This is used to guard against accessing data beyond the page inline_data
+ * points at.
+ */
+static bool iomap_within_inline_data(struct iomap *iomap, loff_t pos)
+{
+	unsigned int size = PAGE_SIZE - offset_in_page(iomap->inline_data);
+
+	return pos - iomap->offset < size;
+}
+
 /*
  * When a filesystem sets page_ops in an iomap mapping it returns, page_prepare
  * and page_done will be called for each page written to.  This only applies to
-- 
2.26.3


WARNING: multiple messages have this Message-ID (diff)
From: Andreas Gruenbacher <agruenba@redhat.com>
To: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Andreas Gruenbacher <andreas.gruenbacher@gmail.com>,
	linux-kernel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v7] iomap: make inline data support more flexible
Date: Mon, 26 Jul 2021 00:16:39 +0200	[thread overview]
Message-ID: <20210725221639.426565-1-agruenba@redhat.com> (raw)
In-Reply-To: <CAHpGcMJBhWcwteLDSBU3hgwq1tk_+LqogM1ZM=Fv8U0VtY5hMg@mail.gmail.com>

Here's a fixed and cleaned up version that passes fstests on gfs2.

I see no reason why the combination of tail packing + writing should
cause any issues, so in my opinion, the check that disables that
combination in iomap_write_begin_inline should still be removed.

It turns out that returning the number of bytes copied from
iomap_read_inline_data is a bit irritating: the function is really used
for filling the page, but that's not always the "progress" we're looking
for.  In the iomap_readpage case, we actually need to advance by an
antire page, but in the iomap_file_buffered_write case, we need to
advance by the length parameter of iomap_write_actor or less.  So I've
changed that back.

I've also renamed iomap_inline_buf to iomap_inline_data and I've turned
iomap_inline_data_size_valid into iomap_within_inline_data, which seems
more useful to me.

Thanks,
Andreas

--

Subject: [PATCH] iomap: Support tail packing

The existing inline data support only works for cases where the entire
file is stored as inline data.  For larger files, EROFS stores the
initial blocks separately and then can pack a small tail adjacent to the
inode.  Generalise inline data to allow for tail packing.  Tails may not
cross a page boundary in memory.

We currently have no filesystems that support tail packing and writing,
so that case is currently disabled (see iomap_write_begin_inline).  I'm
not aware of any reason why this code path shouldn't work, however.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Andreas Gruenbacher <andreas.gruenbacher@gmail.com>
Tested-by: Huang Jianan <huangjianan@oppo.com> # erofs
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 fs/iomap/buffered-io.c | 34 +++++++++++++++++++++++-----------
 fs/iomap/direct-io.c   | 11 ++++++-----
 include/linux/iomap.h  | 22 +++++++++++++++++++++-
 3 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 87ccb3438bec..334bf98fdd4a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -205,25 +205,29 @@ struct iomap_readpage_ctx {
 	struct readahead_control *rac;
 };
 
-static void
-iomap_read_inline_data(struct inode *inode, struct page *page,
+static int iomap_read_inline_data(struct inode *inode, struct page *page,
 		struct iomap *iomap)
 {
-	size_t size = i_size_read(inode);
+	size_t size = i_size_read(inode) - iomap->offset;
 	void *addr;
 
 	if (PageUptodate(page))
-		return;
+		return 0;
 
-	BUG_ON(page_has_private(page));
-	BUG_ON(page->index);
-	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	/* inline and tail-packed data must start page aligned in the file */
+	if (WARN_ON_ONCE(offset_in_page(iomap->offset)))
+		return -EIO;
+	if (WARN_ON_ONCE(size > PAGE_SIZE - offset_in_page(iomap->inline_data)))
+		return -EIO;
+	if (WARN_ON_ONCE(page_has_private(page)))
+		return -EIO;
 
 	addr = kmap_atomic(page);
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - size);
 	kunmap_atomic(addr);
 	SetPageUptodate(page);
+	return 0;
 }
 
 static inline bool iomap_block_needs_zeroing(struct inode *inode,
@@ -247,7 +251,6 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	sector_t sector;
 
 	if (iomap->type == IOMAP_INLINE) {
-		WARN_ON_ONCE(pos);
 		iomap_read_inline_data(inode, page, iomap);
 		return PAGE_SIZE;
 	}
@@ -589,6 +592,15 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 	return 0;
 }
 
+static int iomap_write_begin_inline(struct inode *inode,
+		struct page *page, struct iomap *srcmap)
+{
+	/* needs more work for the tailpacking case, disable for now */
+	if (WARN_ON_ONCE(srcmap->offset != 0))
+		return -EIO;
+	return iomap_read_inline_data(inode, page, srcmap);
+}
+
 static int
 iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		struct page **pagep, struct iomap *iomap, struct iomap *srcmap)
@@ -618,7 +630,7 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 	}
 
 	if (srcmap->type == IOMAP_INLINE)
-		iomap_read_inline_data(inode, page, srcmap);
+		status = iomap_write_begin_inline(inode, page, srcmap);
 	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
 	else
@@ -671,11 +683,11 @@ static size_t iomap_write_end_inline(struct inode *inode, struct page *page,
 	void *addr;
 
 	WARN_ON_ONCE(!PageUptodate(page));
-	BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	BUG_ON(!iomap_within_inline_data(iomap, pos + copied - 1));
 
 	flush_dcache_page(page);
 	addr = kmap_atomic(page);
-	memcpy(iomap->inline_data + pos, addr + pos, copied);
+	memcpy(iomap_inline_data(iomap, pos), addr + pos, copied);
 	kunmap_atomic(addr);
 
 	mark_inode_dirty(inode);
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 9398b8c31323..c9424e58f613 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -380,21 +380,22 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length,
 	struct iov_iter *iter = dio->submit.iter;
 	size_t copied;
 
-	BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	if (WARN_ON_ONCE(!iomap_within_inline_data(iomap, pos + length - 1)))
+		return -EIO;
 
 	if (dio->flags & IOMAP_DIO_WRITE) {
-		loff_t size = inode->i_size;
+		loff_t size = iomap->offset + iomap->length;
 
 		if (pos > size)
-			memset(iomap->inline_data + size, 0, pos - size);
-		copied = copy_from_iter(iomap->inline_data + pos, length, iter);
+			memset(iomap_inline_data(iomap, size), 0, pos - size);
+		copied = copy_from_iter(iomap_inline_data(iomap, pos), length, iter);
 		if (copied) {
 			if (pos + copied > size)
 				i_size_write(inode, pos + copied);
 			mark_inode_dirty(inode);
 		}
 	} else {
-		copied = copy_to_iter(iomap->inline_data + pos, length, iter);
+		copied = copy_to_iter(iomap_inline_data(iomap, pos), length, iter);
 	}
 	dio->size += copied;
 	return copied;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 479c1da3e221..c1b57d34cb76 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -28,7 +28,7 @@ struct vm_fault;
 #define IOMAP_DELALLOC	1	/* delayed allocation blocks */
 #define IOMAP_MAPPED	2	/* blocks allocated at @addr */
 #define IOMAP_UNWRITTEN	3	/* blocks allocated at @addr in unwritten state */
-#define IOMAP_INLINE	4	/* data inline in the inode */
+#define IOMAP_INLINE	4	/* inline or tail-packed data */
 
 /*
  * Flags reported by the file system from iomap_begin:
@@ -97,6 +97,26 @@ iomap_sector(struct iomap *iomap, loff_t pos)
 	return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT;
 }
 
+/*
+ * Returns the inline data pointer for logical offset @pos.
+ */
+static void *iomap_inline_data(struct iomap *iomap, loff_t pos)
+{
+	return iomap->inline_data + pos - iomap->offset;
+}
+
+/*
+ * Check if logical offset @pos is within the valid range for inline data.
+ * This is used to guard against accessing data beyond the page inline_data
+ * points at.
+ */
+static bool iomap_within_inline_data(struct iomap *iomap, loff_t pos)
+{
+	unsigned int size = PAGE_SIZE - offset_in_page(iomap->inline_data);
+
+	return pos - iomap->offset < size;
+}
+
 /*
  * When a filesystem sets page_ops in an iomap mapping it returns, page_prepare
  * and page_done will be called for each page written to.  This only applies to
-- 
2.26.3


  parent reply	other threads:[~2021-07-25 22:16 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-23 17:41 [PATCH v7] iomap: make inline data support more flexible Gao Xiang
2021-07-23 17:41 ` Gao Xiang
2021-07-23 19:40 ` Matthew Wilcox
2021-07-23 19:40   ` Matthew Wilcox
2021-07-24  0:54   ` Gao Xiang
2021-07-24  0:54     ` Gao Xiang
2021-07-25 21:39 ` Andreas Grünbacher
2021-07-25 21:39   ` Andreas Grünbacher
2021-07-25 22:16 ` Andreas Gruenbacher [this message]
2021-07-25 22:16   ` Andreas Gruenbacher
2021-07-26  2:36   ` Gao Xiang
2021-07-26  2:36     ` Gao Xiang
2021-07-26  7:22     ` Andreas Gruenbacher
2021-07-26  7:22       ` Andreas Gruenbacher
2021-07-26  7:38       ` Gao Xiang
2021-07-26  7:38         ` Gao Xiang
2021-07-26 21:36       ` Darrick J. Wong
2021-07-26 21:36         ` Darrick J. Wong
2021-07-26 22:20         ` Andreas Grünbacher
2021-07-26 22:20           ` Andreas Grünbacher
2021-07-26  3:06   ` Matthew Wilcox
2021-07-26  3:06     ` Matthew Wilcox
2021-07-26  6:56     ` Andreas Gruenbacher
2021-07-26  6:56       ` Andreas Gruenbacher
2021-07-26  4:00   ` Gao Xiang
2021-07-26  4:00     ` Gao Xiang
2021-07-26  8:08     ` Andreas Grünbacher
2021-07-26  8:08       ` Andreas Grünbacher
2021-07-26  8:17       ` Gao Xiang
2021-07-26  8:17         ` Gao Xiang
2021-07-26 11:06     ` Andreas Gruenbacher
2021-07-26 11:06       ` Andreas Gruenbacher
2021-07-26 12:17       ` Christoph Hellwig
2021-07-26 12:17         ` Christoph Hellwig
2021-07-26 12:27         ` Andreas Grünbacher
2021-07-26 12:27           ` Andreas Grünbacher
2021-07-26 12:50           ` Gao Xiang
2021-07-26 12:50             ` Gao Xiang
2021-07-26 13:10             ` Andreas Gruenbacher
2021-07-26 13:27           ` Christoph Hellwig
2021-07-26 13:27             ` Christoph Hellwig
2021-07-27  8:20         ` David Sterba
2021-07-27  8:20           ` David Sterba
2021-07-27 13:35           ` Matthew Wilcox
2021-07-27 15:04             ` Gao Xiang
2021-07-27 15:04               ` Gao Xiang
2021-07-27 16:53             ` David Sterba
2021-07-27 16:53               ` David Sterba
2021-07-26 12:32       ` Matthew Wilcox
2021-07-26 12:32         ` Matthew Wilcox
2021-07-26 13:03         ` Andreas Gruenbacher
2021-07-26 13:03           ` Andreas Gruenbacher
2021-07-26 13:12           ` Gao Xiang
2021-07-26 13:12             ` Gao Xiang
2021-07-26 13:30             ` Christoph Hellwig
2021-07-26  8:08 ` Joseph Qi
2021-07-26  8:08   ` Joseph Qi
2021-08-01 10:29 Andreas Gruenbacher
2021-08-01 10:29 ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210725221639.426565-1-agruenba@redhat.com \
    --to=agruenba@redhat.com \
    --cc=andreas.gruenbacher@gmail.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=huangjianan@oppo.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.