All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15)
@ 2022-08-10 14:15 Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
	Luis Chamberlain, Adam Manzanares, linux-xfs, stable

Hi Greg,

This backport series contains small fixes from v5.15 release.

From this point on, 5.10.y xfs can follow and pick changes
posted to 5.15.y.

I already have some debt of fixes from v5.17 already applied to
5.15.y, but not yet submitted to 5.10.y - those will be included
in my next batch.

Thanks,
Amir.

Changes from [v1]:
- Drop backport that disallows disabling of quota accounting
  on a mounted xfs (Darrick)
- Added Acked-by Darrick
- CC stable

[v1] https://lore.kernel.org/linux-xfs/20220809111708.92768-1-amir73il@gmail.com/

Darrick J. Wong (1):
  xfs: only set IOMAP_F_SHARED when providing a srcmap to a write

Dave Chinner (2):
  mm: Add kvrealloc()
  xfs: fix I_DONTCACHE

 fs/xfs/xfs_icache.c      |  3 ++-
 fs/xfs/xfs_iomap.c       |  8 ++++----
 fs/xfs/xfs_iops.c        |  2 +-
 fs/xfs/xfs_log_recover.c |  4 +++-
 include/linux/mm.h       |  2 ++
 mm/util.c                | 15 +++++++++++++++
 6 files changed, 27 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 5.10 v2 1/3] mm: Add kvrealloc()
  2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
@ 2022-08-10 14:15 ` Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE Amir Goldstein
  2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
	Luis Chamberlain, Adam Manzanares, linux-xfs, stable,
	Dave Chinner, Mel Gorman

From: Dave Chinner <dchinner@redhat.com>

commit de2860f4636256836450c6543be744a50118fc66 upstream.

During log recovery of an XFS filesystem with 64kB directory
buffers, rebuilding a buffer split across two log records results
in a memory allocation warning from krealloc like this:

xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
XFS (dm-0): Unmounting Filesystem
XFS (dm-0): Mounting V5 Filesystem
XFS (dm-0): Starting recovery (logdev: internal)
------------[ cut here ]------------
WARNING: CPU: 5 PID: 3435170 at mm/page_alloc.c:3539 get_page_from_freelist+0xdee/0xe40
.....
RIP: 0010:get_page_from_freelist+0xdee/0xe40
Call Trace:
 ? complete+0x3f/0x50
 __alloc_pages+0x16f/0x300
 alloc_pages+0x87/0x110
 kmalloc_order+0x2c/0x90
 kmalloc_order_trace+0x1d/0x90
 __kmalloc_track_caller+0x215/0x270
 ? xlog_recover_add_to_cont_trans+0x63/0x1f0
 krealloc+0x54/0xb0
 xlog_recover_add_to_cont_trans+0x63/0x1f0
 xlog_recovery_process_trans+0xc1/0xd0
 xlog_recover_process_ophdr+0x86/0x130
 xlog_recover_process_data+0x9f/0x160
 xlog_recover_process+0xa2/0x120
 xlog_do_recovery_pass+0x40b/0x7d0
 ? __irq_work_queue_local+0x4f/0x60
 ? irq_work_queue+0x3a/0x50
 xlog_do_log_recovery+0x70/0x150
 xlog_do_recover+0x38/0x1d0
 xlog_recover+0xd8/0x170
 xfs_log_mount+0x181/0x300
 xfs_mountfs+0x4a1/0x9b0
 xfs_fs_fill_super+0x3c0/0x7b0
 get_tree_bdev+0x171/0x270
 ? suffix_kstrtoint.constprop.0+0xf0/0xf0
 xfs_fs_get_tree+0x15/0x20
 vfs_get_tree+0x24/0xc0
 path_mount+0x2f5/0xaf0
 __x64_sys_mount+0x108/0x140
 do_syscall_64+0x3a/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Essentially, we are taking a multi-order allocation from kmem_alloc()
(which has an open coded no fail, no warn loop) and then
reallocating it out to 64kB using krealloc(__GFP_NOFAIL) and that is
then triggering the above warning.

This is a regression caused by converting this code from an open
coded no fail/no warn reallocation loop to using __GFP_NOFAIL.

What we actually need here is kvrealloc(), so that if contiguous
page allocation fails we fall back to vmalloc() and we don't
get nasty warnings happening in XFS.

Fixes: 771915c4f688 ("xfs: remove kmem_realloc()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_log_recover.c |  4 +++-
 include/linux/mm.h       |  2 ++
 mm/util.c                | 15 +++++++++++++++
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 69408782019e..e61f28ce3e44 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2061,7 +2061,9 @@ xlog_recover_add_to_cont_trans(
 	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
 	old_len = item->ri_buf[item->ri_cnt-1].i_len;
 
-	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
+	ptr = kvrealloc(old_ptr, old_len, len + old_len, GFP_KERNEL);
+	if (!ptr)
+		return -ENOMEM;
 	memcpy(&ptr[old_len], dp, len);
 	item->ri_buf[item->ri_cnt-1].i_len += len;
 	item->ri_buf[item->ri_cnt-1].i_addr = ptr;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5b4d88faf114..b8b677f47a8d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -788,6 +788,8 @@ static inline void *kvcalloc(size_t n, size_t size, gfp_t flags)
 	return kvmalloc_array(n, size, flags | __GFP_ZERO);
 }
 
+extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize,
+		gfp_t flags);
 extern void kvfree(const void *addr);
 extern void kvfree_sensitive(const void *addr, size_t len);
 
diff --git a/mm/util.c b/mm/util.c
index ba9643de689e..25bfda774f6f 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -661,6 +661,21 @@ void kvfree_sensitive(const void *addr, size_t len)
 }
 EXPORT_SYMBOL(kvfree_sensitive);
 
+void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags)
+{
+	void *newp;
+
+	if (oldsize >= newsize)
+		return (void *)p;
+	newp = kvmalloc(newsize, flags);
+	if (!newp)
+		return NULL;
+	memcpy(newp, p, oldsize);
+	kvfree(p);
+	return newp;
+}
+EXPORT_SYMBOL(kvrealloc);
+
 static inline void *__page_rmapping(struct page *page)
 {
 	unsigned long mapping;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write
  2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
@ 2022-08-10 14:15 ` Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE Amir Goldstein
  2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
	Luis Chamberlain, Adam Manzanares, linux-xfs, stable,
	Christoph Hellwig, Chandan Babu R

From: "Darrick J. Wong" <djwong@kernel.org>

commit 72a048c1056a72e37ea2ee34cc73d8c6d6cb4290 upstream.

While prototyping a free space defragmentation tool, I observed an
unexpected IO error while running a sequence of commands that can be
recreated by the following sequence of commands:

$ xfs_io -f -c "pwrite -S 0x58 -b 10m 0 10m" file1
$ cp --reflink=always file1 file2
$ punch-alternating -o 1 file2
$ xfs_io -c "funshare 0 10m" file2
fallocate: Input/output error

I then scraped this (abbreviated) stack trace from dmesg:

WARNING: CPU: 0 PID: 30788 at fs/iomap/buffered-io.c:577 iomap_write_begin+0x376/0x450
CPU: 0 PID: 30788 Comm: xfs_io Not tainted 5.14.0-rc6-xfsx #rc6 5ef57b62a900814b3e4d885c755e9014541c8732
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:iomap_write_begin+0x376/0x450
RSP: 0018:ffffc90000c0fc20 EFLAGS: 00010297
RAX: 0000000000000001 RBX: ffffc90000c0fd10 RCX: 0000000000001000
RDX: ffffc90000c0fc54 RSI: 000000000000000c RDI: 000000000000000c
RBP: ffff888005d5dbd8 R08: 0000000000102000 R09: ffffc90000c0fc50
R10: 0000000000b00000 R11: 0000000000101000 R12: ffffea0000336c40
R13: 0000000000001000 R14: ffffc90000c0fd10 R15: 0000000000101000
FS:  00007f4b8f62fe40(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056361c554108 CR3: 000000000524e004 CR4: 00000000001706f0
Call Trace:
 iomap_unshare_actor+0x95/0x140
 iomap_apply+0xfa/0x300
 iomap_file_unshare+0x44/0x60
 xfs_reflink_unshare+0x50/0x140 [xfs 61947ea9b3a73e79d747dbc1b90205e7987e4195]
 xfs_file_fallocate+0x27c/0x610 [xfs 61947ea9b3a73e79d747dbc1b90205e7987e4195]
 vfs_fallocate+0x133/0x330
 __x64_sys_fallocate+0x3e/0x70
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f4b8f79140a

Looking at the iomap tracepoints, I saw this:

iomap_iter:           dev 8:64 ino 0x100 pos 0 length 0 flags WRITE|0x80 (0x81) ops xfs_buffered_write_iomap_ops caller iomap_file_unshare
iomap_iter_dstmap:    dev 8:64 ino 0x100 bdev 8:64 addr -1 offset 0 length 131072 type DELALLOC flags SHARED
iomap_iter_srcmap:    dev 8:64 ino 0x100 bdev 8:64 addr 147456 offset 0 length 4096 type MAPPED flags
iomap_iter:           dev 8:64 ino 0x100 pos 0 length 4096 flags WRITE|0x80 (0x81) ops xfs_buffered_write_iomap_ops caller iomap_file_unshare
iomap_iter_dstmap:    dev 8:64 ino 0x100 bdev 8:64 addr -1 offset 4096 length 4096 type DELALLOC flags SHARED
console:              WARNING: CPU: 0 PID: 30788 at fs/iomap/buffered-io.c:577 iomap_write_begin+0x376/0x450

The first time funshare calls ->iomap_begin, xfs sees that the first
block is shared and creates a 128k delalloc reservation in the COW fork.
The delalloc reservation is returned as dstmap, and the shared block is
returned as srcmap.  So far so good.

funshare calls ->iomap_begin to try the second block.  This time there's
no srcmap (punch-alternating punched it out!) but we still have the
delalloc reservation in the COW fork.  Therefore, we again return the
reservation as dstmap and the hole as srcmap.  iomap_unshare_iter
incorrectly tries to unshare the hole, which __iomap_write_begin rejects
because shared regions must be fully written and therefore cannot
require zeroing.

Therefore, change the buffered write iomap_begin function not to set
IOMAP_F_SHARED when there isn't a source mapping to read from for the
unsharing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_iomap.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 74bc2beadc23..bd5a25f4952d 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1062,11 +1062,11 @@ xfs_buffered_write_iomap_begin(
 		error = xfs_bmbt_to_iomap(ip, srcmap, &imap, 0);
 		if (error)
 			return error;
-	} else {
-		xfs_trim_extent(&cmap, offset_fsb,
-				imap.br_startoff - offset_fsb);
+		return xfs_bmbt_to_iomap(ip, iomap, &cmap, IOMAP_F_SHARED);
 	}
-	return xfs_bmbt_to_iomap(ip, iomap, &cmap, IOMAP_F_SHARED);
+
+	xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb);
+	return xfs_bmbt_to_iomap(ip, iomap, &cmap, 0);
 
 out_unlock:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE
  2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
  2022-08-10 14:15 ` [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write Amir Goldstein
@ 2022-08-10 14:15 ` Amir Goldstein
  2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
	Luis Chamberlain, Adam Manzanares, linux-xfs, stable,
	Dave Chinner

From: Dave Chinner <dchinner@redhat.com>

commit f38a032b165d812b0ba8378a5cd237c0888ff65f upstream.

Yup, the VFS hoist broke it, and nobody noticed. Bulkstat workloads
make it clear that it doesn't work as it should.

Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c | 3 ++-
 fs/xfs/xfs_iops.c   | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index deb99300d171..e69a08ed7de4 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -47,8 +47,9 @@ xfs_inode_alloc(
 		return NULL;
 	}
 
-	/* VFS doesn't initialise i_mode! */
+	/* VFS doesn't initialise i_mode or i_state! */
 	VFS_I(ip)->i_mode = 0;
+	VFS_I(ip)->i_state = 0;
 
 	XFS_STATS_INC(mp, vn_active);
 	ASSERT(atomic_read(&ip->i_pincount) == 0);
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index b7f7b31a77d5..6a3026e78a9b 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1328,7 +1328,7 @@ xfs_setup_inode(
 	gfp_t			gfp_mask;
 
 	inode->i_ino = ip->i_ino;
-	inode->i_state = I_NEW;
+	inode->i_state |= I_NEW;
 
 	inode_sb_list_add(inode);
 	/* make the inode look hashed for the writeback code */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-08-10 14:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.