* [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15)
@ 2022-08-10 14:15 Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
Luis Chamberlain, Adam Manzanares, linux-xfs, stable
Hi Greg,
This backport series contains small fixes from v5.15 release.
From this point on, 5.10.y xfs can follow and pick changes
posted to 5.15.y.
I already have some debt of fixes from v5.17 already applied to
5.15.y, but not yet submitted to 5.10.y - those will be included
in my next batch.
Thanks,
Amir.
Changes from [v1]:
- Drop backport that disallows disabling of quota accounting
on a mounted xfs (Darrick)
- Added Acked-by Darrick
- CC stable
[v1] https://lore.kernel.org/linux-xfs/20220809111708.92768-1-amir73il@gmail.com/
Darrick J. Wong (1):
xfs: only set IOMAP_F_SHARED when providing a srcmap to a write
Dave Chinner (2):
mm: Add kvrealloc()
xfs: fix I_DONTCACHE
fs/xfs/xfs_icache.c | 3 ++-
fs/xfs/xfs_iomap.c | 8 ++++----
fs/xfs/xfs_iops.c | 2 +-
fs/xfs/xfs_log_recover.c | 4 +++-
include/linux/mm.h | 2 ++
mm/util.c | 15 +++++++++++++++
6 files changed, 27 insertions(+), 7 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 5.10 v2 1/3] mm: Add kvrealloc()
2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
@ 2022-08-10 14:15 ` Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE Amir Goldstein
2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
Luis Chamberlain, Adam Manzanares, linux-xfs, stable,
Dave Chinner, Mel Gorman
From: Dave Chinner <dchinner@redhat.com>
commit de2860f4636256836450c6543be744a50118fc66 upstream.
During log recovery of an XFS filesystem with 64kB directory
buffers, rebuilding a buffer split across two log records results
in a memory allocation warning from krealloc like this:
xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
XFS (dm-0): Unmounting Filesystem
XFS (dm-0): Mounting V5 Filesystem
XFS (dm-0): Starting recovery (logdev: internal)
------------[ cut here ]------------
WARNING: CPU: 5 PID: 3435170 at mm/page_alloc.c:3539 get_page_from_freelist+0xdee/0xe40
.....
RIP: 0010:get_page_from_freelist+0xdee/0xe40
Call Trace:
? complete+0x3f/0x50
__alloc_pages+0x16f/0x300
alloc_pages+0x87/0x110
kmalloc_order+0x2c/0x90
kmalloc_order_trace+0x1d/0x90
__kmalloc_track_caller+0x215/0x270
? xlog_recover_add_to_cont_trans+0x63/0x1f0
krealloc+0x54/0xb0
xlog_recover_add_to_cont_trans+0x63/0x1f0
xlog_recovery_process_trans+0xc1/0xd0
xlog_recover_process_ophdr+0x86/0x130
xlog_recover_process_data+0x9f/0x160
xlog_recover_process+0xa2/0x120
xlog_do_recovery_pass+0x40b/0x7d0
? __irq_work_queue_local+0x4f/0x60
? irq_work_queue+0x3a/0x50
xlog_do_log_recovery+0x70/0x150
xlog_do_recover+0x38/0x1d0
xlog_recover+0xd8/0x170
xfs_log_mount+0x181/0x300
xfs_mountfs+0x4a1/0x9b0
xfs_fs_fill_super+0x3c0/0x7b0
get_tree_bdev+0x171/0x270
? suffix_kstrtoint.constprop.0+0xf0/0xf0
xfs_fs_get_tree+0x15/0x20
vfs_get_tree+0x24/0xc0
path_mount+0x2f5/0xaf0
__x64_sys_mount+0x108/0x140
do_syscall_64+0x3a/0x70
entry_SYSCALL_64_after_hwframe+0x44/0xae
Essentially, we are taking a multi-order allocation from kmem_alloc()
(which has an open coded no fail, no warn loop) and then
reallocating it out to 64kB using krealloc(__GFP_NOFAIL) and that is
then triggering the above warning.
This is a regression caused by converting this code from an open
coded no fail/no warn reallocation loop to using __GFP_NOFAIL.
What we actually need here is kvrealloc(), so that if contiguous
page allocation fails we fall back to vmalloc() and we don't
get nasty warnings happening in XFS.
Fixes: 771915c4f688 ("xfs: remove kmem_realloc()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_log_recover.c | 4 +++-
include/linux/mm.h | 2 ++
mm/util.c | 15 +++++++++++++++
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 69408782019e..e61f28ce3e44 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2061,7 +2061,9 @@ xlog_recover_add_to_cont_trans(
old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
old_len = item->ri_buf[item->ri_cnt-1].i_len;
- ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
+ ptr = kvrealloc(old_ptr, old_len, len + old_len, GFP_KERNEL);
+ if (!ptr)
+ return -ENOMEM;
memcpy(&ptr[old_len], dp, len);
item->ri_buf[item->ri_cnt-1].i_len += len;
item->ri_buf[item->ri_cnt-1].i_addr = ptr;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5b4d88faf114..b8b677f47a8d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -788,6 +788,8 @@ static inline void *kvcalloc(size_t n, size_t size, gfp_t flags)
return kvmalloc_array(n, size, flags | __GFP_ZERO);
}
+extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize,
+ gfp_t flags);
extern void kvfree(const void *addr);
extern void kvfree_sensitive(const void *addr, size_t len);
diff --git a/mm/util.c b/mm/util.c
index ba9643de689e..25bfda774f6f 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -661,6 +661,21 @@ void kvfree_sensitive(const void *addr, size_t len)
}
EXPORT_SYMBOL(kvfree_sensitive);
+void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags)
+{
+ void *newp;
+
+ if (oldsize >= newsize)
+ return (void *)p;
+ newp = kvmalloc(newsize, flags);
+ if (!newp)
+ return NULL;
+ memcpy(newp, p, oldsize);
+ kvfree(p);
+ return newp;
+}
+EXPORT_SYMBOL(kvrealloc);
+
static inline void *__page_rmapping(struct page *page)
{
unsigned long mapping;
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write
2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
@ 2022-08-10 14:15 ` Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE Amir Goldstein
2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
Luis Chamberlain, Adam Manzanares, linux-xfs, stable,
Christoph Hellwig, Chandan Babu R
From: "Darrick J. Wong" <djwong@kernel.org>
commit 72a048c1056a72e37ea2ee34cc73d8c6d6cb4290 upstream.
While prototyping a free space defragmentation tool, I observed an
unexpected IO error while running a sequence of commands that can be
recreated by the following sequence of commands:
$ xfs_io -f -c "pwrite -S 0x58 -b 10m 0 10m" file1
$ cp --reflink=always file1 file2
$ punch-alternating -o 1 file2
$ xfs_io -c "funshare 0 10m" file2
fallocate: Input/output error
I then scraped this (abbreviated) stack trace from dmesg:
WARNING: CPU: 0 PID: 30788 at fs/iomap/buffered-io.c:577 iomap_write_begin+0x376/0x450
CPU: 0 PID: 30788 Comm: xfs_io Not tainted 5.14.0-rc6-xfsx #rc6 5ef57b62a900814b3e4d885c755e9014541c8732
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:iomap_write_begin+0x376/0x450
RSP: 0018:ffffc90000c0fc20 EFLAGS: 00010297
RAX: 0000000000000001 RBX: ffffc90000c0fd10 RCX: 0000000000001000
RDX: ffffc90000c0fc54 RSI: 000000000000000c RDI: 000000000000000c
RBP: ffff888005d5dbd8 R08: 0000000000102000 R09: ffffc90000c0fc50
R10: 0000000000b00000 R11: 0000000000101000 R12: ffffea0000336c40
R13: 0000000000001000 R14: ffffc90000c0fd10 R15: 0000000000101000
FS: 00007f4b8f62fe40(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056361c554108 CR3: 000000000524e004 CR4: 00000000001706f0
Call Trace:
iomap_unshare_actor+0x95/0x140
iomap_apply+0xfa/0x300
iomap_file_unshare+0x44/0x60
xfs_reflink_unshare+0x50/0x140 [xfs 61947ea9b3a73e79d747dbc1b90205e7987e4195]
xfs_file_fallocate+0x27c/0x610 [xfs 61947ea9b3a73e79d747dbc1b90205e7987e4195]
vfs_fallocate+0x133/0x330
__x64_sys_fallocate+0x3e/0x70
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f4b8f79140a
Looking at the iomap tracepoints, I saw this:
iomap_iter: dev 8:64 ino 0x100 pos 0 length 0 flags WRITE|0x80 (0x81) ops xfs_buffered_write_iomap_ops caller iomap_file_unshare
iomap_iter_dstmap: dev 8:64 ino 0x100 bdev 8:64 addr -1 offset 0 length 131072 type DELALLOC flags SHARED
iomap_iter_srcmap: dev 8:64 ino 0x100 bdev 8:64 addr 147456 offset 0 length 4096 type MAPPED flags
iomap_iter: dev 8:64 ino 0x100 pos 0 length 4096 flags WRITE|0x80 (0x81) ops xfs_buffered_write_iomap_ops caller iomap_file_unshare
iomap_iter_dstmap: dev 8:64 ino 0x100 bdev 8:64 addr -1 offset 4096 length 4096 type DELALLOC flags SHARED
console: WARNING: CPU: 0 PID: 30788 at fs/iomap/buffered-io.c:577 iomap_write_begin+0x376/0x450
The first time funshare calls ->iomap_begin, xfs sees that the first
block is shared and creates a 128k delalloc reservation in the COW fork.
The delalloc reservation is returned as dstmap, and the shared block is
returned as srcmap. So far so good.
funshare calls ->iomap_begin to try the second block. This time there's
no srcmap (punch-alternating punched it out!) but we still have the
delalloc reservation in the COW fork. Therefore, we again return the
reservation as dstmap and the hole as srcmap. iomap_unshare_iter
incorrectly tries to unshare the hole, which __iomap_write_begin rejects
because shared regions must be fully written and therefore cannot
require zeroing.
Therefore, change the buffered write iomap_begin function not to set
IOMAP_F_SHARED when there isn't a source mapping to read from for the
unsharing.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_iomap.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 74bc2beadc23..bd5a25f4952d 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1062,11 +1062,11 @@ xfs_buffered_write_iomap_begin(
error = xfs_bmbt_to_iomap(ip, srcmap, &imap, 0);
if (error)
return error;
- } else {
- xfs_trim_extent(&cmap, offset_fsb,
- imap.br_startoff - offset_fsb);
+ return xfs_bmbt_to_iomap(ip, iomap, &cmap, IOMAP_F_SHARED);
}
- return xfs_bmbt_to_iomap(ip, iomap, &cmap, IOMAP_F_SHARED);
+
+ xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb);
+ return xfs_bmbt_to_iomap(ip, iomap, &cmap, 0);
out_unlock:
xfs_iunlock(ip, XFS_ILOCK_EXCL);
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE
2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write Amir Goldstein
@ 2022-08-10 14:15 ` Amir Goldstein
2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2022-08-10 14:15 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Sasha Levin, Darrick J . Wong, Leah Rumancik, Chandan Babu R,
Luis Chamberlain, Adam Manzanares, linux-xfs, stable,
Dave Chinner
From: Dave Chinner <dchinner@redhat.com>
commit f38a032b165d812b0ba8378a5cd237c0888ff65f upstream.
Yup, the VFS hoist broke it, and nobody noticed. Bulkstat workloads
make it clear that it doesn't work as it should.
Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
fs/xfs/xfs_icache.c | 3 ++-
fs/xfs/xfs_iops.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index deb99300d171..e69a08ed7de4 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -47,8 +47,9 @@ xfs_inode_alloc(
return NULL;
}
- /* VFS doesn't initialise i_mode! */
+ /* VFS doesn't initialise i_mode or i_state! */
VFS_I(ip)->i_mode = 0;
+ VFS_I(ip)->i_state = 0;
XFS_STATS_INC(mp, vn_active);
ASSERT(atomic_read(&ip->i_pincount) == 0);
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index b7f7b31a77d5..6a3026e78a9b 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1328,7 +1328,7 @@ xfs_setup_inode(
gfp_t gfp_mask;
inode->i_ino = ip->i_ino;
- inode->i_state = I_NEW;
+ inode->i_state |= I_NEW;
inode_sb_list_add(inode);
/* make the inode look hashed for the writeback code */
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-08-10 14:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-10 14:15 [PATCH 5.10 v2 0/3] xfs stable patches for 5.10.y (from v5.15) Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 1/3] mm: Add kvrealloc() Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 2/3] xfs: only set IOMAP_F_SHARED when providing a srcmap to a write Amir Goldstein
2022-08-10 14:15 ` [PATCH 5.10 v2 3/3] xfs: fix I_DONTCACHE Amir Goldstein
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.