* [PATCH v11 0/4] xfs: avoid transaction reservation recursion
@ 2020-12-08 12:28 Yafang Shao
2020-12-08 12:28 ` [PATCH v11 1/4] mm: Add become_kswapd and restore_kswapd Yafang Shao
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Yafang Shao @ 2020-12-08 12:28 UTC (permalink / raw)
To: darrick.wong, willy, david, hch, mhocko, akpm, dhowells, jlayton
Cc: linux-fsdevel, linux-cachefs, linux-xfs, linux-mm, Yafang Shao
PF_FSTRANS which is used to avoid transaction reservation recursion, is
dropped since commit 9070733b4efa ("xfs: abstract PF_FSTRANS to
PF_MEMALLOC_NOFS") and commit 7dea19f9ee63 ("mm: introduce
memalloc_nofs_{save,restore} API"), and replaced by PF_MEMALLOC_NOFS which
means to avoid filesystem reclaim recursion.
As these two flags have different meanings, we'd better reintroduce
PF_FSTRANS back. To avoid wasting the space of PF_* flags in task_struct,
we can reuse the current->journal_info to do that, per Willy. As the
check of transaction reservation recursion is used by XFS only, we can
move the check into xfs_vm_writepage(s), per Dave.
Patch #1 and #2 are to use the memalloc_nofs_{save,restore} API
Patch #1 is picked form Willy's patchset "Overhaul memalloc_no*"[1]
Patch #3 is the refactor of xfs_trans context, which is activated when
xfs_trans is allocated and deactivated when xfs_trans is freed.
Patch #4 is the implementation of reussing current->journal_info to
avoid transaction reservation recursion.
No obvious error occurred after running xfstests.
[1]. https://lore.kernel.org/linux-mm/20200625113122.7540-1-willy@infradead.org
v11:
- add the warning at the callsite of xfs_trans_context_active()
- improve the commit log of patch #2
v10:
- refactor the code, per Dave.
v9:
- rebase it on xfs tree.
- Darrick fixed an error occurred in xfs/141
- run xfstests, and no obvious error occurred.
v8:
- check xfs_trans_context_active() in xfs_vm_writepage(s), per Dave.
v7:
- check fstrans recursion for XFS only, by introducing a new member in
struct writeback_control.
v6:
- add Michal's ack and comment in patch #1.
v5:
- pick one of Willy's patch
- introduce four new helpers, per Dave
v4:
- retitle from "xfs: introduce task->in_fstrans for transaction reservation
recursion protection"
- reuse current->journal_info, per Willy
Matthew Wilcox (Oracle) (1):
mm: Add become_kswapd and restore_kswapd
Yafang Shao (3):
xfs: use memalloc_nofs_{save,restore} in xfs transaction
xfs: refactor the usage around xfs_trans_context_{set,clear}
xfs: use current->journal_info to avoid transaction reservation
recursion
fs/iomap/buffered-io.c | 7 -------
fs/xfs/libxfs/xfs_btree.c | 14 ++++++++------
fs/xfs/xfs_aops.c | 21 +++++++++++++++++++--
fs/xfs/xfs_linux.h | 4 ----
fs/xfs/xfs_trans.c | 24 +++++++++++-------------
fs/xfs/xfs_trans.h | 34 ++++++++++++++++++++++++++++++++++
include/linux/sched/mm.h | 23 +++++++++++++++++++++++
mm/vmscan.c | 16 +---------------
8 files changed, 96 insertions(+), 47 deletions(-)
--
2.18.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v11 1/4] mm: Add become_kswapd and restore_kswapd
2020-12-08 12:28 [PATCH v11 0/4] xfs: avoid transaction reservation recursion Yafang Shao
@ 2020-12-08 12:28 ` Yafang Shao
2020-12-08 12:28 ` [PATCH v11 2/4] xfs: use memalloc_nofs_{save,restore} in xfs transaction Yafang Shao
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Yafang Shao @ 2020-12-08 12:28 UTC (permalink / raw)
To: darrick.wong, willy, david, hch, mhocko, akpm, dhowells, jlayton
Cc: linux-fsdevel, linux-cachefs, linux-xfs, linux-mm, Michal Hocko,
Christoph Hellwig, Yafang Shao
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Since XFS needs to pretend to be kswapd in some of its worker threads,
create methods to save & restore kswapd state. Don't bother restoring
kswapd state in kswapd -- the only time we reach this code is when we're
exiting and the task_struct is about to be destroyed anyway.
Cc: Dave Chinner <david@fromorbit.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
fs/xfs/libxfs/xfs_btree.c | 14 ++++++++------
include/linux/sched/mm.h | 23 +++++++++++++++++++++++
mm/vmscan.c | 16 +---------------
3 files changed, 32 insertions(+), 21 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 2d25bab68764..a04a44238aab 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -2813,8 +2813,9 @@ xfs_btree_split_worker(
{
struct xfs_btree_split_args *args = container_of(work,
struct xfs_btree_split_args, work);
+ bool is_kswapd = args->kswapd;
unsigned long pflags;
- unsigned long new_pflags = PF_MEMALLOC_NOFS;
+ int memalloc_nofs;
/*
* we are in a transaction context here, but may also be doing work
@@ -2822,16 +2823,17 @@ xfs_btree_split_worker(
* temporarily to ensure that we don't block waiting for memory reclaim
* in any way.
*/
- if (args->kswapd)
- new_pflags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
-
- current_set_flags_nested(&pflags, new_pflags);
+ if (is_kswapd)
+ pflags = become_kswapd();
+ memalloc_nofs = memalloc_nofs_save();
args->result = __xfs_btree_split(args->cur, args->level, args->ptrp,
args->key, args->curp, args->stat);
complete(args->done);
- current_restore_flags_nested(&pflags, new_pflags);
+ memalloc_nofs_restore(memalloc_nofs);
+ if (is_kswapd)
+ restore_kswapd(pflags);
}
/*
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index d5ece7a9a403..2faf03e79a1e 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -278,6 +278,29 @@ static inline void memalloc_nocma_restore(unsigned int flags)
}
#endif
+/*
+ * Tell the memory management code that this thread is working on behalf
+ * of background memory reclaim (like kswapd). That means that it will
+ * get access to memory reserves should it need to allocate memory in
+ * order to make forward progress. With this great power comes great
+ * responsibility to not exhaust those reserves.
+ */
+#define KSWAPD_PF_FLAGS (PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD)
+
+static inline unsigned long become_kswapd(void)
+{
+ unsigned long flags = current->flags & KSWAPD_PF_FLAGS;
+
+ current->flags |= KSWAPD_PF_FLAGS;
+
+ return flags;
+}
+
+static inline void restore_kswapd(unsigned long flags)
+{
+ current->flags &= ~(flags ^ KSWAPD_PF_FLAGS);
+}
+
#ifdef CONFIG_MEMCG
DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg);
/**
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1b8f0e059767..77bc1dda75bf 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3869,19 +3869,7 @@ static int kswapd(void *p)
if (!cpumask_empty(cpumask))
set_cpus_allowed_ptr(tsk, cpumask);
- /*
- * Tell the memory management that we're a "memory allocator",
- * and that if we need more memory we should get access to it
- * regardless (see "__alloc_pages()"). "kswapd" should
- * never get caught in the normal page freeing logic.
- *
- * (Kswapd normally doesn't need memory anyway, but sometimes
- * you need a small amount of memory in order to be able to
- * page out something else, and this flag essentially protects
- * us from recursively trying to free more memory as we're
- * trying to free the first piece of memory in the first place).
- */
- tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD;
+ become_kswapd();
set_freezable();
WRITE_ONCE(pgdat->kswapd_order, 0);
@@ -3931,8 +3919,6 @@ static int kswapd(void *p)
goto kswapd_try_sleep;
}
- tsk->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD);
-
return 0;
}
--
2.18.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v11 2/4] xfs: use memalloc_nofs_{save,restore} in xfs transaction
2020-12-08 12:28 [PATCH v11 0/4] xfs: avoid transaction reservation recursion Yafang Shao
2020-12-08 12:28 ` [PATCH v11 1/4] mm: Add become_kswapd and restore_kswapd Yafang Shao
@ 2020-12-08 12:28 ` Yafang Shao
2020-12-08 19:02 ` Darrick J. Wong
2020-12-08 12:28 ` [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear} Yafang Shao
2020-12-08 12:28 ` [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion Yafang Shao
3 siblings, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2020-12-08 12:28 UTC (permalink / raw)
To: darrick.wong, willy, david, hch, mhocko, akpm, dhowells, jlayton
Cc: linux-fsdevel, linux-cachefs, linux-xfs, linux-mm, Yafang Shao,
Christoph Hellwig
Introduce a new API to mark the start and end of XFS transactions.
For now, just save and restore the memalloc_nofs flags.
The new helpers as follows,
- xfs_trans_context_set
Mark the start of XFS transactions
- xfs_trans_context_clear
Mark the end of XFS transactions
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
fs/xfs/xfs_aops.c | 4 ++--
fs/xfs/xfs_linux.h | 4 ----
fs/xfs/xfs_trans.c | 13 +++++++------
fs/xfs/xfs_trans.h | 12 ++++++++++++
4 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 4304c6416fbb..2371187b7615 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -62,7 +62,7 @@ xfs_setfilesize_trans_alloc(
* We hand off the transaction to the completion thread now, so
* clear the flag here.
*/
- current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_clear(tp);
return 0;
}
@@ -125,7 +125,7 @@ xfs_setfilesize_ioend(
* thus we need to mark ourselves as being in a transaction manually.
* Similarly for freeze protection.
*/
- current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_set(tp);
__sb_writers_acquired(VFS_I(ip)->i_sb, SB_FREEZE_FS);
/* we abort the update if there was an IO error */
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index 5b7a1e201559..6ab0f8043c73 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -102,10 +102,6 @@ typedef __u32 xfs_nlink_t;
#define xfs_cowb_secs xfs_params.cowb_timer.val
#define current_cpu() (raw_smp_processor_id())
-#define current_set_flags_nested(sp, f) \
- (*(sp) = current->flags, current->flags |= (f))
-#define current_restore_flags_nested(sp, f) \
- (current->flags = ((current->flags & ~(f)) | (*(sp) & (f))))
#define NBBY 8 /* number of bits per byte */
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index c94e71f741b6..11d390f0d3f2 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -154,7 +154,7 @@ xfs_trans_reserve(
bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
/* Mark this thread as being in a transaction */
- current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_set(tp);
/*
* Attempt to reserve the needed disk blocks by decrementing
@@ -164,7 +164,7 @@ xfs_trans_reserve(
if (blocks > 0) {
error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
if (error != 0) {
- current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_clear(tp);
return -ENOSPC;
}
tp->t_blk_res += blocks;
@@ -241,7 +241,7 @@ xfs_trans_reserve(
tp->t_blk_res = 0;
}
- current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_clear(tp);
return error;
}
@@ -878,7 +878,7 @@ __xfs_trans_commit(
xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
- current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_clear(tp);
xfs_trans_free(tp);
/*
@@ -910,7 +910,8 @@ __xfs_trans_commit(
xfs_log_ticket_ungrant(mp->m_log, tp->t_ticket);
tp->t_ticket = NULL;
}
- current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+
+ xfs_trans_context_clear(tp);
xfs_trans_free_items(tp, !!error);
xfs_trans_free(tp);
@@ -971,7 +972,7 @@ xfs_trans_cancel(
}
/* mark this thread as no longer being in a transaction */
- current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
+ xfs_trans_context_clear(tp);
xfs_trans_free_items(tp, dirty);
xfs_trans_free(tp);
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 084658946cc8..44b11c64a15e 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -268,4 +268,16 @@ xfs_trans_item_relog(
return lip->li_ops->iop_relog(lip, tp);
}
+static inline void
+xfs_trans_context_set(struct xfs_trans *tp)
+{
+ tp->t_pflags = memalloc_nofs_save();
+}
+
+static inline void
+xfs_trans_context_clear(struct xfs_trans *tp)
+{
+ memalloc_nofs_restore(tp->t_pflags);
+}
+
#endif /* __XFS_TRANS_H__ */
--
2.18.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear}
2020-12-08 12:28 [PATCH v11 0/4] xfs: avoid transaction reservation recursion Yafang Shao
2020-12-08 12:28 ` [PATCH v11 1/4] mm: Add become_kswapd and restore_kswapd Yafang Shao
2020-12-08 12:28 ` [PATCH v11 2/4] xfs: use memalloc_nofs_{save,restore} in xfs transaction Yafang Shao
@ 2020-12-08 12:28 ` Yafang Shao
2020-12-08 18:59 ` Darrick J. Wong
2020-12-08 12:28 ` [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion Yafang Shao
3 siblings, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2020-12-08 12:28 UTC (permalink / raw)
To: darrick.wong, willy, david, hch, mhocko, akpm, dhowells, jlayton
Cc: linux-fsdevel, linux-cachefs, linux-xfs, linux-mm, Yafang Shao,
Christoph Hellwig
The xfs_trans context should be active after it is allocated, and
deactive when it is freed.
So these two helpers are refactored as,
- xfs_trans_context_set()
Used in xfs_trans_alloc()
- xfs_trans_context_clear()
Used in xfs_trans_free()
This patch is based on Darrick's work to fix the issue in xfs/141 in the
earlier version. [1]
1. https://lore.kernel.org/linux-xfs/20201104001649.GN7123@magnolia
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
fs/xfs/xfs_trans.c | 20 +++++++-------------
1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 11d390f0d3f2..fe20398a214e 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -67,6 +67,9 @@ xfs_trans_free(
xfs_extent_busy_sort(&tp->t_busy);
xfs_extent_busy_clear(tp->t_mountp, &tp->t_busy, false);
+ /* Detach the transaction from this thread. */
+ xfs_trans_context_clear(tp);
+
trace_xfs_trans_free(tp, _RET_IP_);
if (!(tp->t_flags & XFS_TRANS_NO_WRITECOUNT))
sb_end_intwrite(tp->t_mountp->m_super);
@@ -153,9 +156,6 @@ xfs_trans_reserve(
int error = 0;
bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
- /* Mark this thread as being in a transaction */
- xfs_trans_context_set(tp);
-
/*
* Attempt to reserve the needed disk blocks by decrementing
* the number needed from the number available. This will
@@ -163,10 +163,9 @@ xfs_trans_reserve(
*/
if (blocks > 0) {
error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
- if (error != 0) {
- xfs_trans_context_clear(tp);
+ if (error != 0)
return -ENOSPC;
- }
+
tp->t_blk_res += blocks;
}
@@ -241,8 +240,6 @@ xfs_trans_reserve(
tp->t_blk_res = 0;
}
- xfs_trans_context_clear(tp);
-
return error;
}
@@ -284,6 +281,8 @@ xfs_trans_alloc(
INIT_LIST_HEAD(&tp->t_dfops);
tp->t_firstblock = NULLFSBLOCK;
+ /* Mark this thread as being in a transaction */
+ xfs_trans_context_set(tp);
error = xfs_trans_reserve(tp, resp, blocks, rtextents);
if (error) {
xfs_trans_cancel(tp);
@@ -878,7 +877,6 @@ __xfs_trans_commit(
xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
- xfs_trans_context_clear(tp);
xfs_trans_free(tp);
/*
@@ -911,7 +909,6 @@ __xfs_trans_commit(
tp->t_ticket = NULL;
}
- xfs_trans_context_clear(tp);
xfs_trans_free_items(tp, !!error);
xfs_trans_free(tp);
@@ -971,9 +968,6 @@ xfs_trans_cancel(
tp->t_ticket = NULL;
}
- /* mark this thread as no longer being in a transaction */
- xfs_trans_context_clear(tp);
-
xfs_trans_free_items(tp, dirty);
xfs_trans_free(tp);
}
--
2.18.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion
2020-12-08 12:28 [PATCH v11 0/4] xfs: avoid transaction reservation recursion Yafang Shao
` (2 preceding siblings ...)
2020-12-08 12:28 ` [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear} Yafang Shao
@ 2020-12-08 12:28 ` Yafang Shao
2020-12-08 18:59 ` Darrick J. Wong
3 siblings, 1 reply; 11+ messages in thread
From: Yafang Shao @ 2020-12-08 12:28 UTC (permalink / raw)
To: darrick.wong, willy, david, hch, mhocko, akpm, dhowells, jlayton
Cc: linux-fsdevel, linux-cachefs, linux-xfs, linux-mm, Yafang Shao,
Christoph Hellwig
PF_FSTRANS which is used to avoid transaction reservation recursion, is
dropped since commit 9070733b4efa ("xfs: abstract PF_FSTRANS to
PF_MEMALLOC_NOFS") and commit 7dea19f9ee63 ("mm: introduce
memalloc_nofs_{save,restore} API") and replaced by PF_MEMALLOC_NOFS which
means to avoid filesystem reclaim recursion.
As these two flags have different meanings, we'd better reintroduce
PF_FSTRANS back. To avoid wasting the space of PF_* flags in task_struct,
we can reuse the current->journal_info to do that, per Willy. As the
check of transaction reservation recursion is used by XFS only, we can
move the check into xfs_vm_writepage(s), per Dave.
To better abstract that behavoir, two new helpers are introduced, as
follows,
- xfs_trans_context_active
To check whehter current is in fs transcation or not
- xfs_trans_context_swap
Transfer the transaction context when rolling a permanent transaction
These two new helpers are instroduced in xfs_trans.h.
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
fs/iomap/buffered-io.c | 7 -------
fs/xfs/xfs_aops.c | 17 +++++++++++++++++
fs/xfs/xfs_trans.c | 3 +++
fs/xfs/xfs_trans.h | 22 ++++++++++++++++++++++
4 files changed, 42 insertions(+), 7 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 10cc7979ce38..3c53fa6ce64d 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1458,13 +1458,6 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
PF_MEMALLOC))
goto redirty;
- /*
- * Given that we do not allow direct reclaim to call us, we should
- * never be called in a recursive filesystem reclaim context.
- */
- if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS))
- goto redirty;
-
/*
* Is this page beyond the end of the file?
*
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 2371187b7615..0da0242d42c3 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -568,6 +568,16 @@ xfs_vm_writepage(
{
struct xfs_writepage_ctx wpc = { };
+ /*
+ * Given that we do not allow direct reclaim to call us, we should
+ * never be called while in a filesystem transaction.
+ */
+ if (WARN_ON_ONCE(xfs_trans_context_active())) {
+ redirty_page_for_writepage(wbc, page);
+ unlock_page(page);
+ return 0;
+ }
+
return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops);
}
@@ -579,6 +589,13 @@ xfs_vm_writepages(
struct xfs_writepage_ctx wpc = { };
xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
+ /*
+ * Given that we do not allow direct reclaim to call us, we should
+ * never be called while in a filesystem transaction.
+ */
+ if (WARN_ON_ONCE(xfs_trans_context_active()))
+ return 0;
+
return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
}
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index fe20398a214e..08d4916ffb13 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -124,6 +124,9 @@ xfs_trans_dup(
tp->t_rtx_res = tp->t_rtx_res_used;
ntp->t_pflags = tp->t_pflags;
+ /* Associate the new transaction with this thread. */
+ xfs_trans_context_swap(tp, ntp);
+
/* move deferred ops over to the new tp */
xfs_defer_move(ntp, tp);
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 44b11c64a15e..d596a375e3bf 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -268,16 +268,38 @@ xfs_trans_item_relog(
return lip->li_ops->iop_relog(lip, tp);
}
+static inline bool
+xfs_trans_context_active(void)
+{
+ /* Use journal_info to indicate current is in a transaction */
+ return current->journal_info != NULL;
+}
+
static inline void
xfs_trans_context_set(struct xfs_trans *tp)
{
+ ASSERT(!current->journal_info);
+ current->journal_info = tp;
tp->t_pflags = memalloc_nofs_save();
}
static inline void
xfs_trans_context_clear(struct xfs_trans *tp)
{
+ ASSERT(current->journal_info == tp);
+ current->journal_info = NULL;
memalloc_nofs_restore(tp->t_pflags);
}
+/*
+ * Transfer the transaction context when rolling a permanent
+ * transaction.
+ */
+static inline void
+xfs_trans_context_swap(struct xfs_trans *tp, struct xfs_trans *ntp)
+{
+ ASSERT(current->journal_info == tp);
+ current->journal_info = ntp;
+}
+
#endif /* __XFS_TRANS_H__ */
--
2.18.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear}
2020-12-08 12:28 ` [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear} Yafang Shao
@ 2020-12-08 18:59 ` Darrick J. Wong
[not found] ` <CALOAHbB1uKmQ7ns08KW4zH1ikqD0GAY_Y7VySzmTY0=LTEPURA@mail.gmail.com>
0 siblings, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2020-12-08 18:59 UTC (permalink / raw)
To: Yafang Shao
Cc: willy, david, hch, mhocko, akpm, dhowells, jlayton,
linux-fsdevel, linux-cachefs, linux-xfs, linux-mm,
Christoph Hellwig
On Tue, Dec 08, 2020 at 08:28:23PM +0800, Yafang Shao wrote:
> The xfs_trans context should be active after it is allocated, and
> deactive when it is freed.
>
> So these two helpers are refactored as,
> - xfs_trans_context_set()
> Used in xfs_trans_alloc()
> - xfs_trans_context_clear()
> Used in xfs_trans_free()
>
> This patch is based on Darrick's work to fix the issue in xfs/141 in the
> earlier version. [1]
>
> 1. https://lore.kernel.org/linux-xfs/20201104001649.GN7123@magnolia
>
> Cc: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> fs/xfs/xfs_trans.c | 20 +++++++-------------
> 1 file changed, 7 insertions(+), 13 deletions(-)
>
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index 11d390f0d3f2..fe20398a214e 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -67,6 +67,9 @@ xfs_trans_free(
> xfs_extent_busy_sort(&tp->t_busy);
> xfs_extent_busy_clear(tp->t_mountp, &tp->t_busy, false);
>
> + /* Detach the transaction from this thread. */
> + xfs_trans_context_clear(tp);
Don't you need to check if tp is still the current transaction before
you clear PF_MEMALLOC_NOFS, now that the NOFS is bound to the lifespan
of the transaction itself instead of the reservation?
--D
> +
> trace_xfs_trans_free(tp, _RET_IP_);
> if (!(tp->t_flags & XFS_TRANS_NO_WRITECOUNT))
> sb_end_intwrite(tp->t_mountp->m_super);
> @@ -153,9 +156,6 @@ xfs_trans_reserve(
> int error = 0;
> bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
>
> - /* Mark this thread as being in a transaction */
> - xfs_trans_context_set(tp);
> -
> /*
> * Attempt to reserve the needed disk blocks by decrementing
> * the number needed from the number available. This will
> @@ -163,10 +163,9 @@ xfs_trans_reserve(
> */
> if (blocks > 0) {
> error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
> - if (error != 0) {
> - xfs_trans_context_clear(tp);
> + if (error != 0)
> return -ENOSPC;
> - }
> +
> tp->t_blk_res += blocks;
> }
>
> @@ -241,8 +240,6 @@ xfs_trans_reserve(
> tp->t_blk_res = 0;
> }
>
> - xfs_trans_context_clear(tp);
> -
> return error;
> }
>
> @@ -284,6 +281,8 @@ xfs_trans_alloc(
> INIT_LIST_HEAD(&tp->t_dfops);
> tp->t_firstblock = NULLFSBLOCK;
>
> + /* Mark this thread as being in a transaction */
> + xfs_trans_context_set(tp);
> error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> if (error) {
> xfs_trans_cancel(tp);
> @@ -878,7 +877,6 @@ __xfs_trans_commit(
>
> xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
>
> - xfs_trans_context_clear(tp);
> xfs_trans_free(tp);
>
> /*
> @@ -911,7 +909,6 @@ __xfs_trans_commit(
> tp->t_ticket = NULL;
> }
>
> - xfs_trans_context_clear(tp);
> xfs_trans_free_items(tp, !!error);
> xfs_trans_free(tp);
>
> @@ -971,9 +968,6 @@ xfs_trans_cancel(
> tp->t_ticket = NULL;
> }
>
> - /* mark this thread as no longer being in a transaction */
> - xfs_trans_context_clear(tp);
> -
> xfs_trans_free_items(tp, dirty);
> xfs_trans_free(tp);
> }
> --
> 2.18.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion
2020-12-08 12:28 ` [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion Yafang Shao
@ 2020-12-08 18:59 ` Darrick J. Wong
2020-12-09 1:40 ` Yafang Shao
0 siblings, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2020-12-08 18:59 UTC (permalink / raw)
To: Yafang Shao
Cc: willy, david, hch, mhocko, akpm, dhowells, jlayton,
linux-fsdevel, linux-cachefs, linux-xfs, linux-mm,
Christoph Hellwig
On Tue, Dec 08, 2020 at 08:28:24PM +0800, Yafang Shao wrote:
> PF_FSTRANS which is used to avoid transaction reservation recursion, is
> dropped since commit 9070733b4efa ("xfs: abstract PF_FSTRANS to
> PF_MEMALLOC_NOFS") and commit 7dea19f9ee63 ("mm: introduce
> memalloc_nofs_{save,restore} API") and replaced by PF_MEMALLOC_NOFS which
> means to avoid filesystem reclaim recursion.
>
> As these two flags have different meanings, we'd better reintroduce
> PF_FSTRANS back. To avoid wasting the space of PF_* flags in task_struct,
> we can reuse the current->journal_info to do that, per Willy. As the
> check of transaction reservation recursion is used by XFS only, we can
> move the check into xfs_vm_writepage(s), per Dave.
>
> To better abstract that behavoir, two new helpers are introduced, as
> follows,
> - xfs_trans_context_active
> To check whehter current is in fs transcation or not
> - xfs_trans_context_swap
> Transfer the transaction context when rolling a permanent transaction
>
> These two new helpers are instroduced in xfs_trans.h.
>
> Cc: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Jeff Layton <jlayton@redhat.com>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> fs/iomap/buffered-io.c | 7 -------
> fs/xfs/xfs_aops.c | 17 +++++++++++++++++
> fs/xfs/xfs_trans.c | 3 +++
> fs/xfs/xfs_trans.h | 22 ++++++++++++++++++++++
> 4 files changed, 42 insertions(+), 7 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 10cc7979ce38..3c53fa6ce64d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1458,13 +1458,6 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> PF_MEMALLOC))
> goto redirty;
>
> - /*
> - * Given that we do not allow direct reclaim to call us, we should
> - * never be called in a recursive filesystem reclaim context.
> - */
> - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS))
> - goto redirty;
> -
> /*
> * Is this page beyond the end of the file?
> *
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 2371187b7615..0da0242d42c3 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -568,6 +568,16 @@ xfs_vm_writepage(
> {
> struct xfs_writepage_ctx wpc = { };
>
> + /*
> + * Given that we do not allow direct reclaim to call us, we should
> + * never be called while in a filesystem transaction.
> + */
> + if (WARN_ON_ONCE(xfs_trans_context_active())) {
> + redirty_page_for_writepage(wbc, page);
> + unlock_page(page);
> + return 0;
> + }
> +
> return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops);
> }
>
> @@ -579,6 +589,13 @@ xfs_vm_writepages(
> struct xfs_writepage_ctx wpc = { };
>
> xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
> + /*
> + * Given that we do not allow direct reclaim to call us, we should
> + * never be called while in a filesystem transaction.
> + */
> + if (WARN_ON_ONCE(xfs_trans_context_active()))
> + return 0;
> +
> return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
> }
>
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index fe20398a214e..08d4916ffb13 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -124,6 +124,9 @@ xfs_trans_dup(
> tp->t_rtx_res = tp->t_rtx_res_used;
> ntp->t_pflags = tp->t_pflags;
This one line (ntp->t_pflags = tp->t_pflags) should move to
xfs_trans_context_swap.
--D
>
> + /* Associate the new transaction with this thread. */
> + xfs_trans_context_swap(tp, ntp);
> +
> /* move deferred ops over to the new tp */
> xfs_defer_move(ntp, tp);
>
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 44b11c64a15e..d596a375e3bf 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -268,16 +268,38 @@ xfs_trans_item_relog(
> return lip->li_ops->iop_relog(lip, tp);
> }
>
> +static inline bool
> +xfs_trans_context_active(void)
> +{
> + /* Use journal_info to indicate current is in a transaction */
> + return current->journal_info != NULL;
> +}
> +
> static inline void
> xfs_trans_context_set(struct xfs_trans *tp)
> {
> + ASSERT(!current->journal_info);
> + current->journal_info = tp;
> tp->t_pflags = memalloc_nofs_save();
> }
>
> static inline void
> xfs_trans_context_clear(struct xfs_trans *tp)
> {
> + ASSERT(current->journal_info == tp);
> + current->journal_info = NULL;
> memalloc_nofs_restore(tp->t_pflags);
> }
>
> +/*
> + * Transfer the transaction context when rolling a permanent
> + * transaction.
> + */
> +static inline void
> +xfs_trans_context_swap(struct xfs_trans *tp, struct xfs_trans *ntp)
> +{
> + ASSERT(current->journal_info == tp);
> + current->journal_info = ntp;
> +}
> +
> #endif /* __XFS_TRANS_H__ */
> --
> 2.18.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v11 2/4] xfs: use memalloc_nofs_{save,restore} in xfs transaction
2020-12-08 12:28 ` [PATCH v11 2/4] xfs: use memalloc_nofs_{save,restore} in xfs transaction Yafang Shao
@ 2020-12-08 19:02 ` Darrick J. Wong
0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2020-12-08 19:02 UTC (permalink / raw)
To: Yafang Shao
Cc: willy, david, hch, mhocko, akpm, dhowells, jlayton,
linux-fsdevel, linux-cachefs, linux-xfs, linux-mm,
Christoph Hellwig
On Tue, Dec 08, 2020 at 08:28:22PM +0800, Yafang Shao wrote:
> Introduce a new API to mark the start and end of XFS transactions.
> For now, just save and restore the memalloc_nofs flags.
>
> The new helpers as follows,
> - xfs_trans_context_set
> Mark the start of XFS transactions
> - xfs_trans_context_clear
> Mark the end of XFS transactions
>
> Cc: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
> fs/xfs/xfs_aops.c | 4 ++--
> fs/xfs/xfs_linux.h | 4 ----
> fs/xfs/xfs_trans.c | 13 +++++++------
> fs/xfs/xfs_trans.h | 12 ++++++++++++
> 4 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 4304c6416fbb..2371187b7615 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -62,7 +62,7 @@ xfs_setfilesize_trans_alloc(
> * We hand off the transaction to the completion thread now, so
> * clear the flag here.
> */
> - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_clear(tp);
> return 0;
> }
>
> @@ -125,7 +125,7 @@ xfs_setfilesize_ioend(
> * thus we need to mark ourselves as being in a transaction manually.
> * Similarly for freeze protection.
> */
> - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_set(tp);
> __sb_writers_acquired(VFS_I(ip)->i_sb, SB_FREEZE_FS);
>
> /* we abort the update if there was an IO error */
> diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
> index 5b7a1e201559..6ab0f8043c73 100644
> --- a/fs/xfs/xfs_linux.h
> +++ b/fs/xfs/xfs_linux.h
> @@ -102,10 +102,6 @@ typedef __u32 xfs_nlink_t;
> #define xfs_cowb_secs xfs_params.cowb_timer.val
>
> #define current_cpu() (raw_smp_processor_id())
> -#define current_set_flags_nested(sp, f) \
> - (*(sp) = current->flags, current->flags |= (f))
> -#define current_restore_flags_nested(sp, f) \
> - (current->flags = ((current->flags & ~(f)) | (*(sp) & (f))))
>
> #define NBBY 8 /* number of bits per byte */
>
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index c94e71f741b6..11d390f0d3f2 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -154,7 +154,7 @@ xfs_trans_reserve(
> bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
>
> /* Mark this thread as being in a transaction */
> - current_set_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_set(tp);
>
> /*
> * Attempt to reserve the needed disk blocks by decrementing
> @@ -164,7 +164,7 @@ xfs_trans_reserve(
> if (blocks > 0) {
> error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
> if (error != 0) {
> - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_clear(tp);
> return -ENOSPC;
> }
> tp->t_blk_res += blocks;
> @@ -241,7 +241,7 @@ xfs_trans_reserve(
> tp->t_blk_res = 0;
> }
>
> - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_clear(tp);
>
> return error;
> }
> @@ -878,7 +878,7 @@ __xfs_trans_commit(
>
> xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
>
> - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_clear(tp);
> xfs_trans_free(tp);
>
> /*
> @@ -910,7 +910,8 @@ __xfs_trans_commit(
> xfs_log_ticket_ungrant(mp->m_log, tp->t_ticket);
> tp->t_ticket = NULL;
> }
> - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> +
> + xfs_trans_context_clear(tp);
> xfs_trans_free_items(tp, !!error);
> xfs_trans_free(tp);
>
> @@ -971,7 +972,7 @@ xfs_trans_cancel(
> }
>
> /* mark this thread as no longer being in a transaction */
> - current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS);
> + xfs_trans_context_clear(tp);
>
> xfs_trans_free_items(tp, dirty);
> xfs_trans_free(tp);
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 084658946cc8..44b11c64a15e 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -268,4 +268,16 @@ xfs_trans_item_relog(
> return lip->li_ops->iop_relog(lip, tp);
> }
>
> +static inline void
> +xfs_trans_context_set(struct xfs_trans *tp)
> +{
> + tp->t_pflags = memalloc_nofs_save();
> +}
> +
> +static inline void
> +xfs_trans_context_clear(struct xfs_trans *tp)
> +{
> + memalloc_nofs_restore(tp->t_pflags);
It's a little strange to add the wrappers and convert the current->flags
modification macros to the memalloc_nofs_* functions in one patch, but
whatever, I'm more concerned about the things I complained about in the
next two patches.
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> +}
> +
> #endif /* __XFS_TRANS_H__ */
> --
> 2.18.4
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion
2020-12-08 18:59 ` Darrick J. Wong
@ 2020-12-09 1:40 ` Yafang Shao
0 siblings, 0 replies; 11+ messages in thread
From: Yafang Shao @ 2020-12-09 1:40 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Matthew Wilcox, Dave Chinner, Christoph Hellwig, Michal Hocko,
Andrew Morton, David Howells, jlayton, linux-fsdevel,
linux-cachefs, linux-xfs, Linux MM, Christoph Hellwig
On Wed, Dec 9, 2020 at 3:00 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Tue, Dec 08, 2020 at 08:28:24PM +0800, Yafang Shao wrote:
> > PF_FSTRANS which is used to avoid transaction reservation recursion, is
> > dropped since commit 9070733b4efa ("xfs: abstract PF_FSTRANS to
> > PF_MEMALLOC_NOFS") and commit 7dea19f9ee63 ("mm: introduce
> > memalloc_nofs_{save,restore} API") and replaced by PF_MEMALLOC_NOFS which
> > means to avoid filesystem reclaim recursion.
> >
> > As these two flags have different meanings, we'd better reintroduce
> > PF_FSTRANS back. To avoid wasting the space of PF_* flags in task_struct,
> > we can reuse the current->journal_info to do that, per Willy. As the
> > check of transaction reservation recursion is used by XFS only, we can
> > move the check into xfs_vm_writepage(s), per Dave.
> >
> > To better abstract that behavoir, two new helpers are introduced, as
> > follows,
> > - xfs_trans_context_active
> > To check whehter current is in fs transcation or not
> > - xfs_trans_context_swap
> > Transfer the transaction context when rolling a permanent transaction
> >
> > These two new helpers are instroduced in xfs_trans.h.
> >
> > Cc: Darrick J. Wong <darrick.wong@oracle.com>
> > Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Dave Chinner <david@fromorbit.com>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: David Howells <dhowells@redhat.com>
> > Cc: Jeff Layton <jlayton@redhat.com>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> > fs/iomap/buffered-io.c | 7 -------
> > fs/xfs/xfs_aops.c | 17 +++++++++++++++++
> > fs/xfs/xfs_trans.c | 3 +++
> > fs/xfs/xfs_trans.h | 22 ++++++++++++++++++++++
> > 4 files changed, 42 insertions(+), 7 deletions(-)
> >
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 10cc7979ce38..3c53fa6ce64d 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -1458,13 +1458,6 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> > PF_MEMALLOC))
> > goto redirty;
> >
> > - /*
> > - * Given that we do not allow direct reclaim to call us, we should
> > - * never be called in a recursive filesystem reclaim context.
> > - */
> > - if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS))
> > - goto redirty;
> > -
> > /*
> > * Is this page beyond the end of the file?
> > *
> > diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> > index 2371187b7615..0da0242d42c3 100644
> > --- a/fs/xfs/xfs_aops.c
> > +++ b/fs/xfs/xfs_aops.c
> > @@ -568,6 +568,16 @@ xfs_vm_writepage(
> > {
> > struct xfs_writepage_ctx wpc = { };
> >
> > + /*
> > + * Given that we do not allow direct reclaim to call us, we should
> > + * never be called while in a filesystem transaction.
> > + */
> > + if (WARN_ON_ONCE(xfs_trans_context_active())) {
> > + redirty_page_for_writepage(wbc, page);
> > + unlock_page(page);
> > + return 0;
> > + }
> > +
> > return iomap_writepage(page, wbc, &wpc.ctx, &xfs_writeback_ops);
> > }
> >
> > @@ -579,6 +589,13 @@ xfs_vm_writepages(
> > struct xfs_writepage_ctx wpc = { };
> >
> > xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
> > + /*
> > + * Given that we do not allow direct reclaim to call us, we should
> > + * never be called while in a filesystem transaction.
> > + */
> > + if (WARN_ON_ONCE(xfs_trans_context_active()))
> > + return 0;
> > +
> > return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
> > }
> >
> > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > index fe20398a214e..08d4916ffb13 100644
> > --- a/fs/xfs/xfs_trans.c
> > +++ b/fs/xfs/xfs_trans.c
> > @@ -124,6 +124,9 @@ xfs_trans_dup(
> > tp->t_rtx_res = tp->t_rtx_res_used;
> > ntp->t_pflags = tp->t_pflags;
>
> This one line (ntp->t_pflags = tp->t_pflags) should move to
> xfs_trans_context_swap.
>
Make sense to me.
Will update it.
> --D
>
> >
> > + /* Associate the new transaction with this thread. */
> > + xfs_trans_context_swap(tp, ntp);
> > +
> > /* move deferred ops over to the new tp */
> > xfs_defer_move(ntp, tp);
> >
> > diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> > index 44b11c64a15e..d596a375e3bf 100644
> > --- a/fs/xfs/xfs_trans.h
> > +++ b/fs/xfs/xfs_trans.h
> > @@ -268,16 +268,38 @@ xfs_trans_item_relog(
> > return lip->li_ops->iop_relog(lip, tp);
> > }
> >
> > +static inline bool
> > +xfs_trans_context_active(void)
> > +{
> > + /* Use journal_info to indicate current is in a transaction */
> > + return current->journal_info != NULL;
> > +}
> > +
> > static inline void
> > xfs_trans_context_set(struct xfs_trans *tp)
> > {
> > + ASSERT(!current->journal_info);
> > + current->journal_info = tp;
> > tp->t_pflags = memalloc_nofs_save();
> > }
> >
> > static inline void
> > xfs_trans_context_clear(struct xfs_trans *tp)
> > {
> > + ASSERT(current->journal_info == tp);
> > + current->journal_info = NULL;
> > memalloc_nofs_restore(tp->t_pflags);
> > }
> >
> > +/*
> > + * Transfer the transaction context when rolling a permanent
> > + * transaction.
> > + */
> > +static inline void
> > +xfs_trans_context_swap(struct xfs_trans *tp, struct xfs_trans *ntp)
> > +{
> > + ASSERT(current->journal_info == tp);
> > + current->journal_info = ntp;
> > +}
> > +
> > #endif /* __XFS_TRANS_H__ */
> > --
> > 2.18.4
> >
--
Thanks
Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear}
[not found] ` <CALOAHbB1uKmQ7ns08KW4zH1ikqD0GAY_Y7VySzmTY0=LTEPURA@mail.gmail.com>
@ 2020-12-09 3:53 ` Darrick J. Wong
2020-12-09 10:43 ` Yafang Shao
0 siblings, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2020-12-09 3:53 UTC (permalink / raw)
To: Yafang Shao
Cc: Matthew Wilcox, Dave Chinner, Christoph Hellwig, Michal Hocko,
Andrew Morton, David Howells, jlayton, linux-fsdevel,
linux-cachefs, linux-xfs, Linux MM, Christoph Hellwig
On Wed, Dec 09, 2020 at 09:47:38AM +0800, Yafang Shao wrote:
> On Wed, Dec 9, 2020 at 2:59 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > On Tue, Dec 08, 2020 at 08:28:23PM +0800, Yafang Shao wrote:
> > > The xfs_trans context should be active after it is allocated, and
> > > deactive when it is freed.
> > >
> > > So these two helpers are refactored as,
> > > - xfs_trans_context_set()
> > > Used in xfs_trans_alloc()
> > > - xfs_trans_context_clear()
> > > Used in xfs_trans_free()
> > >
> > > This patch is based on Darrick's work to fix the issue in xfs/141 in the
> > > earlier version. [1]
> > >
> > > 1. https://lore.kernel.org/linux-xfs/20201104001649.GN7123@magnolia
> > >
> > > Cc: Darrick J. Wong <darrick.wong@oracle.com>
> > > Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > Cc: Christoph Hellwig <hch@lst.de>
> > > Cc: Dave Chinner <david@fromorbit.com>
> > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > ---
> > > fs/xfs/xfs_trans.c | 20 +++++++-------------
> > > 1 file changed, 7 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > > index 11d390f0d3f2..fe20398a214e 100644
> > > --- a/fs/xfs/xfs_trans.c
> > > +++ b/fs/xfs/xfs_trans.c
> > > @@ -67,6 +67,9 @@ xfs_trans_free(
> > > xfs_extent_busy_sort(&tp->t_busy);
> > > xfs_extent_busy_clear(tp->t_mountp, &tp->t_busy, false);
> > >
> > > + /* Detach the transaction from this thread. */
> > > + xfs_trans_context_clear(tp);
> >
> > Don't you need to check if tp is still the current transaction before
> > you clear PF_MEMALLOC_NOFS, now that the NOFS is bound to the lifespan
> > of the transaction itself instead of the reservation?
> >
>
> The current->journal_info is always the same with tp here in my verification.
> I don't know in which case they are different.
I don't know why you changed it from the previous version.
> It would be better if you could explain in detail. Anyway I can add
> the check with your comment in the next version.
xfs_trans_alloc is called to allocate a transaction. We set _NOFS and
save the old flags (which don't contain _NOFS) to this transaction.
thread logs some changes and calls xfs_trans_roll.
xfs_trans_roll calls xfs_trans_dup to duplicate the old transaction.
xfs_trans_dup allocates a new transaction, which sets PF_MEMALLOC_NOFS
and saves the current context flags (in which _NOFS is set) in the new
transaction.
xfs_trans_roll then commits the old transaction
xfs_trans_commit frees the old transaction
xfs_trans_free restores the old context (which didn't have _NOFS) and
now we've dropped NOFS incorrectly
now we move on with the new transaction, but in the wrong NOFS mode.
note that this becomes a lot more obvious once you start fiddling with
current->journal_info in the last patch.
--D
>
> >
> > > +
> > > trace_xfs_trans_free(tp, _RET_IP_);
> > > if (!(tp->t_flags & XFS_TRANS_NO_WRITECOUNT))
> > > sb_end_intwrite(tp->t_mountp->m_super);
> > > @@ -153,9 +156,6 @@ xfs_trans_reserve(
> > > int error = 0;
> > > bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
> > >
> > > - /* Mark this thread as being in a transaction */
> > > - xfs_trans_context_set(tp);
> > > -
> > > /*
> > > * Attempt to reserve the needed disk blocks by decrementing
> > > * the number needed from the number available. This will
> > > @@ -163,10 +163,9 @@ xfs_trans_reserve(
> > > */
> > > if (blocks > 0) {
> > > error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
> > > - if (error != 0) {
> > > - xfs_trans_context_clear(tp);
> > > + if (error != 0)
> > > return -ENOSPC;
> > > - }
> > > +
> > > tp->t_blk_res += blocks;
> > > }
> > >
> > > @@ -241,8 +240,6 @@ xfs_trans_reserve(
> > > tp->t_blk_res = 0;
> > > }
> > >
> > > - xfs_trans_context_clear(tp);
> > > -
> > > return error;
> > > }
> > >
> > > @@ -284,6 +281,8 @@ xfs_trans_alloc(
> > > INIT_LIST_HEAD(&tp->t_dfops);
> > > tp->t_firstblock = NULLFSBLOCK;
> > >
> > > + /* Mark this thread as being in a transaction */
> > > + xfs_trans_context_set(tp);
> > > error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > if (error) {
> > > xfs_trans_cancel(tp);
> > > @@ -878,7 +877,6 @@ __xfs_trans_commit(
> > >
> > > xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
> > >
> > > - xfs_trans_context_clear(tp);
> > > xfs_trans_free(tp);
> > >
> > > /*
> > > @@ -911,7 +909,6 @@ __xfs_trans_commit(
> > > tp->t_ticket = NULL;
> > > }
> > >
> > > - xfs_trans_context_clear(tp);
> > > xfs_trans_free_items(tp, !!error);
> > > xfs_trans_free(tp);
> > >
> > > @@ -971,9 +968,6 @@ xfs_trans_cancel(
> > > tp->t_ticket = NULL;
> > > }
> > >
> > > - /* mark this thread as no longer being in a transaction */
> > > - xfs_trans_context_clear(tp);
> > > -
> > > xfs_trans_free_items(tp, dirty);
> > > xfs_trans_free(tp);
> > > }
> > > --
> > > 2.18.4
> > >
>
>
>
> --
> Thanks
> Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear}
2020-12-09 3:53 ` Darrick J. Wong
@ 2020-12-09 10:43 ` Yafang Shao
0 siblings, 0 replies; 11+ messages in thread
From: Yafang Shao @ 2020-12-09 10:43 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Matthew Wilcox, Dave Chinner, Christoph Hellwig, Michal Hocko,
Andrew Morton, David Howells, jlayton, linux-fsdevel,
linux-cachefs, linux-xfs, Linux MM, Christoph Hellwig
On Wed, Dec 9, 2020 at 11:53 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Wed, Dec 09, 2020 at 09:47:38AM +0800, Yafang Shao wrote:
> > On Wed, Dec 9, 2020 at 2:59 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > >
> > > On Tue, Dec 08, 2020 at 08:28:23PM +0800, Yafang Shao wrote:
> > > > The xfs_trans context should be active after it is allocated, and
> > > > deactive when it is freed.
> > > >
> > > > So these two helpers are refactored as,
> > > > - xfs_trans_context_set()
> > > > Used in xfs_trans_alloc()
> > > > - xfs_trans_context_clear()
> > > > Used in xfs_trans_free()
> > > >
> > > > This patch is based on Darrick's work to fix the issue in xfs/141 in the
> > > > earlier version. [1]
> > > >
> > > > 1. https://lore.kernel.org/linux-xfs/20201104001649.GN7123@magnolia
> > > >
> > > > Cc: Darrick J. Wong <darrick.wong@oracle.com>
> > > > Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > Cc: Christoph Hellwig <hch@lst.de>
> > > > Cc: Dave Chinner <david@fromorbit.com>
> > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > > > ---
> > > > fs/xfs/xfs_trans.c | 20 +++++++-------------
> > > > 1 file changed, 7 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > > > index 11d390f0d3f2..fe20398a214e 100644
> > > > --- a/fs/xfs/xfs_trans.c
> > > > +++ b/fs/xfs/xfs_trans.c
> > > > @@ -67,6 +67,9 @@ xfs_trans_free(
> > > > xfs_extent_busy_sort(&tp->t_busy);
> > > > xfs_extent_busy_clear(tp->t_mountp, &tp->t_busy, false);
> > > >
> > > > + /* Detach the transaction from this thread. */
> > > > + xfs_trans_context_clear(tp);
> > >
> > > Don't you need to check if tp is still the current transaction before
> > > you clear PF_MEMALLOC_NOFS, now that the NOFS is bound to the lifespan
> > > of the transaction itself instead of the reservation?
> > >
> >
> > The current->journal_info is always the same with tp here in my verification.
> > I don't know in which case they are different.
>
> I don't know why you changed it from the previous version.
>
I should explain it in the change log. Sorry about that.
> > It would be better if you could explain in detail. Anyway I can add
> > the check with your comment in the next version.
>
> xfs_trans_alloc is called to allocate a transaction. We set _NOFS and
> save the old flags (which don't contain _NOFS) to this transaction.
>
> thread logs some changes and calls xfs_trans_roll.
>
> xfs_trans_roll calls xfs_trans_dup to duplicate the old transaction.
>
> xfs_trans_dup allocates a new transaction, which sets PF_MEMALLOC_NOFS
> and saves the current context flags (in which _NOFS is set) in the new
> transaction.
>
> xfs_trans_roll then commits the old transaction
>
> xfs_trans_commit frees the old transaction
>
> xfs_trans_free restores the old context (which didn't have _NOFS) and
> now we've dropped NOFS incorrectly
>
> now we move on with the new transaction, but in the wrong NOFS mode.
>
> note that this becomes a lot more obvious once you start fiddling with
> current->journal_info in the last patch.
>
Many thanks for the detailed explanation. I missed the rolling transaction.
I will add this check in the next version.
> --D
>
> >
> > >
> > > > +
> > > > trace_xfs_trans_free(tp, _RET_IP_);
> > > > if (!(tp->t_flags & XFS_TRANS_NO_WRITECOUNT))
> > > > sb_end_intwrite(tp->t_mountp->m_super);
> > > > @@ -153,9 +156,6 @@ xfs_trans_reserve(
> > > > int error = 0;
> > > > bool rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
> > > >
> > > > - /* Mark this thread as being in a transaction */
> > > > - xfs_trans_context_set(tp);
> > > > -
> > > > /*
> > > > * Attempt to reserve the needed disk blocks by decrementing
> > > > * the number needed from the number available. This will
> > > > @@ -163,10 +163,9 @@ xfs_trans_reserve(
> > > > */
> > > > if (blocks > 0) {
> > > > error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
> > > > - if (error != 0) {
> > > > - xfs_trans_context_clear(tp);
> > > > + if (error != 0)
> > > > return -ENOSPC;
> > > > - }
> > > > +
> > > > tp->t_blk_res += blocks;
> > > > }
> > > >
> > > > @@ -241,8 +240,6 @@ xfs_trans_reserve(
> > > > tp->t_blk_res = 0;
> > > > }
> > > >
> > > > - xfs_trans_context_clear(tp);
> > > > -
> > > > return error;
> > > > }
> > > >
> > > > @@ -284,6 +281,8 @@ xfs_trans_alloc(
> > > > INIT_LIST_HEAD(&tp->t_dfops);
> > > > tp->t_firstblock = NULLFSBLOCK;
> > > >
> > > > + /* Mark this thread as being in a transaction */
> > > > + xfs_trans_context_set(tp);
> > > > error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > > if (error) {
> > > > xfs_trans_cancel(tp);
> > > > @@ -878,7 +877,6 @@ __xfs_trans_commit(
> > > >
> > > > xfs_log_commit_cil(mp, tp, &commit_lsn, regrant);
> > > >
> > > > - xfs_trans_context_clear(tp);
> > > > xfs_trans_free(tp);
> > > >
> > > > /*
> > > > @@ -911,7 +909,6 @@ __xfs_trans_commit(
> > > > tp->t_ticket = NULL;
> > > > }
> > > >
> > > > - xfs_trans_context_clear(tp);
> > > > xfs_trans_free_items(tp, !!error);
> > > > xfs_trans_free(tp);
> > > >
> > > > @@ -971,9 +968,6 @@ xfs_trans_cancel(
> > > > tp->t_ticket = NULL;
> > > > }
> > > >
> > > > - /* mark this thread as no longer being in a transaction */
> > > > - xfs_trans_context_clear(tp);
> > > > -
> > > > xfs_trans_free_items(tp, dirty);
> > > > xfs_trans_free(tp);
> > > > }
> > > > --
> > > > 2.18.4
> > > >
> >
> >
> >
> > --
> > Thanks
> > Yafang
--
Thanks
Yafang
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-12-09 10:44 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 12:28 [PATCH v11 0/4] xfs: avoid transaction reservation recursion Yafang Shao
2020-12-08 12:28 ` [PATCH v11 1/4] mm: Add become_kswapd and restore_kswapd Yafang Shao
2020-12-08 12:28 ` [PATCH v11 2/4] xfs: use memalloc_nofs_{save,restore} in xfs transaction Yafang Shao
2020-12-08 19:02 ` Darrick J. Wong
2020-12-08 12:28 ` [PATCH v11 3/4] xfs: refactor the usage around xfs_trans_context_{set,clear} Yafang Shao
2020-12-08 18:59 ` Darrick J. Wong
[not found] ` <CALOAHbB1uKmQ7ns08KW4zH1ikqD0GAY_Y7VySzmTY0=LTEPURA@mail.gmail.com>
2020-12-09 3:53 ` Darrick J. Wong
2020-12-09 10:43 ` Yafang Shao
2020-12-08 12:28 ` [PATCH v11 4/4] xfs: use current->journal_info to avoid transaction reservation recursion Yafang Shao
2020-12-08 18:59 ` Darrick J. Wong
2020-12-09 1:40 ` Yafang Shao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).