All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure
@ 2022-02-22 20:34 Ritesh Harjani
  2022-02-22 20:34 ` [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED Ritesh Harjani
                   ` (9 more replies)
  0 siblings, 10 replies; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

Hello,

Sending this RFC out for getting some review comments/suggestions on the
problem listed in this cover letter. It would be helpful if someone has any
pointers around these.

1. Patch-2 fixes the problem reported by Steven about dereferencing pointers from
   the ring buffer [1].
	One of the problem in this patch which I might need some help is, how to
	expose EXT4_FC_REASON_MAX enum for
	+	__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)

	My observation is that this trace_event is correctly printing the array
	values when called from cat /sys/kernel/debug/tracing/trace_pipe.

	But when I am using perf record -e ext4:ext4_fc_stats to record these trace
	events, I am seeing the array values to be all 0.

	Do you know what will be the right and easy way to fix above, so that this works
	properly for perf tools too? And we should be able to backport this too.

2. Patch-9 discusses one of the problem where FC might miss to track/commit
   inode's update. This also needs some discussion. I have added my observations
   on the commit log of patch-9 itself.

3. Remaining are FC trace event improvement patches, which I found useful while
   debugging some of the recent fast_commit issues.

Also a careful review of patches is always helpful :)


[1]: https://lore.kernel.org/linux-ext4/20220221160916.333e6491@rorschach.local.home/T/#u

Ritesh Harjani (9):
  ext4: Remove unused enum EXT4_FC_COMMIT_FAILED
  ext4: Fix ext4_fc_stats trace point
  ext4: Add couple of more fast_commit tracepoints
  ext4: Do not call FC trace event if FS does not support FC
  ext4: Add commit_tid info in jbd debug log
  ext4: Add commit tid info in ext4_fc_commit_start/stop trace events
  ext4: Fix remaining two trace events to use same printk convention
  ext4: Convert ext4_fc_track_dentry type events to use event class
  ext4: fast_commit missing tracking updates to a file

 fs/ext4/fast_commit.c       |  30 +++--
 fs/ext4/fast_commit.h       |   1 -
 include/trace/events/ext4.h | 229 +++++++++++++++++++++++++-----------
 3 files changed, 182 insertions(+), 78 deletions(-)

--
2.31.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:37   ` Jan Kara
  2022-02-22 20:34 ` [RFC 2/9] ext4: Fix ext4_fc_stats trace point Ritesh Harjani
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

Below commit removed all references of EXT4_FC_COMMIT_FAILED.
commit 0915e464cb274 ("ext4: simplify updating of fast commit stats")

Just remove it since it is not used anymore.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/ext4/fast_commit.h b/fs/ext4/fast_commit.h
index 02afa52e8e41..80414dcba6e1 100644
--- a/fs/ext4/fast_commit.h
+++ b/fs/ext4/fast_commit.h
@@ -93,7 +93,6 @@ enum {
 	EXT4_FC_REASON_RENAME_DIR,
 	EXT4_FC_REASON_FALLOC_RANGE,
 	EXT4_FC_REASON_INODE_JOURNAL_DATA,
-	EXT4_FC_COMMIT_FAILED,
 	EXT4_FC_REASON_MAX
 };
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 2/9] ext4: Fix ext4_fc_stats trace point
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
  2022-02-22 20:34 ` [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-22 20:52   ` Steven Rostedt
  2022-02-23  9:54   ` Jan Kara
  2022-02-22 20:34 ` [RFC 3/9] ext4: Add couple of more fast_commit tracepoints Ritesh Harjani
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani, Steven Rostedt

ftrace's __print_symbolic() requires that any enum values used in the
symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM
so that the enum value can be encoded in the ftrace ring buffer.

This patch also fixes few other problems found in this trace point.
e.g. dereferencing structures in TP_printk which should not be done
at any cost.

Also to avoid checkpatch warnings, this patch removes those
whitespaces/tab stops issues.

Fixes: commit aa75f4d3daae ("ext4: main fast-commit commit path")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 include/trace/events/ext4.h | 76 +++++++++++++++++++++++--------------
 1 file changed, 47 insertions(+), 29 deletions(-)

diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 19e957b7f941..17fb9c506e8a 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -95,6 +95,16 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B);
 	{ FALLOC_FL_COLLAPSE_RANGE,	"COLLAPSE_RANGE"},	\
 	{ FALLOC_FL_ZERO_RANGE,		"ZERO_RANGE"})
 
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE);
+TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA);
+
 #define show_fc_reason(reason)						\
 	__print_symbolic(reason,					\
 		{ EXT4_FC_REASON_XATTR,		"XATTR"},		\
@@ -2723,41 +2733,49 @@ TRACE_EVENT(ext4_fc_commit_stop,
 
 #define FC_REASON_NAME_STAT(reason)					\
 	show_fc_reason(reason),						\
-	__entry->sbi->s_fc_stats.fc_ineligible_reason_count[reason]
+	__entry->fc_ineligible_rc[reason]
 
 TRACE_EVENT(ext4_fc_stats,
-	    TP_PROTO(struct super_block *sb),
-
-	    TP_ARGS(sb),
+	TP_PROTO(struct super_block *sb),
 
-	    TP_STRUCT__entry(
-		    __field(dev_t, dev)
-		    __field(struct ext4_sb_info *, sbi)
-		    __field(int, count)
-		    ),
+	TP_ARGS(sb),
 
-	    TP_fast_assign(
-		    __entry->dev = sb->s_dev;
-		    __entry->sbi = EXT4_SB(sb);
-		    ),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)
+		__field(unsigned long, fc_commits)
+		__field(unsigned long, fc_ineligible_commits)
+		__field(unsigned long, fc_numblks)
+	),
 
-	    TP_printk("dev %d:%d fc ineligible reasons:\n"
-		      "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
-		      "num_commits:%ld, ineligible: %ld, numblks: %ld",
-		      MAJOR(__entry->dev), MINOR(__entry->dev),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
-		      FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
-		      __entry->sbi->s_fc_stats.fc_num_commits,
-		      __entry->sbi->s_fc_stats.fc_ineligible_commits,
-		      __entry->sbi->s_fc_stats.fc_numblks)
+	TP_fast_assign(
+		int i;
 
+		__entry->dev = sb->s_dev;
+		for (i = 0; i < EXT4_FC_REASON_MAX; i++)
+			__entry->fc_ineligible_rc[i] =
+			EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i];
+		__entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits;
+		__entry->fc_ineligible_commits =
+			EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
+		__entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks;
+	),
+
+	TP_printk("dev %d,%d fc ineligible reasons:\n"
+		  "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
+		  "num_commits:%lu, ineligible: %lu, numblks: %lu",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
+		  __entry->fc_commits, __entry->fc_ineligible_commits,
+		  __entry->fc_numblks)
 );
 
 #define DEFINE_TRACE_DENTRY_EVENT(__type)				\
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 3/9] ext4: Add couple of more fast_commit tracepoints
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
  2022-02-22 20:34 ` [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED Ritesh Harjani
  2022-02-22 20:34 ` [RFC 2/9] ext4: Fix ext4_fc_stats trace point Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:40   ` Jan Kara
  2022-02-22 20:34 ` [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC Ritesh Harjani
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

This adds two more tracepoints for ext4_fc_track_template() &
ext4_fc_cleanup() which are helpful in debugging some fast_commit issues.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c       |  3 ++
 include/trace/events/ext4.h | 67 +++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 5ac594e03402..bf70879bb4fe 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -386,6 +386,8 @@ static int ext4_fc_track_template(
 	if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
 		return -EINVAL;
 
+	trace_ext4_fc_track_template(handle, inode, __fc_track_fn, enqueue);
+
 	tid = handle->h_transaction->t_tid;
 	mutex_lock(&ei->i_fc_lock);
 	if (tid == ei->i_sync_tid) {
@@ -1241,6 +1243,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
 	if (full && sbi->s_fc_bh)
 		sbi->s_fc_bh = NULL;
 
+	trace_ext4_fc_cleanup(journal, full, tid);
 	jbd2_fc_release_bufs(journal);
 
 	spin_lock(&sbi->s_fc_lock);
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 17fb9c506e8a..cd09dccea502 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2855,6 +2855,73 @@ TRACE_EVENT(ext4_fc_track_range,
 		      __entry->end)
 	);
 
+TRACE_EVENT(ext4_fc_track_template,
+	TP_PROTO(handle_t *handle, struct inode *inode,
+		 void *__fc_track_fn, int enqueue),
+
+	TP_ARGS(handle, inode, __fc_track_fn, enqueue),
+
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(tid_t, t_tid)
+		__field(ino_t, i_ino)
+		__field(tid_t, i_sync_tid)
+		__field(void *, __fc_track_fn)
+		__field(int, enqueue)
+		__field(bool, jbd2_ongoing)
+		__field(bool, fc_ongoing)
+	),
+
+	TP_fast_assign(
+		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
+		struct ext4_inode_info *ei = EXT4_I(inode);
+
+		__entry->dev = inode->i_sb->s_dev;
+		__entry->t_tid = handle->h_transaction->t_tid;
+		__entry->i_ino = inode->i_ino;
+		__entry->i_sync_tid = ei->i_sync_tid;
+		__entry->__fc_track_fn = __fc_track_fn;
+		__entry->enqueue = enqueue;
+		__entry->jbd2_ongoing =
+		    sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING;
+		__entry->fc_ongoing =
+		    sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING;
+	),
+
+	TP_printk("dev %d,%d, t_tid %u, ino %lu, i_sync_tid %u, "
+		  "track_fn %pS, enqueue %d, jbd2_ongoing %d, fc_ongoing %d",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->t_tid, __entry->i_ino, __entry->i_sync_tid,
+		  (void *)__entry->__fc_track_fn, __entry->enqueue,
+		  __entry->jbd2_ongoing, __entry->fc_ongoing)
+	);
+
+TRACE_EVENT(ext4_fc_cleanup,
+	TP_PROTO(journal_t *journal, int full, tid_t tid),
+
+	TP_ARGS(journal, full, tid),
+
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(int, j_fc_off)
+		__field(int, full)
+		__field(tid_t, tid)
+	),
+
+	TP_fast_assign(
+		struct super_block *sb = journal->j_private;
+
+		__entry->dev = sb->s_dev;
+		__entry->j_fc_off = journal->j_fc_off;
+		__entry->full = full;
+		__entry->tid = tid;
+	),
+
+	TP_printk("dev %d,%d, j_fc_off %d, full %d, tid %u",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->j_fc_off, __entry->full, __entry->tid)
+	);
+
 TRACE_EVENT(ext4_update_sb,
 	TP_PROTO(struct super_block *sb, ext4_fsblk_t fsblk,
 		 unsigned int flags),
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (2 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 3/9] ext4: Add couple of more fast_commit tracepoints Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:41   ` Jan Kara
  2022-02-22 20:34 ` [RFC 5/9] ext4: Add commit_tid info in jbd debug log Ritesh Harjani
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

This just puts trace_ext4_fc_commit_start(sb) & ktime_get()
for measuring FC commit time, after the check of whether sb
supports JOURNAL_FAST_COMMIT or not.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index bf70879bb4fe..7fb1eceef30c 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1167,13 +1167,13 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
 	int status = EXT4_FC_STATUS_OK, fc_bufs_before = 0;
 	ktime_t start_time, commit_time;
 
+	if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
+		return jbd2_complete_transaction(journal, commit_tid);
+
 	trace_ext4_fc_commit_start(sb);
 
 	start_time = ktime_get();
 
-	if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
-		return jbd2_complete_transaction(journal, commit_tid);
-
 restart_fc:
 	ret = jbd2_fc_begin_commit(journal, commit_tid);
 	if (ret == -EALREADY) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 5/9] ext4: Add commit_tid info in jbd debug log
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (3 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:42   ` Jan Kara
  2022-02-22 20:34 ` [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events Ritesh Harjani
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

This adds commit_tid argument in ext4_fc_update_stats()
so that we can add this information too in jbd_debug logs.
This is also required in a later patch to pass the commit_tid info in
ext4_fc_commit_start/stop() trace events.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 7fb1eceef30c..ee32aac0cbbf 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1127,11 +1127,12 @@ static int ext4_fc_perform_commit(journal_t *journal)
 }
 
 static void ext4_fc_update_stats(struct super_block *sb, int status,
-				 u64 commit_time, int nblks)
+				 u64 commit_time, int nblks, tid_t commit_tid)
 {
 	struct ext4_fc_stats *stats = &EXT4_SB(sb)->s_fc_stats;
 
-	jbd_debug(1, "Fast commit ended with status = %d", status);
+	jbd_debug(1, "Fast commit ended with status = %d for tid %u",
+			status, commit_tid);
 	if (status == EXT4_FC_STATUS_OK) {
 		stats->fc_num_commits++;
 		stats->fc_numblks += nblks;
@@ -1181,14 +1182,16 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
 		if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
 			commit_tid > journal->j_commit_sequence)
 			goto restart_fc;
-		ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0);
+		ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0,
+				commit_tid);
 		return 0;
 	} else if (ret) {
 		/*
 		 * Commit couldn't start. Just update stats and perform a
 		 * full commit.
 		 */
-		ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0);
+		ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0,
+				commit_tid);
 		return jbd2_complete_transaction(journal, commit_tid);
 	}
 
@@ -1220,12 +1223,12 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
 	 * don't react too strongly to vast changes in the commit time
 	 */
 	commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
-	ext4_fc_update_stats(sb, status, commit_time, nblks);
+	ext4_fc_update_stats(sb, status, commit_time, nblks, commit_tid);
 	return ret;
 
 fallback:
 	ret = jbd2_fc_end_commit_fallback(journal);
-	ext4_fc_update_stats(sb, status, 0, 0);
+	ext4_fc_update_stats(sb, status, 0, 0, commit_tid);
 	return ret;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (4 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 5/9] ext4: Add commit_tid info in jbd debug log Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:44   ` Jan Kara
  2022-02-22 20:34 ` [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention Ritesh Harjani
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

This adds commit_tid info in ext4_fc_commit_start/stop which is helpful
in debugging fast_commit issues.

For e.g. issues where due to jbd2 journal full commit, FC miss to commit
updates to a file.

Also improves TP_prink format string i.e. all ext4 and jbd2 trace events
starts with "dev MAjOR,MINOR". Let's follow the same convention while we
are still at it.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c       |  4 ++--
 include/trace/events/ext4.h | 21 +++++++++++++--------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index ee32aac0cbbf..8803ba087b07 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1150,7 +1150,7 @@ static void ext4_fc_update_stats(struct super_block *sb, int status,
 	} else {
 		stats->fc_skipped_commits++;
 	}
-	trace_ext4_fc_commit_stop(sb, nblks, status);
+	trace_ext4_fc_commit_stop(sb, nblks, status, commit_tid);
 }
 
 /*
@@ -1171,7 +1171,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
 	if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
 		return jbd2_complete_transaction(journal, commit_tid);
 
-	trace_ext4_fc_commit_start(sb);
+	trace_ext4_fc_commit_start(sb, commit_tid);
 
 	start_time = ktime_get();
 
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index cd09dccea502..6e66cb7ce624 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2685,26 +2685,29 @@ TRACE_EVENT(ext4_fc_replay,
 );
 
 TRACE_EVENT(ext4_fc_commit_start,
-	TP_PROTO(struct super_block *sb),
+	TP_PROTO(struct super_block *sb, tid_t commit_tid),
 
-	TP_ARGS(sb),
+	TP_ARGS(sb, commit_tid),
 
 	TP_STRUCT__entry(
 		__field(dev_t, dev)
+		__field(tid_t, tid)
 	),
 
 	TP_fast_assign(
 		__entry->dev = sb->s_dev;
+		__entry->tid = commit_tid;
 	),
 
-	TP_printk("fast_commit started on dev %d,%d",
-		  MAJOR(__entry->dev), MINOR(__entry->dev))
+	TP_printk("dev %d,%d tid %u", MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->tid)
 );
 
 TRACE_EVENT(ext4_fc_commit_stop,
-	    TP_PROTO(struct super_block *sb, int nblks, int reason),
+	    TP_PROTO(struct super_block *sb, int nblks, int reason,
+		     tid_t commit_tid),
 
-	TP_ARGS(sb, nblks, reason),
+	TP_ARGS(sb, nblks, reason, commit_tid),
 
 	TP_STRUCT__entry(
 		__field(dev_t, dev)
@@ -2713,6 +2716,7 @@ TRACE_EVENT(ext4_fc_commit_stop,
 		__field(int, num_fc)
 		__field(int, num_fc_ineligible)
 		__field(int, nblks_agg)
+		__field(tid_t, tid)
 	),
 
 	TP_fast_assign(
@@ -2723,12 +2727,13 @@ TRACE_EVENT(ext4_fc_commit_stop,
 		__entry->num_fc_ineligible =
 			EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
 		__entry->nblks_agg = EXT4_SB(sb)->s_fc_stats.fc_numblks;
+		__entry->tid = commit_tid;
 	),
 
-	TP_printk("fc on [%d,%d] nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d",
+	TP_printk("dev %d,%d nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d, tid %u",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
 		  __entry->nblks, __entry->reason, __entry->num_fc,
-		  __entry->num_fc_ineligible, __entry->nblks_agg)
+		  __entry->num_fc_ineligible, __entry->nblks_agg, __entry->tid)
 );
 
 #define FC_REASON_NAME_STAT(reason)					\
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (5 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:45   ` Jan Kara
  2022-02-22 20:34 ` [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class Ritesh Harjani
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

All ext4 & jbd2 trace events starts with "dev Major:Minor".
While we are still improving/adding the ftrace events for FC,
let's fix last two remaining trace events to follow the same
convention.

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 include/trace/events/ext4.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 6e66cb7ce624..233dbffa5ceb 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2653,7 +2653,7 @@ TRACE_EVENT(ext4_fc_replay_scan,
 		__entry->off = off;
 	),
 
-	TP_printk("FC scan pass on dev %d,%d: error %d, off %d",
+	TP_printk("dev %d,%d error %d, off %d",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
 		  __entry->error, __entry->off)
 );
@@ -2679,7 +2679,7 @@ TRACE_EVENT(ext4_fc_replay,
 		__entry->priv2 = priv2;
 	),
 
-	TP_printk("FC Replay %d,%d: tag %d, ino %d, data1 %d, data2 %d",
+	TP_printk("dev %d,%d: tag %d, ino %d, data1 %d, data2 %d",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
 		  __entry->tag, __entry->ino, __entry->priv1, __entry->priv2)
 );
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (6 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  9:49   ` Jan Kara
  2022-02-22 20:34 ` [RFC 9/9] ext4: fast_commit missing tracking updates to a file Ritesh Harjani
  2022-03-09 17:48 ` [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Theodore Ts'o
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

One should use DECLARE_EVENT_CLASS for similar event types instead of
defining TRACE_EVENT for each event type. This is helpful in reducing
the text section footprint for e.g. [1]

[1]: https://lwn.net/Articles/381064/

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 include/trace/events/ext4.h | 57 +++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 25 deletions(-)

diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 233dbffa5ceb..33a059d845d6 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2783,33 +2783,40 @@ TRACE_EVENT(ext4_fc_stats,
 		  __entry->fc_numblks)
 );
 
-#define DEFINE_TRACE_DENTRY_EVENT(__type)				\
-	TRACE_EVENT(ext4_fc_track_##__type,				\
-	    TP_PROTO(struct inode *inode, struct dentry *dentry, int ret), \
-									\
-	    TP_ARGS(inode, dentry, ret),				\
-									\
-	    TP_STRUCT__entry(						\
-		    __field(dev_t, dev)					\
-		    __field(int, ino)					\
-		    __field(int, error)					\
-		    ),							\
-									\
-	    TP_fast_assign(						\
-		    __entry->dev = inode->i_sb->s_dev;			\
-		    __entry->ino = inode->i_ino;			\
-		    __entry->error = ret;				\
-		    ),							\
-									\
-	    TP_printk("dev %d:%d, inode %d, error %d, fc_%s",		\
-		      MAJOR(__entry->dev), MINOR(__entry->dev),		\
-		      __entry->ino, __entry->error,			\
-		      #__type)						\
+DECLARE_EVENT_CLASS(ext4_fc_track_dentry,
+
+	TP_PROTO(struct inode *inode, struct dentry *dentry, int ret),
+
+	TP_ARGS(inode, dentry, ret),
+
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(int, ino)
+		__field(int, error)
+	),
+
+	TP_fast_assign(
+		__entry->dev = inode->i_sb->s_dev;
+		__entry->ino = inode->i_ino;
+		__entry->error = ret;
+	),
+
+	TP_printk("dev %d,%d, inode %d, error %d",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->ino, __entry->error
 	)
+);
+
+#define DEFINE_EVENT_CLASS_TYPE(__type)					\
+DEFINE_EVENT(ext4_fc_track_dentry, ext4_fc_track_##__type,		\
+	TP_PROTO(struct inode *inode, struct dentry *dentry, int ret),	\
+	TP_ARGS(inode, dentry, ret)					\
+)
+
 
-DEFINE_TRACE_DENTRY_EVENT(create);
-DEFINE_TRACE_DENTRY_EVENT(link);
-DEFINE_TRACE_DENTRY_EVENT(unlink);
+DEFINE_EVENT_CLASS_TYPE(create);
+DEFINE_EVENT_CLASS_TYPE(link);
+DEFINE_EVENT_CLASS_TYPE(unlink);
 
 TRACE_EVENT(ext4_fc_track_inode,
 	    TP_PROTO(struct inode *inode, int ret),
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFC 9/9] ext4: fast_commit missing tracking updates to a file
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (7 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class Ritesh Harjani
@ 2022-02-22 20:34 ` Ritesh Harjani
  2022-02-23  3:50   ` [External] " Xin Yin
  2022-03-09 17:48 ` [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Theodore Ts'o
  9 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-22 20:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: Harshad Shirwadkar, Theodore Ts'o, Jan Kara, linux-fsdevel,
	linux-kernel, Ritesh Harjani

<DO NOT MERGE THIS YET>

Testcase
==========
1. i=0; while [ $i -lt 1000 ]; do xfs_io -f -c "pwrite -S 0xaa -b 32k 0 32k" -c "fsync" /mnt/$i; i=$(($i+1)); done && sudo ./src/godown -v /mnt && sudo umount /mnt && sudo mount /dev/loop2 /mnt'
2. ls -alih /mnt/ -> In this you will observe one such file with 0 bytes (which ideally should not happen)

^^^ say if you don't see the issue because your underlying storage
device is very fast, then maybe try with commit=1 mount option.

Analysis
==========
It seems a file's updates can be a part of two transaction tid.
Below are the sequence of events which could cause this issue.

jbd2_handle_start -> (t_tid = 38)
__ext4_new_inode
ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid = 38)
<track more updates>
jbd2_start_commit -> (t_tid = 38)

jbd2_handle_start (tid = 39)
ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid 39)
    -> ext4_fc_reset_inode & ei->i_sync_tid = t_tid

ext4_fc_commit_start -> (will wait since jbd2 full commit is in progress)
jbd2_end_commit (t_tid = 38)
    -> jbd2_fc_cleanup() -> this will cleanup entries in sbi->s_fc_q[FC_Q_MAIN]
        -> And the above could result inode size as 0 as  after effect.
ext4_fc_commit_stop

You could find the logs for the above behavior for inode 979 at [1].

-> So what is happening here is since the ei->i_fc_list is not empty
(because it is already part of sb's MAIN queue), we don't add this inode
again into neither sb's MAIN or STAGING queue.
And after jbd2_fc_cleanup() is called from jbd2 full commit, we
just remove this inode from the main queue.

So as a simple fix, what I did below was to check if it is a jbd2 full commit
in ext4_fc_cleanup(), and if the ei->i_sync_tid > tid, that means we
need not remove that from MAIN queue. This is since neither jbd2 nor FC
has committed updates of those inodes for this new txn tid yet.

But below are some quick queries on this
=========================================

1. why do we call ext4_fc_reset_inode() when inode tid and
   running txn tid does not match?

2. Also is this an expected behavior from the design perspective of
   fast_commit. i.e.
   a. the inode can be part of two tids?
   b. And that while a full commit is in progress, the inode can still
   receive updates but using a new transaction tid.

Frankly speaking, since I was also working on other things, so I haven't
yet got the chance to completely analyze the situation yet.
Once I have those things sorted, I will spend more time on this, to
understand it more. Meanwhile if you already have some answers to above
queries/observations, please do share those here.

Links
=========
[1] https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/ext4/fast_commit/fc_inode_missing_updates_ino_979.txt

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 8803ba087b07..769b584c2552 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1252,6 +1252,8 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
 	spin_lock(&sbi->s_fc_lock);
 	list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
 				 i_fc_list) {
+		if (full && iter->i_sync_tid > tid)
+			continue;
 		list_del_init(&iter->i_fc_list);
 		ext4_clear_inode_state(&iter->vfs_inode,
 				       EXT4_STATE_FC_COMMITTING);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [RFC 2/9] ext4: Fix ext4_fc_stats trace point
  2022-02-22 20:34 ` [RFC 2/9] ext4: Fix ext4_fc_stats trace point Ritesh Harjani
@ 2022-02-22 20:52   ` Steven Rostedt
  2022-02-23  9:54   ` Jan Kara
  1 sibling, 0 replies; 35+ messages in thread
From: Steven Rostedt @ 2022-02-22 20:52 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed, 23 Feb 2022 02:04:10 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> ftrace's __print_symbolic() requires that any enum values used in the
> symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM
> so that the enum value can be encoded in the ftrace ring buffer.

Actually, you mean "decoded from the ftrace ring buffer by user space
tooling".

> 
> This patch also fixes few other problems found in this trace point.
> e.g. dereferencing structures in TP_printk which should not be done
> at any cost.
> 
> Also to avoid checkpatch warnings, this patch removes those
> whitespaces/tab stops issues.
> 
> Fixes: commit aa75f4d3daae ("ext4: main fast-commit commit path")

Should add:

Cc: stable@kernel.org

as the commit landed in 5.10.

> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good with one small nit below.

> ---
>  include/trace/events/ext4.h | 76 +++++++++++++++++++++++--------------
>  1 file changed, 47 insertions(+), 29 deletions(-)
> 
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index 19e957b7f941..17fb9c506e8a 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -95,6 +95,16 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B);
>  	{ FALLOC_FL_COLLAPSE_RANGE,	"COLLAPSE_RANGE"},	\
>  	{ FALLOC_FL_ZERO_RANGE,		"ZERO_RANGE"})
>  
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA);
> +
>  #define show_fc_reason(reason)						\
>  	__print_symbolic(reason,					\
>  		{ EXT4_FC_REASON_XATTR,		"XATTR"},		\
> @@ -2723,41 +2733,49 @@ TRACE_EVENT(ext4_fc_commit_stop,
>  
>  #define FC_REASON_NAME_STAT(reason)					\
>  	show_fc_reason(reason),						\
> -	__entry->sbi->s_fc_stats.fc_ineligible_reason_count[reason]
> +	__entry->fc_ineligible_rc[reason]
>  
>  TRACE_EVENT(ext4_fc_stats,
> -	    TP_PROTO(struct super_block *sb),
> -
> -	    TP_ARGS(sb),
> +	TP_PROTO(struct super_block *sb),
>  
> -	    TP_STRUCT__entry(
> -		    __field(dev_t, dev)
> -		    __field(struct ext4_sb_info *, sbi)
> -		    __field(int, count)
> -		    ),
> +	TP_ARGS(sb),
>  
> -	    TP_fast_assign(
> -		    __entry->dev = sb->s_dev;
> -		    __entry->sbi = EXT4_SB(sb);
> -		    ),
> +	TP_STRUCT__entry(
> +		__field(dev_t, dev)
> +		__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)
> +		__field(unsigned long, fc_commits)
> +		__field(unsigned long, fc_ineligible_commits)
> +		__field(unsigned long, fc_numblks)
> +	),
>  
> -	    TP_printk("dev %d:%d fc ineligible reasons:\n"
> -		      "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
> -		      "num_commits:%ld, ineligible: %ld, numblks: %ld",
> -		      MAJOR(__entry->dev), MINOR(__entry->dev),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> -		      __entry->sbi->s_fc_stats.fc_num_commits,
> -		      __entry->sbi->s_fc_stats.fc_ineligible_commits,
> -		      __entry->sbi->s_fc_stats.fc_numblks)
> +	TP_fast_assign(
> +		int i;
>  
> +		__entry->dev = sb->s_dev;
> +		for (i = 0; i < EXT4_FC_REASON_MAX; i++)
> +			__entry->fc_ineligible_rc[i] =
> +			EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i];

As the above line is a continuation of the previous line, it should be
indented as such. And since you added a line break, it is recommend to have
any blocks use brackets. That is:

		for (i = 0; i < EXT4_FC_REASON_MAX; i++) {
			__entry->fc_ineligible_rc[i] =
				EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i];
		}

Other than that:

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve


> +		__entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits;
> +		__entry->fc_ineligible_commits =
> +			EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
> +		__entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks;
> +	),
> +
> +	TP_printk("dev %d,%d fc ineligible reasons:\n"
> +		  "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
> +		  "num_commits:%lu, ineligible: %lu, numblks: %lu",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> +		  __entry->fc_commits, __entry->fc_ineligible_commits,
> +		  __entry->fc_numblks)
>  );
>  
>  #define DEFINE_TRACE_DENTRY_EVENT(__type)				\


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [External] [RFC 9/9] ext4: fast_commit missing tracking updates to a file
  2022-02-22 20:34 ` [RFC 9/9] ext4: fast_commit missing tracking updates to a file Ritesh Harjani
@ 2022-02-23  3:50   ` Xin Yin
  2022-02-23 13:58     ` Ritesh Harjani
  0 siblings, 1 reply; 35+ messages in thread
From: Xin Yin @ 2022-02-23  3:50 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed, Feb 23, 2022 at 4:36 AM Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> <DO NOT MERGE THIS YET>
>
> Testcase
> ==========
> 1. i=0; while [ $i -lt 1000 ]; do xfs_io -f -c "pwrite -S 0xaa -b 32k 0 32k" -c "fsync" /mnt/$i; i=$(($i+1)); done && sudo ./src/godown -v /mnt && sudo umount /mnt && sudo mount /dev/loop2 /mnt'
> 2. ls -alih /mnt/ -> In this you will observe one such file with 0 bytes (which ideally should not happen)
>
> ^^^ say if you don't see the issue because your underlying storage
> device is very fast, then maybe try with commit=1 mount option.
>
> Analysis
> ==========
> It seems a file's updates can be a part of two transaction tid.
> Below are the sequence of events which could cause this issue.
>
> jbd2_handle_start -> (t_tid = 38)
> __ext4_new_inode
> ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid = 38)
> <track more updates>
> jbd2_start_commit -> (t_tid = 38)
>
> jbd2_handle_start (tid = 39)
> ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid 39)
>     -> ext4_fc_reset_inode & ei->i_sync_tid = t_tid
>
> ext4_fc_commit_start -> (will wait since jbd2 full commit is in progress)
> jbd2_end_commit (t_tid = 38)
>     -> jbd2_fc_cleanup() -> this will cleanup entries in sbi->s_fc_q[FC_Q_MAIN]
>         -> And the above could result inode size as 0 as  after effect.
> ext4_fc_commit_stop
>
> You could find the logs for the above behavior for inode 979 at [1].
>
> -> So what is happening here is since the ei->i_fc_list is not empty
> (because it is already part of sb's MAIN queue), we don't add this inode
> again into neither sb's MAIN or STAGING queue.
> And after jbd2_fc_cleanup() is called from jbd2 full commit, we
> just remove this inode from the main queue.
>
> So as a simple fix, what I did below was to check if it is a jbd2 full commit
> in ext4_fc_cleanup(), and if the ei->i_sync_tid > tid, that means we
> need not remove that from MAIN queue. This is since neither jbd2 nor FC
> has committed updates of those inodes for this new txn tid yet.
>
> But below are some quick queries on this
> =========================================
>
> 1. why do we call ext4_fc_reset_inode() when inode tid and
>    running txn tid does not match?
This is part of a change in commit:bdc8a53a6f2f,  it fixes the issue
for fc tracking logic while jbd2 commit is ongoing.
If the inode tid is bigger than txn tid, that means this inode may be
in the STAGING queue, if we reset it then it will lose the tack range.
I think it's a similar issue, the difference is this inode is already
in the MAIN queue before the jbd2 commit starts.
And yes , I think in this case we can not remove it from the MAIN
queue, but still need to clear EXT4_STATE_FC_COMMITTING right? it may
block some task still waiting for it.

Thanks,
Xin Yin
>
> 2. Also is this an expected behavior from the design perspective of
>    fast_commit. i.e.
>    a. the inode can be part of two tids?
>    b. And that while a full commit is in progress, the inode can still
>    receive updates but using a new transaction tid.
>
> Frankly speaking, since I was also working on other things, so I haven't
> yet got the chance to completely analyze the situation yet.
> Once I have those things sorted, I will spend more time on this, to
> understand it more. Meanwhile if you already have some answers to above
> queries/observations, please do share those here.
>
> Links
> =========
> [1] https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/ext4/fast_commit/fc_inode_missing_updates_ino_979.txt
>
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> ---
>  fs/ext4/fast_commit.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 8803ba087b07..769b584c2552 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1252,6 +1252,8 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
>         spin_lock(&sbi->s_fc_lock);
>         list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
>                                  i_fc_list) {
> +               if (full && iter->i_sync_tid > tid)
> +                       continue;
>                 list_del_init(&iter->i_fc_list);
>                 ext4_clear_inode_state(&iter->vfs_inode,
>                                        EXT4_STATE_FC_COMMITTING);
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED
  2022-02-22 20:34 ` [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED Ritesh Harjani
@ 2022-02-23  9:37   ` Jan Kara
  2022-02-27 18:27     ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:37 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:09, Ritesh Harjani wrote:
> Below commit removed all references of EXT4_FC_COMMIT_FAILED.
> commit 0915e464cb274 ("ext4: simplify updating of fast commit stats")
> 
> Just remove it since it is not used anymore.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Sure. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza

> ---
>  fs/ext4/fast_commit.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/fs/ext4/fast_commit.h b/fs/ext4/fast_commit.h
> index 02afa52e8e41..80414dcba6e1 100644
> --- a/fs/ext4/fast_commit.h
> +++ b/fs/ext4/fast_commit.h
> @@ -93,7 +93,6 @@ enum {
>  	EXT4_FC_REASON_RENAME_DIR,
>  	EXT4_FC_REASON_FALLOC_RANGE,
>  	EXT4_FC_REASON_INODE_JOURNAL_DATA,
> -	EXT4_FC_COMMIT_FAILED,
>  	EXT4_FC_REASON_MAX
>  };
>  
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/9] ext4: Add couple of more fast_commit tracepoints
  2022-02-22 20:34 ` [RFC 3/9] ext4: Add couple of more fast_commit tracepoints Ritesh Harjani
@ 2022-02-23  9:40   ` Jan Kara
  2022-02-23 10:11     ` Ritesh Harjani
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:40 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:11, Ritesh Harjani wrote:
> This adds two more tracepoints for ext4_fc_track_template() &
> ext4_fc_cleanup() which are helpful in debugging some fast_commit issues.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

So why is this more useful than trace_ext4_fc_track_range() and other
tracepoints? I don't think it provides any more information? What am I
missing?

								Honza

> ---
>  fs/ext4/fast_commit.c       |  3 ++
>  include/trace/events/ext4.h | 67 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 70 insertions(+)
> 
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 5ac594e03402..bf70879bb4fe 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -386,6 +386,8 @@ static int ext4_fc_track_template(
>  	if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
>  		return -EINVAL;
>  
> +	trace_ext4_fc_track_template(handle, inode, __fc_track_fn, enqueue);
> +
>  	tid = handle->h_transaction->t_tid;
>  	mutex_lock(&ei->i_fc_lock);
>  	if (tid == ei->i_sync_tid) {
> @@ -1241,6 +1243,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
>  	if (full && sbi->s_fc_bh)
>  		sbi->s_fc_bh = NULL;
>  
> +	trace_ext4_fc_cleanup(journal, full, tid);
>  	jbd2_fc_release_bufs(journal);
>  
>  	spin_lock(&sbi->s_fc_lock);
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index 17fb9c506e8a..cd09dccea502 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2855,6 +2855,73 @@ TRACE_EVENT(ext4_fc_track_range,
>  		      __entry->end)
>  	);
>  
> +TRACE_EVENT(ext4_fc_track_template,
> +	TP_PROTO(handle_t *handle, struct inode *inode,
> +		 void *__fc_track_fn, int enqueue),
> +
> +	TP_ARGS(handle, inode, __fc_track_fn, enqueue),
> +
> +	TP_STRUCT__entry(
> +		__field(dev_t, dev)
> +		__field(tid_t, t_tid)
> +		__field(ino_t, i_ino)
> +		__field(tid_t, i_sync_tid)
> +		__field(void *, __fc_track_fn)
> +		__field(int, enqueue)
> +		__field(bool, jbd2_ongoing)
> +		__field(bool, fc_ongoing)
> +	),
> +
> +	TP_fast_assign(
> +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> +		struct ext4_inode_info *ei = EXT4_I(inode);
> +
> +		__entry->dev = inode->i_sb->s_dev;
> +		__entry->t_tid = handle->h_transaction->t_tid;
> +		__entry->i_ino = inode->i_ino;
> +		__entry->i_sync_tid = ei->i_sync_tid;
> +		__entry->__fc_track_fn = __fc_track_fn;
> +		__entry->enqueue = enqueue;
> +		__entry->jbd2_ongoing =
> +		    sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING;
> +		__entry->fc_ongoing =
> +		    sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING;
> +	),
> +
> +	TP_printk("dev %d,%d, t_tid %u, ino %lu, i_sync_tid %u, "
> +		  "track_fn %pS, enqueue %d, jbd2_ongoing %d, fc_ongoing %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __entry->t_tid, __entry->i_ino, __entry->i_sync_tid,
> +		  (void *)__entry->__fc_track_fn, __entry->enqueue,
> +		  __entry->jbd2_ongoing, __entry->fc_ongoing)
> +	);
> +
> +TRACE_EVENT(ext4_fc_cleanup,
> +	TP_PROTO(journal_t *journal, int full, tid_t tid),
> +
> +	TP_ARGS(journal, full, tid),
> +
> +	TP_STRUCT__entry(
> +		__field(dev_t, dev)
> +		__field(int, j_fc_off)
> +		__field(int, full)
> +		__field(tid_t, tid)
> +	),
> +
> +	TP_fast_assign(
> +		struct super_block *sb = journal->j_private;
> +
> +		__entry->dev = sb->s_dev;
> +		__entry->j_fc_off = journal->j_fc_off;
> +		__entry->full = full;
> +		__entry->tid = tid;
> +	),
> +
> +	TP_printk("dev %d,%d, j_fc_off %d, full %d, tid %u",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __entry->j_fc_off, __entry->full, __entry->tid)
> +	);
> +
>  TRACE_EVENT(ext4_update_sb,
>  	TP_PROTO(struct super_block *sb, ext4_fsblk_t fsblk,
>  		 unsigned int flags),
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC
  2022-02-22 20:34 ` [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC Ritesh Harjani
@ 2022-02-23  9:41   ` Jan Kara
  2022-02-27 18:30     ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:41 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:12, Ritesh Harjani wrote:
> This just puts trace_ext4_fc_commit_start(sb) & ktime_get()
> for measuring FC commit time, after the check of whether sb
> supports JOURNAL_FAST_COMMIT or not.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/fast_commit.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index bf70879bb4fe..7fb1eceef30c 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1167,13 +1167,13 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
>  	int status = EXT4_FC_STATUS_OK, fc_bufs_before = 0;
>  	ktime_t start_time, commit_time;
>  
> +	if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
> +		return jbd2_complete_transaction(journal, commit_tid);
> +
>  	trace_ext4_fc_commit_start(sb);
>  
>  	start_time = ktime_get();
>  
> -	if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
> -		return jbd2_complete_transaction(journal, commit_tid);
> -
>  restart_fc:
>  	ret = jbd2_fc_begin_commit(journal, commit_tid);
>  	if (ret == -EALREADY) {
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 5/9] ext4: Add commit_tid info in jbd debug log
  2022-02-22 20:34 ` [RFC 5/9] ext4: Add commit_tid info in jbd debug log Ritesh Harjani
@ 2022-02-23  9:42   ` Jan Kara
  2022-02-27 18:31     ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:42 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:13, Ritesh Harjani wrote:
> This adds commit_tid argument in ext4_fc_update_stats()
> so that we can add this information too in jbd_debug logs.
> This is also required in a later patch to pass the commit_tid info in
> ext4_fc_commit_start/stop() trace events.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/fast_commit.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 7fb1eceef30c..ee32aac0cbbf 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1127,11 +1127,12 @@ static int ext4_fc_perform_commit(journal_t *journal)
>  }
>  
>  static void ext4_fc_update_stats(struct super_block *sb, int status,
> -				 u64 commit_time, int nblks)
> +				 u64 commit_time, int nblks, tid_t commit_tid)
>  {
>  	struct ext4_fc_stats *stats = &EXT4_SB(sb)->s_fc_stats;
>  
> -	jbd_debug(1, "Fast commit ended with status = %d", status);
> +	jbd_debug(1, "Fast commit ended with status = %d for tid %u",
> +			status, commit_tid);
>  	if (status == EXT4_FC_STATUS_OK) {
>  		stats->fc_num_commits++;
>  		stats->fc_numblks += nblks;
> @@ -1181,14 +1182,16 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
>  		if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
>  			commit_tid > journal->j_commit_sequence)
>  			goto restart_fc;
> -		ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0);
> +		ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0,
> +				commit_tid);
>  		return 0;
>  	} else if (ret) {
>  		/*
>  		 * Commit couldn't start. Just update stats and perform a
>  		 * full commit.
>  		 */
> -		ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0);
> +		ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0,
> +				commit_tid);
>  		return jbd2_complete_transaction(journal, commit_tid);
>  	}
>  
> @@ -1220,12 +1223,12 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
>  	 * don't react too strongly to vast changes in the commit time
>  	 */
>  	commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
> -	ext4_fc_update_stats(sb, status, commit_time, nblks);
> +	ext4_fc_update_stats(sb, status, commit_time, nblks, commit_tid);
>  	return ret;
>  
>  fallback:
>  	ret = jbd2_fc_end_commit_fallback(journal);
> -	ext4_fc_update_stats(sb, status, 0, 0);
> +	ext4_fc_update_stats(sb, status, 0, 0, commit_tid);
>  	return ret;
>  }
>  
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events
  2022-02-22 20:34 ` [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events Ritesh Harjani
@ 2022-02-23  9:44   ` Jan Kara
  2022-02-27 18:31     ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:44 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:14, Ritesh Harjani wrote:
> This adds commit_tid info in ext4_fc_commit_start/stop which is helpful
> in debugging fast_commit issues.
> 
> For e.g. issues where due to jbd2 journal full commit, FC miss to commit
> updates to a file.
> 
> Also improves TP_prink format string i.e. all ext4 and jbd2 trace events
> starts with "dev MAjOR,MINOR". Let's follow the same convention while we
> are still at it.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/fast_commit.c       |  4 ++--
>  include/trace/events/ext4.h | 21 +++++++++++++--------
>  2 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index ee32aac0cbbf..8803ba087b07 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1150,7 +1150,7 @@ static void ext4_fc_update_stats(struct super_block *sb, int status,
>  	} else {
>  		stats->fc_skipped_commits++;
>  	}
> -	trace_ext4_fc_commit_stop(sb, nblks, status);
> +	trace_ext4_fc_commit_stop(sb, nblks, status, commit_tid);
>  }
>  
>  /*
> @@ -1171,7 +1171,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
>  	if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
>  		return jbd2_complete_transaction(journal, commit_tid);
>  
> -	trace_ext4_fc_commit_start(sb);
> +	trace_ext4_fc_commit_start(sb, commit_tid);
>  
>  	start_time = ktime_get();
>  
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index cd09dccea502..6e66cb7ce624 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2685,26 +2685,29 @@ TRACE_EVENT(ext4_fc_replay,
>  );
>  
>  TRACE_EVENT(ext4_fc_commit_start,
> -	TP_PROTO(struct super_block *sb),
> +	TP_PROTO(struct super_block *sb, tid_t commit_tid),
>  
> -	TP_ARGS(sb),
> +	TP_ARGS(sb, commit_tid),
>  
>  	TP_STRUCT__entry(
>  		__field(dev_t, dev)
> +		__field(tid_t, tid)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->dev = sb->s_dev;
> +		__entry->tid = commit_tid;
>  	),
>  
> -	TP_printk("fast_commit started on dev %d,%d",
> -		  MAJOR(__entry->dev), MINOR(__entry->dev))
> +	TP_printk("dev %d,%d tid %u", MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __entry->tid)
>  );
>  
>  TRACE_EVENT(ext4_fc_commit_stop,
> -	    TP_PROTO(struct super_block *sb, int nblks, int reason),
> +	    TP_PROTO(struct super_block *sb, int nblks, int reason,
> +		     tid_t commit_tid),
>  
> -	TP_ARGS(sb, nblks, reason),
> +	TP_ARGS(sb, nblks, reason, commit_tid),
>  
>  	TP_STRUCT__entry(
>  		__field(dev_t, dev)
> @@ -2713,6 +2716,7 @@ TRACE_EVENT(ext4_fc_commit_stop,
>  		__field(int, num_fc)
>  		__field(int, num_fc_ineligible)
>  		__field(int, nblks_agg)
> +		__field(tid_t, tid)
>  	),
>  
>  	TP_fast_assign(
> @@ -2723,12 +2727,13 @@ TRACE_EVENT(ext4_fc_commit_stop,
>  		__entry->num_fc_ineligible =
>  			EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
>  		__entry->nblks_agg = EXT4_SB(sb)->s_fc_stats.fc_numblks;
> +		__entry->tid = commit_tid;
>  	),
>  
> -	TP_printk("fc on [%d,%d] nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d",
> +	TP_printk("dev %d,%d nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d, tid %u",
>  		  MAJOR(__entry->dev), MINOR(__entry->dev),
>  		  __entry->nblks, __entry->reason, __entry->num_fc,
> -		  __entry->num_fc_ineligible, __entry->nblks_agg)
> +		  __entry->num_fc_ineligible, __entry->nblks_agg, __entry->tid)
>  );
>  
>  #define FC_REASON_NAME_STAT(reason)					\
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention
  2022-02-22 20:34 ` [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention Ritesh Harjani
@ 2022-02-23  9:45   ` Jan Kara
  2022-02-27 18:32     ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:45 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:15, Ritesh Harjani wrote:
> All ext4 & jbd2 trace events starts with "dev Major:Minor".
> While we are still improving/adding the ftrace events for FC,
> let's fix last two remaining trace events to follow the same
> convention.
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

OK. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  include/trace/events/ext4.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index 6e66cb7ce624..233dbffa5ceb 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2653,7 +2653,7 @@ TRACE_EVENT(ext4_fc_replay_scan,
>  		__entry->off = off;
>  	),
>  
> -	TP_printk("FC scan pass on dev %d,%d: error %d, off %d",
> +	TP_printk("dev %d,%d error %d, off %d",
>  		  MAJOR(__entry->dev), MINOR(__entry->dev),
>  		  __entry->error, __entry->off)
>  );
> @@ -2679,7 +2679,7 @@ TRACE_EVENT(ext4_fc_replay,
>  		__entry->priv2 = priv2;
>  	),
>  
> -	TP_printk("FC Replay %d,%d: tag %d, ino %d, data1 %d, data2 %d",
> +	TP_printk("dev %d,%d: tag %d, ino %d, data1 %d, data2 %d",
>  		  MAJOR(__entry->dev), MINOR(__entry->dev),
>  		  __entry->tag, __entry->ino, __entry->priv1, __entry->priv2)
>  );
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class
  2022-02-22 20:34 ` [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class Ritesh Harjani
@ 2022-02-23  9:49   ` Jan Kara
  2022-02-27 18:35     ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:49 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 02:04:16, Ritesh Harjani wrote:
> One should use DECLARE_EVENT_CLASS for similar event types instead of
> defining TRACE_EVENT for each event type. This is helpful in reducing
> the text section footprint for e.g. [1]
> 
> [1]: https://lwn.net/Articles/381064/
> 
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  include/trace/events/ext4.h | 57 +++++++++++++++++++++----------------
>  1 file changed, 32 insertions(+), 25 deletions(-)
> 
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index 233dbffa5ceb..33a059d845d6 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -2783,33 +2783,40 @@ TRACE_EVENT(ext4_fc_stats,
>  		  __entry->fc_numblks)
>  );
>  
> -#define DEFINE_TRACE_DENTRY_EVENT(__type)				\
> -	TRACE_EVENT(ext4_fc_track_##__type,				\
> -	    TP_PROTO(struct inode *inode, struct dentry *dentry, int ret), \
> -									\
> -	    TP_ARGS(inode, dentry, ret),				\
> -									\
> -	    TP_STRUCT__entry(						\
> -		    __field(dev_t, dev)					\
> -		    __field(int, ino)					\
> -		    __field(int, error)					\
> -		    ),							\
> -									\
> -	    TP_fast_assign(						\
> -		    __entry->dev = inode->i_sb->s_dev;			\
> -		    __entry->ino = inode->i_ino;			\
> -		    __entry->error = ret;				\
> -		    ),							\
> -									\
> -	    TP_printk("dev %d:%d, inode %d, error %d, fc_%s",		\
> -		      MAJOR(__entry->dev), MINOR(__entry->dev),		\
> -		      __entry->ino, __entry->error,			\
> -		      #__type)						\
> +DECLARE_EVENT_CLASS(ext4_fc_track_dentry,
> +
> +	TP_PROTO(struct inode *inode, struct dentry *dentry, int ret),
> +
> +	TP_ARGS(inode, dentry, ret),
> +
> +	TP_STRUCT__entry(
> +		__field(dev_t, dev)
> +		__field(int, ino)
> +		__field(int, error)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev = inode->i_sb->s_dev;
> +		__entry->ino = inode->i_ino;
> +		__entry->error = ret;
> +	),
> +
> +	TP_printk("dev %d,%d, inode %d, error %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __entry->ino, __entry->error
>  	)
> +);
> +
> +#define DEFINE_EVENT_CLASS_TYPE(__type)					\
> +DEFINE_EVENT(ext4_fc_track_dentry, ext4_fc_track_##__type,		\
> +	TP_PROTO(struct inode *inode, struct dentry *dentry, int ret),	\
> +	TP_ARGS(inode, dentry, ret)					\
> +)
> +
>  
> -DEFINE_TRACE_DENTRY_EVENT(create);
> -DEFINE_TRACE_DENTRY_EVENT(link);
> -DEFINE_TRACE_DENTRY_EVENT(unlink);
> +DEFINE_EVENT_CLASS_TYPE(create);
> +DEFINE_EVENT_CLASS_TYPE(link);
> +DEFINE_EVENT_CLASS_TYPE(unlink);
>  
>  TRACE_EVENT(ext4_fc_track_inode,
>  	    TP_PROTO(struct inode *inode, int ret),
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/9] ext4: Fix ext4_fc_stats trace point
  2022-02-22 20:34 ` [RFC 2/9] ext4: Fix ext4_fc_stats trace point Ritesh Harjani
  2022-02-22 20:52   ` Steven Rostedt
@ 2022-02-23  9:54   ` Jan Kara
  2022-02-27 18:29     ` harshad shirwadkar
  1 sibling, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23  9:54 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel, Steven Rostedt

On Wed 23-02-22 02:04:10, Ritesh Harjani wrote:
> ftrace's __print_symbolic() requires that any enum values used in the
> symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM
> so that the enum value can be encoded in the ftrace ring buffer.
> 
> This patch also fixes few other problems found in this trace point.
> e.g. dereferencing structures in TP_printk which should not be done
> at any cost.
> 
> Also to avoid checkpatch warnings, this patch removes those
> whitespaces/tab stops issues.
> 
> Fixes: commit aa75f4d3daae ("ext4: main fast-commit commit path")
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>

Looks good (modulo Steven's nit). Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza


> ---
>  include/trace/events/ext4.h | 76 +++++++++++++++++++++++--------------
>  1 file changed, 47 insertions(+), 29 deletions(-)
> 
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index 19e957b7f941..17fb9c506e8a 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -95,6 +95,16 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B);
>  	{ FALLOC_FL_COLLAPSE_RANGE,	"COLLAPSE_RANGE"},	\
>  	{ FALLOC_FL_ZERO_RANGE,		"ZERO_RANGE"})
>  
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE);
> +TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA);
> +
>  #define show_fc_reason(reason)						\
>  	__print_symbolic(reason,					\
>  		{ EXT4_FC_REASON_XATTR,		"XATTR"},		\
> @@ -2723,41 +2733,49 @@ TRACE_EVENT(ext4_fc_commit_stop,
>  
>  #define FC_REASON_NAME_STAT(reason)					\
>  	show_fc_reason(reason),						\
> -	__entry->sbi->s_fc_stats.fc_ineligible_reason_count[reason]
> +	__entry->fc_ineligible_rc[reason]
>  
>  TRACE_EVENT(ext4_fc_stats,
> -	    TP_PROTO(struct super_block *sb),
> -
> -	    TP_ARGS(sb),
> +	TP_PROTO(struct super_block *sb),
>  
> -	    TP_STRUCT__entry(
> -		    __field(dev_t, dev)
> -		    __field(struct ext4_sb_info *, sbi)
> -		    __field(int, count)
> -		    ),
> +	TP_ARGS(sb),
>  
> -	    TP_fast_assign(
> -		    __entry->dev = sb->s_dev;
> -		    __entry->sbi = EXT4_SB(sb);
> -		    ),
> +	TP_STRUCT__entry(
> +		__field(dev_t, dev)
> +		__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)
> +		__field(unsigned long, fc_commits)
> +		__field(unsigned long, fc_ineligible_commits)
> +		__field(unsigned long, fc_numblks)
> +	),
>  
> -	    TP_printk("dev %d:%d fc ineligible reasons:\n"
> -		      "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
> -		      "num_commits:%ld, ineligible: %ld, numblks: %ld",
> -		      MAJOR(__entry->dev), MINOR(__entry->dev),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> -		      FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> -		      __entry->sbi->s_fc_stats.fc_num_commits,
> -		      __entry->sbi->s_fc_stats.fc_ineligible_commits,
> -		      __entry->sbi->s_fc_stats.fc_numblks)
> +	TP_fast_assign(
> +		int i;
>  
> +		__entry->dev = sb->s_dev;
> +		for (i = 0; i < EXT4_FC_REASON_MAX; i++)
> +			__entry->fc_ineligible_rc[i] =
> +			EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i];
> +		__entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits;
> +		__entry->fc_ineligible_commits =
> +			EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
> +		__entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks;
> +	),
> +
> +	TP_printk("dev %d,%d fc ineligible reasons:\n"
> +		  "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
> +		  "num_commits:%lu, ineligible: %lu, numblks: %lu",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> +		  __entry->fc_commits, __entry->fc_ineligible_commits,
> +		  __entry->fc_numblks)
>  );
>  
>  #define DEFINE_TRACE_DENTRY_EVENT(__type)				\
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/9] ext4: Add couple of more fast_commit tracepoints
  2022-02-23  9:40   ` Jan Kara
@ 2022-02-23 10:11     ` Ritesh Harjani
  2022-02-23 11:53       ` Jan Kara
  0 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-23 10:11 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, linux-fsdevel,
	linux-kernel

On 22/02/23 10:40AM, Jan Kara wrote:
> On Wed 23-02-22 02:04:11, Ritesh Harjani wrote:
> > This adds two more tracepoints for ext4_fc_track_template() &
> > ext4_fc_cleanup() which are helpful in debugging some fast_commit issues.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> So why is this more useful than trace_ext4_fc_track_range() and other
> tracepoints? I don't think it provides any more information? What am I
> missing?

Thanks Jan for all the reviews.

So ext4_fc_track_template() adds almost all required information
(including the caller info) in this one trace point along with transaction tid
which is useful for tracking issue similar to what is mentioned in Patch-9.

(race with if inode is part of two transactions tid where jbd2 full commit
may begin for txn n-1 while inode is still in sbi->s_fc_q[MAIN])

And similarly ext4_fc_cleanup() helps with that information about which tid
completed and whether it was called from jbd2 full commit or from fast_commit.

-ritesh

>
> 								Honza
>
> > ---
> >  fs/ext4/fast_commit.c       |  3 ++
> >  include/trace/events/ext4.h | 67 +++++++++++++++++++++++++++++++++++++
> >  2 files changed, 70 insertions(+)
> >
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index 5ac594e03402..bf70879bb4fe 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -386,6 +386,8 @@ static int ext4_fc_track_template(
> >  	if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
> >  		return -EINVAL;
> >
> > +	trace_ext4_fc_track_template(handle, inode, __fc_track_fn, enqueue);
> > +
> >  	tid = handle->h_transaction->t_tid;
> >  	mutex_lock(&ei->i_fc_lock);
> >  	if (tid == ei->i_sync_tid) {
> > @@ -1241,6 +1243,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> >  	if (full && sbi->s_fc_bh)
> >  		sbi->s_fc_bh = NULL;
> >
> > +	trace_ext4_fc_cleanup(journal, full, tid);
> >  	jbd2_fc_release_bufs(journal);
> >
> >  	spin_lock(&sbi->s_fc_lock);
> > diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> > index 17fb9c506e8a..cd09dccea502 100644
> > --- a/include/trace/events/ext4.h
> > +++ b/include/trace/events/ext4.h
> > @@ -2855,6 +2855,73 @@ TRACE_EVENT(ext4_fc_track_range,
> >  		      __entry->end)
> >  	);
> >
> > +TRACE_EVENT(ext4_fc_track_template,
> > +	TP_PROTO(handle_t *handle, struct inode *inode,
> > +		 void *__fc_track_fn, int enqueue),
> > +
> > +	TP_ARGS(handle, inode, __fc_track_fn, enqueue),
> > +
> > +	TP_STRUCT__entry(
> > +		__field(dev_t, dev)
> > +		__field(tid_t, t_tid)
> > +		__field(ino_t, i_ino)
> > +		__field(tid_t, i_sync_tid)
> > +		__field(void *, __fc_track_fn)
> > +		__field(int, enqueue)
> > +		__field(bool, jbd2_ongoing)
> > +		__field(bool, fc_ongoing)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> > +		struct ext4_inode_info *ei = EXT4_I(inode);
> > +
> > +		__entry->dev = inode->i_sb->s_dev;
> > +		__entry->t_tid = handle->h_transaction->t_tid;
> > +		__entry->i_ino = inode->i_ino;
> > +		__entry->i_sync_tid = ei->i_sync_tid;
> > +		__entry->__fc_track_fn = __fc_track_fn;
> > +		__entry->enqueue = enqueue;
> > +		__entry->jbd2_ongoing =
> > +		    sbi->s_journal->j_flags & JBD2_FULL_COMMIT_ONGOING;
> > +		__entry->fc_ongoing =
> > +		    sbi->s_journal->j_flags & JBD2_FAST_COMMIT_ONGOING;
> > +	),
> > +
> > +	TP_printk("dev %d,%d, t_tid %u, ino %lu, i_sync_tid %u, "
> > +		  "track_fn %pS, enqueue %d, jbd2_ongoing %d, fc_ongoing %d",
> > +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> > +		  __entry->t_tid, __entry->i_ino, __entry->i_sync_tid,
> > +		  (void *)__entry->__fc_track_fn, __entry->enqueue,
> > +		  __entry->jbd2_ongoing, __entry->fc_ongoing)
> > +	);
> > +
> > +TRACE_EVENT(ext4_fc_cleanup,
> > +	TP_PROTO(journal_t *journal, int full, tid_t tid),
> > +
> > +	TP_ARGS(journal, full, tid),
> > +
> > +	TP_STRUCT__entry(
> > +		__field(dev_t, dev)
> > +		__field(int, j_fc_off)
> > +		__field(int, full)
> > +		__field(tid_t, tid)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		struct super_block *sb = journal->j_private;
> > +
> > +		__entry->dev = sb->s_dev;
> > +		__entry->j_fc_off = journal->j_fc_off;
> > +		__entry->full = full;
> > +		__entry->tid = tid;
> > +	),
> > +
> > +	TP_printk("dev %d,%d, j_fc_off %d, full %d, tid %u",
> > +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> > +		  __entry->j_fc_off, __entry->full, __entry->tid)
> > +	);
> > +
> >  TRACE_EVENT(ext4_update_sb,
> >  	TP_PROTO(struct super_block *sb, ext4_fsblk_t fsblk,
> >  		 unsigned int flags),
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/9] ext4: Add couple of more fast_commit tracepoints
  2022-02-23 10:11     ` Ritesh Harjani
@ 2022-02-23 11:53       ` Jan Kara
  2022-02-23 12:04         ` Ritesh Harjani
  0 siblings, 1 reply; 35+ messages in thread
From: Jan Kara @ 2022-02-23 11:53 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: Jan Kara, linux-ext4, Harshad Shirwadkar, Theodore Ts'o,
	linux-fsdevel, linux-kernel

On Wed 23-02-22 15:41:59, Ritesh Harjani wrote:
> On 22/02/23 10:40AM, Jan Kara wrote:
> > On Wed 23-02-22 02:04:11, Ritesh Harjani wrote:
> > > This adds two more tracepoints for ext4_fc_track_template() &
> > > ext4_fc_cleanup() which are helpful in debugging some fast_commit issues.
> > >
> > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> >
> > So why is this more useful than trace_ext4_fc_track_range() and other
> > tracepoints? I don't think it provides any more information? What am I
> > missing?
> 
> Thanks Jan for all the reviews.
> 
> So ext4_fc_track_template() adds almost all required information
> (including the caller info) in this one trace point along with transaction tid
> which is useful for tracking issue similar to what is mentioned in Patch-9.
> 
> (race with if inode is part of two transactions tid where jbd2 full commit
> may begin for txn n-1 while inode is still in sbi->s_fc_q[MAIN])
 
I understand commit tid is interesting but cannot we just add it to
tracepoints like trace_ext4_fc_track_range() directly? It would seem useful
to have it there and when it is there, the need for
ext4_fc_track_template() is not really big. I don't care too much but
this tracepoint looked a bit superfluous to me.

> And similarly ext4_fc_cleanup() helps with that information about which tid
> completed and whether it was called from jbd2 full commit or from fast_commit.

Yeah, that one is clear.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 3/9] ext4: Add couple of more fast_commit tracepoints
  2022-02-23 11:53       ` Jan Kara
@ 2022-02-23 12:04         ` Ritesh Harjani
  0 siblings, 0 replies; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-23 12:04 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, linux-fsdevel,
	linux-kernel

On 22/02/23 12:53PM, Jan Kara wrote:
> On Wed 23-02-22 15:41:59, Ritesh Harjani wrote:
> > On 22/02/23 10:40AM, Jan Kara wrote:
> > > On Wed 23-02-22 02:04:11, Ritesh Harjani wrote:
> > > > This adds two more tracepoints for ext4_fc_track_template() &
> > > > ext4_fc_cleanup() which are helpful in debugging some fast_commit issues.
> > > >
> > > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > >
> > > So why is this more useful than trace_ext4_fc_track_range() and other
> > > tracepoints? I don't think it provides any more information? What am I
> > > missing?
> >
> > Thanks Jan for all the reviews.
> >
> > So ext4_fc_track_template() adds almost all required information
> > (including the caller info) in this one trace point along with transaction tid
> > which is useful for tracking issue similar to what is mentioned in Patch-9.
> >
> > (race with if inode is part of two transactions tid where jbd2 full commit
> > may begin for txn n-1 while inode is still in sbi->s_fc_q[MAIN])
>
> I understand commit tid is interesting but cannot we just add it to
> tracepoints like trace_ext4_fc_track_range() directly? It would seem useful
> to have it there and when it is there, the need for
> ext4_fc_track_template() is not really big. I don't care too much but

Yes make sense. Sure, I will look into adding this info to existing trace
points then. With that I think trace_ext4_fc_track_template() won't be required.

Will add those changes in V2.

> this tracepoint looked a bit superfluous to me.
>
> > And similarly ext4_fc_cleanup() helps with that information about which tid
> > completed and whether it was called from jbd2 full commit or from fast_commit.
>
> Yeah, that one is clear.

Will retain trace_ext4_fc_cleanup() then.


-ritesh
>
> 								Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [External] [RFC 9/9] ext4: fast_commit missing tracking updates to a file
  2022-02-23  3:50   ` [External] " Xin Yin
@ 2022-02-23 13:58     ` Ritesh Harjani
  2022-02-24 11:43       ` Xin Yin
  0 siblings, 1 reply; 35+ messages in thread
From: Ritesh Harjani @ 2022-02-23 13:58 UTC (permalink / raw)
  To: Xin Yin
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On 22/02/23 11:50AM, Xin Yin wrote:
> On Wed, Feb 23, 2022 at 4:36 AM Ritesh Harjani <riteshh@linux.ibm.com> wrote:
> >
> > <DO NOT MERGE THIS YET>
> >
> > Testcase
> > ==========
> > 1. i=0; while [ $i -lt 1000 ]; do xfs_io -f -c "pwrite -S 0xaa -b 32k 0 32k" -c "fsync" /mnt/$i; i=$(($i+1)); done && sudo ./src/godown -v /mnt && sudo umount /mnt && sudo mount /dev/loop2 /mnt'
> > 2. ls -alih /mnt/ -> In this you will observe one such file with 0 bytes (which ideally should not happen)
> >
> > ^^^ say if you don't see the issue because your underlying storage
> > device is very fast, then maybe try with commit=1 mount option.
> >
> > Analysis
> > ==========
> > It seems a file's updates can be a part of two transaction tid.
> > Below are the sequence of events which could cause this issue.
> >
> > jbd2_handle_start -> (t_tid = 38)
> > __ext4_new_inode
> > ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid = 38)
> > <track more updates>
> > jbd2_start_commit -> (t_tid = 38)
> >
> > jbd2_handle_start (tid = 39)
> > ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid 39)
> >     -> ext4_fc_reset_inode & ei->i_sync_tid = t_tid
> >
> > ext4_fc_commit_start -> (will wait since jbd2 full commit is in progress)
> > jbd2_end_commit (t_tid = 38)
> >     -> jbd2_fc_cleanup() -> this will cleanup entries in sbi->s_fc_q[FC_Q_MAIN]
> >         -> And the above could result inode size as 0 as  after effect.
> > ext4_fc_commit_stop
> >
> > You could find the logs for the above behavior for inode 979 at [1].
> >
> > -> So what is happening here is since the ei->i_fc_list is not empty
> > (because it is already part of sb's MAIN queue), we don't add this inode
> > again into neither sb's MAIN or STAGING queue.
> > And after jbd2_fc_cleanup() is called from jbd2 full commit, we
> > just remove this inode from the main queue.
> >
> > So as a simple fix, what I did below was to check if it is a jbd2 full commit
> > in ext4_fc_cleanup(), and if the ei->i_sync_tid > tid, that means we
> > need not remove that from MAIN queue. This is since neither jbd2 nor FC
> > has committed updates of those inodes for this new txn tid yet.
> >
> > But below are some quick queries on this
> > =========================================
> >
> > 1. why do we call ext4_fc_reset_inode() when inode tid and
> >    running txn tid does not match?
> This is part of a change in commit:bdc8a53a6f2f,  it fixes the issue
> for fc tracking logic while jbd2 commit is ongoing.

Thanks Xin for pointing the other issue too.
But I think what I was mostly referring to was - calling ext4_fc_reset_inode()
in ext4_fc_track_template().

<..>
 391         tid = handle->h_transaction->t_tid;
 392         mutex_lock(&ei->i_fc_lock);
 393         if (tid == ei->i_sync_tid) {
 394                 update = true;
 395         } else {
 396                 ext4_fc_reset_inode(inode);
 397                 ei->i_sync_tid = tid;
 398         }
 399         ret = __fc_track_fn(inode, args, update);
 400         mutex_unlock(&ei->i_fc_lock);
 <..>

So, yes these are few corner cases which I want to take a deeper look at.
I vaugely understand that this reset inode is done since we anyway might have
done the full commit for previous tid, so we can reset the inode track range.

So, yes, we should carefully review this as well that if jbd2 commit happens for
an inode which is still part of MAIN_Q, then does it make sense to still
call ext4_fc_reset_inode() for that inode in ext4_fc_track_template()?

> If the inode tid is bigger than txn tid, that means this inode may be
> in the STAGING queue, if we reset it then it will lose the tack range.
> I think it's a similar issue, the difference is this inode is already

Do you have a test case which was failing for your issue?
I would like to test that one too.


> in the MAIN queue before the jbd2 commit starts.
> And yes , I think in this case we can not remove it from the MAIN

Yes. I too have a similar thought. But I still wanted to get few queries sorted
(like point 1 & 2).

> queue, but still need to clear EXT4_STATE_FC_COMMITTING right? it may
> block some task still waiting for it.

Sorry I didn't get you here. So I think we will end up in such situation
(where ext4_fc_cleanup() is getting called for an inode with i_sync_tid > tid)
only from full commit path right ?
And that won't set EXT4_FC_COMMITTING for this inode right anyways no?

Do you mean anything else, or am I missing something here?

-ritesh


>
> Thanks,
> Xin Yin
> >
> > 2. Also is this an expected behavior from the design perspective of
> >    fast_commit. i.e.
> >    a. the inode can be part of two tids?
> >    b. And that while a full commit is in progress, the inode can still
> >    receive updates but using a new transaction tid.
> >
> > Frankly speaking, since I was also working on other things, so I haven't
> > yet got the chance to completely analyze the situation yet.
> > Once I have those things sorted, I will spend more time on this, to
> > understand it more. Meanwhile if you already have some answers to above
> > queries/observations, please do share those here.
> >
> > Links
> > =========
> > [1] https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/ext4/fast_commit/fc_inode_missing_updates_ino_979.txt
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > ---
> >  fs/ext4/fast_commit.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index 8803ba087b07..769b584c2552 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -1252,6 +1252,8 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> >         spin_lock(&sbi->s_fc_lock);
> >         list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
> >                                  i_fc_list) {
> > +               if (full && iter->i_sync_tid > tid)
> > +                       continue;
> >                 list_del_init(&iter->i_fc_list);
> >                 ext4_clear_inode_state(&iter->vfs_inode,
> >                                        EXT4_STATE_FC_COMMITTING);
> > --
> > 2.31.1
> >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [External] [RFC 9/9] ext4: fast_commit missing tracking updates to a file
  2022-02-23 13:58     ` Ritesh Harjani
@ 2022-02-24 11:43       ` Xin Yin
  2022-02-27 20:51         ` harshad shirwadkar
  0 siblings, 1 reply; 35+ messages in thread
From: Xin Yin @ 2022-02-24 11:43 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Theodore Ts'o, Jan Kara,
	linux-fsdevel, linux-kernel

On Wed, Feb 23, 2022 at 9:59 PM Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> On 22/02/23 11:50AM, Xin Yin wrote:
> > On Wed, Feb 23, 2022 at 4:36 AM Ritesh Harjani <riteshh@linux.ibm.com> wrote:
> > >
> > > <DO NOT MERGE THIS YET>
> > >
> > > Testcase
> > > ==========
> > > 1. i=0; while [ $i -lt 1000 ]; do xfs_io -f -c "pwrite -S 0xaa -b 32k 0 32k" -c "fsync" /mnt/$i; i=$(($i+1)); done && sudo ./src/godown -v /mnt && sudo umount /mnt && sudo mount /dev/loop2 /mnt'
> > > 2. ls -alih /mnt/ -> In this you will observe one such file with 0 bytes (which ideally should not happen)
> > >
> > > ^^^ say if you don't see the issue because your underlying storage
> > > device is very fast, then maybe try with commit=1 mount option.
> > >
> > > Analysis
> > > ==========
> > > It seems a file's updates can be a part of two transaction tid.
> > > Below are the sequence of events which could cause this issue.
> > >
> > > jbd2_handle_start -> (t_tid = 38)
> > > __ext4_new_inode
> > > ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid = 38)
> > > <track more updates>
> > > jbd2_start_commit -> (t_tid = 38)
> > >
> > > jbd2_handle_start (tid = 39)
> > > ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid 39)
> > >     -> ext4_fc_reset_inode & ei->i_sync_tid = t_tid
> > >
> > > ext4_fc_commit_start -> (will wait since jbd2 full commit is in progress)
> > > jbd2_end_commit (t_tid = 38)
> > >     -> jbd2_fc_cleanup() -> this will cleanup entries in sbi->s_fc_q[FC_Q_MAIN]
> > >         -> And the above could result inode size as 0 as  after effect.
> > > ext4_fc_commit_stop
> > >
> > > You could find the logs for the above behavior for inode 979 at [1].
> > >
> > > -> So what is happening here is since the ei->i_fc_list is not empty
> > > (because it is already part of sb's MAIN queue), we don't add this inode
> > > again into neither sb's MAIN or STAGING queue.
> > > And after jbd2_fc_cleanup() is called from jbd2 full commit, we
> > > just remove this inode from the main queue.
> > >
> > > So as a simple fix, what I did below was to check if it is a jbd2 full commit
> > > in ext4_fc_cleanup(), and if the ei->i_sync_tid > tid, that means we
> > > need not remove that from MAIN queue. This is since neither jbd2 nor FC
> > > has committed updates of those inodes for this new txn tid yet.
> > >
> > > But below are some quick queries on this
> > > =========================================
> > >
> > > 1. why do we call ext4_fc_reset_inode() when inode tid and
> > >    running txn tid does not match?
> > This is part of a change in commit:bdc8a53a6f2f,  it fixes the issue
> > for fc tracking logic while jbd2 commit is ongoing.
>
> Thanks Xin for pointing the other issue too.
> But I think what I was mostly referring to was - calling ext4_fc_reset_inode()
> in ext4_fc_track_template().
>
Understood, I missed something here, then maybe Harshad can give some
directions for this part.

> <..>
>  391         tid = handle->h_transaction->t_tid;
>  392         mutex_lock(&ei->i_fc_lock);
>  393         if (tid == ei->i_sync_tid) {
>  394                 update = true;
>  395         } else {
>  396                 ext4_fc_reset_inode(inode);
>  397                 ei->i_sync_tid = tid;
>  398         }
>  399         ret = __fc_track_fn(inode, args, update);
>  400         mutex_unlock(&ei->i_fc_lock);
>  <..>
>
> So, yes these are few corner cases which I want to take a deeper look at.
> I vaugely understand that this reset inode is done since we anyway might have
> done the full commit for previous tid, so we can reset the inode track range.
>
> So, yes, we should carefully review this as well that if jbd2 commit happens for
> an inode which is still part of MAIN_Q, then does it make sense to still
> call ext4_fc_reset_inode() for that inode in ext4_fc_track_template()?
>
> > If the inode tid is bigger than txn tid, that means this inode may be
> > in the STAGING queue, if we reset it then it will lose the tack range.
> > I think it's a similar issue, the difference is this inode is already
>
> Do you have a test case which was failing for your issue?
> I would like to test that one too.

This issue can be triggered by generic/455 , but this is one failed
case for it. I also do not have a reproducer for this.

>
>
> > in the MAIN queue before the jbd2 commit starts.
> > And yes , I think in this case we can not remove it from the MAIN
>
> Yes. I too have a similar thought. But I still wanted to get few queries sorted
> (like point 1 & 2).
>
> > queue, but still need to clear EXT4_STATE_FC_COMMITTING right? it may
> > block some task still waiting for it.
>
> Sorry I didn't get you here. So I think we will end up in such situation
> (where ext4_fc_cleanup() is getting called for an inode with i_sync_tid > tid)
> only from full commit path right ?
> And that won't set EXT4_FC_COMMITTING for this inode right anyways no?

 I am not sure if there are any other cases, But for now we also clear
EXT4_STATE_FC_COMMITTING in the full commit path right? So maybe some
further tests are needed.

Thanks,
Xin Yin
>
> Do you mean anything else, or am I missing something here?
>
> -ritesh
>
>
> >
> > Thanks,
> > Xin Yin
> > >
> > > 2. Also is this an expected behavior from the design perspective of
> > >    fast_commit. i.e.
> > >    a. the inode can be part of two tids?
> > >    b. And that while a full commit is in progress, the inode can still
> > >    receive updates but using a new transaction tid.
> > >
> > > Frankly speaking, since I was also working on other things, so I haven't
> > > yet got the chance to completely analyze the situation yet.
> > > Once I have those things sorted, I will spend more time on this, to
> > > understand it more. Meanwhile if you already have some answers to above
> > > queries/observations, please do share those here.
> > >
> > > Links
> > > =========
> > > [1] https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/ext4/fast_commit/fc_inode_missing_updates_ino_979.txt
> > >
> > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > > ---
> > >  fs/ext4/fast_commit.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > > index 8803ba087b07..769b584c2552 100644
> > > --- a/fs/ext4/fast_commit.c
> > > +++ b/fs/ext4/fast_commit.c
> > > @@ -1252,6 +1252,8 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> > >         spin_lock(&sbi->s_fc_lock);
> > >         list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
> > >                                  i_fc_list) {
> > > +               if (full && iter->i_sync_tid > tid)
> > > +                       continue;
> > >                 list_del_init(&iter->i_fc_list);
> > >                 ext4_clear_inode_state(&iter->vfs_inode,
> > >                                        EXT4_STATE_FC_COMMITTING);
> > > --
> > > 2.31.1
> > >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED
  2022-02-23  9:37   ` Jan Kara
@ 2022-02-27 18:27     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:27 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel

Looks good.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

- Harshad

On Wed, 23 Feb 2022 at 01:37, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:09, Ritesh Harjani wrote:
> > Below commit removed all references of EXT4_FC_COMMIT_FAILED.
> > commit 0915e464cb274 ("ext4: simplify updating of fast commit stats")
> >
> > Just remove it since it is not used anymore.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Sure. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                         Honza
>
> > ---
> >  fs/ext4/fast_commit.h | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/fs/ext4/fast_commit.h b/fs/ext4/fast_commit.h
> > index 02afa52e8e41..80414dcba6e1 100644
> > --- a/fs/ext4/fast_commit.h
> > +++ b/fs/ext4/fast_commit.h
> > @@ -93,7 +93,6 @@ enum {
> >       EXT4_FC_REASON_RENAME_DIR,
> >       EXT4_FC_REASON_FALLOC_RANGE,
> >       EXT4_FC_REASON_INODE_JOURNAL_DATA,
> > -     EXT4_FC_COMMIT_FAILED,
> >       EXT4_FC_REASON_MAX
> >  };
> >
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 2/9] ext4: Fix ext4_fc_stats trace point
  2022-02-23  9:54   ` Jan Kara
@ 2022-02-27 18:29     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:29 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel, Steven Rostedt

Thanks for reporting this Steven and thanks Ritesh for fixing this.

Looks good.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

- Harshad

On Wed, 23 Feb 2022 at 01:54, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:10, Ritesh Harjani wrote:
> > ftrace's __print_symbolic() requires that any enum values used in the
> > symbol to string translation table be wrapped in a TRACE_DEFINE_ENUM
> > so that the enum value can be encoded in the ftrace ring buffer.
> >
> > This patch also fixes few other problems found in this trace point.
> > e.g. dereferencing structures in TP_printk which should not be done
> > at any cost.
> >
> > Also to avoid checkpatch warnings, this patch removes those
> > whitespaces/tab stops issues.
> >
> > Fixes: commit aa75f4d3daae ("ext4: main fast-commit commit path")
> > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Looks good (modulo Steven's nit). Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza
>
>
> > ---
> >  include/trace/events/ext4.h | 76 +++++++++++++++++++++++--------------
> >  1 file changed, 47 insertions(+), 29 deletions(-)
> >
> > diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> > index 19e957b7f941..17fb9c506e8a 100644
> > --- a/include/trace/events/ext4.h
> > +++ b/include/trace/events/ext4.h
> > @@ -95,6 +95,16 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B);
> >       { FALLOC_FL_COLLAPSE_RANGE,     "COLLAPSE_RANGE"},      \
> >       { FALLOC_FL_ZERO_RANGE,         "ZERO_RANGE"})
> >
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_NOMEM);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_SWAP_BOOT);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_RESIZE);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_RENAME_DIR);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_FALLOC_RANGE);
> > +TRACE_DEFINE_ENUM(EXT4_FC_REASON_INODE_JOURNAL_DATA);
> > +
> >  #define show_fc_reason(reason)                                               \
> >       __print_symbolic(reason,                                        \
> >               { EXT4_FC_REASON_XATTR,         "XATTR"},               \
> > @@ -2723,41 +2733,49 @@ TRACE_EVENT(ext4_fc_commit_stop,
> >
> >  #define FC_REASON_NAME_STAT(reason)                                  \
> >       show_fc_reason(reason),                                         \
> > -     __entry->sbi->s_fc_stats.fc_ineligible_reason_count[reason]
> > +     __entry->fc_ineligible_rc[reason]
> >
> >  TRACE_EVENT(ext4_fc_stats,
> > -         TP_PROTO(struct super_block *sb),
> > -
> > -         TP_ARGS(sb),
> > +     TP_PROTO(struct super_block *sb),
> >
> > -         TP_STRUCT__entry(
> > -                 __field(dev_t, dev)
> > -                 __field(struct ext4_sb_info *, sbi)
> > -                 __field(int, count)
> > -                 ),
> > +     TP_ARGS(sb),
> >
> > -         TP_fast_assign(
> > -                 __entry->dev = sb->s_dev;
> > -                 __entry->sbi = EXT4_SB(sb);
> > -                 ),
> > +     TP_STRUCT__entry(
> > +             __field(dev_t, dev)
> > +             __array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)
> > +             __field(unsigned long, fc_commits)
> > +             __field(unsigned long, fc_ineligible_commits)
> > +             __field(unsigned long, fc_numblks)
> > +     ),
> >
> > -         TP_printk("dev %d:%d fc ineligible reasons:\n"
> > -                   "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
> > -                   "num_commits:%ld, ineligible: %ld, numblks: %ld",
> > -                   MAJOR(__entry->dev), MINOR(__entry->dev),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> > -                   FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> > -                   __entry->sbi->s_fc_stats.fc_num_commits,
> > -                   __entry->sbi->s_fc_stats.fc_ineligible_commits,
> > -                   __entry->sbi->s_fc_stats.fc_numblks)
> > +     TP_fast_assign(
> > +             int i;
> >
> > +             __entry->dev = sb->s_dev;
> > +             for (i = 0; i < EXT4_FC_REASON_MAX; i++)
> > +                     __entry->fc_ineligible_rc[i] =
> > +                     EXT4_SB(sb)->s_fc_stats.fc_ineligible_reason_count[i];
> > +             __entry->fc_commits = EXT4_SB(sb)->s_fc_stats.fc_num_commits;
> > +             __entry->fc_ineligible_commits =
> > +                     EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
> > +             __entry->fc_numblks = EXT4_SB(sb)->s_fc_stats.fc_numblks;
> > +     ),
> > +
> > +     TP_printk("dev %d,%d fc ineligible reasons:\n"
> > +               "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
> > +               "num_commits:%lu, ineligible: %lu, numblks: %lu",
> > +               MAJOR(__entry->dev), MINOR(__entry->dev),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> > +               FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> > +               __entry->fc_commits, __entry->fc_ineligible_commits,
> > +               __entry->fc_numblks)
> >  );
> >
> >  #define DEFINE_TRACE_DENTRY_EVENT(__type)                            \
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC
  2022-02-23  9:41   ` Jan Kara
@ 2022-02-27 18:30     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:30 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel

Looks good.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

- Harshad

On Wed, 23 Feb 2022 at 01:41, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:12, Ritesh Harjani wrote:
> > This just puts trace_ext4_fc_commit_start(sb) & ktime_get()
> > for measuring FC commit time, after the check of whether sb
> > supports JOURNAL_FAST_COMMIT or not.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza
>
> > ---
> >  fs/ext4/fast_commit.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index bf70879bb4fe..7fb1eceef30c 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -1167,13 +1167,13 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
> >       int status = EXT4_FC_STATUS_OK, fc_bufs_before = 0;
> >       ktime_t start_time, commit_time;
> >
> > +     if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
> > +             return jbd2_complete_transaction(journal, commit_tid);
> > +
> >       trace_ext4_fc_commit_start(sb);
> >
> >       start_time = ktime_get();
> >
> > -     if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
> > -             return jbd2_complete_transaction(journal, commit_tid);
> > -
> >  restart_fc:
> >       ret = jbd2_fc_begin_commit(journal, commit_tid);
> >       if (ret == -EALREADY) {
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 5/9] ext4: Add commit_tid info in jbd debug log
  2022-02-23  9:42   ` Jan Kara
@ 2022-02-27 18:31     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:31 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel

Looks good.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

- Harshad

On Wed, 23 Feb 2022 at 01:42, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:13, Ritesh Harjani wrote:
> > This adds commit_tid argument in ext4_fc_update_stats()
> > so that we can add this information too in jbd_debug logs.
> > This is also required in a later patch to pass the commit_tid info in
> > ext4_fc_commit_start/stop() trace events.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza
>
> > ---
> >  fs/ext4/fast_commit.c | 15 +++++++++------
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index 7fb1eceef30c..ee32aac0cbbf 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -1127,11 +1127,12 @@ static int ext4_fc_perform_commit(journal_t *journal)
> >  }
> >
> >  static void ext4_fc_update_stats(struct super_block *sb, int status,
> > -                              u64 commit_time, int nblks)
> > +                              u64 commit_time, int nblks, tid_t commit_tid)
> >  {
> >       struct ext4_fc_stats *stats = &EXT4_SB(sb)->s_fc_stats;
> >
> > -     jbd_debug(1, "Fast commit ended with status = %d", status);
> > +     jbd_debug(1, "Fast commit ended with status = %d for tid %u",
> > +                     status, commit_tid);
> >       if (status == EXT4_FC_STATUS_OK) {
> >               stats->fc_num_commits++;
> >               stats->fc_numblks += nblks;
> > @@ -1181,14 +1182,16 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
> >               if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
> >                       commit_tid > journal->j_commit_sequence)
> >                       goto restart_fc;
> > -             ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0);
> > +             ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0,
> > +                             commit_tid);
> >               return 0;
> >       } else if (ret) {
> >               /*
> >                * Commit couldn't start. Just update stats and perform a
> >                * full commit.
> >                */
> > -             ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0);
> > +             ext4_fc_update_stats(sb, EXT4_FC_STATUS_FAILED, 0, 0,
> > +                             commit_tid);
> >               return jbd2_complete_transaction(journal, commit_tid);
> >       }
> >
> > @@ -1220,12 +1223,12 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
> >        * don't react too strongly to vast changes in the commit time
> >        */
> >       commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
> > -     ext4_fc_update_stats(sb, status, commit_time, nblks);
> > +     ext4_fc_update_stats(sb, status, commit_time, nblks, commit_tid);
> >       return ret;
> >
> >  fallback:
> >       ret = jbd2_fc_end_commit_fallback(journal);
> > -     ext4_fc_update_stats(sb, status, 0, 0);
> > +     ext4_fc_update_stats(sb, status, 0, 0, commit_tid);
> >       return ret;
> >  }
> >
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events
  2022-02-23  9:44   ` Jan Kara
@ 2022-02-27 18:31     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:31 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel

Looks good.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

- Harshad

On Wed, 23 Feb 2022 at 01:44, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:14, Ritesh Harjani wrote:
> > This adds commit_tid info in ext4_fc_commit_start/stop which is helpful
> > in debugging fast_commit issues.
> >
> > For e.g. issues where due to jbd2 journal full commit, FC miss to commit
> > updates to a file.
> >
> > Also improves TP_prink format string i.e. all ext4 and jbd2 trace events
> > starts with "dev MAjOR,MINOR". Let's follow the same convention while we
> > are still at it.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza
>
> > ---
> >  fs/ext4/fast_commit.c       |  4 ++--
> >  include/trace/events/ext4.h | 21 +++++++++++++--------
> >  2 files changed, 15 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index ee32aac0cbbf..8803ba087b07 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -1150,7 +1150,7 @@ static void ext4_fc_update_stats(struct super_block *sb, int status,
> >       } else {
> >               stats->fc_skipped_commits++;
> >       }
> > -     trace_ext4_fc_commit_stop(sb, nblks, status);
> > +     trace_ext4_fc_commit_stop(sb, nblks, status, commit_tid);
> >  }
> >
> >  /*
> > @@ -1171,7 +1171,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
> >       if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
> >               return jbd2_complete_transaction(journal, commit_tid);
> >
> > -     trace_ext4_fc_commit_start(sb);
> > +     trace_ext4_fc_commit_start(sb, commit_tid);
> >
> >       start_time = ktime_get();
> >
> > diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> > index cd09dccea502..6e66cb7ce624 100644
> > --- a/include/trace/events/ext4.h
> > +++ b/include/trace/events/ext4.h
> > @@ -2685,26 +2685,29 @@ TRACE_EVENT(ext4_fc_replay,
> >  );
> >
> >  TRACE_EVENT(ext4_fc_commit_start,
> > -     TP_PROTO(struct super_block *sb),
> > +     TP_PROTO(struct super_block *sb, tid_t commit_tid),
> >
> > -     TP_ARGS(sb),
> > +     TP_ARGS(sb, commit_tid),
> >
> >       TP_STRUCT__entry(
> >               __field(dev_t, dev)
> > +             __field(tid_t, tid)
> >       ),
> >
> >       TP_fast_assign(
> >               __entry->dev = sb->s_dev;
> > +             __entry->tid = commit_tid;
> >       ),
> >
> > -     TP_printk("fast_commit started on dev %d,%d",
> > -               MAJOR(__entry->dev), MINOR(__entry->dev))
> > +     TP_printk("dev %d,%d tid %u", MAJOR(__entry->dev), MINOR(__entry->dev),
> > +               __entry->tid)
> >  );
> >
> >  TRACE_EVENT(ext4_fc_commit_stop,
> > -         TP_PROTO(struct super_block *sb, int nblks, int reason),
> > +         TP_PROTO(struct super_block *sb, int nblks, int reason,
> > +                  tid_t commit_tid),
> >
> > -     TP_ARGS(sb, nblks, reason),
> > +     TP_ARGS(sb, nblks, reason, commit_tid),
> >
> >       TP_STRUCT__entry(
> >               __field(dev_t, dev)
> > @@ -2713,6 +2716,7 @@ TRACE_EVENT(ext4_fc_commit_stop,
> >               __field(int, num_fc)
> >               __field(int, num_fc_ineligible)
> >               __field(int, nblks_agg)
> > +             __field(tid_t, tid)
> >       ),
> >
> >       TP_fast_assign(
> > @@ -2723,12 +2727,13 @@ TRACE_EVENT(ext4_fc_commit_stop,
> >               __entry->num_fc_ineligible =
> >                       EXT4_SB(sb)->s_fc_stats.fc_ineligible_commits;
> >               __entry->nblks_agg = EXT4_SB(sb)->s_fc_stats.fc_numblks;
> > +             __entry->tid = commit_tid;
> >       ),
> >
> > -     TP_printk("fc on [%d,%d] nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d",
> > +     TP_printk("dev %d,%d nblks %d, reason %d, fc = %d, ineligible = %d, agg_nblks %d, tid %u",
> >                 MAJOR(__entry->dev), MINOR(__entry->dev),
> >                 __entry->nblks, __entry->reason, __entry->num_fc,
> > -               __entry->num_fc_ineligible, __entry->nblks_agg)
> > +               __entry->num_fc_ineligible, __entry->nblks_agg, __entry->tid)
> >  );
> >
> >  #define FC_REASON_NAME_STAT(reason)                                  \
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention
  2022-02-23  9:45   ` Jan Kara
@ 2022-02-27 18:32     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:32 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel

Looks good.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>

- Harshad

On Wed, 23 Feb 2022 at 01:45, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:15, Ritesh Harjani wrote:
> > All ext4 & jbd2 trace events starts with "dev Major:Minor".
> > While we are still improving/adding the ftrace events for FC,
> > let's fix last two remaining trace events to follow the same
> > convention.
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> OK. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza
>
> > ---
> >  include/trace/events/ext4.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> > index 6e66cb7ce624..233dbffa5ceb 100644
> > --- a/include/trace/events/ext4.h
> > +++ b/include/trace/events/ext4.h
> > @@ -2653,7 +2653,7 @@ TRACE_EVENT(ext4_fc_replay_scan,
> >               __entry->off = off;
> >       ),
> >
> > -     TP_printk("FC scan pass on dev %d,%d: error %d, off %d",
> > +     TP_printk("dev %d,%d error %d, off %d",
> >                 MAJOR(__entry->dev), MINOR(__entry->dev),
> >                 __entry->error, __entry->off)
> >  );
> > @@ -2679,7 +2679,7 @@ TRACE_EVENT(ext4_fc_replay,
> >               __entry->priv2 = priv2;
> >       ),
> >
> > -     TP_printk("FC Replay %d,%d: tag %d, ino %d, data1 %d, data2 %d",
> > +     TP_printk("dev %d,%d: tag %d, ino %d, data1 %d, data2 %d",
> >                 MAJOR(__entry->dev), MINOR(__entry->dev),
> >                 __entry->tag, __entry->ino, __entry->priv1, __entry->priv2)
> >  );
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class
  2022-02-23  9:49   ` Jan Kara
@ 2022-02-27 18:35     ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 18:35 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	linux-fsdevel, linux-kernel

Nice! Thanks for fixing this.

Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>


- Harshad

On Wed, 23 Feb 2022 at 01:49, Jan Kara <jack@suse.cz> wrote:
>
> On Wed 23-02-22 02:04:16, Ritesh Harjani wrote:
> > One should use DECLARE_EVENT_CLASS for similar event types instead of
> > defining TRACE_EVENT for each event type. This is helpful in reducing
> > the text section footprint for e.g. [1]
> >
> > [1]: https://lwn.net/Articles/381064/
> >
> > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza
>
> > ---
> >  include/trace/events/ext4.h | 57 +++++++++++++++++++++----------------
> >  1 file changed, 32 insertions(+), 25 deletions(-)
> >
> > diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> > index 233dbffa5ceb..33a059d845d6 100644
> > --- a/include/trace/events/ext4.h
> > +++ b/include/trace/events/ext4.h
> > @@ -2783,33 +2783,40 @@ TRACE_EVENT(ext4_fc_stats,
> >                 __entry->fc_numblks)
> >  );
> >
> > -#define DEFINE_TRACE_DENTRY_EVENT(__type)                            \
> > -     TRACE_EVENT(ext4_fc_track_##__type,                             \
> > -         TP_PROTO(struct inode *inode, struct dentry *dentry, int ret), \
> > -                                                                     \
> > -         TP_ARGS(inode, dentry, ret),                                \
> > -                                                                     \
> > -         TP_STRUCT__entry(                                           \
> > -                 __field(dev_t, dev)                                 \
> > -                 __field(int, ino)                                   \
> > -                 __field(int, error)                                 \
> > -                 ),                                                  \
> > -                                                                     \
> > -         TP_fast_assign(                                             \
> > -                 __entry->dev = inode->i_sb->s_dev;                  \
> > -                 __entry->ino = inode->i_ino;                        \
> > -                 __entry->error = ret;                               \
> > -                 ),                                                  \
> > -                                                                     \
> > -         TP_printk("dev %d:%d, inode %d, error %d, fc_%s",           \
> > -                   MAJOR(__entry->dev), MINOR(__entry->dev),         \
> > -                   __entry->ino, __entry->error,                     \
> > -                   #__type)                                          \
> > +DECLARE_EVENT_CLASS(ext4_fc_track_dentry,
> > +
> > +     TP_PROTO(struct inode *inode, struct dentry *dentry, int ret),
> > +
> > +     TP_ARGS(inode, dentry, ret),
> > +
> > +     TP_STRUCT__entry(
> > +             __field(dev_t, dev)
> > +             __field(int, ino)
> > +             __field(int, error)
> > +     ),
> > +
> > +     TP_fast_assign(
> > +             __entry->dev = inode->i_sb->s_dev;
> > +             __entry->ino = inode->i_ino;
> > +             __entry->error = ret;
> > +     ),
> > +
> > +     TP_printk("dev %d,%d, inode %d, error %d",
> > +               MAJOR(__entry->dev), MINOR(__entry->dev),
> > +               __entry->ino, __entry->error
> >       )
> > +);
> > +
> > +#define DEFINE_EVENT_CLASS_TYPE(__type)                                      \
> > +DEFINE_EVENT(ext4_fc_track_dentry, ext4_fc_track_##__type,           \
> > +     TP_PROTO(struct inode *inode, struct dentry *dentry, int ret),  \
> > +     TP_ARGS(inode, dentry, ret)                                     \
> > +)
> > +
> >
> > -DEFINE_TRACE_DENTRY_EVENT(create);
> > -DEFINE_TRACE_DENTRY_EVENT(link);
> > -DEFINE_TRACE_DENTRY_EVENT(unlink);
> > +DEFINE_EVENT_CLASS_TYPE(create);
> > +DEFINE_EVENT_CLASS_TYPE(link);
> > +DEFINE_EVENT_CLASS_TYPE(unlink);
> >
> >  TRACE_EVENT(ext4_fc_track_inode,
> >           TP_PROTO(struct inode *inode, int ret),
> > --
> > 2.31.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [External] [RFC 9/9] ext4: fast_commit missing tracking updates to a file
  2022-02-24 11:43       ` Xin Yin
@ 2022-02-27 20:51         ` harshad shirwadkar
  0 siblings, 0 replies; 35+ messages in thread
From: harshad shirwadkar @ 2022-02-27 20:51 UTC (permalink / raw)
  To: Xin Yin
  Cc: Ritesh Harjani, Ext4 Developers List, Theodore Ts'o,
	Jan Kara, linux-fsdevel, linux-kernel

> > So, yes these are few corner cases which I want to take a deeper look at.
> > I vaugely understand that this reset inode is done since we anyway might have
> > done the full commit for previous tid, so we can reset the inode track range.
> >
> > So, yes, we should carefully review this as well that if jbd2 commit happens for
> > an inode which is still part of MAIN_Q, then does it make sense to still
> > call ext4_fc_reset_inode() for that inode in ext4_fc_track_template()?
What this code assumes here is that if tid != ei->i_sync_tid, then the
fast commit information present in the inode cannot be trusted. This
is useful when the inode is added to the fast commit queue for the
first ever or first time during the current transaction. Also note the
"update" parameter, if update is set to false, then the track
functions know that this is the first time the track function is
called since the last (fast / full) commit.

I think the simple invariant that we need to follow is that, once an
inode is committed (either by fast or full commit)
ext4_fc_reset_inode() should be called exactly once before any track
functions get called. So, if we can ensure that reset gets called on
all inodes that were committed by fast commit / full commit exactly
once before any track functions update the fc info, then we don't
really need this reset call here.

> > > > 2. Also is this an expected behavior from the design perspective of
> > > >    fast_commit. i.e.
> > > >    a. the inode can be part of two tids?
From a design perspective, in fast commits, inode updates are not
strictly tied to tids. We reuse i_sync_tid only to detect if the
updates to an inode are committed or not. But at a high level, inode
can be in 3 states from fc perspective - (1) inode is not modified
since last fast commit / full commit (2) inode is modified since last
fast commit / full commit (3) inode is being committed by fast commit.
Well, this makes me think that these states can also be represented by
inode states, and maybe if we do that we can get rid of reliance on
i_sync_tid altogether? This needs a bit more thought.
> > > >    b. And that while a full commit is in progress, the inode can still
> > > >    receive updates but using a new transaction tid.
Yeah, inode can still receive updates if a full commit is ongoing. If
a fast commit is ongoing on a particular update, then the inode will
not receive updates. EXT4_FC_STATE_COMMITTING will protect against it.
In the new patch that I'm working on (sorry for the delay on that),
what I'm doing is that handles that modify inode wait if the inode is
being committed.

- Harshad
> > > >
> > > > Frankly speaking, since I was also working on other things, so I haven't
> > > > yet got the chance to completely analyze the situation yet.
> > > > Once I have those things sorted, I will spend more time on this, to
> > > > understand it more. Meanwhile if you already have some answers to above
> > > > queries/observations, please do share those here.
> > > >
> > > > Links
> > > > =========
> > > > [1] https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/ext4/fast_commit/fc_inode_missing_updates_ino_979.txt
> > > >
> > > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
> > > > ---
> > > >  fs/ext4/fast_commit.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > > > index 8803ba087b07..769b584c2552 100644
> > > > --- a/fs/ext4/fast_commit.c
> > > > +++ b/fs/ext4/fast_commit.c
> > > > @@ -1252,6 +1252,8 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> > > >         spin_lock(&sbi->s_fc_lock);
> > > >         list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
> > > >                                  i_fc_list) {
> > > > +               if (full && iter->i_sync_tid > tid)
> > > > +                       continue;
> > > >                 list_del_init(&iter->i_fc_list);
> > > >                 ext4_clear_inode_state(&iter->vfs_inode,
> > > >                                        EXT4_STATE_FC_COMMITTING);
> > > > --
> > > > 2.31.1
> > > >

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure
  2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
                   ` (8 preceding siblings ...)
  2022-02-22 20:34 ` [RFC 9/9] ext4: fast_commit missing tracking updates to a file Ritesh Harjani
@ 2022-03-09 17:48 ` Theodore Ts'o
  2022-03-10  1:22   ` Ritesh Harjani
  9 siblings, 1 reply; 35+ messages in thread
From: Theodore Ts'o @ 2022-03-09 17:48 UTC (permalink / raw)
  To: Ritesh Harjani
  Cc: linux-ext4, Harshad Shirwadkar, Jan Kara, linux-fsdevel, linux-kernel

Ritesh,

Were you going to be sending a revised version of this patch series?

Thanks!

					- Ted

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure
  2022-03-09 17:48 ` [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Theodore Ts'o
@ 2022-03-10  1:22   ` Ritesh Harjani
  0 siblings, 0 replies; 35+ messages in thread
From: Ritesh Harjani @ 2022-03-10  1:22 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: linux-ext4, Harshad Shirwadkar, Jan Kara, linux-fsdevel, linux-kernel

On 22/03/09 12:48PM, Theodore Ts'o wrote:
> Ritesh,
>
> Were you going to be sending a revised version of this patch series?

Hello Ted,

Due to some unexpected guests at home, I was on leave since last weekend.
I am starting to work from today. Let me work on the revised version of this
patch series. I will try to complete it before end of day i.e. before
our call.


-ritesh

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2022-03-10  1:22 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
2022-02-22 20:34 ` [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED Ritesh Harjani
2022-02-23  9:37   ` Jan Kara
2022-02-27 18:27     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 2/9] ext4: Fix ext4_fc_stats trace point Ritesh Harjani
2022-02-22 20:52   ` Steven Rostedt
2022-02-23  9:54   ` Jan Kara
2022-02-27 18:29     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 3/9] ext4: Add couple of more fast_commit tracepoints Ritesh Harjani
2022-02-23  9:40   ` Jan Kara
2022-02-23 10:11     ` Ritesh Harjani
2022-02-23 11:53       ` Jan Kara
2022-02-23 12:04         ` Ritesh Harjani
2022-02-22 20:34 ` [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC Ritesh Harjani
2022-02-23  9:41   ` Jan Kara
2022-02-27 18:30     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 5/9] ext4: Add commit_tid info in jbd debug log Ritesh Harjani
2022-02-23  9:42   ` Jan Kara
2022-02-27 18:31     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events Ritesh Harjani
2022-02-23  9:44   ` Jan Kara
2022-02-27 18:31     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention Ritesh Harjani
2022-02-23  9:45   ` Jan Kara
2022-02-27 18:32     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class Ritesh Harjani
2022-02-23  9:49   ` Jan Kara
2022-02-27 18:35     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 9/9] ext4: fast_commit missing tracking updates to a file Ritesh Harjani
2022-02-23  3:50   ` [External] " Xin Yin
2022-02-23 13:58     ` Ritesh Harjani
2022-02-24 11:43       ` Xin Yin
2022-02-27 20:51         ` harshad shirwadkar
2022-03-09 17:48 ` [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Theodore Ts'o
2022-03-10  1:22   ` Ritesh Harjani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.