All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani <riteshh@linux.ibm.com>
To: linux-ext4@vger.kernel.org
Cc: Harshad Shirwadkar <harshadshirwadkar@gmail.com>,
	"Theodore Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ritesh Harjani <riteshh@linux.ibm.com>
Subject: [RFC 9/9] ext4: fast_commit missing tracking updates to a file
Date: Wed, 23 Feb 2022 02:04:17 +0530	[thread overview]
Message-ID: <e91b6872860df3ec520799a5d0b65e54ccf32407.1645558375.git.riteshh@linux.ibm.com> (raw)
In-Reply-To: <cover.1645558375.git.riteshh@linux.ibm.com>

<DO NOT MERGE THIS YET>

Testcase
==========
1. i=0; while [ $i -lt 1000 ]; do xfs_io -f -c "pwrite -S 0xaa -b 32k 0 32k" -c "fsync" /mnt/$i; i=$(($i+1)); done && sudo ./src/godown -v /mnt && sudo umount /mnt && sudo mount /dev/loop2 /mnt'
2. ls -alih /mnt/ -> In this you will observe one such file with 0 bytes (which ideally should not happen)

^^^ say if you don't see the issue because your underlying storage
device is very fast, then maybe try with commit=1 mount option.

Analysis
==========
It seems a file's updates can be a part of two transaction tid.
Below are the sequence of events which could cause this issue.

jbd2_handle_start -> (t_tid = 38)
__ext4_new_inode
ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid = 38)
<track more updates>
jbd2_start_commit -> (t_tid = 38)

jbd2_handle_start (tid = 39)
ext4_fc_track_template -> __track_inode -> (i_sync_tid = 38, t_tid 39)
    -> ext4_fc_reset_inode & ei->i_sync_tid = t_tid

ext4_fc_commit_start -> (will wait since jbd2 full commit is in progress)
jbd2_end_commit (t_tid = 38)
    -> jbd2_fc_cleanup() -> this will cleanup entries in sbi->s_fc_q[FC_Q_MAIN]
        -> And the above could result inode size as 0 as  after effect.
ext4_fc_commit_stop

You could find the logs for the above behavior for inode 979 at [1].

-> So what is happening here is since the ei->i_fc_list is not empty
(because it is already part of sb's MAIN queue), we don't add this inode
again into neither sb's MAIN or STAGING queue.
And after jbd2_fc_cleanup() is called from jbd2 full commit, we
just remove this inode from the main queue.

So as a simple fix, what I did below was to check if it is a jbd2 full commit
in ext4_fc_cleanup(), and if the ei->i_sync_tid > tid, that means we
need not remove that from MAIN queue. This is since neither jbd2 nor FC
has committed updates of those inodes for this new txn tid yet.

But below are some quick queries on this
=========================================

1. why do we call ext4_fc_reset_inode() when inode tid and
   running txn tid does not match?

2. Also is this an expected behavior from the design perspective of
   fast_commit. i.e.
   a. the inode can be part of two tids?
   b. And that while a full commit is in progress, the inode can still
   receive updates but using a new transaction tid.

Frankly speaking, since I was also working on other things, so I haven't
yet got the chance to completely analyze the situation yet.
Once I have those things sorted, I will spend more time on this, to
understand it more. Meanwhile if you already have some answers to above
queries/observations, please do share those here.

Links
=========
[1] https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/ext4/fast_commit/fc_inode_missing_updates_ino_979.txt

Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
 fs/ext4/fast_commit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 8803ba087b07..769b584c2552 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1252,6 +1252,8 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
 	spin_lock(&sbi->s_fc_lock);
 	list_for_each_entry_safe(iter, iter_n, &sbi->s_fc_q[FC_Q_MAIN],
 				 i_fc_list) {
+		if (full && iter->i_sync_tid > tid)
+			continue;
 		list_del_init(&iter->i_fc_list);
 		ext4_clear_inode_state(&iter->vfs_inode,
 				       EXT4_STATE_FC_COMMITTING);
-- 
2.31.1


  parent reply	other threads:[~2022-02-22 20:36 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-22 20:34 [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Ritesh Harjani
2022-02-22 20:34 ` [RFC 1/9] ext4: Remove unused enum EXT4_FC_COMMIT_FAILED Ritesh Harjani
2022-02-23  9:37   ` Jan Kara
2022-02-27 18:27     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 2/9] ext4: Fix ext4_fc_stats trace point Ritesh Harjani
2022-02-22 20:52   ` Steven Rostedt
2022-02-23  9:54   ` Jan Kara
2022-02-27 18:29     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 3/9] ext4: Add couple of more fast_commit tracepoints Ritesh Harjani
2022-02-23  9:40   ` Jan Kara
2022-02-23 10:11     ` Ritesh Harjani
2022-02-23 11:53       ` Jan Kara
2022-02-23 12:04         ` Ritesh Harjani
2022-02-22 20:34 ` [RFC 4/9] ext4: Do not call FC trace event if FS does not support FC Ritesh Harjani
2022-02-23  9:41   ` Jan Kara
2022-02-27 18:30     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 5/9] ext4: Add commit_tid info in jbd debug log Ritesh Harjani
2022-02-23  9:42   ` Jan Kara
2022-02-27 18:31     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 6/9] ext4: Add commit tid info in ext4_fc_commit_start/stop trace events Ritesh Harjani
2022-02-23  9:44   ` Jan Kara
2022-02-27 18:31     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 7/9] ext4: Fix remaining two trace events to use same printk convention Ritesh Harjani
2022-02-23  9:45   ` Jan Kara
2022-02-27 18:32     ` harshad shirwadkar
2022-02-22 20:34 ` [RFC 8/9] ext4: Convert ext4_fc_track_dentry type events to use event class Ritesh Harjani
2022-02-23  9:49   ` Jan Kara
2022-02-27 18:35     ` harshad shirwadkar
2022-02-22 20:34 ` Ritesh Harjani [this message]
2022-02-23  3:50   ` [External] [RFC 9/9] ext4: fast_commit missing tracking updates to a file Xin Yin
2022-02-23 13:58     ` Ritesh Harjani
2022-02-24 11:43       ` Xin Yin
2022-02-27 20:51         ` harshad shirwadkar
2022-03-09 17:48 ` [RFC 0/9] ext4: Improve FC trace events and discuss one FC failure Theodore Ts'o
2022-03-10  1:22   ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e91b6872860df3ec520799a5d0b65e54ccf32407.1645558375.git.riteshh@linux.ibm.com \
    --to=riteshh@linux.ibm.com \
    --cc=harshadshirwadkar@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.