All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Denis <denis@voxelsoft.com>
Cc: "Благодаренко Артём" <artem.blagodarenko@gmail.com>,
	linux-ext4@vger.kernel.org
Subject: Re: bug with large_dir in 5.12.17
Date: Sat, 31 Jul 2021 01:13:37 -0400	[thread overview]
Message-ID: <YQTcAfR3JX8BYj/f@mit.edu> (raw)
In-Reply-To: <7f781a3cd7114db0842dc3f291cd3f6cd826917f.camel@voxelsoft.com>

On Fri, Jul 30, 2021 at 02:15:12AM +0100, Denis wrote:
> 
> My emails to the list have been silently dropped and postmaster does
> not respond.

I'm not sure why your e-mails are getting dropped by vger.kernel.org.
I will note that voxelsoft.com appears to be on the a barracuda spam
blacklist:

https://mxtoolbox.com/SuperTool.aspx?action=blacklist%3avoxelsoft.com&run=toolpage

> http://voxelsoft.com/2021/ext4_large_dir_corruption.html

I've looked your analysis, and while I appreciate your goals:

   * a naïve approach would be to reinstate the missing goto, but
       1. such approach is likely to contribute to another mistake in the future
       2. goto considered harmful
   * instead refactor the outer scope to handle both restart conditions
     (goto agnostic)

I've looked at your proposed fix, and I think a simpler fix might be:

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 5fd56f616cf0..f3bbcd4efb56 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2517,7 +2517,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 				goto journal_error;
 			err = ext4_handle_dirty_dx_node(handle, dir,
 							frame->bh);
-			if (err)
+			if (restart || err)
 				goto journal_error;
 		} else {
 			struct dx_root *dxroot;


It is similarly "goto agnostic" in that it doesn't actually add (or
remove) any gotos.

I don't think your proposed fix improves the maintainability or
understandability for the code, so my bias would be to go for the
simpler fix.

If we were going to clean up the code, the main issues that I'd want
to address are:

*) Goto statements aren't necessarilyy evil; using "goto errout" is
   pretty common idiom, and I don't have a problem with that.
   Occasionally having a "goto again" when we need to retry some
   operation can actually be the cleanest way to code something.  But
   the combination of the two --- when "goto cleanup" and "goto
   journal_error" is used for both purposes --- is super confusing,
   and that is caused the bug in the first place.

So for example, this diff increase the goto count by one, but the code
is much more readable, because each goto and goto label do exactly one
thing:

 fs/ext4/namei.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 5fd56f616cf0..58fca88d7459 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2398,14 +2398,15 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 {
 	struct dx_frame frames[EXT4_HTREE_LEVEL], *frame;
 	struct dx_entry *entries, *at;
-	struct buffer_head *bh;
+	struct buffer_head *bh = NULL;
 	struct super_block *sb = dir->i_sb;
 	struct ext4_dir_entry_2 *de;
-	int restart;
 	int err;
 
-again:
-	restart = 0;
+	frames[0].bh = NULL;
+recalc_htree_path:
+	brelse(bh);
+	dx_release(frames);
 	frame = dx_probe(fname, dir, NULL, frames);
 	if (IS_ERR(frame))
 		return PTR_ERR(frame);
@@ -2436,7 +2437,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 		ext4_lblk_t newblock;
 		int levels = frame - frames + 1;
 		unsigned int icount;
-		int add_level = 1;
+		int add_level = 1, restart = 0;
 		struct dx_entry *entries2;
 		struct dx_node *node2;
 		struct buffer_head *bh2;
@@ -2508,9 +2509,9 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 			dxtrace(dx_show_index("node",
 			       ((struct dx_node *) bh2->b_data)->entries));
 			err = ext4_handle_dirty_dx_node(handle, dir, bh2);
+			brelse (bh2);
 			if (err)
 				goto journal_error;
-			brelse (bh2);
 			err = ext4_handle_dirty_dx_node(handle, dir,
 						   (frame - 1)->bh);
 			if (err)
@@ -2519,6 +2520,8 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 							frame->bh);
 			if (err)
 				goto journal_error;
+			if (restart)
+				goto recalc_htree_path;
 		} else {
 			struct dx_root *dxroot;
 			memcpy((char *) entries2, (char *) entries,
@@ -2538,8 +2541,9 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 				goto journal_error;
 			err = ext4_handle_dirty_dx_node(handle, dir, bh2);
 			brelse(bh2);
-			restart = 1;
-			goto journal_error;
+			if (err)
+				goto journal_error;
+			goto recalc_htree_path;
 		}
 	}
 	de = do_split(handle, dir, &bh, frame, &fname->hinfo);
@@ -2555,11 +2559,6 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 cleanup:
 	brelse(bh);
 	dx_release(frames);
-	/* @restart is true means htree-path has been changed, we need to
-	 * repeat dx_probe() to find out valid htree-path
-	 */
-	if (restart && err == 0)
-		goto again;
 	return err;
 }
 

*) The ext4_dx_add_entry() function is just too big.  It was
   borderline unwieldy before we added the large_dir feature, and that
   large_dir feature just tip it over the line.  We probably should to
   refactor separate functions for "do_split_index()" and
   "add_htree_level()".

*) The control avariables "add_level" and "restart" are an additional
   cause of complexity.  It may be that refactoring some of the
   separate functions, and simply forcing that the hree path get
   recalculated if an index node is split or the tree depth increases,
   would allow us to get rid of these two variables.

Denis, Artem, what do you think?

					- Ted
					

  parent reply	other threads:[~2021-07-31  5:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22 14:23 bug with large_dir in 5.12.17 Carlos Carvalho
2021-07-27  6:22 ` Andreas Dilger
2021-07-27 19:07   ` Carlos Carvalho
2021-07-28  0:38     ` Carlos Carvalho
2021-07-28 16:07   ` Благодаренко Артём
2021-07-28 13:56 ` Благодаренко Артём
2021-07-29 19:23 ` Благодаренко Артём
     [not found]   ` <7f781a3cd7114db0842dc3f291cd3f6cd826917f.camel@voxelsoft.com>
2021-07-31  5:13     ` Theodore Ts'o [this message]
2021-08-04 19:25   ` Theodore Ts'o
2021-08-04 21:15     ` Благодаренко Артём
2021-08-05 14:39       ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YQTcAfR3JX8BYj/f@mit.edu \
    --to=tytso@mit.edu \
    --cc=artem.blagodarenko@gmail.com \
    --cc=denis@voxelsoft.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.