All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/7] ext4: shutdown should not prevent get_write_access
       [not found] <20180220023038.19883-1-tytso@mit.edu>
@ 2018-02-20  2:30 ` Theodore Ts'o
  2018-03-07  9:42   ` Jan Kara
  2018-02-20  2:30 ` [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl Theodore Ts'o
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2018-02-20  2:30 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

The ext4 forced shutdown flag needs to prevent new handles from being
started, but it needs to allow existing handles to complete.  So the
forced shutdown flag should not force ext4_journal_get_write_access to
fail.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/ext4_jbd2.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 2d593201cf7a..7c70b08d104c 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -166,13 +166,6 @@ int __ext4_journal_get_write_access(const char *where, unsigned int line,
 	might_sleep();
 
 	if (ext4_handle_valid(handle)) {
-		struct super_block *sb;
-
-		sb = handle->h_transaction->t_journal->j_private;
-		if (unlikely(ext4_forced_shutdown(EXT4_SB(sb)))) {
-			jbd2_journal_abort_handle(handle);
-			return -EIO;
-		}
 		err = jbd2_journal_get_write_access(handle, bh);
 		if (err)
 			ext4_journal_abort_handle(where, line, __func__, bh,
-- 
2.16.1.72.g5be1f00a9a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl
       [not found] <20180220023038.19883-1-tytso@mit.edu>
  2018-02-20  2:30 ` [PATCH 3/7] ext4: shutdown should not prevent get_write_access Theodore Ts'o
@ 2018-02-20  2:30 ` Theodore Ts'o
  2018-03-07  9:07   ` Jan Kara
  2018-02-20  2:30 ` [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer Theodore Ts'o
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2018-02-20  2:30 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to
avoid some races (that are now fixed), but in fact it introduced its
own race.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/ioctl.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 4d1b1575f8ac..16d3d1325f5b 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -498,10 +498,8 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg)
 		break;
 	case EXT4_GOING_FLAGS_NOLOGFLUSH:
 		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
-		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) {
-			msleep(100);
+		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal))
 			jbd2_journal_abort(sbi->s_journal, 0);
-		}
 		break;
 	default:
 		return -EINVAL;
-- 
2.16.1.72.g5be1f00a9a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer
       [not found] <20180220023038.19883-1-tytso@mit.edu>
  2018-02-20  2:30 ` [PATCH 3/7] ext4: shutdown should not prevent get_write_access Theodore Ts'o
  2018-02-20  2:30 ` [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl Theodore Ts'o
@ 2018-02-20  2:30 ` Theodore Ts'o
  2018-03-06 17:10   ` Jan Kara
  2018-02-20  2:30 ` [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail Theodore Ts'o
  2018-02-20  2:30 ` [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps Theodore Ts'o
  4 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2018-02-20  2:30 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

Previously the jbd2 layer assumed that a file system check would be
required after a journal abort.  In the case of the deliberate file
system shutdown, this should not be necessary.  Allow the jbd2 layer
to distinguish between these two cases by using the ESHUTDOWN errno.

Also add proper locking to __journal_abort_soft().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/ioctl.c   |  4 ++--
 fs/jbd2/journal.c | 25 +++++++++++++++++++------
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 16d3d1325f5b..9ac33a7cbd32 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -493,13 +493,13 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg)
 		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
 		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) {
 			(void) ext4_force_commit(sb);
-			jbd2_journal_abort(sbi->s_journal, 0);
+			jbd2_journal_abort(sbi->s_journal, -ESHUTDOWN);
 		}
 		break;
 	case EXT4_GOING_FLAGS_NOLOGFLUSH:
 		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
 		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal))
-			jbd2_journal_abort(sbi->s_journal, 0);
+			jbd2_journal_abort(sbi->s_journal, -ESHUTDOWN);
 		break;
 	default:
 		return -EINVAL;
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 3fbf48ec2188..efa0c72a0b9f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1483,12 +1483,15 @@ static void jbd2_mark_journal_empty(journal_t *journal, int write_op)
 void jbd2_journal_update_sb_errno(journal_t *journal)
 {
 	journal_superblock_t *sb = journal->j_superblock;
+	int errcode;
 
 	read_lock(&journal->j_state_lock);
-	jbd_debug(1, "JBD2: updating superblock error (errno %d)\n",
-		  journal->j_errno);
-	sb->s_errno    = cpu_to_be32(journal->j_errno);
+	errcode = journal->j_errno;
 	read_unlock(&journal->j_state_lock);
+	if (errcode == -ESHUTDOWN)
+		errcode = 0;
+	jbd_debug(1, "JBD2: updating superblock error (errno %d)\n", errcode);
+	sb->s_errno    = cpu_to_be32(errcode);
 
 	jbd2_write_superblock(journal, REQ_SYNC | REQ_FUA);
 }
@@ -2105,12 +2108,22 @@ void __jbd2_journal_abort_hard(journal_t *journal)
  * but don't do any other IO. */
 static void __journal_abort_soft (journal_t *journal, int errno)
 {
-	if (journal->j_flags & JBD2_ABORT)
-		return;
+	int old_errno;
 
-	if (!journal->j_errno)
+	write_lock(&journal->j_state_lock);
+	old_errno = journal->j_errno;
+	if (!journal->j_errno || errno == -ESHUTDOWN)
 		journal->j_errno = errno;
 
+	if (journal->j_flags & JBD2_ABORT) {
+		write_unlock(&journal->j_state_lock);
+		if (!old_errno && old_errno != -ESHUTDOWN &&
+		    errno == -ESHUTDOWN)
+			jbd2_journal_update_sb_errno(journal);
+		return;
+	}
+	write_unlock(&journal->j_state_lock);
+
 	__jbd2_journal_abort_hard(journal);
 
 	if (errno) {
-- 
2.16.1.72.g5be1f00a9a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail
       [not found] <20180220023038.19883-1-tytso@mit.edu>
                   ` (2 preceding siblings ...)
  2018-02-20  2:30 ` [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer Theodore Ts'o
@ 2018-02-20  2:30 ` Theodore Ts'o
  2018-03-07  8:55   ` Jan Kara
  2018-02-20  2:30 ` [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps Theodore Ts'o
  4 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2018-02-20  2:30 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

This updates the jbd2 superblock unnecessarily, and on an abort we
shouldn't truncate the log.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/jbd2/journal.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index efa0c72a0b9f..dfb057900e79 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -974,7 +974,7 @@ int __jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block)
 }
 
 /*
- * This is a variaon of __jbd2_update_log_tail which checks for validity of
+ * This is a variation of __jbd2_update_log_tail which checks for validity of
  * provided log tail and locks j_checkpoint_mutex. So it is safe against races
  * with other threads updating log tail.
  */
@@ -1417,6 +1417,9 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid,
 	journal_superblock_t *sb = journal->j_superblock;
 	int ret;
 
+	if (is_journal_aborted(journal))
+		return -EIO;
+
 	BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex));
 	jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n",
 		  tail_block, tail_tid);
-- 
2.16.1.72.g5be1f00a9a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps
       [not found] <20180220023038.19883-1-tytso@mit.edu>
                   ` (3 preceding siblings ...)
  2018-02-20  2:30 ` [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail Theodore Ts'o
@ 2018-02-20  2:30 ` Theodore Ts'o
  2018-03-07  9:06   ` Jan Kara
  4 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2018-02-20  2:30 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

When reading the inode or block allocation bitmap, if the bitmap needs
to be initialized, do not update the checksum in the block group
descriptor.  That's because we're not set up to journal those changes.
Instead, just set the verified bit on the bitmap block, so that it's
not necessary to validate the checksum.

When a block or inode allocation actually happens, at that point the
checksum will be calculated, and update of the bg descriptor block
will be properly journalled.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/balloc.c |  3 +--
 fs/ext4/ialloc.c | 47 +++--------------------------------------------
 2 files changed, 4 insertions(+), 46 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index f9b3e0a83526..f82c4966f4ce 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -243,8 +243,6 @@ static int ext4_init_block_bitmap(struct super_block *sb,
 	 */
 	ext4_mark_bitmap_end(num_clusters_in_group(sb, block_group),
 			     sb->s_blocksize * 8, bh->b_data);
-	ext4_block_bitmap_csum_set(sb, block_group, gdp, bh);
-	ext4_group_desc_csum_set(sb, block_group, gdp);
 	return 0;
 }
 
@@ -448,6 +446,7 @@ ext4_read_block_bitmap_nowait(struct super_block *sb, ext4_group_t block_group)
 		err = ext4_init_block_bitmap(sb, bh, block_group, desc);
 		set_bitmap_uptodate(bh);
 		set_buffer_uptodate(bh);
+		set_buffer_verified(bh);
 		ext4_unlock_group(sb, block_group);
 		unlock_buffer(bh);
 		if (err) {
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 7830d28df331..3fa93665b4a3 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -66,44 +66,6 @@ void ext4_mark_bitmap_end(int start_bit, int end_bit, char *bitmap)
 		memset(bitmap + (i >> 3), 0xff, (end_bit - i) >> 3);
 }
 
-/* Initializes an uninitialized inode bitmap */
-static int ext4_init_inode_bitmap(struct super_block *sb,
-				       struct buffer_head *bh,
-				       ext4_group_t block_group,
-				       struct ext4_group_desc *gdp)
-{
-	struct ext4_group_info *grp;
-	struct ext4_sb_info *sbi = EXT4_SB(sb);
-	J_ASSERT_BH(bh, buffer_locked(bh));
-
-	/* If checksum is bad mark all blocks and inodes use to prevent
-	 * allocation, essentially implementing a per-group read-only flag. */
-	if (!ext4_group_desc_csum_verify(sb, block_group, gdp)) {
-		grp = ext4_get_group_info(sb, block_group);
-		if (!EXT4_MB_GRP_BBITMAP_CORRUPT(grp))
-			percpu_counter_sub(&sbi->s_freeclusters_counter,
-					   grp->bb_free);
-		set_bit(EXT4_GROUP_INFO_BBITMAP_CORRUPT_BIT, &grp->bb_state);
-		if (!EXT4_MB_GRP_IBITMAP_CORRUPT(grp)) {
-			int count;
-			count = ext4_free_inodes_count(sb, gdp);
-			percpu_counter_sub(&sbi->s_freeinodes_counter,
-					   count);
-		}
-		set_bit(EXT4_GROUP_INFO_IBITMAP_CORRUPT_BIT, &grp->bb_state);
-		return -EFSBADCRC;
-	}
-
-	memset(bh->b_data, 0, (EXT4_INODES_PER_GROUP(sb) + 7) / 8);
-	ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb), sb->s_blocksize * 8,
-			bh->b_data);
-	ext4_inode_bitmap_csum_set(sb, block_group, gdp, bh,
-				   EXT4_INODES_PER_GROUP(sb) / 8);
-	ext4_group_desc_csum_set(sb, block_group, gdp);
-
-	return 0;
-}
-
 void ext4_end_bitmap_read(struct buffer_head *bh, int uptodate)
 {
 	if (uptodate) {
@@ -187,17 +149,14 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
 
 	ext4_lock_group(sb, block_group);
 	if (desc->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT)) {
-		err = ext4_init_inode_bitmap(sb, bh, block_group, desc);
+		memset(bh->b_data, 0, (EXT4_INODES_PER_GROUP(sb) + 7) / 8);
+		ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb),
+				     sb->s_blocksize * 8, bh->b_data);
 		set_bitmap_uptodate(bh);
 		set_buffer_uptodate(bh);
 		set_buffer_verified(bh);
 		ext4_unlock_group(sb, block_group);
 		unlock_buffer(bh);
-		if (err) {
-			ext4_error(sb, "Failed to init inode bitmap for group "
-				   "%u: %d", block_group, err);
-			goto out;
-		}
 		return bh;
 	}
 	ext4_unlock_group(sb, block_group);
-- 
2.16.1.72.g5be1f00a9a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer
  2018-02-20  2:30 ` [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer Theodore Ts'o
@ 2018-03-06 17:10   ` Jan Kara
  2018-03-22 15:26     ` Theodore Y. Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2018-03-06 17:10 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Mon 19-02-18 21:30:36, Theodore Ts'o wrote:
> Previously the jbd2 layer assumed that a file system check would be
> required after a journal abort.  In the case of the deliberate file
> system shutdown, this should not be necessary.  Allow the jbd2 layer
> to distinguish between these two cases by using the ESHUTDOWN errno.
> 
> Also add proper locking to __journal_abort_soft().
> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@vger.kernel.org
> ---

Maybe a cleaner api would be to have a separate JBD2 function like
jbd2_journal_abort_clean() that would just abort the journal without
setting an error to it.

> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 3fbf48ec2188..efa0c72a0b9f 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -2105,12 +2108,22 @@ void __jbd2_journal_abort_hard(journal_t *journal)
>   * but don't do any other IO. */
>  static void __journal_abort_soft (journal_t *journal, int errno)
>  {
> -	if (journal->j_flags & JBD2_ABORT)
> -		return;
> +	int old_errno;
>  
> -	if (!journal->j_errno)
> +	write_lock(&journal->j_state_lock);
> +	old_errno = journal->j_errno;
> +	if (!journal->j_errno || errno == -ESHUTDOWN)
>  		journal->j_errno = errno;
>  
> +	if (journal->j_flags & JBD2_ABORT) {
> +		write_unlock(&journal->j_state_lock);
> +		if (!old_errno && old_errno != -ESHUTDOWN &&
> +		    errno == -ESHUTDOWN)
> +			jbd2_journal_update_sb_errno(journal);
> +		return;
> +	}
> +	write_unlock(&journal->j_state_lock);
> +

Is it really correct that once the filesystem gets shutdown you clear the
previous error from the journal? Because if we hit some real fs corruption,
the journal gets aborted, and then someone calls ext4_shutdown(), we'd
clear that error which looks like a bug to me because that shutdown hardly
fixes the fs corruption...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail
  2018-02-20  2:30 ` [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail Theodore Ts'o
@ 2018-03-07  8:55   ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2018-03-07  8:55 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Mon 19-02-18 21:30:37, Theodore Ts'o wrote:
> This updates the jbd2 superblock unnecessarily, and on an abort we
> shouldn't truncate the log.
> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@vger.kernel.org

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/journal.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index efa0c72a0b9f..dfb057900e79 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -974,7 +974,7 @@ int __jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block)
>  }
>  
>  /*
> - * This is a variaon of __jbd2_update_log_tail which checks for validity of
> + * This is a variation of __jbd2_update_log_tail which checks for validity of
>   * provided log tail and locks j_checkpoint_mutex. So it is safe against races
>   * with other threads updating log tail.
>   */
> @@ -1417,6 +1417,9 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid,
>  	journal_superblock_t *sb = journal->j_superblock;
>  	int ret;
>  
> +	if (is_journal_aborted(journal))
> +		return -EIO;
> +
>  	BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex));
>  	jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n",
>  		  tail_block, tail_tid);
> -- 
> 2.16.1.72.g5be1f00a9a
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps
  2018-02-20  2:30 ` [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps Theodore Ts'o
@ 2018-03-07  9:06   ` Jan Kara
  2018-03-22 15:29     ` Theodore Y. Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2018-03-07  9:06 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Mon 19-02-18 21:30:38, Theodore Ts'o wrote:
> When reading the inode or block allocation bitmap, if the bitmap needs
> to be initialized, do not update the checksum in the block group
> descriptor.  That's because we're not set up to journal those changes.
> Instead, just set the verified bit on the bitmap block, so that it's
> not necessary to validate the checksum.
> 
> When a block or inode allocation actually happens, at that point the
> checksum will be calculated, and update of the bg descriptor block
> will be properly journalled.
> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@vger.kernel.org
> ---
>  fs/ext4/balloc.c |  3 +--
>  fs/ext4/ialloc.c | 47 +++--------------------------------------------
>  2 files changed, 4 insertions(+), 46 deletions(-)
> 
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index f9b3e0a83526..f82c4966f4ce 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -243,8 +243,6 @@ static int ext4_init_block_bitmap(struct super_block *sb,
>  	 */
>  	ext4_mark_bitmap_end(num_clusters_in_group(sb, block_group),
>  			     sb->s_blocksize * 8, bh->b_data);
> -	ext4_block_bitmap_csum_set(sb, block_group, gdp, bh);
> -	ext4_group_desc_csum_set(sb, block_group, gdp);
>  	return 0;
>  }

Probably you should remove the bad checksum handling in
ext4_init_block_bitmap() the same way as you did in
ext4_init_inode_bitmap()? Otherwise the patch looks good.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl
  2018-02-20  2:30 ` [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl Theodore Ts'o
@ 2018-03-07  9:07   ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2018-03-07  9:07 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Mon 19-02-18 21:30:35, Theodore Ts'o wrote:
> The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to
> avoid some races (that are now fixed), but in fact it introduced its
> own race.
> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@vger.kernel.org

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/ioctl.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index 4d1b1575f8ac..16d3d1325f5b 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -498,10 +498,8 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg)
>  		break;
>  	case EXT4_GOING_FLAGS_NOLOGFLUSH:
>  		set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
> -		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal)) {
> -			msleep(100);
> +		if (sbi->s_journal && !is_journal_aborted(sbi->s_journal))
>  			jbd2_journal_abort(sbi->s_journal, 0);
> -		}
>  		break;
>  	default:
>  		return -EINVAL;
> -- 
> 2.16.1.72.g5be1f00a9a
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/7] ext4: shutdown should not prevent get_write_access
  2018-02-20  2:30 ` [PATCH 3/7] ext4: shutdown should not prevent get_write_access Theodore Ts'o
@ 2018-03-07  9:42   ` Jan Kara
  2018-03-22 15:38     ` Theodore Y. Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2018-03-07  9:42 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Mon 19-02-18 21:30:34, Theodore Ts'o wrote:
> The ext4 forced shutdown flag needs to prevent new handles from being
> started, but it needs to allow existing handles to complete.  So the
> forced shutdown flag should not force ext4_journal_get_write_access to
> fail.
> 
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Cc: stable@vger.kernel.org

OK, if you want the semantics of ext4 shutdown to be that running
handles should be allowed to complete, I see where you are going with this
patch. However there are more problems with this semantics than just
__ext4_journal_get_write_access(). Just for example
ext4_reserve_inode_write() will bail in case the fs got shutdown and thus
inode changes won't be properly added to the running handle. Also places
that rely on nested transactions being possible will not work because
ext4_journal_start_sb() will refuse to get refcount of a running handle
(ext4_journal_check_start() fails) in case fs got shutdown. And I may have
missed other cases.

The above problems are a reason why I though the semantics of ext4 shutdown
was terminate the fs *now* - effectively a software equivalent of power
off. That is much easier to implement since we just have to make sure no
running handle makes it to the journal... Since I've said Google is using
ext4 shutdown - is there any reason why you need the "running handles are
allowed to finish" semantics? After all it seems it's just a race whether
some handle makes it before the cut off or not...

								Honza

> ---
>  fs/ext4/ext4_jbd2.c | 7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> index 2d593201cf7a..7c70b08d104c 100644
> --- a/fs/ext4/ext4_jbd2.c
> +++ b/fs/ext4/ext4_jbd2.c
> @@ -166,13 +166,6 @@ int __ext4_journal_get_write_access(const char *where, unsigned int line,
>  	might_sleep();
>  
>  	if (ext4_handle_valid(handle)) {
> -		struct super_block *sb;
> -
> -		sb = handle->h_transaction->t_journal->j_private;
> -		if (unlikely(ext4_forced_shutdown(EXT4_SB(sb)))) {
> -			jbd2_journal_abort_handle(handle);
> -			return -EIO;
> -		}
>  		err = jbd2_journal_get_write_access(handle, bh);
>  		if (err)
>  			ext4_journal_abort_handle(where, line, __func__, bh,
> -- 
> 2.16.1.72.g5be1f00a9a
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer
  2018-03-06 17:10   ` Jan Kara
@ 2018-03-22 15:26     ` Theodore Y. Ts'o
  2018-05-17 15:18       ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Theodore Y. Ts'o @ 2018-03-22 15:26 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ext4 Developers List, stable

On Tue, Mar 06, 2018 at 06:10:41PM +0100, Jan Kara wrote:
> Is it really correct that once the filesystem gets shutdown you clear the
> previous error from the journal? Because if we hit some real fs corruption,
> the journal gets aborted, and then someone calls ext4_shutdown(), we'd
> clear that error which looks like a bug to me because that shutdown hardly
> fixes the fs corruption...

That's not what the code does.  If journal->j_errno is set, then we
won't clear it, for precisely what concern you've articulated.

      	    	    	      	   	   - Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps
  2018-03-07  9:06   ` Jan Kara
@ 2018-03-22 15:29     ` Theodore Y. Ts'o
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Y. Ts'o @ 2018-03-22 15:29 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ext4 Developers List, stable

On Wed, Mar 07, 2018 at 10:06:07AM +0100, Jan Kara wrote:
> 
> Probably you should remove the bad checksum handling in
> ext4_init_block_bitmap() the same way as you did in
> ext4_init_inode_bitmap()? Otherwise the patch looks good.

There isn't an ext4_init_inode_bitmap() --- the functionality is
inlined ext4_read_inode_bitmap(), and it's already doing it that way
(it doesn't set the checksum; it just sets the verified bit).

    	    	    	      	      - Ted
    

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/7] ext4: shutdown should not prevent get_write_access
  2018-03-07  9:42   ` Jan Kara
@ 2018-03-22 15:38     ` Theodore Y. Ts'o
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Y. Ts'o @ 2018-03-22 15:38 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ext4 Developers List, stable

On Wed, Mar 07, 2018 at 10:42:01AM +0100, Jan Kara wrote:
> On Mon 19-02-18 21:30:34, Theodore Ts'o wrote:
> > The ext4 forced shutdown flag needs to prevent new handles from being
> > started, but it needs to allow existing handles to complete.  So the
> > forced shutdown flag should not force ext4_journal_get_write_access to
> > fail.
> > 
> > Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> > Cc: stable@vger.kernel.org
> 
> OK, if you want the semantics of ext4 shutdown to be that running
> handles should be allowed to complete, I see where you are going with this
> patch. However there are more problems with this semantics than just
> __ext4_journal_get_write_access(). Just for example
> ext4_reserve_inode_write() will bail in case the fs got shutdown and thus
> inode changes won't be properly added to the running handle. Also places
> that rely on nested transactions being possible will not work because
> ext4_journal_start_sb() will refuse to get refcount of a running handle
> (ext4_journal_check_start() fails) in case fs got shutdown. And I may have
> missed other cases.

We have three cases in ext4_shutdown:

EXT4_GOING_FLAGS_DEFUALT:
    Freezes the block device, sets the EXT4_FLAGS_SHUTDOWN flag, and then
    unfreezes the block device:

EXT4_GOING_FLAGS_LOGFLUSH:
    Sets the EXT4_FLAGS_SHUTDOWN flag, forces a commit, and then aborts
    the journal.

EXT4_GOING_FLAGS_NOLOGFLUSH
    Sets the EXT4_FLAGS_SHUTDOWN flag, aborts the journal.

> The above problems are a reason why I though the semantics of ext4 shutdown
> was terminate the fs *now* - effectively a software equivalent of power
> off. That is much easier to implement since we just have to make sure no
> running handle makes it to the journal... Since I've said Google is using
> ext4 shutdown - is there any reason why you need the "running handles are
> allowed to finish" semantics? After all it seems it's just a race whether
> some handle makes it before the cut off or not...

Good points.  I'll have to look at what happens if we just drop the
EXT4_FLAGS_SHUTDOWN flag altogether.

							- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer
  2018-03-22 15:26     ` Theodore Y. Ts'o
@ 2018-05-17 15:18       ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2018-05-17 15:18 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Jan Kara, Ext4 Developers List, stable

On Thu 22-03-18 11:26:45, Theodore Y. Ts'o wrote:
> On Tue, Mar 06, 2018 at 06:10:41PM +0100, Jan Kara wrote:
> > Is it really correct that once the filesystem gets shutdown you clear the
> > previous error from the journal? Because if we hit some real fs corruption,
> > the journal gets aborted, and then someone calls ext4_shutdown(), we'd
> > clear that error which looks like a bug to me because that shutdown hardly
> > fixes the fs corruption...
> 
> That's not what the code does.  If journal->j_errno is set, then we
> won't clear it, for precisely what concern you've articulated.

Sorry for not following up on this earlier but now I've returned to this
and I still cannot wrap my head around checks in __journal_abort_soft().
There's:

       if (!journal->j_errno || errno == -ESHUTDOWN)
               journal->j_errno = errno;

Due to this ESHUTDOWN will override anything in journal->j_errno. Why is
that?

And then:

       if (journal->j_flags & JBD2_ABORT) {
                write_unlock(&journal->j_state_lock);
                if (!old_errno && old_errno != -ESHUTDOWN &&
                    errno == -ESHUTDOWN)
                        jbd2_journal_update_sb_errno(journal);

And here can the test ever be true? Probably only if we hard-aborted the
journal. The test !old_errno && old_errno != -ESHUTDOWN definitely looks
weird and is equivalent to old_errno == 0. Furthermore the errno ==
-ESHUTDOWN part looks strange as well as in that case we'd only clear the
sb->s_errno field... So what was really the intention here?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-05-17 15:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180220023038.19883-1-tytso@mit.edu>
2018-02-20  2:30 ` [PATCH 3/7] ext4: shutdown should not prevent get_write_access Theodore Ts'o
2018-03-07  9:42   ` Jan Kara
2018-03-22 15:38     ` Theodore Y. Ts'o
2018-02-20  2:30 ` [PATCH 4/7] ext4: eliminate sleep from shutdown ioctl Theodore Ts'o
2018-03-07  9:07   ` Jan Kara
2018-02-20  2:30 ` [PATCH 5/7] ext4: pass -ESHUTDOWN code to jbd2 layer Theodore Ts'o
2018-03-06 17:10   ` Jan Kara
2018-03-22 15:26     ` Theodore Y. Ts'o
2018-05-17 15:18       ` Jan Kara
2018-02-20  2:30 ` [PATCH 6/7] jbd2: if the journal is aborted then don't allow update of the log tail Theodore Ts'o
2018-03-07  8:55   ` Jan Kara
2018-02-20  2:30 ` [PATCH 7/7] ext4: don't update checksum of new initialized bitmaps Theodore Ts'o
2018-03-07  9:06   ` Jan Kara
2018-03-22 15:29     ` Theodore Y. Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.