Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] ext4: move buffer_mapped() to proper position
@ 2020-07-31 16:10 Xianting Tian
  2020-08-07 20:02 ` tytso
  0 siblings, 1 reply; 3+ messages in thread
From: Xianting Tian @ 2020-07-31 16:10 UTC (permalink / raw)
  To: viro, tytso, adilger.kernel; +Cc: linux-fsdevel, linux-kernel, linux-ext4

As you know, commit a17712c8 has added below code to aviod a
crash( 'BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc) when
device hot-removed(a physical device is unpluged from pcie slot
or a nbd device's network is shutdown).
static int ext4_commit_super():
 	if (!sbh || block_device_ejected(sb))
 		return error;
+
+	/*
+	 * The superblock bh should be mapped, but it might not be if the
+	 * device was hot-removed. Not much we can do but fail the I/O.
+	 */
+	if (!buffer_mapped(sbh))
+		return error;

And the call trace, which leads to the crash, as below:
ext4_commit_super()
  __sync_dirty_buffer()
    submit_bh()
      submit_bh_wbc()
        BUG_ON(!buffer_mapped(bh));

But recently we met the same crash(with very low probability) when
device hot-removed even though the kernel already contained
above exception protection code. Still, the crash is caused by
'BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc(), and the same
call trace as below.

As my understanding and below code,there are still some more
codes needs to run between 'buffer_mapped(sbh)'(which is added
by commit a17712c8) and 'BUG_ON(!buffer_mapped(bh))' in
submit_bh_wbc(), especially lock_buffer is called two times(sometimes,
it may take more times to get the lock). So when do the test of
device hot-remove, there is low probability that the sbh is mapped
when executing 'buffer_mapped(sbh)'(which is added by commit a17712c8)
but sbh is not mapped when executing 'BUG_ON(!buffer_mapped(bh))'
in submit_bh_wbc().
Code path:
ext4_commit_super
    judge if 'buffer_mapped(sbh)' is false, return <== commit a17712c8
          lock_buffer(sbh)
          ...
          unlock_buffer(sbh)
               __sync_dirty_buffer(sbh,...
                    lock_buffer(sbh)
                        judge if 'buffer_mapped(sbh))' is false, return <== added by this patch
                            submit_bh(...,sbh)
                                submit_bh_wbc(...,sbh,...)

This patch is to move the check of 'buffer_mapped(sbh)' to the place just
before calling 'BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc().

[100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc()
[100722.966503] invalid opcode: 0000 [#1] SMP
[100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000
[100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190
[100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246
[100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000
[100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001
[100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d
[100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800
[100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000
[100722.966580] FS:  00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000
[100722.966580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0
[100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[100722.966583] PKRU: 55555554
[100722.966583] Call Trace:
[100722.966588]  __sync_dirty_buffer+0x6e/0xd0
[100722.966614]  ext4_commit_super+0x1d8/0x290 [ext4]
[100722.966626]  __ext4_std_error+0x78/0x100 [ext4]
[100722.966635]  ? __ext4_journal_get_write_access+0xca/0x120 [ext4]
[100722.966646]  ext4_reserve_inode_write+0x58/0xb0 [ext4]
[100722.966655]  ? ext4_dirty_inode+0x48/0x70 [ext4]
[100722.966663]  ext4_mark_inode_dirty+0x53/0x1e0 [ext4]
[100722.966671]  ? __ext4_journal_start_sb+0x6d/0xf0 [ext4]
[100722.966679]  ext4_dirty_inode+0x48/0x70 [ext4]
[100722.966682]  __mark_inode_dirty+0x17f/0x350
[100722.966686]  generic_update_time+0x87/0xd0
[100722.966687]  touch_atime+0xa9/0xd0
[100722.966690]  generic_file_read_iter+0xa09/0xcd0
[100722.966694]  ? page_cache_tree_insert+0xb0/0xb0
[100722.966704]  ext4_file_read_iter+0x4a/0x100 [ext4]
[100722.966707]  ? __inode_security_revalidate+0x4f/0x60
[100722.966709]  __vfs_read+0xec/0x160
[100722.966711]  vfs_read+0x8c/0x130
[100722.966712]  SyS_pread64+0x87/0xb0
[100722.966716]  do_syscall_64+0x67/0x1b0
[100722.966719]  entry_SYSCALL64_slow_path+0x25/0x25

Signed-off-by: Xianting Tian <xianting_tian@126.com>
---
 fs/buffer.c     | 9 +++++++++
 fs/ext4/super.c | 7 -------
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 64fe82e..75a8849 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3160,6 +3160,15 @@ int __sync_dirty_buffer(struct buffer_head *bh, int op_flags)
 	WARN_ON(atomic_read(&bh->b_count) < 1);
 	lock_buffer(bh);
 	if (test_clear_buffer_dirty(bh)) {
+		/*
+		 * The bh should be mapped, but it might not be if the
+		 * device was hot-removed. Not much we can do but fail the I/O.
+		 */
+		if (!buffer_mapped(bh)) {
+			unlock_buffer(bh);
+			return -EIO;
+		}
+
 		get_bh(bh);
 		bh->b_end_io = end_buffer_write_sync;
 		ret = submit_bh(REQ_OP_WRITE, op_flags, bh);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 330957e..1c22044 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5171,13 +5171,6 @@ static int ext4_commit_super(struct super_block *sb, int sync)
 		return error;
 
 	/*
-	 * The superblock bh should be mapped, but it might not be if the
-	 * device was hot-removed. Not much we can do but fail the I/O.
-	 */
-	if (!buffer_mapped(sbh))
-		return error;
-
-	/*
 	 * If the file system is mounted read-only, don't update the
 	 * superblock write time.  This avoids updating the superblock
 	 * write time when we are mounting the root file system
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: move buffer_mapped() to proper position
  2020-07-31 16:10 [PATCH] ext4: move buffer_mapped() to proper position Xianting Tian
@ 2020-08-07 20:02 ` tytso
  2020-08-07 22:01   ` Andreas Dilger
  0 siblings, 1 reply; 3+ messages in thread
From: tytso @ 2020-08-07 20:02 UTC (permalink / raw)
  To: Xianting Tian
  Cc: viro, adilger.kernel, linux-fsdevel, linux-kernel, linux-ext4

Thanks, applied, although I rewrote the commit description to make it
be a bit more clearer:

    fs: prevent BUG_ON in submit_bh_wbc()
    
    If a device is hot-removed --- for example, when a physical device is
    unplugged from pcie slot or a nbd device's network is shutdown ---
    this can result in a BUG_ON() crash in submit_bh_wbc().  This is
    because the when the block device dies, the buffer heads will have
    their Buffer_Mapped flag get cleared, leading to the crash in
    submit_bh_wbc.
    
    We had attempted to work around this problem in commit a17712c8
    ("ext4: check superblock mapped prior to committing").  Unfortunately,
    it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the
    device dies between when the work-around check in ext4_commit_super()
    and when submit_bh_wbh() is finally called:
    
    Code path:
    ext4_commit_super
        judge if 'buffer_mapped(sbh)' is false, return <== commit a17712c8
              lock_buffer(sbh)
              ...
              unlock_buffer(sbh)
                   __sync_dirty_buffer(sbh,...
                        lock_buffer(sbh)
                            judge if 'buffer_mapped(sbh))' is false, return <== added by this patch
                                submit_bh(...,sbh)
                                    submit_bh_wbc(...,sbh,...)
    
    [100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc()
    [100722.966503] invalid opcode: 0000 [#1] SMP
    [100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000
    [100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190
    [100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246
    [100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000
    [100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001
    [100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d
    [100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800
    [100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000
    [100722.966580] FS:  00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000
    [100722.966580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0
    [100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [100722.966583] PKRU: 55555554
    [100722.966583] Call Trace:
    [100722.966588]  __sync_dirty_buffer+0x6e/0xd0
    [100722.966614]  ext4_commit_super+0x1d8/0x290 [ext4]
    [100722.966626]  __ext4_std_error+0x78/0x100 [ext4]
    [100722.966635]  ? __ext4_journal_get_write_access+0xca/0x120 [ext4]
    [100722.966646]  ext4_reserve_inode_write+0x58/0xb0 [ext4]
    [100722.966655]  ? ext4_dirty_inode+0x48/0x70 [ext4]
    [100722.966663]  ext4_mark_inode_dirty+0x53/0x1e0 [ext4]
    [100722.966671]  ? __ext4_journal_start_sb+0x6d/0xf0 [ext4]
    [100722.966679]  ext4_dirty_inode+0x48/0x70 [ext4]
    [100722.966682]  __mark_inode_dirty+0x17f/0x350
    [100722.966686]  generic_update_time+0x87/0xd0
    [100722.966687]  touch_atime+0xa9/0xd0
    [100722.966690]  generic_file_read_iter+0xa09/0xcd0
    [100722.966694]  ? page_cache_tree_insert+0xb0/0xb0
    [100722.966704]  ext4_file_read_iter+0x4a/0x100 [ext4]
    [100722.966707]  ? __inode_security_revalidate+0x4f/0x60
    [100722.966709]  __vfs_read+0xec/0x160
    [100722.966711]  vfs_read+0x8c/0x130
    [100722.966712]  SyS_pread64+0x87/0xb0
    [100722.966716]  do_syscall_64+0x67/0x1b0
    [100722.966719]  entry_SYSCALL64_slow_path+0x25/0x25
    
    To address this, add the check of 'buffer_mapped(bh)' to
    __sync_dirty_buffer().  This also has the benefit of fixing this for
    other file systems.
    
    With this addition, we can drop the workaround in ext4_commit_supper().
    
    [ Commit description rewritten by tytso. ]
    
    Signed-off-by: Xianting Tian <xianting_tian@126.com>
    Link: https://lore.kernel.org/r/1596211825-8750-1-git-send-email-xianting_tian@126.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>

							- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: move buffer_mapped() to proper position
  2020-08-07 20:02 ` tytso
@ 2020-08-07 22:01   ` Andreas Dilger
  0 siblings, 0 replies; 3+ messages in thread
From: Andreas Dilger @ 2020-08-07 22:01 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Xianting Tian, Al Viro, linux-fsdevel, Linux Kernel Mailing List,
	Ext4 Developers List


[-- Attachment #1: Type: text/plain, Size: 4678 bytes --]


> On Aug 7, 2020, at 2:02 PM, tytso@mit.edu wrote:
> 
> Thanks, applied, although I rewrote the commit description to make it
> be a bit more clearer:
> 
>    fs: prevent BUG_ON in submit_bh_wbc()
> 
>    If a device is hot-removed --- for example, when a physical device is
>    unplugged from pcie slot or a nbd device's network is shutdown ---
>    this can result in a BUG_ON() crash in submit_bh_wbc().  This is
>    because the when the block device dies, the buffer heads will have
>    their Buffer_Mapped flag get cleared, leading to the crash in
>    submit_bh_wbc.
> 
>    We had attempted to work around this problem in commit a17712c8

Should this get a "Fixes:" label with this info, rather than embedding
it in the commit message, so that it could be picked up by stable?

Cheers, Andreas

>    ("ext4: check superblock mapped prior to committing").  Unfortunately,
>    it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the
>    device dies between when the work-around check in ext4_commit_super()
>    and when submit_bh_wbh() is finally called:
> 
>    Code path:
>    ext4_commit_super
>        judge if 'buffer_mapped(sbh)' is false, return <== commit a17712c8
>              lock_buffer(sbh)
>              ...
>              unlock_buffer(sbh)
>                   __sync_dirty_buffer(sbh,...
>                        lock_buffer(sbh)
>                            judge if 'buffer_mapped(sbh))' is false, return <== added by this patch
>                                submit_bh(...,sbh)
>                                    submit_bh_wbc(...,sbh,...)
> 
>    [100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc()
>    [100722.966503] invalid opcode: 0000 [#1] SMP
>    [100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000
>    [100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190
>    [100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246
>    [100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000
>    [100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001
>    [100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d
>    [100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800
>    [100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000
>    [100722.966580] FS:  00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000
>    [100722.966580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    [100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0
>    [100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>    [100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>    [100722.966583] PKRU: 55555554
>    [100722.966583] Call Trace:
>    [100722.966588]  __sync_dirty_buffer+0x6e/0xd0
>    [100722.966614]  ext4_commit_super+0x1d8/0x290 [ext4]
>    [100722.966626]  __ext4_std_error+0x78/0x100 [ext4]
>    [100722.966635]  ? __ext4_journal_get_write_access+0xca/0x120 [ext4]
>    [100722.966646]  ext4_reserve_inode_write+0x58/0xb0 [ext4]
>    [100722.966655]  ? ext4_dirty_inode+0x48/0x70 [ext4]
>    [100722.966663]  ext4_mark_inode_dirty+0x53/0x1e0 [ext4]
>    [100722.966671]  ? __ext4_journal_start_sb+0x6d/0xf0 [ext4]
>    [100722.966679]  ext4_dirty_inode+0x48/0x70 [ext4]
>    [100722.966682]  __mark_inode_dirty+0x17f/0x350
>    [100722.966686]  generic_update_time+0x87/0xd0
>    [100722.966687]  touch_atime+0xa9/0xd0
>    [100722.966690]  generic_file_read_iter+0xa09/0xcd0
>    [100722.966694]  ? page_cache_tree_insert+0xb0/0xb0
>    [100722.966704]  ext4_file_read_iter+0x4a/0x100 [ext4]
>    [100722.966707]  ? __inode_security_revalidate+0x4f/0x60
>    [100722.966709]  __vfs_read+0xec/0x160
>    [100722.966711]  vfs_read+0x8c/0x130
>    [100722.966712]  SyS_pread64+0x87/0xb0
>    [100722.966716]  do_syscall_64+0x67/0x1b0
>    [100722.966719]  entry_SYSCALL64_slow_path+0x25/0x25
> 
>    To address this, add the check of 'buffer_mapped(bh)' to
>    __sync_dirty_buffer().  This also has the benefit of fixing this for
>    other file systems.
> 
>    With this addition, we can drop the workaround in ext4_commit_supper().
> 
>    [ Commit description rewritten by tytso. ]
> 
>    Signed-off-by: Xianting Tian <xianting_tian@126.com>
>    Link: https://lore.kernel.org/r/1596211825-8750-1-git-send-email-xianting_tian@126.com
>    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> 
> 							- Ted


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-31 16:10 [PATCH] ext4: move buffer_mapped() to proper position Xianting Tian
2020-08-07 20:02 ` tytso
2020-08-07 22:01   ` Andreas Dilger

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git