All of lore.kernel.org
 help / color / mirror / Atom feed
From: yangerkun <yangerkun@huawei.com>
To: <tytso@mit.edu>, <adilger.kernel@dilger.ca>, <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>, "zhangyi (F)" <yi.zhang@huawei.com>,
	Hou Tao <houtao1@huawei.com>, <zhangxiaoxu5@huawei.com>,
	Ye Bin <yebin10@huawei.com>, <hejie3@huawei.com>
Subject: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
Date: Sat, 14 Nov 2020 16:18:12 +0800	[thread overview]
Message-ID: <68b9650e-bef2-69e2-ab5e-8aaddaf46cfe@huawei.com> (raw)

Hi,

While using ext4 with data=journal(3.10 kernel), we meet a problem that 
we think may never happend...

[421306.834334] JBD2: Spotted dirty metadata buffer (dev = vda2, blocknr 
= 5092931). There's a risk of filesystem corruption in case of system crash.
[421306.834375] JBD2: Spotted dirty metadata buffer (dev = vda2, blocknr 
= 5092931). There's a risk of filesystem corruption in case of system crash.
[421306.841728] JBD2: Spotted dirty metadata buffer (dev = vda2, blocknr 
= 5092931). There's a risk of filesystem corruption in case of system crash.
[421306.859799] ------------[ cut here ]------------
[421306.860616] kernel BUG at fs/jbd2/commit.c:1030!
[421306.861285] invalid opcode: 0000 [#1] SMP
[421306.861996] CPU: 3 PID: 1594 Comm: jbd2/vda2-8 Kdump: loaded
...
[421306.877080] Call Trace:
[421306.877406]  [<ffffffffc045d069>] kjournald2+0xc9/0x260 [jbd2]
[421306.878133]  [<ffffffff914c16c0>] ? wake_up_atomic_t+0x30/0x30
[421306.878851]  [<ffffffffc045cfa0>] ? commit_timeout+0x10/0x10 [jbd2]
[421306.879609]  [<ffffffff914c06a1>] kthread+0xd1/0xe0
[421306.880200]  [<ffffffff914c05d0>] ? insert_kthread_work+0x40/0x40
[421306.880949]  [<ffffffff91b3965d>] ret_from_fork_nospec_begin+0x7/0x21
[421306.881737]  [<ffffffff914c05d0>] ? insert_kthread_work+0x40/0x40

Crash code in jbd2_journal_commit_transaction:

jbd2_journal_commit_transaction(...)
{
     ...
     while (commit_transaction->t_forget) {
     ...
     if (buffer_jbddirty(bh)) {
         ...
     } else {
         J_ASSERT_BH(bh, !buffer_dirty(bh));
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         ...
     }
}

1. the warning and the panic show that someone can dirty buffer directly;
2. the state in buffer and page show that we may call ext4_punch_hole or 
zero_range just before now;

crash> buffer_head ffff971220f3caf8
struct buffer_head {
   b_state = 5308419, BH_State|BH_RevokeValid|BH_JBD|BH_Uptodate|BH_Dirty
   b_this_page = 0xffff971220f3caf8,
   b_page = 0xffffdb4e8e897cc0,
   b_blocknr = 5092931,
   b_size = 4096,
   b_data = 0xffff9711a25f3000 ...
   b_bdev = 0x0,
   b_end_io = 0x0,
   b_private = 0xffff97114c04faf0,
   b_assoc_buffers = {
     next = 0xffff971220f3cb40,
     prev = 0xffff971220f3cb40
   },
   b_assoc_map = 0x0,
   b_count = {
     counter = 2
   }
}

crash> page 0xffffdb4e8e897cc0
struct page {
   flags = 31525193096628284,
   mapping = 0x0,
   {
     {
       index = 766,
     ...
     private = 0xffff971220f3caf8,
     ...
}

3. the b_blocknr in buffer_head and index in page show that the buffer 
wont be a metadata block.

For now, what I have seen that can dirty buffer directly is 
ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for 
ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size 
/ext4_page_mkwrite parallel can trigger above warning easily.

a. first, file with 4K size punch hole to 0 with keep size

mmap1:                     mmap2:                  commit:
ext4_page_fault
  create new page
  and lock page
...
unlock page
                            ext4_page_fault
                             find and lock the
                             page mmap1 create
                            ...
                            unlock_page

ext4_page_mkwrite
  lock page
  (has buffer&&unmap)
  or goto out
  unlock page
                            ext4_page_mkwrite
                             lock_page
                             (has buffer&&unmap)
                             or goto out
                             unlock page
  start handle(trans 1)
  __block_page_mkwrite
   lock page
   (page->mapping==
   inode->mapping) or
   goto out
   block_commit_write
    set_buffer_dirty
  ext4_walk_page_buffers
   do_journal_get_write_access
    clear_buffer_dirty
...
unlock_page
                              start_handle(trans 2)
                               __block_page_mkwrite
                                lock page
                                ...(same as mmap1)
                                 set_buffer_dirty  trans1 1 commit:
                                ...                bh moving from one
                                                   list to other list
                                                   (like shadow), and
                                                   warn_dirty_buffer!
                                unlock page



However, the same testcase won't trigger the panic. We can seen that 
ext4_punch_hole and ext4_page_mkwrite all will try to lock page. So, if 
punch_hole first, we won't set buffer dirty since page->mapping has been 
set to NULL. And if ext4_page_mkwrite first, we won't seen buffer dirty 
since do_journal_get_write_access will clear it.

Besides, the panic code was protected by jbd_lock_bh_state, and the 
information of bh show that we has call journal_unmap_buffer for it. So, 
the panic code may never be trigger...

punch hole:
ext4_punch_hole
   ...
   lock_page
   truncate_inode_page
     truncate_complete_page
       do_invalidatepage
         ...
         journal_unmap_buffer
       delete_from_page_cache
         remove page from radix tree, and set page->mapping = NULL,
         so we won't find this page
   unlock_page


mmap:
ext4_page_fault
   find and create new page(without bh)
...
unlock_page

ext4_page_mkwrite
   lock_page
   (has buffer && unmap) or will go out
   unlock_page
   start_handle
   __block_page_mkwrite
     lock_page
     (page->mapping != inode->i_mapping) or go out
     block_commit_write
       set_buffer_dirty
   ext4_walk_page_buffers
     do_journal_get_write_access
       clear_buffer_dirty =========> after unlock page, wont seen dirty
...
unlock_page



The above assumption was based on we can only dirty buffer directly by 
ext4_page_mkwrite. Maybe there exists other way too? Or, the analysis 
above exists some bug...


Thanks,
Kun.




             reply	other threads:[~2020-11-14  8:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-14  8:18 yangerkun [this message]
2020-11-16 13:50 ` [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction Mauricio Oliveira
2020-11-19  4:25   ` yangerkun
2020-11-19 13:12     ` Mauricio Oliveira
2020-11-20  2:54       ` yangerkun
2020-11-20  3:03         ` yangerkun
2020-11-20 13:14         ` Mauricio Oliveira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68b9650e-bef2-69e2-ab5e-8aaddaf46cfe@huawei.com \
    --to=yangerkun@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=hejie3@huawei.com \
    --cc=houtao1@huawei.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yebin10@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=zhangxiaoxu5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.