All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
@ 2020-11-14  8:18 yangerkun
  2020-11-16 13:50 ` Mauricio Oliveira
  0 siblings, 1 reply; 7+ messages in thread
From: yangerkun @ 2020-11-14  8:18 UTC (permalink / raw)
  To: tytso, adilger.kernel, jack
  Cc: linux-ext4, zhangyi (F), Hou Tao, zhangxiaoxu5, Ye Bin, hejie3

Hi,

While using ext4 with data=journal(3.10 kernel), we meet a problem that 
we think may never happend...

[421306.834334] JBD2: Spotted dirty metadata buffer (dev = vda2, blocknr 
= 5092931). There's a risk of filesystem corruption in case of system crash.
[421306.834375] JBD2: Spotted dirty metadata buffer (dev = vda2, blocknr 
= 5092931). There's a risk of filesystem corruption in case of system crash.
[421306.841728] JBD2: Spotted dirty metadata buffer (dev = vda2, blocknr 
= 5092931). There's a risk of filesystem corruption in case of system crash.
[421306.859799] ------------[ cut here ]------------
[421306.860616] kernel BUG at fs/jbd2/commit.c:1030!
[421306.861285] invalid opcode: 0000 [#1] SMP
[421306.861996] CPU: 3 PID: 1594 Comm: jbd2/vda2-8 Kdump: loaded
...
[421306.877080] Call Trace:
[421306.877406]  [<ffffffffc045d069>] kjournald2+0xc9/0x260 [jbd2]
[421306.878133]  [<ffffffff914c16c0>] ? wake_up_atomic_t+0x30/0x30
[421306.878851]  [<ffffffffc045cfa0>] ? commit_timeout+0x10/0x10 [jbd2]
[421306.879609]  [<ffffffff914c06a1>] kthread+0xd1/0xe0
[421306.880200]  [<ffffffff914c05d0>] ? insert_kthread_work+0x40/0x40
[421306.880949]  [<ffffffff91b3965d>] ret_from_fork_nospec_begin+0x7/0x21
[421306.881737]  [<ffffffff914c05d0>] ? insert_kthread_work+0x40/0x40

Crash code in jbd2_journal_commit_transaction:

jbd2_journal_commit_transaction(...)
{
     ...
     while (commit_transaction->t_forget) {
     ...
     if (buffer_jbddirty(bh)) {
         ...
     } else {
         J_ASSERT_BH(bh, !buffer_dirty(bh));
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         ...
     }
}

1. the warning and the panic show that someone can dirty buffer directly;
2. the state in buffer and page show that we may call ext4_punch_hole or 
zero_range just before now;

crash> buffer_head ffff971220f3caf8
struct buffer_head {
   b_state = 5308419, BH_State|BH_RevokeValid|BH_JBD|BH_Uptodate|BH_Dirty
   b_this_page = 0xffff971220f3caf8,
   b_page = 0xffffdb4e8e897cc0,
   b_blocknr = 5092931,
   b_size = 4096,
   b_data = 0xffff9711a25f3000 ...
   b_bdev = 0x0,
   b_end_io = 0x0,
   b_private = 0xffff97114c04faf0,
   b_assoc_buffers = {
     next = 0xffff971220f3cb40,
     prev = 0xffff971220f3cb40
   },
   b_assoc_map = 0x0,
   b_count = {
     counter = 2
   }
}

crash> page 0xffffdb4e8e897cc0
struct page {
   flags = 31525193096628284,
   mapping = 0x0,
   {
     {
       index = 766,
     ...
     private = 0xffff971220f3caf8,
     ...
}

3. the b_blocknr in buffer_head and index in page show that the buffer 
wont be a metadata block.

For now, what I have seen that can dirty buffer directly is 
ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for 
ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size 
/ext4_page_mkwrite parallel can trigger above warning easily.

a. first, file with 4K size punch hole to 0 with keep size

mmap1:                     mmap2:                  commit:
ext4_page_fault
  create new page
  and lock page
...
unlock page
                            ext4_page_fault
                             find and lock the
                             page mmap1 create
                            ...
                            unlock_page

ext4_page_mkwrite
  lock page
  (has buffer&&unmap)
  or goto out
  unlock page
                            ext4_page_mkwrite
                             lock_page
                             (has buffer&&unmap)
                             or goto out
                             unlock page
  start handle(trans 1)
  __block_page_mkwrite
   lock page
   (page->mapping==
   inode->mapping) or
   goto out
   block_commit_write
    set_buffer_dirty
  ext4_walk_page_buffers
   do_journal_get_write_access
    clear_buffer_dirty
...
unlock_page
                              start_handle(trans 2)
                               __block_page_mkwrite
                                lock page
                                ...(same as mmap1)
                                 set_buffer_dirty  trans1 1 commit:
                                ...                bh moving from one
                                                   list to other list
                                                   (like shadow), and
                                                   warn_dirty_buffer!
                                unlock page



However, the same testcase won't trigger the panic. We can seen that 
ext4_punch_hole and ext4_page_mkwrite all will try to lock page. So, if 
punch_hole first, we won't set buffer dirty since page->mapping has been 
set to NULL. And if ext4_page_mkwrite first, we won't seen buffer dirty 
since do_journal_get_write_access will clear it.

Besides, the panic code was protected by jbd_lock_bh_state, and the 
information of bh show that we has call journal_unmap_buffer for it. So, 
the panic code may never be trigger...

punch hole:
ext4_punch_hole
   ...
   lock_page
   truncate_inode_page
     truncate_complete_page
       do_invalidatepage
         ...
         journal_unmap_buffer
       delete_from_page_cache
         remove page from radix tree, and set page->mapping = NULL,
         so we won't find this page
   unlock_page


mmap:
ext4_page_fault
   find and create new page(without bh)
...
unlock_page

ext4_page_mkwrite
   lock_page
   (has buffer && unmap) or will go out
   unlock_page
   start_handle
   __block_page_mkwrite
     lock_page
     (page->mapping != inode->i_mapping) or go out
     block_commit_write
       set_buffer_dirty
   ext4_walk_page_buffers
     do_journal_get_write_access
       clear_buffer_dirty =========> after unlock page, wont seen dirty
...
unlock_page



The above assumption was based on we can only dirty buffer directly by 
ext4_page_mkwrite. Maybe there exists other way too? Or, the analysis 
above exists some bug...


Thanks,
Kun.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
  2020-11-14  8:18 [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction yangerkun
@ 2020-11-16 13:50 ` Mauricio Oliveira
  2020-11-19  4:25   ` yangerkun
  0 siblings, 1 reply; 7+ messages in thread
From: Mauricio Oliveira @ 2020-11-16 13:50 UTC (permalink / raw)
  To: yangerkun
  Cc: Theodore Y . Ts'o, adilger.kernel, Jan Kara, linux-ext4,
	zhangyi (F),
	Hou Tao, zhangxiaoxu5, Ye Bin, hejie3

Hi Kun,

On Sat, Nov 14, 2020 at 5:18 AM yangerkun <yangerkun@huawei.com> wrote:
> While using ext4 with data=journal(3.10 kernel), we meet a problem that
> we think may never happend...
[...]

Could you please confirm you mean 5.10-rc* kernel instead of 3.10?
(It seems so as you mention a recent commit below.)  Thanks!

> For now, what I have seen that can dirty buffer directly is
> ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for
> ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size
> /ext4_page_mkwrite parallel can trigger above warning easily.
[...]


-- 
Mauricio Faria de Oliveira

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
  2020-11-16 13:50 ` Mauricio Oliveira
@ 2020-11-19  4:25   ` yangerkun
  2020-11-19 13:12     ` Mauricio Oliveira
  0 siblings, 1 reply; 7+ messages in thread
From: yangerkun @ 2020-11-19  4:25 UTC (permalink / raw)
  To: Mauricio Oliveira
  Cc: Theodore Y . Ts'o, adilger.kernel, Jan Kara, linux-ext4,
	zhangyi (F),
	Hou Tao, zhangxiaoxu5, Ye Bin, hejie3



在 2020/11/16 21:50, Mauricio Oliveira 写道:
> Hi Kun,
> 
> On Sat, Nov 14, 2020 at 5:18 AM yangerkun <yangerkun@huawei.com> wrote:
>> While using ext4 with data=journal(3.10 kernel), we meet a problem that
>> we think may never happend...
> [...]
> 
> Could you please confirm you mean 5.10-rc* kernel instead of 3.10?
> (It seems so as you mention a recent commit below.)  Thanks!
> 
>> For now, what I have seen that can dirty buffer directly is
>> ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for
>> ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size
>> /ext4_page_mkwrite parallel can trigger above warning easily.
> [...]
> 
> 

Hi,

Sorry for the long delay reply... And thanks a lot for your advise! The 
bug trigger with a very low probability. So won't trigger with 5.10 can 
not prove no bug exist in 5.10.

Google a lot and notice that someone before has report the same bug[1]. 
'3b136499e906 ("ext4: fix data corruption in data=journal mode")' seems 
fix the problem. I will try to understand this, and give a analysis 
about how to reproduce it!

Thanks,
Kun.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
  2020-11-19  4:25   ` yangerkun
@ 2020-11-19 13:12     ` Mauricio Oliveira
  2020-11-20  2:54       ` yangerkun
  0 siblings, 1 reply; 7+ messages in thread
From: Mauricio Oliveira @ 2020-11-19 13:12 UTC (permalink / raw)
  To: yangerkun
  Cc: Theodore Y . Ts'o, adilger.kernel, Jan Kara, linux-ext4,
	zhangyi (F),
	Hou Tao, zhangxiaoxu5, Ye Bin, hejie3

On Thu, Nov 19, 2020 at 1:25 AM yangerkun <yangerkun@huawei.com> wrote:
>
>
>
> 在 2020/11/16 21:50, Mauricio Oliveira 写道:
> > Hi Kun,
> >
> > On Sat, Nov 14, 2020 at 5:18 AM yangerkun <yangerkun@huawei.com> wrote:
> >> While using ext4 with data=journal(3.10 kernel), we meet a problem that
> >> we think may never happend...
> > [...]
> >
> > Could you please confirm you mean 5.10-rc* kernel instead of 3.10?
> > (It seems so as you mention a recent commit below.)  Thanks!
> >
> >> For now, what I have seen that can dirty buffer directly is
> >> ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for
> >> ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size
> >> /ext4_page_mkwrite parallel can trigger above warning easily.
> > [...]
> >
> >
>
> Hi,
>
> Sorry for the long delay reply... And thanks a lot for your advise! The
> bug trigger with a very low probability. So won't trigger with 5.10 can
> not prove no bug exist in 5.10.
>

No worries, and thanks for following up.
So I understand that the bug report was indeed on 3.10, and 5.10-rcN
is not yet confirmed.

> Google a lot and notice that someone before has report the same bug[1].
> '3b136499e906 ("ext4: fix data corruption in data=journal mode")' seems
> fix the problem. I will try to understand this, and give a analysis
> about how to reproduce it!

Cool, thanks!

> Thanks,
> Kun.



-- 
Mauricio Faria de Oliveira

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
  2020-11-19 13:12     ` Mauricio Oliveira
@ 2020-11-20  2:54       ` yangerkun
  2020-11-20  3:03         ` yangerkun
  2020-11-20 13:14         ` Mauricio Oliveira
  0 siblings, 2 replies; 7+ messages in thread
From: yangerkun @ 2020-11-20  2:54 UTC (permalink / raw)
  To: Mauricio Oliveira
  Cc: Theodore Y . Ts'o, adilger.kernel, Jan Kara, linux-ext4,
	zhangyi (F),
	Hou Tao, zhangxiaoxu5, Ye Bin, hejie3



在 2020/11/19 21:12, Mauricio Oliveira 写道:
> On Thu, Nov 19, 2020 at 1:25 AM yangerkun <yangerkun@huawei.com> wrote:
>>
>>
>>
>> 在 2020/11/16 21:50, Mauricio Oliveira 写道:
>>> Hi Kun,
>>>
>>> On Sat, Nov 14, 2020 at 5:18 AM yangerkun <yangerkun@huawei.com> wrote:
>>>> While using ext4 with data=journal(3.10 kernel), we meet a problem that
>>>> we think may never happend...
>>> [...]
>>>
>>> Could you please confirm you mean 5.10-rc* kernel instead of 3.10?
>>> (It seems so as you mention a recent commit below.)  Thanks!
>>>
>>>> For now, what I have seen that can dirty buffer directly is
>>>> ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for
>>>> ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size
>>>> /ext4_page_mkwrite parallel can trigger above warning easily.
>>> [...]
>>>
>>>
>>
>> Hi,
>>
>> Sorry for the long delay reply... And thanks a lot for your advise! The
>> bug trigger with a very low probability. So won't trigger with 5.10 can
>> not prove no bug exist in 5.10.
>>
> 
> No worries, and thanks for following up.
> So I understand that the bug report was indeed on 3.10, and 5.10-rcN
> is not yet confirmed.
> 
>> Google a lot and notice that someone before has report the same bug[1].
>> '3b136499e906 ("ext4: fix data corruption in data=journal mode")' seems
>> fix the problem. I will try to understand this, and give a analysis
>> about how to reproduce it!
> 
> Cool, thanks!
> 
>> Thanks,
>> Kun.
> 
> 
> 
Hi,

The follow step can reproduce the bug[1] reported before easily. And the 
bug we meet seems same. Following patch will fix the bug.

3b136499e906 ext4: fix data corruption in data=journal mode
b90197b65518 ext4: use private version of page_zero_new_buffers() for 
data=journal mode


1. mkfs.ext4
2. touch $tofile(ino == 12)
3. touch $fromfile(ino == 13) and write 4k to fromfile and sync

mmap $fromfile 4k
and write 4k
to $tofile

...
generic_perform_write
  ext4_write_begin
   ext4_journal_start
   (trans 1)
  if (ino == 12) sleep for 30s
  ...                           truncate $fromfile
                                to 0
  copied=0,bytes=4k
  ext4_journalled_write_end
   page_zero_new_buffers
    mark_buffer_dirty
   write_end_fn
    ...
    __jbd2_journal_file_buffer
     test_clear_buffer_dirty
     __jbd2_journal_temp_unlink_buffer
   ext4_journal_stop
   (trans 1)
                                                  trans1 commit
                                                  ...
   ext4_truncate_failed_write
    ...
    journal_unmap_buffer
     set_buffer_freed
                                                  forget list
                                                   ...
                                                   clear_buffer_jbddirty
                                                   ...
                                                   J_ASSERT_BH(bh,
                                                   !buffer_dirty(bh))
                                                   ^^^^^^^^^^^^^^^^^
                                                   trigger the bug...



[1]. https://www.spinics.net/lists/linux-ext4/msg56447.html

Thanks,
Kun.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
  2020-11-20  2:54       ` yangerkun
@ 2020-11-20  3:03         ` yangerkun
  2020-11-20 13:14         ` Mauricio Oliveira
  1 sibling, 0 replies; 7+ messages in thread
From: yangerkun @ 2020-11-20  3:03 UTC (permalink / raw)
  To: Mauricio Oliveira
  Cc: Theodore Y . Ts'o, adilger.kernel, Jan Kara, linux-ext4,
	zhangyi (F),
	Hou Tao, zhangxiaoxu5, Ye Bin, hejie3



在 2020/11/20 10:54, yangerkun 写道:
> 
> 
> 在 2020/11/19 21:12, Mauricio Oliveira 写道:
>> On Thu, Nov 19, 2020 at 1:25 AM yangerkun <yangerkun@huawei.com> wrote:
>>>
>>>
>>>
>>> 在 2020/11/16 21:50, Mauricio Oliveira 写道:
>>>> Hi Kun,
>>>>
>>>> On Sat, Nov 14, 2020 at 5:18 AM yangerkun <yangerkun@huawei.com> wrote:
>>>>> While using ext4 with data=journal(3.10 kernel), we meet a problem 
>>>>> that
>>>>> we think may never happend...
>>>> [...]
>>>>
>>>> Could you please confirm you mean 5.10-rc* kernel instead of 3.10?
>>>> (It seems so as you mention a recent commit below.)  Thanks!
>>>>
>>>>> For now, what I have seen that can dirty buffer directly is
>>>>> ext4_page_mkwrite(64a9f1449950 ("ext4: data=journal: fixes for
>>>>> ext4_page_mkwrite()")), and runing ext4_punch_hole with keep_size
>>>>> /ext4_page_mkwrite parallel can trigger above warning easily.
>>>> [...]
>>>>
>>>>
>>>
>>> Hi,
>>>
>>> Sorry for the long delay reply... And thanks a lot for your advise! The
>>> bug trigger with a very low probability. So won't trigger with 5.10 can
>>> not prove no bug exist in 5.10.
>>>
>>
>> No worries, and thanks for following up.
>> So I understand that the bug report was indeed on 3.10, and 5.10-rcN
>> is not yet confirmed.
>>
>>> Google a lot and notice that someone before has report the same bug[1].
>>> '3b136499e906 ("ext4: fix data corruption in data=journal mode")' seems
>>> fix the problem. I will try to understand this, and give a analysis
>>> about how to reproduce it!
>>
>> Cool, thanks!
>>
>>> Thanks,
>>> Kun.
>>
>>
>>
> Hi,
> 
> The follow step can reproduce the bug[1] reported before easily. And the 
> bug we meet seems same. Following patch will fix the bug.
> 
> 3b136499e906 ext4: fix data corruption in data=journal mode
> b90197b65518 ext4: use private version of page_zero_new_buffers() for 
> data=journal mode
> 
> 
> 1. mkfs.ext4
> 2. touch $tofile(ino == 12)
> 3. touch $fromfile(ino == 13) and write 4k to fromfile and sync
> 
> mmap $fromfile 4k
> and write 4k
> to $tofile
> 
> ...
> generic_perform_write
>   ext4_write_begin
>    ext4_journal_start
>    (trans 1)
>   if (ino == 12) sleep for 30s
>   ...                           truncate $fromfile
>                                 to 0
>   copied=0,bytes=4k
>   ext4_journalled_write_end
>    page_zero_new_buffers
>     mark_buffer_dirty
>    write_end_fn
>     ...
>     __jbd2_journal_file_buffer
>      test_clear_buffer_dirty
>      __jbd2_journal_temp_unlink_buffer
            this will mark buffer dirty again!
>    ext4_journal_stop
>    (trans 1)
>                                                   trans1 commit
>                                                   ...
>    ext4_truncate_failed_write
>     ...
>     journal_unmap_buffer
>      set_buffer_freed
>                                                   forget list
>                                                    ...
>                                                    clear_buffer_jbddirty
>                                                    ...
>                                                    J_ASSERT_BH(bh,
>                                                    !buffer_dirty(bh))
>                                                    ^^^^^^^^^^^^^^^^^
>                                                    trigger the bug...
> 
> 
> 
> [1]. https://www.spinics.net/lists/linux-ext4/msg56447.html
> 
> Thanks,
> Kun.
> .

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction
  2020-11-20  2:54       ` yangerkun
  2020-11-20  3:03         ` yangerkun
@ 2020-11-20 13:14         ` Mauricio Oliveira
  1 sibling, 0 replies; 7+ messages in thread
From: Mauricio Oliveira @ 2020-11-20 13:14 UTC (permalink / raw)
  To: yangerkun
  Cc: Theodore Y . Ts'o, adilger.kernel, Jan Kara, linux-ext4,
	zhangyi (F),
	Hou Tao, zhangxiaoxu5, Ye Bin, hejie3

Hi Kun,

On Thu, Nov 19, 2020 at 11:54 PM yangerkun <yangerkun@huawei.com> wrote:
> Hi,
>
> The follow step can reproduce the bug[1] reported before easily. And the
> bug we meet seems same. Following patch will fix the bug.
>
> 3b136499e906 ext4: fix data corruption in data=journal mode
> b90197b65518 ext4: use private version of page_zero_new_buffers() for
> data=journal mode

Since this issue is apparently on a 3.10 kernel, which is no longer
maintained upstream [see 2],
I guess you have to talk to the distro vendor about including such
patches in their kernel.

[2] https://www.kernel.org/

cheers,

-- 
Mauricio Faria de Oliveira

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-20 13:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-14  8:18 [Bug report] journal data mode trigger panic in jbd2_journal_commit_transaction yangerkun
2020-11-16 13:50 ` Mauricio Oliveira
2020-11-19  4:25   ` yangerkun
2020-11-19 13:12     ` Mauricio Oliveira
2020-11-20  2:54       ` yangerkun
2020-11-20  3:03         ` yangerkun
2020-11-20 13:14         ` Mauricio Oliveira

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.