linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Crash in jbd2_chksum due to null journal->j_chksum_driver
@ 2015-09-30 13:35 Nikolay Borisov
  2015-09-30 17:12 ` Darrick J. Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2015-09-30 13:35 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: Linux-Kernel@Vger. Kernel. Org, Jan Kara, darrick.wong,
	SiteGround Operations

Hello, 

Today a colleague was testing something and while doing so he observed 
the following crash: 

jbd2_journal_bmap: journal block not found at offset 67 on dm-26-8
Aborting journal on device dm-26-8.
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
PGD 3fcef54067 PUD 3fce84e067 PMD 0 
Oops: 0000 [#1] SMP 
Modules linked in: act_police cls_basic sch_ingress veth dm_snapshot openvswitch gre vxlan ip_tunnel xt_owner xt_conntrack iptable_mangle xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_CT nf_conntrack iptable_raw ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ext2 dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mirror dm_region_hash dm_log ses enclosure igb i2c_algo_bit x86_pkg_temp_thermal crc32_pclmul i2c_i801 lpc_ich mfd_core ioapic ioatdma dca shpchp ipmi_devintf ipmi_si ipmi_msghandler
CPU: 0 PID: 12059 Comm: jbd2/dm-26-8 Not tainted 3.12.47-clouder1 #1
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
task: ffff883f904958b0 ti: ffff883fce4d8000 task.ti: ffff883fce4d8000
RIP: 0010:[<ffffffff812b12eb>]  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
RSP: 0018:ffff883fce4d9a58  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff883f8dd77000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: ffff883f8dd77000 RDI: ffff883fa0fc6800
RBP: ffff883fce4d9a88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000f0459c0b
R13: 0000000000000411 R14: ffff883f8dd77000 R15: 00000000560bb55d
FS:  0000000000000000(0000) GS:ffff881fffa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000003fd145d000 CR4: 00000000001407f0
Stack:
 ffffffff81e07402 ffff883fa0fc6800 00000000fffffffb ffff883fce4d9b90
 ffff883f8dd77000 ffff883fa0fc6800 ffff883fce4d9aa8 ffffffff812b1369
 0000000000000010 ffff883f90c772d8 ffff883fce4d9ae8 ffffffff812b1455
Call Trace:
 [<ffffffff812b1369>] jbd2_superblock_csum_set+0x29/0x40
 [<ffffffff812b1455>] jbd2_write_superblock+0x85/0x1b0
 [<ffffffff812b1b70>] jbd2_journal_update_sb_errno+0x50/0x60
 [<ffffffff812b1bd0>] __journal_abort_soft+0x50/0x60
 [<ffffffff812b1c80>] jbd2_journal_bmap+0x90/0xa0
 [<ffffffff812b1ec7>] jbd2_journal_next_log_block+0x77/0x80
 [<ffffffff812b1ef3>] jbd2_journal_get_descriptor_buffer+0x23/0xb0
 [<ffffffff812aa02c>] journal_submit_commit_record+0x7c/0x1e0
 [<ffffffff812abade>] jbd2_journal_commit_transaction+0x194e/0x1d20
 [<ffffffff812b062f>] kjournald2+0xef/0x2b0
 [<ffffffff810aef00>] ? wake_up_bit+0x40/0x40
 [<ffffffff812b0540>] ? commit_timeout+0x10/0x10
 [<ffffffff810ae48e>] kthread+0xce/0xe0
 [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
 [<ffffffff816571c8>] ret_from_fork+0x58/0x90
 [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
Code: 55 48 89 e5 41 54 53 48 83 ec 20 0f 1f 44 00 00 44 8b a6 fc 00 00 00 48 89 f3 c7 86 fc 00 00 00 00 00 00 00 48 8b 87 d0 04 00 00 <83> 38 04 77 39 48 89 45 d0 c7 45 d8 00 00 00 00 48 8d 7d d0 c7 
RIP  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
 RSP <ffff883fce4d9a58>
CR2: 0000000000000000
---[ end trace e1bd94031f410b71 ]---

The ffffffff812b12eb address actually is jbd2_chksum and the 
instruction where the deference is happening in 
crypto_shash_descsize(), essentially journal->j_chksum_driver is NULL. 

Now, how we got ourselves in this situation - we have an lvm thin 
volume with ext4 fs and a container started from it,
then, while the container is running we invoke the following 
command to scrub its contents:

openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero | dd bs=64K of=/dev/volumegroupname/volumename


And then when we try to umount the volume we get the aforementioned 
crash. Naturally, because we overwrite the on-disk contents jbd2_journal_bmap 
fails which triggers the journal abort which wants to update the on-disk
errno, which naturally triggers a superblock checksum regeneration
and this goes BOOM. 

I looked around the code but couldn't figure out a code path
which allows the checksum driver to become null at runtime.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Crash in jbd2_chksum due to null journal->j_chksum_driver
  2015-09-30 13:35 Crash in jbd2_chksum due to null journal->j_chksum_driver Nikolay Borisov
@ 2015-09-30 17:12 ` Darrick J. Wong
  2015-09-30 18:13   ` Nikolay Borisov
  0 siblings, 1 reply; 4+ messages in thread
From: Darrick J. Wong @ 2015-09-30 17:12 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: linux-fsdevel, Linux-Kernel@Vger. Kernel. Org, Jan Kara,
	SiteGround Operations

On Wed, Sep 30, 2015 at 04:35:49PM +0300, Nikolay Borisov wrote:
> Hello, 
> 
> Today a colleague was testing something and while doing so he observed 
> the following crash: 
> 
> jbd2_journal_bmap: journal block not found at offset 67 on dm-26-8
> Aborting journal on device dm-26-8.
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
> PGD 3fcef54067 PUD 3fce84e067 PMD 0 
> Oops: 0000 [#1] SMP 
> Modules linked in: act_police cls_basic sch_ingress veth dm_snapshot openvswitch gre vxlan ip_tunnel xt_owner xt_conntrack iptable_mangle xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_CT nf_conntrack iptable_raw ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ext2 dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mirror dm_region_hash dm_log ses enclosure igb i2c_algo_bit x86_pkg_temp_thermal crc32_pclmul i2c_i801 lpc_ich mfd_core ioapic ioatdma dca shpchp ipmi_devintf ipmi_si ipmi_msghandler
> CPU: 0 PID: 12059 Comm: jbd2/dm-26-8 Not tainted 3.12.47-clouder1 #1
> Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
> task: ffff883f904958b0 ti: ffff883fce4d8000 task.ti: ffff883fce4d8000
> RIP: 0010:[<ffffffff812b12eb>]  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
> RSP: 0018:ffff883fce4d9a58  EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffff883f8dd77000 RCX: 0000000000000006
> RDX: 0000000000000000 RSI: ffff883f8dd77000 RDI: ffff883fa0fc6800
> RBP: ffff883fce4d9a88 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000000 R12: 00000000f0459c0b
> R13: 0000000000000411 R14: ffff883f8dd77000 R15: 00000000560bb55d
> FS:  0000000000000000(0000) GS:ffff881fffa00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 0000003fd145d000 CR4: 00000000001407f0
> Stack:
>  ffffffff81e07402 ffff883fa0fc6800 00000000fffffffb ffff883fce4d9b90
>  ffff883f8dd77000 ffff883fa0fc6800 ffff883fce4d9aa8 ffffffff812b1369
>  0000000000000010 ffff883f90c772d8 ffff883fce4d9ae8 ffffffff812b1455
> Call Trace:
>  [<ffffffff812b1369>] jbd2_superblock_csum_set+0x29/0x40
>  [<ffffffff812b1455>] jbd2_write_superblock+0x85/0x1b0
>  [<ffffffff812b1b70>] jbd2_journal_update_sb_errno+0x50/0x60
>  [<ffffffff812b1bd0>] __journal_abort_soft+0x50/0x60
>  [<ffffffff812b1c80>] jbd2_journal_bmap+0x90/0xa0
>  [<ffffffff812b1ec7>] jbd2_journal_next_log_block+0x77/0x80
>  [<ffffffff812b1ef3>] jbd2_journal_get_descriptor_buffer+0x23/0xb0
>  [<ffffffff812aa02c>] journal_submit_commit_record+0x7c/0x1e0
>  [<ffffffff812abade>] jbd2_journal_commit_transaction+0x194e/0x1d20
>  [<ffffffff812b062f>] kjournald2+0xef/0x2b0
>  [<ffffffff810aef00>] ? wake_up_bit+0x40/0x40
>  [<ffffffff812b0540>] ? commit_timeout+0x10/0x10
>  [<ffffffff810ae48e>] kthread+0xce/0xe0
>  [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
>  [<ffffffff816571c8>] ret_from_fork+0x58/0x90
>  [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
> Code: 55 48 89 e5 41 54 53 48 83 ec 20 0f 1f 44 00 00 44 8b a6 fc 00 00 00 48 89 f3 c7 86 fc 00 00 00 00 00 00 00 48 8b 87 d0 04 00 00 <83> 38 04 77 39 48 89 45 d0 c7 45 d8 00 00 00 00 48 8d 7d d0 c7 
> RIP  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
>  RSP <ffff883fce4d9a58>
> CR2: 0000000000000000
> ---[ end trace e1bd94031f410b71 ]---
> 
> The ffffffff812b12eb address actually is jbd2_chksum and the 
> instruction where the deference is happening in 
> crypto_shash_descsize(), essentially journal->j_chksum_driver is NULL. 
> 
> Now, how we got ourselves in this situation - we have an lvm thin 
> volume with ext4 fs and a container started from it,
> then, while the container is running we invoke the following 
> command to scrub its contents:
> 
> openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero | dd bs=64K of=/dev/volumegroupname/volumename
> 
> 
> And then when we try to umount the volume we get the aforementioned 
> crash. Naturally, because we overwrite the on-disk contents jbd2_journal_bmap 
> fails which triggers the journal abort which wants to update the on-disk
> errno, which naturally triggers a superblock checksum regeneration
> and this goes BOOM. 
> 
> I looked around the code but couldn't figure out a code path
> which allows the checksum driver to become null at runtime.

Most likely is that the journal wasn't started with the checksum driver
turned on, and then your randomizing of the journal sb *while it was running*
flipped the feature bit on, causing jbd2 to think checksumming was turned on.

I guess the "proper" fix is to set j_chksum_driver at journal load time if
the superblock flags are set properly and then gate all other accesses on
the status of j_chksum_driver just in case someone obliterates the journal sb.

OTOH, why can't you unmount the FS and /then/ randomize the disk?

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Crash in jbd2_chksum due to null journal->j_chksum_driver
  2015-09-30 17:12 ` Darrick J. Wong
@ 2015-09-30 18:13   ` Nikolay Borisov
  2015-09-30 18:43     ` Darrick J. Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2015-09-30 18:13 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Nikolay Borisov, linux-fsdevel, Linux-Kernel@Vger. Kernel. Org,
	Jan Kara, SiteGround Operations

Hello,


Well, I guess I can do that the thing is the current scenario was like
that. Anyway,
I thought something like what you describe could be happening. I saw
your patch and I'm
going to test it tomorrow. But I think the patch needs to be tagged
for stable since there
is going to be effort to make filesystems mountable in non-init user
namespace and
an arbitrary user could potentially cause instability on the system?

Regards,
Nikolay

On Wed, Sep 30, 2015 at 8:12 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Wed, Sep 30, 2015 at 04:35:49PM +0300, Nikolay Borisov wrote:
>> Hello,
>>
>> Today a colleague was testing something and while doing so he observed
>> the following crash:
>>
>> jbd2_journal_bmap: journal block not found at offset 67 on dm-26-8
>> Aborting journal on device dm-26-8.
>> BUG: unable to handle kernel NULL pointer dereference at           (null)
>> IP: [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
>> PGD 3fcef54067 PUD 3fce84e067 PMD 0
>> Oops: 0000 [#1] SMP
>> Modules linked in: act_police cls_basic sch_ingress veth dm_snapshot openvswitch gre vxlan ip_tunnel xt_owner xt_conntrack iptable_mangle xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_CT nf_conntrack iptable_raw ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ext2 dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mirror dm_region_hash dm_log ses enclosure igb i2c_algo_bit x86_pkg_temp_thermal crc32_pclmul i2c_i801 lpc_ich mfd_core ioapic ioatdma dca shpchp ipmi_devintf ipmi_si ipmi_msghandler
>> CPU: 0 PID: 12059 Comm: jbd2/dm-26-8 Not tainted 3.12.47-clouder1 #1
>> Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
>> task: ffff883f904958b0 ti: ffff883fce4d8000 task.ti: ffff883fce4d8000
>> RIP: 0010:[<ffffffff812b12eb>]  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
>> RSP: 0018:ffff883fce4d9a58  EFLAGS: 00010282
>> RAX: 0000000000000000 RBX: ffff883f8dd77000 RCX: 0000000000000006
>> RDX: 0000000000000000 RSI: ffff883f8dd77000 RDI: ffff883fa0fc6800
>> RBP: ffff883fce4d9a88 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000000 R12: 00000000f0459c0b
>> R13: 0000000000000411 R14: ffff883f8dd77000 R15: 00000000560bb55d
>> FS:  0000000000000000(0000) GS:ffff881fffa00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 0000003fd145d000 CR4: 00000000001407f0
>> Stack:
>>  ffffffff81e07402 ffff883fa0fc6800 00000000fffffffb ffff883fce4d9b90
>>  ffff883f8dd77000 ffff883fa0fc6800 ffff883fce4d9aa8 ffffffff812b1369
>>  0000000000000010 ffff883f90c772d8 ffff883fce4d9ae8 ffffffff812b1455
>> Call Trace:
>>  [<ffffffff812b1369>] jbd2_superblock_csum_set+0x29/0x40
>>  [<ffffffff812b1455>] jbd2_write_superblock+0x85/0x1b0
>>  [<ffffffff812b1b70>] jbd2_journal_update_sb_errno+0x50/0x60
>>  [<ffffffff812b1bd0>] __journal_abort_soft+0x50/0x60
>>  [<ffffffff812b1c80>] jbd2_journal_bmap+0x90/0xa0
>>  [<ffffffff812b1ec7>] jbd2_journal_next_log_block+0x77/0x80
>>  [<ffffffff812b1ef3>] jbd2_journal_get_descriptor_buffer+0x23/0xb0
>>  [<ffffffff812aa02c>] journal_submit_commit_record+0x7c/0x1e0
>>  [<ffffffff812abade>] jbd2_journal_commit_transaction+0x194e/0x1d20
>>  [<ffffffff812b062f>] kjournald2+0xef/0x2b0
>>  [<ffffffff810aef00>] ? wake_up_bit+0x40/0x40
>>  [<ffffffff812b0540>] ? commit_timeout+0x10/0x10
>>  [<ffffffff810ae48e>] kthread+0xce/0xe0
>>  [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
>>  [<ffffffff816571c8>] ret_from_fork+0x58/0x90
>>  [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
>> Code: 55 48 89 e5 41 54 53 48 83 ec 20 0f 1f 44 00 00 44 8b a6 fc 00 00 00 48 89 f3 c7 86 fc 00 00 00 00 00 00 00 48 8b 87 d0 04 00 00 <83> 38 04 77 39 48 89 45 d0 c7 45 d8 00 00 00 00 48 8d 7d d0 c7
>> RIP  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
>>  RSP <ffff883fce4d9a58>
>> CR2: 0000000000000000
>> ---[ end trace e1bd94031f410b71 ]---
>>
>> The ffffffff812b12eb address actually is jbd2_chksum and the
>> instruction where the deference is happening in
>> crypto_shash_descsize(), essentially journal->j_chksum_driver is NULL.
>>
>> Now, how we got ourselves in this situation - we have an lvm thin
>> volume with ext4 fs and a container started from it,
>> then, while the container is running we invoke the following
>> command to scrub its contents:
>>
>> openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero | dd bs=64K of=/dev/volumegroupname/volumename
>>
>>
>> And then when we try to umount the volume we get the aforementioned
>> crash. Naturally, because we overwrite the on-disk contents jbd2_journal_bmap
>> fails which triggers the journal abort which wants to update the on-disk
>> errno, which naturally triggers a superblock checksum regeneration
>> and this goes BOOM.
>>
>> I looked around the code but couldn't figure out a code path
>> which allows the checksum driver to become null at runtime.
>
> Most likely is that the journal wasn't started with the checksum driver
> turned on, and then your randomizing of the journal sb *while it was running*
> flipped the feature bit on, causing jbd2 to think checksumming was turned on.
>
> I guess the "proper" fix is to set j_chksum_driver at journal load time if
> the superblock flags are set properly and then gate all other accesses on
> the status of j_chksum_driver just in case someone obliterates the journal sb.
>
> OTOH, why can't you unmount the FS and /then/ randomize the disk?
>
> --D
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Crash in jbd2_chksum due to null journal->j_chksum_driver
  2015-09-30 18:13   ` Nikolay Borisov
@ 2015-09-30 18:43     ` Darrick J. Wong
  0 siblings, 0 replies; 4+ messages in thread
From: Darrick J. Wong @ 2015-09-30 18:43 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: Nikolay Borisov, linux-fsdevel, Linux-Kernel@Vger. Kernel. Org,
	Jan Kara, SiteGround Operations

On Wed, Sep 30, 2015 at 09:13:38PM +0300, Nikolay Borisov wrote:
> Hello,
> 
> 
> Well, I guess I can do that the thing is the current scenario was like
> that. Anyway,
> I thought something like what you describe could be happening. I saw
> your patch and I'm
> going to test it tomorrow. But I think the patch needs to be tagged
> for stable since there
> is going to be effort to make filesystems mountable in non-init user
> namespace and
> an arbitrary user could potentially cause instability on the system?

<shrug> If non-root users can write arbitrarily to block devices, I'm
sure a /lot/ more bad things can happen.  But you're right, we could
at least avoid crashing.

--D

> 
> Regards,
> Nikolay
> 
> On Wed, Sep 30, 2015 at 8:12 PM, Darrick J. Wong
> <darrick.wong@oracle.com> wrote:
> > On Wed, Sep 30, 2015 at 04:35:49PM +0300, Nikolay Borisov wrote:
> >> Hello,
> >>
> >> Today a colleague was testing something and while doing so he observed
> >> the following crash:
> >>
> >> jbd2_journal_bmap: journal block not found at offset 67 on dm-26-8
> >> Aborting journal on device dm-26-8.
> >> BUG: unable to handle kernel NULL pointer dereference at           (null)
> >> IP: [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
> >> PGD 3fcef54067 PUD 3fce84e067 PMD 0
> >> Oops: 0000 [#1] SMP
> >> Modules linked in: act_police cls_basic sch_ingress veth dm_snapshot openvswitch gre vxlan ip_tunnel xt_owner xt_conntrack iptable_mangle xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_CT nf_conntrack iptable_raw ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ext2 dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio dm_mirror dm_region_hash dm_log ses enclosure igb i2c_algo_bit x86_pkg_temp_thermal crc32_pclmul i2c_i801 lpc_ich mfd_core ioapic ioatdma dca shpchp ipmi_devintf ipmi_si ipmi_msghandler
> >> CPU: 0 PID: 12059 Comm: jbd2/dm-26-8 Not tainted 3.12.47-clouder1 #1
> >> Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
> >> task: ffff883f904958b0 ti: ffff883fce4d8000 task.ti: ffff883fce4d8000
> >> RIP: 0010:[<ffffffff812b12eb>]  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
> >> RSP: 0018:ffff883fce4d9a58  EFLAGS: 00010282
> >> RAX: 0000000000000000 RBX: ffff883f8dd77000 RCX: 0000000000000006
> >> RDX: 0000000000000000 RSI: ffff883f8dd77000 RDI: ffff883fa0fc6800
> >> RBP: ffff883fce4d9a88 R08: 0000000000000000 R09: 0000000000000000
> >> R10: 0000000000000001 R11: 0000000000000000 R12: 00000000f0459c0b
> >> R13: 0000000000000411 R14: ffff883f8dd77000 R15: 00000000560bb55d
> >> FS:  0000000000000000(0000) GS:ffff881fffa00000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 0000000000000000 CR3: 0000003fd145d000 CR4: 00000000001407f0
> >> Stack:
> >>  ffffffff81e07402 ffff883fa0fc6800 00000000fffffffb ffff883fce4d9b90
> >>  ffff883f8dd77000 ffff883fa0fc6800 ffff883fce4d9aa8 ffffffff812b1369
> >>  0000000000000010 ffff883f90c772d8 ffff883fce4d9ae8 ffffffff812b1455
> >> Call Trace:
> >>  [<ffffffff812b1369>] jbd2_superblock_csum_set+0x29/0x40
> >>  [<ffffffff812b1455>] jbd2_write_superblock+0x85/0x1b0
> >>  [<ffffffff812b1b70>] jbd2_journal_update_sb_errno+0x50/0x60
> >>  [<ffffffff812b1bd0>] __journal_abort_soft+0x50/0x60
> >>  [<ffffffff812b1c80>] jbd2_journal_bmap+0x90/0xa0
> >>  [<ffffffff812b1ec7>] jbd2_journal_next_log_block+0x77/0x80
> >>  [<ffffffff812b1ef3>] jbd2_journal_get_descriptor_buffer+0x23/0xb0
> >>  [<ffffffff812aa02c>] journal_submit_commit_record+0x7c/0x1e0
> >>  [<ffffffff812abade>] jbd2_journal_commit_transaction+0x194e/0x1d20
> >>  [<ffffffff812b062f>] kjournald2+0xef/0x2b0
> >>  [<ffffffff810aef00>] ? wake_up_bit+0x40/0x40
> >>  [<ffffffff812b0540>] ? commit_timeout+0x10/0x10
> >>  [<ffffffff810ae48e>] kthread+0xce/0xe0
> >>  [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
> >>  [<ffffffff816571c8>] ret_from_fork+0x58/0x90
> >>  [<ffffffff810ae3c0>] ? kthread_freezable_should_stop+0x80/0x80
> >> Code: 55 48 89 e5 41 54 53 48 83 ec 20 0f 1f 44 00 00 44 8b a6 fc 00 00 00 48 89 f3 c7 86 fc 00 00 00 00 00 00 00 48 8b 87 d0 04 00 00 <83> 38 04 77 39 48 89 45 d0 c7 45 d8 00 00 00 00 48 8d 7d d0 c7
> >> RIP  [<ffffffff812b12eb>] jbd2_superblock_csum+0x2b/0x80
> >>  RSP <ffff883fce4d9a58>
> >> CR2: 0000000000000000
> >> ---[ end trace e1bd94031f410b71 ]---
> >>
> >> The ffffffff812b12eb address actually is jbd2_chksum and the
> >> instruction where the deference is happening in
> >> crypto_shash_descsize(), essentially journal->j_chksum_driver is NULL.
> >>
> >> Now, how we got ourselves in this situation - we have an lvm thin
> >> volume with ext4 fs and a container started from it,
> >> then, while the container is running we invoke the following
> >> command to scrub its contents:
> >>
> >> openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero | dd bs=64K of=/dev/volumegroupname/volumename
> >>
> >>
> >> And then when we try to umount the volume we get the aforementioned
> >> crash. Naturally, because we overwrite the on-disk contents jbd2_journal_bmap
> >> fails which triggers the journal abort which wants to update the on-disk
> >> errno, which naturally triggers a superblock checksum regeneration
> >> and this goes BOOM.
> >>
> >> I looked around the code but couldn't figure out a code path
> >> which allows the checksum driver to become null at runtime.
> >
> > Most likely is that the journal wasn't started with the checksum driver
> > turned on, and then your randomizing of the journal sb *while it was running*
> > flipped the feature bit on, causing jbd2 to think checksumming was turned on.
> >
> > I guess the "proper" fix is to set j_chksum_driver at journal load time if
> > the superblock flags are set properly and then gate all other accesses on
> > the status of j_chksum_driver just in case someone obliterates the journal sb.
> >
> > OTOH, why can't you unmount the FS and /then/ randomize the disk?
> >
> > --D
> >
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-09-30 18:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-30 13:35 Crash in jbd2_chksum due to null journal->j_chksum_driver Nikolay Borisov
2015-09-30 17:12 ` Darrick J. Wong
2015-09-30 18:13   ` Nikolay Borisov
2015-09-30 18:43     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).