linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] invalid superblock checksum possibly due to race
@ 2020-06-24 23:56 Costa Sapuntzakis
  2020-06-30 11:48 ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Costa Sapuntzakis @ 2020-06-24 23:56 UTC (permalink / raw)
  To: linux-ext4

Our workload: taking snapshots repeatedly of an active ext4 filesystem
(vdbench fwiw). e2fsck discovered a snapshot that had a corrupted
superblock after journal replay. Diffing the corrupted superblock to
the superblock before journal replay revealed that only s_last_orphan
and the checksum had changed.

The following race could explain it:

Thread 1 (T1): ext4_orphan_del -> update s_last_orphan to value A ->
ext4_handle_dirty_super -> ext4_superblock_csum_set -- PAUSE right
before setting es->s_checksum

T2: ext4_orphan_del -> update s_last_orphan to value B ->
ext4_handle_dirty_super -> ext4_superblock_csum_set runs to completion

T1: Resume and assign es->s_checksum

Is there higher level synchronization going on that makes this race benign?

If not, a spinlock around the calculation and assignment should fix it.

The spinlock still has the race where s_last_orphan is being updated
while the checksum is calculated. But the last thread to set
s_last_orphan will also eventually try to recalculate the checksum and
set it right (though it's possible some other thread will do it for
it). And I'm guessing/hoping jbd2 won't flush the superblock to the
journal and close a transaction until the references from
journal_get_write_access drain. The checksum is recalculated before
the get_write_access reference is dropped.

-Costa

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] invalid superblock checksum possibly due to race
  2020-06-24 23:56 [BUG] invalid superblock checksum possibly due to race Costa Sapuntzakis
@ 2020-06-30 11:48 ` Jan Kara
  2020-06-30 18:34   ` Costa Sapuntzakis
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2020-06-30 11:48 UTC (permalink / raw)
  To: Costa Sapuntzakis; +Cc: linux-ext4

Hello!

On Wed 24-06-20 16:56:18, Costa Sapuntzakis wrote:
> Our workload: taking snapshots repeatedly of an active ext4 filesystem
> (vdbench fwiw). e2fsck discovered a snapshot that had a corrupted
> superblock after journal replay. Diffing the corrupted superblock to
> the superblock before journal replay revealed that only s_last_orphan
> and the checksum had changed.
> 
> The following race could explain it:
> 
> Thread 1 (T1): ext4_orphan_del -> update s_last_orphan to value A ->
> ext4_handle_dirty_super -> ext4_superblock_csum_set -- PAUSE right
> before setting es->s_checksum
> 
> T2: ext4_orphan_del -> update s_last_orphan to value B ->
> ext4_handle_dirty_super -> ext4_superblock_csum_set runs to completion
> 
> T1: Resume and assign es->s_checksum
> 
> Is there higher level synchronization going on that makes this race benign?

Thanks for report and the analysis. What you describe indeed seems
possible.

> If not, a spinlock around the calculation and assignment should fix it.

Yes, probably ext4_superblock_csum_set() should use

lock_buffer(EXT4_SB(sb)->s_sbh)

to synchronize updating of superblock checksum. Will you send a patch?

> The spinlock still has the race where s_last_orphan is being updated
> while the checksum is calculated. But the last thread to set
> s_last_orphan will also eventually try to recalculate the checksum and
> set it right (though it's possible some other thread will do it for
> it). And I'm guessing/hoping jbd2 won't flush the superblock to the
> journal and close a transaction until the references from
> journal_get_write_access drain. The checksum is recalculated before
> the get_write_access reference is dropped.

Yes, jbd2 layer will make sure that inconsistent block contents will not
make it to disk in this case.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] invalid superblock checksum possibly due to race
  2020-06-30 11:48 ` Jan Kara
@ 2020-06-30 18:34   ` Costa Sapuntzakis
  2020-07-01 16:22     ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Costa Sapuntzakis @ 2020-06-30 18:34 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

Hello and thank you!

> Yes, probably ext4_superblock_csum_set() should use
>
> lock_buffer(EXT4_SB(sb)->s_sbh)
>
> to synchronize updating of superblock checksum. Will you send a patch?

Yes. I will send a patch.

I noticed lock_buffer can sleep. That would seem to imply to me that
lock_buffer can be held across I/Os.
I worry that this will occasionally significantly slow down this code
path compared to what it used to be.  Are there any things
about the way ext4 uses buffers that makes this less of a concern?

-Costa

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] invalid superblock checksum possibly due to race
  2020-06-30 18:34   ` Costa Sapuntzakis
@ 2020-07-01 16:22     ` Jan Kara
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Kara @ 2020-07-01 16:22 UTC (permalink / raw)
  To: Costa Sapuntzakis; +Cc: Jan Kara, linux-ext4

Hello!

On Tue 30-06-20 11:34:49, Costa Sapuntzakis wrote:
> > Yes, probably ext4_superblock_csum_set() should use
> >
> > lock_buffer(EXT4_SB(sb)->s_sbh)
> >
> > to synchronize updating of superblock checksum. Will you send a patch?
> 
> Yes. I will send a patch.

Thanks!

> I noticed lock_buffer can sleep. That would seem to imply to me that
> lock_buffer can be held across I/Os.
> I worry that this will occasionally significantly slow down this code
> path compared to what it used to be.  Are there any things
> about the way ext4 uses buffers that makes this less of a concern?

Yes, buffer lock is a sleeping lock but that's the lock we usually use to
protect consistency of buffer contents. So I prefer to use that lock unless
we have definitive performance data showing we need something more
clever...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-01 16:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-24 23:56 [BUG] invalid superblock checksum possibly due to race Costa Sapuntzakis
2020-06-30 11:48 ` Jan Kara
2020-06-30 18:34   ` Costa Sapuntzakis
2020-07-01 16:22     ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).