linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Possible UDF locking error?
@ 2019-03-23 20:14 Steve Magnani
  2019-03-25 16:42 ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Magnani @ 2019-03-23 20:14 UTC (permalink / raw)
  To: Jan Kara, linux-kernel, linux-fsdevel

Hi,

I have been hunting a UDF bug that occasionally results in generation
of an Allocation Extent Descriptor with an incorrect tagLocation. So
far I haven't been able to see a path through the code that could
cause that. But, I noticed some inconsistency in locking during
AED generation and wonder if it could result in random corruption.

The function udf_update_inode() has this general pattern:

   bh = udf_tgetblk(...);   // calls sb_getblk()
   lock_buffer(bh);
   memset(bh->b_data, 0, inode->i_sb->s_blocksize);
   
   // <snip>other code to populate FE/EFE data in the block</snip>
   
   set_buffer_uptodate(bh);
   unlock_buffer(bh);
   mark_buffer_dirty(bh);

This I can understand - the lock is held for as long as the buffer
contents are being assembled.

In contrast, udf_setup_indirect_aext(), which constructs an AED,
has this sequence:

   bh = udf_tgetblk(...);   // calls sb_getblk()
   lock_buffer(bh);
   memset(bh->b_data, 0, inode->i_sb->s_blocksize);

   set_buffer_uptodate(bh);
   unlock_buffer(bh);
   mark_buffer_dirty_inode(bh);

   // <snip>other code to populate AED data in the block</snip>

In this case the population of the block occurs without
the protection of the lock.

Because the block has been marked dirty, does this mean that
writeback could occur at any point during population?

There is one path through udf_setup_indirect_aext() where
mark_buffer_dirty_inode() gets called again after population is
complete, which I suppose could heal a partial writeout, but there is
also another path in which the buffer does not get marked dirty again.

Regards,
  ------------------------------------------------------------------------
  Steven J. Magnani               "I claim this network for MARS!
  www.digidescorp.com              Earthling, return my space modulator!"

  #include <standard.disclaimer>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible UDF locking error?
  2019-03-23 20:14 Possible UDF locking error? Steve Magnani
@ 2019-03-25 16:42 ` Jan Kara
  2019-03-25 18:23   ` Steve Magnani
  2019-03-30 19:49   ` Steve Magnani
  0 siblings, 2 replies; 5+ messages in thread
From: Jan Kara @ 2019-03-25 16:42 UTC (permalink / raw)
  To: Steve Magnani; +Cc: Jan Kara, linux-kernel, linux-fsdevel

Hi!

On Sat 23-03-19 15:14:05, Steve Magnani wrote:
> I have been hunting a UDF bug that occasionally results in generation
> of an Allocation Extent Descriptor with an incorrect tagLocation. So
> far I haven't been able to see a path through the code that could
> cause that. But, I noticed some inconsistency in locking during
> AED generation and wonder if it could result in random corruption.
> 
> The function udf_update_inode() has this general pattern:
> 
>   bh = udf_tgetblk(...);   // calls sb_getblk()
>   lock_buffer(bh);
>   memset(bh->b_data, 0, inode->i_sb->s_blocksize);
>   // <snip>other code to populate FE/EFE data in the block</snip>
>   set_buffer_uptodate(bh);
>   unlock_buffer(bh);
>   mark_buffer_dirty(bh);
> 
> This I can understand - the lock is held for as long as the buffer
> contents are being assembled.
> 
> In contrast, udf_setup_indirect_aext(), which constructs an AED,
> has this sequence:
> 
>   bh = udf_tgetblk(...);   // calls sb_getblk()
>   lock_buffer(bh);
>   memset(bh->b_data, 0, inode->i_sb->s_blocksize);
> 
>   set_buffer_uptodate(bh);
>   unlock_buffer(bh);
>   mark_buffer_dirty_inode(bh);
> 
>   // <snip>other code to populate AED data in the block</snip>
> 
> In this case the population of the block occurs without
> the protection of the lock.
> 
> Because the block has been marked dirty, does this mean that
> writeback could occur at any point during population?

Yes. Thanks for noticing this!

> There is one path through udf_setup_indirect_aext() where
> mark_buffer_dirty_inode() gets called again after population is
> complete, which I suppose could heal a partial writeout, but there is
> also another path in which the buffer does not get marked dirty again.

Generally, we add new extents to the created indirect extent which dirties
the buffer and that should fix the problem. But you are definitely right
that the code is suspicious and should be fixed. Will you send a patch?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible UDF locking error?
  2019-03-25 16:42 ` Jan Kara
@ 2019-03-25 18:23   ` Steve Magnani
  2019-03-30 19:49   ` Steve Magnani
  1 sibling, 0 replies; 5+ messages in thread
From: Steve Magnani @ 2019-03-25 18:23 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel

On 3/25/19 11:42 AM, Jan Kara wrote:
> Hi!
>
> On Sat 23-03-19 15:14:05, Steve Magnani wrote:
>> ...
>>
>> In contrast, udf_setup_indirect_aext(), which constructs an AED,
>> has this sequence:
>>
>>    bh = udf_tgetblk(...);   // calls sb_getblk()
>>    lock_buffer(bh);
>>    memset(bh->b_data, 0, inode->i_sb->s_blocksize);
>>
>>    set_buffer_uptodate(bh);
>>    unlock_buffer(bh);
>>    mark_buffer_dirty_inode(bh);
>>
>>    // <snip>other code to populate AED data in the block</snip>
>>
>> In this case the population of the block occurs without
>> the protection of the lock.
>>
>> Because the block has been marked dirty, does this mean that
>> writeback could occur at any point during population?
> Yes. Thanks for noticing this!
>
>> There is one path through udf_setup_indirect_aext() where
>> mark_buffer_dirty_inode() gets called again after population is
>> complete, which I suppose could heal a partial writeout, but there is
>> also another path in which the buffer does not get marked dirty again.
> Generally, we add new extents to the created indirect extent which dirties
> the buffer and that should fix the problem. But you are definitely right
> that the code is suspicious and should be fixed. Will you send a patch?
>
> 								Honza


Sure. There's at least one other place where it looked like there might 
be a similar issue.

Steve



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible UDF locking error?
  2019-03-25 16:42 ` Jan Kara
  2019-03-25 18:23   ` Steve Magnani
@ 2019-03-30 19:49   ` Steve Magnani
  2019-04-03  8:07     ` Jan Kara
  1 sibling, 1 reply; 5+ messages in thread
From: Steve Magnani @ 2019-03-30 19:49 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, linux-fsdevel

Jan -

On 3/25/19 11:42 AM, Jan Kara wrote:
> Hi!
>
> On Sat 23-03-19 15:14:05, Steve Magnani wrote:
>> I have been hunting a UDF bug that occasionally results in generation
>> of an Allocation Extent Descriptor with an incorrect tagLocation. So
>> far I haven't been able to see a path through the code that could
>> cause that. But, I noticed some inconsistency in locking during
>> AED generation and wonder if it could result in random corruption.
>>
>> The function udf_update_inode() has this general pattern:
>>
>>    bh = udf_tgetblk(...);   // calls sb_getblk()
>>    lock_buffer(bh);
>>    memset(bh->b_data, 0, inode->i_sb->s_blocksize);
>>    // <snip>other code to populate FE/EFE data in the block</snip>
>>    set_buffer_uptodate(bh);
>>    unlock_buffer(bh);
>>    mark_buffer_dirty(bh);
>>
>> This I can understand - the lock is held for as long as the buffer
>> contents are being assembled.
>>
>> In contrast, udf_setup_indirect_aext(), which constructs an AED,
>> has this sequence:
>>
>>    bh = udf_tgetblk(...);   // calls sb_getblk()
>>    lock_buffer(bh);
>>    memset(bh->b_data, 0, inode->i_sb->s_blocksize);
>>
>>    set_buffer_uptodate(bh);
>>    unlock_buffer(bh);
>>    mark_buffer_dirty_inode(bh);
>>
>>    // <snip>other code to populate AED data in the block</snip>
>>
>> In this case the population of the block occurs without
>> the protection of the lock.
>>
>> Because the block has been marked dirty, does this mean that
>> writeback could occur at any point during population?
> Yes. Thanks for noticing this!
>
>> There is one path through udf_setup_indirect_aext() where
>> mark_buffer_dirty_inode() gets called again after population is
>> complete, which I suppose could heal a partial writeout, but there is
>> also another path in which the buffer does not get marked dirty again.
> Generally, we add new extents to the created indirect extent which dirties
> the buffer and that should fix the problem. But you are definitely right
> that the code is suspicious and should be fixed. Will you send a patch?

I did a little archaeology to see how the code evolved to this point. 
It's been like this a long time.

I also did some research to understand why filesystems use lock_buffer() 
sometimes but not others. For example, the FAT driver never calls it. I 
ran across this thread from 2011:

        https://lkml.org/lkml/2011/5/16/402

...from which I conclude that while it is correct in a strict sense to 
hold a lock on a buffer any time its contents are being modified, 
performance considerations make it preferable (or at least reasonable) 
to make some modifications without a lock provided it's known that a 
subsequent write-out will "fix" any potential partial write out before 
anyone else tries to read the block. I doubt that UDF sees common use 
with DIF/DIX block devices, which might make a decision in favor of 
performance a little easier. Since the FAT driver doesn't contain 
Darrick's proposed changes I assume a decision was made that performance 
was more important there.

Certainly the call to udf_setup_indirect_aext() from udf_add_aext() 
meets those criteria. But udf_table_free_blocks() may not dirty the AED 
block.

So if this looks reasonable I will resend as a formal patch:

--- a/fs/udf/inode.c	2019-03-30 11:28:38.637759458 -0500
+++ b/fs/udf/inode.c	2019-03-30 11:33:00.357761250 -0500
@@ -1873,9 +1873,6 @@ int udf_setup_indirect_aext(struct inode
  		return -EIO;
  	lock_buffer(bh);
  	memset(bh->b_data, 0x00, sb->s_blocksize);
-	set_buffer_uptodate(bh);
-	unlock_buffer(bh);
-	mark_buffer_dirty_inode(bh, inode);
  
  	aed = (struct allocExtDesc *)(bh->b_data);
  	if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT)) {
@@ -1890,6 +1887,9 @@ int udf_setup_indirect_aext(struct inode
  	udf_new_tag(bh->b_data, TAG_IDENT_AED, ver, 1, block,
  		    sizeof(struct tag));
  
+	set_buffer_uptodate(bh);
+	unlock_buffer(bh);
+
  	nepos.block = neloc;
  	nepos.offset = sizeof(struct allocExtDesc);
  	nepos.bh = bh;
@@ -1913,6 +1913,8 @@ int udf_setup_indirect_aext(struct inode
  	} else {
  		__udf_add_aext(inode, epos, &nepos.block,
  			       sb->s_blocksize | EXT_NEXT_EXTENT_ALLOCDECS, 0);
+		/* Make sure completed AED gets written out */
+		mark_buffer_dirty_inode(nepos.bh, inode);
  	}
  
  	brelse(epos->bh);

------------------------------------------------------------------------
  Steven J. Magnani               "I claim this network for MARS!
  www.digidescorp.com               Earthling, return my space modulator!"

  #include <standard.disclaimer>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible UDF locking error?
  2019-03-30 19:49   ` Steve Magnani
@ 2019-04-03  8:07     ` Jan Kara
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2019-04-03  8:07 UTC (permalink / raw)
  To: Steve Magnani; +Cc: Jan Kara, linux-kernel, linux-fsdevel

Hi,

On Sat 30-03-19 14:49:46, Steve Magnani wrote:
> On 3/25/19 11:42 AM, Jan Kara wrote:
> > Hi!
> > 
> > On Sat 23-03-19 15:14:05, Steve Magnani wrote:
> > > I have been hunting a UDF bug that occasionally results in generation
> > > of an Allocation Extent Descriptor with an incorrect tagLocation. So
> > > far I haven't been able to see a path through the code that could
> > > cause that. But, I noticed some inconsistency in locking during
> > > AED generation and wonder if it could result in random corruption.
> > > 
> > > The function udf_update_inode() has this general pattern:
> > > 
> > >    bh = udf_tgetblk(...);   // calls sb_getblk()
> > >    lock_buffer(bh);
> > >    memset(bh->b_data, 0, inode->i_sb->s_blocksize);
> > >    // <snip>other code to populate FE/EFE data in the block</snip>
> > >    set_buffer_uptodate(bh);
> > >    unlock_buffer(bh);
> > >    mark_buffer_dirty(bh);
> > > 
> > > This I can understand - the lock is held for as long as the buffer
> > > contents are being assembled.
> > > 
> > > In contrast, udf_setup_indirect_aext(), which constructs an AED,
> > > has this sequence:
> > > 
> > >    bh = udf_tgetblk(...);   // calls sb_getblk()
> > >    lock_buffer(bh);
> > >    memset(bh->b_data, 0, inode->i_sb->s_blocksize);
> > > 
> > >    set_buffer_uptodate(bh);
> > >    unlock_buffer(bh);
> > >    mark_buffer_dirty_inode(bh);
> > > 
> > >    // <snip>other code to populate AED data in the block</snip>
> > > 
> > > In this case the population of the block occurs without
> > > the protection of the lock.
> > > 
> > > Because the block has been marked dirty, does this mean that
> > > writeback could occur at any point during population?
> > Yes. Thanks for noticing this!
> > 
> > > There is one path through udf_setup_indirect_aext() where
> > > mark_buffer_dirty_inode() gets called again after population is
> > > complete, which I suppose could heal a partial writeout, but there is
> > > also another path in which the buffer does not get marked dirty again.
> > Generally, we add new extents to the created indirect extent which dirties
> > the buffer and that should fix the problem. But you are definitely right
> > that the code is suspicious and should be fixed. Will you send a patch?
> 
> I did a little archaeology to see how the code evolved to this point. It's
> been like this a long time.
> 
> I also did some research to understand why filesystems use lock_buffer()
> sometimes but not others. For example, the FAT driver never calls it. I ran
> across this thread from 2011:
> 
>        https://lkml.org/lkml/2011/5/16/402
> 
> ...from which I conclude that while it is correct in a strict sense to hold
> a lock on a buffer any time its contents are being modified, performance
> considerations make it preferable (or at least reasonable) to make some
> modifications without a lock provided it's known that a subsequent write-out
> will "fix" any potential partial write out before anyone else tries to read
> the block.

Understood but UDF (and neither FAT) are really that performance critical.
If you look for performance, you'd certainly pick a different filesystem.
UDF is mainly for data interchange so it should work reasonably for copy-in
copy-out style of workloads, the rest isn't that important. So there
correctness and simplicity is preferred over performance.

> I doubt that UDF sees common use with DIF/DIX block devices,
> which might make a decision in favor of performance a little easier. Since
> the FAT driver doesn't contain Darrick's proposed changes I assume a
> decision was made that performance was more important there.
> 
> Certainly the call to udf_setup_indirect_aext() from udf_add_aext() meets
> those criteria. But udf_table_free_blocks() may not dirty the AED block.
> 
> So if this looks reasonable I will resend as a formal patch:
> 
> --- a/fs/udf/inode.c	2019-03-30 11:28:38.637759458 -0500
> +++ b/fs/udf/inode.c	2019-03-30 11:33:00.357761250 -0500
> @@ -1873,9 +1873,6 @@ int udf_setup_indirect_aext(struct inode
>  		return -EIO;
>  	lock_buffer(bh);
>  	memset(bh->b_data, 0x00, sb->s_blocksize);
> -	set_buffer_uptodate(bh);
> -	unlock_buffer(bh);
> -	mark_buffer_dirty_inode(bh, inode);
>  	aed = (struct allocExtDesc *)(bh->b_data);
>  	if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT)) {
> @@ -1890,6 +1887,9 @@ int udf_setup_indirect_aext(struct inode
>  	udf_new_tag(bh->b_data, TAG_IDENT_AED, ver, 1, block,
>  		    sizeof(struct tag));
> +	set_buffer_uptodate(bh);
> +	unlock_buffer(bh);
> +
>  	nepos.block = neloc;
>  	nepos.offset = sizeof(struct allocExtDesc);
>  	nepos.bh = bh;
> @@ -1913,6 +1913,8 @@ int udf_setup_indirect_aext(struct inode
>  	} else {
>  		__udf_add_aext(inode, epos, &nepos.block,
>  			       sb->s_blocksize | EXT_NEXT_EXTENT_ALLOCDECS, 0);
> +		/* Make sure completed AED gets written out */
> +		mark_buffer_dirty_inode(nepos.bh, inode);

Why do you mark the buffer as dirty only here? I'd just mark it dirty after
unlocking. If __udf_add_aext() or udf_write_aext() modify the buffer, they
will mark it as dirty as well... Thanks!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-04-03  8:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-23 20:14 Possible UDF locking error? Steve Magnani
2019-03-25 16:42 ` Jan Kara
2019-03-25 18:23   ` Steve Magnani
2019-03-30 19:49   ` Steve Magnani
2019-04-03  8:07     ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).