All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout
@ 2021-08-04 12:50 yangerkun
  2021-08-04 13:35 ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: yangerkun @ 2021-08-04 12:50 UTC (permalink / raw)
  To: tytso, jack, adilger.kernel; +Cc: linux-ext4, yangerkun, yukuai3

Our testcase(briefly described as fsstress on dm thin-provisioning which
ext4 see volume size with 100G but actual size 10G) trigger a hungtask
bug since ext4_writepages fall into a infinite loop:

static int ext4_writepages(xxx)
{
    ...
   while (!done && mpd.first_page <= mpd.last_page) {
       ...
       ret = mpage_prepare_extent_to_map(&mpd);
       if (!ret) {
           ...
           ret = mpage_map_and_submit_extent(handle,
&mpd,&give_up_on_write);
           <----- will return -ENOSPC
           ...
       }
       ...
       if (ret == -ENOSPC && sbi->s_journal) {
           <------ we cannot break since we will get ENOSPC forever
           jbd2_journal_force_commit_nested(sbi->s_journal);
           ret = 0;
           continue;
       }
       ...
   }
}

Got ENOSPC with follow stack:
...
ext4_ext_map_blocks
  ext4_ext_convert_to_initialized
    ext4_ext_zeroout
      ext4_issue_zeroout
        ...
        submit_bio_wait <-- bio to thinpool will return ENOSPC

'df22291ff0fd ("ext4: Retry block allocation if we have free blocks
left")' add the logic to retry block allcate since we may get free block
after we commit a transaction. But the ENOSPC from thin-provisioning
will confuse ext4, and lead the upper infinite loop. Fix it by convert
the err to EIO.

Signed-off-by: yangerkun <yangerkun@huawei.com>
---
 fs/ext4/inode.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 038aebd7eb2f..d9ded699a88c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -428,6 +428,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk,
 	if (ret > 0)
 		ret = 0;
 
+	if (ret == -ENOSPC)
+		ret = -EIO;
+
 	return ret;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout
  2021-08-04 12:50 [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout yangerkun
@ 2021-08-04 13:35 ` Jan Kara
  2021-08-13 15:18   ` Theodore Ts'o
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2021-08-04 13:35 UTC (permalink / raw)
  To: yangerkun; +Cc: tytso, jack, adilger.kernel, linux-ext4, yukuai3

On Wed 04-08-21 20:50:44, yangerkun wrote:
> Our testcase(briefly described as fsstress on dm thin-provisioning which
> ext4 see volume size with 100G but actual size 10G) trigger a hungtask
> bug since ext4_writepages fall into a infinite loop:
> 
> static int ext4_writepages(xxx)
> {
>     ...
>    while (!done && mpd.first_page <= mpd.last_page) {
>        ...
>        ret = mpage_prepare_extent_to_map(&mpd);
>        if (!ret) {
>            ...
>            ret = mpage_map_and_submit_extent(handle,
> &mpd,&give_up_on_write);
>            <----- will return -ENOSPC
>            ...
>        }
>        ...
>        if (ret == -ENOSPC && sbi->s_journal) {
>            <------ we cannot break since we will get ENOSPC forever
>            jbd2_journal_force_commit_nested(sbi->s_journal);
>            ret = 0;
>            continue;
>        }
>        ...
>    }
> }
> 
> Got ENOSPC with follow stack:
> ...
> ext4_ext_map_blocks
>   ext4_ext_convert_to_initialized
>     ext4_ext_zeroout
>       ext4_issue_zeroout
>         ...
>         submit_bio_wait <-- bio to thinpool will return ENOSPC
> 
> 'df22291ff0fd ("ext4: Retry block allocation if we have free blocks
> left")' add the logic to retry block allcate since we may get free block
> after we commit a transaction. But the ENOSPC from thin-provisioning
> will confuse ext4, and lead the upper infinite loop. Fix it by convert
> the err to EIO.
> 
> Signed-off-by: yangerkun <yangerkun@huawei.com>

Thanks for the patch. As a quick fix for the problem this is probably fine.
But longer term we might need to implement a configurable behavior for this
because just dropping data on the floor (which is what would happen here)
need not be what sysadmin wants and blocking until space is provisioned may be
actually a preferable behavior. Anyway for now feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

									Honza

> ---
>  fs/ext4/inode.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 038aebd7eb2f..d9ded699a88c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -428,6 +428,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t lblk, ext4_fsblk_t pblk,
>  	if (ret > 0)
>  		ret = 0;
>  
> +	if (ret == -ENOSPC)
> +		ret = -EIO;
> +
>  	return ret;
>  }
>  
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout
  2021-08-04 13:35 ` Jan Kara
@ 2021-08-13 15:18   ` Theodore Ts'o
  2021-08-13 21:27     ` [PATCH] ext4: if zeroout fails fall back to splitting the extent node Theodore Ts'o
  2021-08-16 10:05     ` [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout Jan Kara
  0 siblings, 2 replies; 14+ messages in thread
From: Theodore Ts'o @ 2021-08-13 15:18 UTC (permalink / raw)
  To: Jan Kara; +Cc: yangerkun, adilger.kernel, linux-ext4, yukuai3

On Wed, Aug 04, 2021 at 03:35:29PM +0200, Jan Kara wrote:
> On Wed 04-08-21 20:50:44, yangerkun wrote:
> > Our testcase(briefly described as fsstress on dm thin-provisioning which
> > ext4 see volume size with 100G but actual size 10G) trigger a hungtask
> > bug since ext4_writepages fall into a infinite loop:
> > 
> > Got ENOSPC with follow stack:
> > ...
> > ext4_ext_map_blocks
> >   ext4_ext_convert_to_initialized
> >     ext4_ext_zeroout
> >       ext4_issue_zeroout
> >         ...
> >         submit_bio_wait <-- bio to thinpool will return ENOSPC
> > 
> 
> Thanks for the patch. As a quick fix for the problem this is probably fine.
> But longer term we might need to implement a configurable behavior for this
> because just dropping data on the floor (which is what would happen here)
> need not be what sysadmin wants and blocking until space is provisioned may be
> actually a preferable behavior. Anyway for now feel free to add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Hmm, I wonder if this would be a better fix.  (Not yet tested, may fry
your file system, etc....)   What do folks think?

						- Ted

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 92ad64b89d9b..501516cadc1b 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 				split_map.m_len - ee_block);
 			err = ext4_ext_zeroout(inode, &zero_ex1);
 			if (err)
-				goto out;
+				goto fallback;
 			split_map.m_len = allocated;
 		}
 		if (split_map.m_lblk - ee_block + split_map.m_len <
@@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 						      ext4_ext_pblock(ex));
 				err = ext4_ext_zeroout(inode, &zero_ex2);
 				if (err)
-					goto out;
+					goto fallback;
 			}
 
 			split_map.m_len += split_map.m_lblk - ee_block;
@@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 		}
 	}
 
+fallback:
 	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
 				flags);
 	if (err > 0)

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-08-13 15:18   ` Theodore Ts'o
@ 2021-08-13 21:27     ` Theodore Ts'o
  2021-08-14  2:15       ` yangerkun
  2021-09-26 11:35       ` yangerkun
  2021-08-16 10:05     ` [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout Jan Kara
  1 sibling, 2 replies; 14+ messages in thread
From: Theodore Ts'o @ 2021-08-13 21:27 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, yangerkun, yukuai3

If the underlying storage device is using thin-provisioning, it's
possible for a zeroout operation to return ENOSPC.

Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
left") added logic to retry block allocation since we might get free block
after we commit a transaction. But the ENOSPC from thin-provisioning
will confuse ext4, and lead to an infinite loop.

Since using zeroout instead of splitting the extent node is an
optimization, if it fails, we might as well fall back to splitting the
extent node.

Reported-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---

I've run this through my battery of tests, and it doesn't cause any
regressions.  Yangerkun, can you test this and see if this works for
you?

 fs/ext4/extents.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 92ad64b89d9b..501516cadc1b 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 				split_map.m_len - ee_block);
 			err = ext4_ext_zeroout(inode, &zero_ex1);
 			if (err)
-				goto out;
+				goto fallback;
 			split_map.m_len = allocated;
 		}
 		if (split_map.m_lblk - ee_block + split_map.m_len <
@@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 						      ext4_ext_pblock(ex));
 				err = ext4_ext_zeroout(inode, &zero_ex2);
 				if (err)
-					goto out;
+					goto fallback;
 			}
 
 			split_map.m_len += split_map.m_lblk - ee_block;
@@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 		}
 	}
 
+fallback:
 	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
 				flags);
 	if (err > 0)
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-08-13 21:27     ` [PATCH] ext4: if zeroout fails fall back to splitting the extent node Theodore Ts'o
@ 2021-08-14  2:15       ` yangerkun
  2021-08-16  6:57         ` yangerkun
  2021-09-26 11:35       ` yangerkun
  1 sibling, 1 reply; 14+ messages in thread
From: yangerkun @ 2021-08-14  2:15 UTC (permalink / raw)
  To: Theodore Ts'o, Ext4 Developers List; +Cc: yukuai3



在 2021/8/14 5:27, Theodore Ts'o 写道:
> If the underlying storage device is using thin-provisioning, it's
> possible for a zeroout operation to return ENOSPC.
> 
> Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
> left") added logic to retry block allocation since we might get free block
> after we commit a transaction. But the ENOSPC from thin-provisioning
> will confuse ext4, and lead to an infinite loop.
> 
> Since using zeroout instead of splitting the extent node is an
> optimization, if it fails, we might as well fall back to splitting the
> extent node.
> 
> Reported-by: yangerkun <yangerkun@huawei.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
> 
> I've run this through my battery of tests, and it doesn't cause any
> regressions.  Yangerkun, can you test this and see if this works for
> you?

Will do it.

> 
>   fs/ext4/extents.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 92ad64b89d9b..501516cadc1b 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>   				split_map.m_len - ee_block);
>   			err = ext4_ext_zeroout(inode, &zero_ex1);
>   			if (err)
> -				goto out;
> +				goto fallback;
>   			split_map.m_len = allocated;
>   		}
>   		if (split_map.m_lblk - ee_block + split_map.m_len <
> @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>   						      ext4_ext_pblock(ex));
>   				err = ext4_ext_zeroout(inode, &zero_ex2);
>   				if (err)
> -					goto out;
> +					goto fallback;
>   			}
>   
>   			split_map.m_len += split_map.m_lblk - ee_block;
> @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>   		}
>   	}
>   
> +fallback:
>   	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
>   				flags);
>   	if (err > 0)
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-08-14  2:15       ` yangerkun
@ 2021-08-16  6:57         ` yangerkun
  0 siblings, 0 replies; 14+ messages in thread
From: yangerkun @ 2021-08-16  6:57 UTC (permalink / raw)
  To: Theodore Ts'o, Ext4 Developers List; +Cc: yukuai3



在 2021/8/14 10:15, yangerkun 写道:
> 
> 
> 在 2021/8/14 5:27, Theodore Ts'o 写道:
>> If the underlying storage device is using thin-provisioning, it's
>> possible for a zeroout operation to return ENOSPC.
>>
>> Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
>> left") added logic to retry block allocation since we might get free 
>> block
>> after we commit a transaction. But the ENOSPC from thin-provisioning
>> will confuse ext4, and lead to an infinite loop.
>>
>> Since using zeroout instead of splitting the extent node is an
>> optimization, if it fails, we might as well fall back to splitting the
>> extent node.
>>
>> Reported-by: yangerkun <yangerkun@huawei.com>
>> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>> ---
>>
>> I've run this through my battery of tests, and it doesn't cause any
>> regressions.  Yangerkun, can you test this and see if this works for
>> you?
> 
> Will do it.

Thanks for the patch, it can help us to pass the testcase. And after 
some review, it's really a better fix for me.

Thanks,
Kun.

> 
>>
>>   fs/ext4/extents.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index 92ad64b89d9b..501516cadc1b 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -3569,7 +3569,7 @@ static int 
>> ext4_ext_convert_to_initialized(handle_t *handle,
>>                   split_map.m_len - ee_block);
>>               err = ext4_ext_zeroout(inode, &zero_ex1);
>>               if (err)
>> -                goto out;
>> +                goto fallback;
>>               split_map.m_len = allocated;
>>           }
>>           if (split_map.m_lblk - ee_block + split_map.m_len <
>> @@ -3583,7 +3583,7 @@ static int 
>> ext4_ext_convert_to_initialized(handle_t *handle,
>>                                 ext4_ext_pblock(ex));
>>                   err = ext4_ext_zeroout(inode, &zero_ex2);
>>                   if (err)
>> -                    goto out;
>> +                    goto fallback;
>>               }
>>               split_map.m_len += split_map.m_lblk - ee_block;
>> @@ -3592,6 +3592,7 @@ static int 
>> ext4_ext_convert_to_initialized(handle_t *handle,
>>           }
>>       }
>> +fallback:
>>       err = ext4_split_extent(handle, inode, ppath, &split_map, 
>> split_flag,
>>                   flags);
>>       if (err > 0)
>>
> .

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout
  2021-08-13 15:18   ` Theodore Ts'o
  2021-08-13 21:27     ` [PATCH] ext4: if zeroout fails fall back to splitting the extent node Theodore Ts'o
@ 2021-08-16 10:05     ` Jan Kara
  2021-08-16 14:16       ` Theodore Ts'o
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Kara @ 2021-08-16 10:05 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jan Kara, yangerkun, adilger.kernel, linux-ext4, yukuai3

On Fri 13-08-21 11:18:01, Theodore Ts'o wrote:
> On Wed, Aug 04, 2021 at 03:35:29PM +0200, Jan Kara wrote:
> > On Wed 04-08-21 20:50:44, yangerkun wrote:
> > > Our testcase(briefly described as fsstress on dm thin-provisioning which
> > > ext4 see volume size with 100G but actual size 10G) trigger a hungtask
> > > bug since ext4_writepages fall into a infinite loop:
> > > 
> > > Got ENOSPC with follow stack:
> > > ...
> > > ext4_ext_map_blocks
> > >   ext4_ext_convert_to_initialized
> > >     ext4_ext_zeroout
> > >       ext4_issue_zeroout
> > >         ...
> > >         submit_bio_wait <-- bio to thinpool will return ENOSPC
> > > 
> > 
> > Thanks for the patch. As a quick fix for the problem this is probably fine.
> > But longer term we might need to implement a configurable behavior for this
> > because just dropping data on the floor (which is what would happen here)
> > need not be what sysadmin wants and blocking until space is provisioned may be
> > actually a preferable behavior. Anyway for now feel free to add:
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> 
> Hmm, I wonder if this would be a better fix.  (Not yet tested, may fry
> your file system, etc....)   What do folks think?

Yes, that looks indeed better. I'd note that even splitting extent may fail
due to ENOSPC on thin-provisioned storage but the chances are *much* lower.

								Honza

> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 92ad64b89d9b..501516cadc1b 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>  				split_map.m_len - ee_block);
>  			err = ext4_ext_zeroout(inode, &zero_ex1);
>  			if (err)
> -				goto out;
> +				goto fallback;
>  			split_map.m_len = allocated;
>  		}
>  		if (split_map.m_lblk - ee_block + split_map.m_len <
> @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>  						      ext4_ext_pblock(ex));
>  				err = ext4_ext_zeroout(inode, &zero_ex2);
>  				if (err)
> -					goto out;
> +					goto fallback;
>  			}
>  
>  			split_map.m_len += split_map.m_lblk - ee_block;
> @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>  		}
>  	}
>  
> +fallback:
>  	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
>  				flags);
>  	if (err > 0)
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout
  2021-08-16 10:05     ` [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout Jan Kara
@ 2021-08-16 14:16       ` Theodore Ts'o
  0 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2021-08-16 14:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: yangerkun, adilger.kernel, linux-ext4, yukuai3

On Mon, Aug 16, 2021 at 12:05:45PM +0200, Jan Kara wrote:
> 
> Yes, that looks indeed better. I'd note that even splitting extent may fail
> due to ENOSPC on thin-provisioned storage but the chances are *much* lower.

Indeed, any kind of metadata update (updating an inode atime, creating
a new inode, deleting a directory entry, etc.) can fail due to ENOSPC
on thin-provision storage, leading to a potentially corrupted file
system since some writes will succeed, while others won't --- and
that's not a scenario that's super well tested, nor is there much we
can do other than remounting the file system read/only or forcing a
reboot.

But at least we won't throw the kernel into an infinite loop, which is
what yangerkun was reporting...

					- Ted

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-08-13 21:27     ` [PATCH] ext4: if zeroout fails fall back to splitting the extent node Theodore Ts'o
  2021-08-14  2:15       ` yangerkun
@ 2021-09-26 11:35       ` yangerkun
  2021-11-23  9:27         ` Jan Kara
  1 sibling, 1 reply; 14+ messages in thread
From: yangerkun @ 2021-09-26 11:35 UTC (permalink / raw)
  To: Theodore Ts'o, Ext4 Developers List, Jan Kara; +Cc: yukuai3

Hi,

Rethink about this problem. Should we consider other place which call 
ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not 
really happened)...

How about include follow patch which not only transfer ENOSPC to EIO. 
But also stop to overwrite the error return by ext4_ext_insert_extent in 
ext4_split_extent_at.

Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting 
the extent node") can work together with this patch.



diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index c0de30f25185..66767ede235f 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3218,16 +3218,18 @@ static int ext4_split_extent_at(handle_t *handle,
                 goto out;

         if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
+               int ret = 0;
+
                 if (split_flag & 
(EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
                         if (split_flag & EXT4_EXT_DATA_VALID1) {
-                               err = ext4_ext_zeroout(inode, ex2);
+                               ret = ext4_ext_zeroout(inode, ex2);
                                 zero_ex.ee_block = ex2->ee_block;
                                 zero_ex.ee_len = cpu_to_le16(
 
ext4_ext_get_actual_len(ex2));
                                 ext4_ext_store_pblock(&zero_ex,
 
ext4_ext_pblock(ex2));
                         } else {
-                               err = ext4_ext_zeroout(inode, ex);
+                               ret = ext4_ext_zeroout(inode, ex);
                                 zero_ex.ee_block = ex->ee_block;
                                 zero_ex.ee_len = cpu_to_le16(
 
ext4_ext_get_actual_len(ex));
@@ -3235,7 +3237,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                                       ext4_ext_pblock(ex));
                         }
                 } else {
-                       err = ext4_ext_zeroout(inode, &orig_ex);
+                       ret = ext4_ext_zeroout(inode, &orig_ex);
                         zero_ex.ee_block = orig_ex.ee_block;
                         zero_ex.ee_len = cpu_to_le16(
 
ext4_ext_get_actual_len(&orig_ex));
@@ -3243,7 +3245,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                               ext4_ext_pblock(&orig_ex));
                 }

-               if (!err) {
+               if (!ret) {
                         /* update the extent length and mark as 
initialized */
                         ex->ee_len = cpu_to_le16(ee_len);
                         ext4_ext_try_to_merge(handle, inode, path, ex);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d18852d6029c..95b970581864 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -427,6 +427,9 @@ int ext4_issue_zeroout(struct inode *inode, 
ext4_lblk_t lblk, ext4_fsblk_t pblk,
         if (ret > 0)
                 ret = 0;

+       if (ret == -ENOSPC)
+               ret = -EIO;
+
         return ret;
  }



在 2021/8/14 5:27, Theodore Ts'o 写道:
> If the underlying storage device is using thin-provisioning, it's
> possible for a zeroout operation to return ENOSPC.
> 
> Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
> left") added logic to retry block allocation since we might get free block
> after we commit a transaction. But the ENOSPC from thin-provisioning
> will confuse ext4, and lead to an infinite loop.
> 
> Since using zeroout instead of splitting the extent node is an
> optimization, if it fails, we might as well fall back to splitting the
> extent node.
> 
> Reported-by: yangerkun <yangerkun@huawei.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
> 
> I've run this through my battery of tests, and it doesn't cause any
> regressions.  Yangerkun, can you test this and see if this works for
> you?
> 
>   fs/ext4/extents.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 92ad64b89d9b..501516cadc1b 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>   				split_map.m_len - ee_block);
>   			err = ext4_ext_zeroout(inode, &zero_ex1);
>   			if (err)
> -				goto out;
> +				goto fallback;
>   			split_map.m_len = allocated;
>   		}
>   		if (split_map.m_lblk - ee_block + split_map.m_len <
> @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>   						      ext4_ext_pblock(ex));
>   				err = ext4_ext_zeroout(inode, &zero_ex2);
>   				if (err)
> -					goto out;
> +					goto fallback;
>   			}
>   
>   			split_map.m_len += split_map.m_lblk - ee_block;
> @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>   		}
>   	}
>   
> +fallback:
>   	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
>   				flags);
>   	if (err > 0)
> 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-09-26 11:35       ` yangerkun
@ 2021-11-23  9:27         ` Jan Kara
  2021-11-24  9:01           ` yangerkun
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2021-11-23  9:27 UTC (permalink / raw)
  To: yangerkun; +Cc: Theodore Ts'o, Ext4 Developers List, Jan Kara, yukuai3

Hello,

On Sun 26-09-21 19:35:01, yangerkun wrote:
> Rethink about this problem. Should we consider other place which call
> ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
> really happened)...
> 
> How about include follow patch which not only transfer ENOSPC to EIO. But
> also stop to overwrite the error return by ext4_ext_insert_extent in
> ext4_split_extent_at.
> 
> Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
> extent node") can work together with this patch.

I've got back to this. The ext4_ext_zeroout() calls in
ext4_split_extent_at() seem to be there as fallback when insertion of a new
extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
think returning an error as the code does now is correct and we don't have
much other option. Also we are really running out of disk space so I think
returning ENOSPC is fine. What exact scenario are you afraid of?

								Honza

> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index c0de30f25185..66767ede235f 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3218,16 +3218,18 @@ static int ext4_split_extent_at(handle_t *handle,
>                 goto out;
> 
>         if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
> +               int ret = 0;
> +
>                 if (split_flag &
> (EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
>                         if (split_flag & EXT4_EXT_DATA_VALID1) {
> -                               err = ext4_ext_zeroout(inode, ex2);
> +                               ret = ext4_ext_zeroout(inode, ex2);
>                                 zero_ex.ee_block = ex2->ee_block;
>                                 zero_ex.ee_len = cpu_to_le16(
> 
> ext4_ext_get_actual_len(ex2));
>                                 ext4_ext_store_pblock(&zero_ex,
> 
> ext4_ext_pblock(ex2));
>                         } else {
> -                               err = ext4_ext_zeroout(inode, ex);
> +                               ret = ext4_ext_zeroout(inode, ex);
>                                 zero_ex.ee_block = ex->ee_block;
>                                 zero_ex.ee_len = cpu_to_le16(
> 
> ext4_ext_get_actual_len(ex));
> @@ -3235,7 +3237,7 @@ static int ext4_split_extent_at(handle_t *handle,
>                                                       ext4_ext_pblock(ex));
>                         }
>                 } else {
> -                       err = ext4_ext_zeroout(inode, &orig_ex);
> +                       ret = ext4_ext_zeroout(inode, &orig_ex);
>                         zero_ex.ee_block = orig_ex.ee_block;
>                         zero_ex.ee_len = cpu_to_le16(
> 
> ext4_ext_get_actual_len(&orig_ex));
> @@ -3243,7 +3245,7 @@ static int ext4_split_extent_at(handle_t *handle,
>                                               ext4_ext_pblock(&orig_ex));
>                 }
> 
> -               if (!err) {
> +               if (!ret) {
>                         /* update the extent length and mark as initialized
> */
>                         ex->ee_len = cpu_to_le16(ee_len);
>                         ext4_ext_try_to_merge(handle, inode, path, ex);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d18852d6029c..95b970581864 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -427,6 +427,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t
> lblk, ext4_fsblk_t pblk,
>         if (ret > 0)
>                 ret = 0;
> 
> +       if (ret == -ENOSPC)
> +               ret = -EIO;
> +
>         return ret;
>  }
> 
> 
> 
> 在 2021/8/14 5:27, Theodore Ts'o 写道:
> > If the underlying storage device is using thin-provisioning, it's
> > possible for a zeroout operation to return ENOSPC.
> > 
> > Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
> > left") added logic to retry block allocation since we might get free block
> > after we commit a transaction. But the ENOSPC from thin-provisioning
> > will confuse ext4, and lead to an infinite loop.
> > 
> > Since using zeroout instead of splitting the extent node is an
> > optimization, if it fails, we might as well fall back to splitting the
> > extent node.
> > 
> > Reported-by: yangerkun <yangerkun@huawei.com>
> > Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> > ---
> > 
> > I've run this through my battery of tests, and it doesn't cause any
> > regressions.  Yangerkun, can you test this and see if this works for
> > you?
> > 
> >   fs/ext4/extents.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index 92ad64b89d9b..501516cadc1b 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
> >   				split_map.m_len - ee_block);
> >   			err = ext4_ext_zeroout(inode, &zero_ex1);
> >   			if (err)
> > -				goto out;
> > +				goto fallback;
> >   			split_map.m_len = allocated;
> >   		}
> >   		if (split_map.m_lblk - ee_block + split_map.m_len <
> > @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
> >   						      ext4_ext_pblock(ex));
> >   				err = ext4_ext_zeroout(inode, &zero_ex2);
> >   				if (err)
> > -					goto out;
> > +					goto fallback;
> >   			}
> >   			split_map.m_len += split_map.m_lblk - ee_block;
> > @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
> >   		}
> >   	}
> > +fallback:
> >   	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
> >   				flags);
> >   	if (err > 0)
> > 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-11-23  9:27         ` Jan Kara
@ 2021-11-24  9:01           ` yangerkun
  2021-11-24 10:37             ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: yangerkun @ 2021-11-24  9:01 UTC (permalink / raw)
  To: Jan Kara; +Cc: Theodore Ts'o, Ext4 Developers List, yukuai3



On 2021/11/23 17:27, Jan Kara wrote:
> Hello,
> 
> On Sun 26-09-21 19:35:01, yangerkun wrote:
>> Rethink about this problem. Should we consider other place which call
>> ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
>> really happened)...
>>
>> How about include follow patch which not only transfer ENOSPC to EIO. But
>> also stop to overwrite the error return by ext4_ext_insert_extent in
>> ext4_split_extent_at.
>>
>> Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
>> extent node") can work together with this patch.
> 
> I've got back to this. The ext4_ext_zeroout() calls in
> ext4_split_extent_at() seem to be there as fallback when insertion of a new
> extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
> think returning an error as the code does now is correct and we don't have
> much other option. Also we are really running out of disk space so I think
> returning ENOSPC is fine. What exact scenario are you afraid of?

I am afraid about the EDQUOT from ext4_ext_insert_extent may be 
overwrite by ext4_ext_zeroout with ENOSPC. And this may lead to dead 
loop since ext4_writepages will retry once get ENOSPC? Maybe I am wrong...

> 
> 								Honza
> 
>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>> index c0de30f25185..66767ede235f 100644
>> --- a/fs/ext4/extents.c
>> +++ b/fs/ext4/extents.c
>> @@ -3218,16 +3218,18 @@ static int ext4_split_extent_at(handle_t *handle,
>>                  goto out;
>>
>>          if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
>> +               int ret = 0;
>> +
>>                  if (split_flag &
>> (EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
>>                          if (split_flag & EXT4_EXT_DATA_VALID1) {
>> -                               err = ext4_ext_zeroout(inode, ex2);
>> +                               ret = ext4_ext_zeroout(inode, ex2);
>>                                  zero_ex.ee_block = ex2->ee_block;
>>                                  zero_ex.ee_len = cpu_to_le16(
>>
>> ext4_ext_get_actual_len(ex2));
>>                                  ext4_ext_store_pblock(&zero_ex,
>>
>> ext4_ext_pblock(ex2));
>>                          } else {
>> -                               err = ext4_ext_zeroout(inode, ex);
>> +                               ret = ext4_ext_zeroout(inode, ex);
>>                                  zero_ex.ee_block = ex->ee_block;
>>                                  zero_ex.ee_len = cpu_to_le16(
>>
>> ext4_ext_get_actual_len(ex));
>> @@ -3235,7 +3237,7 @@ static int ext4_split_extent_at(handle_t *handle,
>>                                                        ext4_ext_pblock(ex));
>>                          }
>>                  } else {
>> -                       err = ext4_ext_zeroout(inode, &orig_ex);
>> +                       ret = ext4_ext_zeroout(inode, &orig_ex);
>>                          zero_ex.ee_block = orig_ex.ee_block;
>>                          zero_ex.ee_len = cpu_to_le16(
>>
>> ext4_ext_get_actual_len(&orig_ex));
>> @@ -3243,7 +3245,7 @@ static int ext4_split_extent_at(handle_t *handle,
>>                                                ext4_ext_pblock(&orig_ex));
>>                  }
>>
>> -               if (!err) {
>> +               if (!ret) {
>>                          /* update the extent length and mark as initialized
>> */
>>                          ex->ee_len = cpu_to_le16(ee_len);
>>                          ext4_ext_try_to_merge(handle, inode, path, ex);
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index d18852d6029c..95b970581864 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -427,6 +427,9 @@ int ext4_issue_zeroout(struct inode *inode, ext4_lblk_t
>> lblk, ext4_fsblk_t pblk,
>>          if (ret > 0)
>>                  ret = 0;
>>
>> +       if (ret == -ENOSPC)
>> +               ret = -EIO;
>> +
>>          return ret;
>>   }
>>
>>
>>
>> 在 2021/8/14 5:27, Theodore Ts'o 写道:
>>> If the underlying storage device is using thin-provisioning, it's
>>> possible for a zeroout operation to return ENOSPC.
>>>
>>> Commit df22291ff0fd ("ext4: Retry block allocation if we have free blocks
>>> left") added logic to retry block allocation since we might get free block
>>> after we commit a transaction. But the ENOSPC from thin-provisioning
>>> will confuse ext4, and lead to an infinite loop.
>>>
>>> Since using zeroout instead of splitting the extent node is an
>>> optimization, if it fails, we might as well fall back to splitting the
>>> extent node.
>>>
>>> Reported-by: yangerkun <yangerkun@huawei.com>
>>> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>>> ---
>>>
>>> I've run this through my battery of tests, and it doesn't cause any
>>> regressions.  Yangerkun, can you test this and see if this works for
>>> you?
>>>
>>>    fs/ext4/extents.c | 5 +++--
>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>> index 92ad64b89d9b..501516cadc1b 100644
>>> --- a/fs/ext4/extents.c
>>> +++ b/fs/ext4/extents.c
>>> @@ -3569,7 +3569,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>>>    				split_map.m_len - ee_block);
>>>    			err = ext4_ext_zeroout(inode, &zero_ex1);
>>>    			if (err)
>>> -				goto out;
>>> +				goto fallback;
>>>    			split_map.m_len = allocated;
>>>    		}
>>>    		if (split_map.m_lblk - ee_block + split_map.m_len <
>>> @@ -3583,7 +3583,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>>>    						      ext4_ext_pblock(ex));
>>>    				err = ext4_ext_zeroout(inode, &zero_ex2);
>>>    				if (err)
>>> -					goto out;
>>> +					goto fallback;
>>>    			}
>>>    			split_map.m_len += split_map.m_lblk - ee_block;
>>> @@ -3592,6 +3592,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
>>>    		}
>>>    	}
>>> +fallback:
>>>    	err = ext4_split_extent(handle, inode, ppath, &split_map, split_flag,
>>>    				flags);
>>>    	if (err > 0)
>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-11-24  9:01           ` yangerkun
@ 2021-11-24 10:37             ` Jan Kara
  2021-11-24 12:11               ` yangerkun
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2021-11-24 10:37 UTC (permalink / raw)
  To: yangerkun; +Cc: Jan Kara, Theodore Ts'o, Ext4 Developers List, yukuai3

On Wed 24-11-21 17:01:12, yangerkun wrote:
> On 2021/11/23 17:27, Jan Kara wrote:
> > Hello,
> > 
> > On Sun 26-09-21 19:35:01, yangerkun wrote:
> > > Rethink about this problem. Should we consider other place which call
> > > ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
> > > really happened)...
> > > 
> > > How about include follow patch which not only transfer ENOSPC to EIO. But
> > > also stop to overwrite the error return by ext4_ext_insert_extent in
> > > ext4_split_extent_at.
> > > 
> > > Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
> > > extent node") can work together with this patch.
> > 
> > I've got back to this. The ext4_ext_zeroout() calls in
> > ext4_split_extent_at() seem to be there as fallback when insertion of a new
> > extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
> > think returning an error as the code does now is correct and we don't have
> > much other option. Also we are really running out of disk space so I think
> > returning ENOSPC is fine. What exact scenario are you afraid of?
> 
> I am afraid about the EDQUOT from ext4_ext_insert_extent may be overwrite by
> ext4_ext_zeroout with ENOSPC. And this may lead to dead loop since
> ext4_writepages will retry once get ENOSPC? Maybe I am wrong...

OK, so passing back original error instead of the error from
ext4_ext_zeroout() makes sense. But I don't think doing much more is needed
- firstly, ENOSPC or EDQUOT should not happen in ext4_split_extent_at()
called from ext4_writepages() because we should have reserved enough
space for extent splits when writing data. So hitting that is already
unexpected. Committing transaction holding blocks that are expected to be
free is the most likely reason for us seeing ENOSPC and returning EIO in
that case would be bug. Secondly, returning EIO instead of ENOSPC is IMO a
bit confusing for upper layers and makes it harder to analyze where the
real problem is...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-11-24 10:37             ` Jan Kara
@ 2021-11-24 12:11               ` yangerkun
  2021-11-24 17:15                 ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: yangerkun @ 2021-11-24 12:11 UTC (permalink / raw)
  To: Jan Kara; +Cc: Theodore Ts'o, Ext4 Developers List, yukuai3



On 2021/11/24 18:37, Jan Kara wrote:
> On Wed 24-11-21 17:01:12, yangerkun wrote:
>> On 2021/11/23 17:27, Jan Kara wrote:
>>> Hello,
>>>
>>> On Sun 26-09-21 19:35:01, yangerkun wrote:
>>>> Rethink about this problem. Should we consider other place which call
>>>> ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
>>>> really happened)...
>>>>
>>>> How about include follow patch which not only transfer ENOSPC to EIO. But
>>>> also stop to overwrite the error return by ext4_ext_insert_extent in
>>>> ext4_split_extent_at.
>>>>
>>>> Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
>>>> extent node") can work together with this patch.
>>>
>>> I've got back to this. The ext4_ext_zeroout() calls in
>>> ext4_split_extent_at() seem to be there as fallback when insertion of a new
>>> extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
>>> think returning an error as the code does now is correct and we don't have
>>> much other option. Also we are really running out of disk space so I think
>>> returning ENOSPC is fine. What exact scenario are you afraid of?
>>
>> I am afraid about the EDQUOT from ext4_ext_insert_extent may be overwrite by
>> ext4_ext_zeroout with ENOSPC. And this may lead to dead loop since
>> ext4_writepages will retry once get ENOSPC? Maybe I am wrong...
> 
> OK, so passing back original error instead of the error from
> ext4_ext_zeroout() makes sense. But I don't think doing much more is needed
> - firstly, ENOSPC or EDQUOT should not happen in ext4_split_extent_at()
> called from ext4_writepages() because we should have reserved enough
> space for extent splits when writing data. So hitting that is already

ext4_da_write_begin
   ext4_da_get_block_prep
     ext4_insert_delayed_block
       ext4_da_reserve_space

It seems we will only reserve space for data, no for metadata...


> unexpected. Committing transaction holding blocks that are expected to be
> free is the most likely reason for us seeing ENOSPC and returning EIO in
> that case would be bug. 

Agree. EIO from ext4_ext_zeroout that overwrite the ENOSPC from
ext4_ext_insert_extent seems buggy too. Maybe we should ignore the error
from ext4_ext_zeroout and return the error from ext4_ext_insert_extent
once ext4_ext_zeroout in ext4_split_extent_at got a error. Something
like this:

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 0ecf819bf189..56cc00ee42a1 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3185,6 +3185,7 @@ static int ext4_split_extent_at(handle_t *handle,
         struct ext4_extent *ex2 = NULL;
         unsigned int ee_len, depth;
         int err = 0;
+       int err1;

         BUG_ON((split_flag & (EXT4_EXT_DATA_VALID1 | 
EXT4_EXT_DATA_VALID2)) ==
                (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2));
@@ -3255,7 +3256,7 @@ static int ext4_split_extent_at(handle_t *handle,
         if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
                 if (split_flag & 
(EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
                         if (split_flag & EXT4_EXT_DATA_VALID1) {
-                               err = ext4_ext_zeroout(inode, ex2);
+                               err1 = ext4_ext_zeroout(inode, ex2);
                                 zero_ex.ee_block = ex2->ee_block;
                                 zero_ex.ee_len = cpu_to_le16(
 
ext4_ext_get_actual_len(ex2));
@@ -3270,7 +3271,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                                       ext4_ext_pblock(ex));
                         }
                 } else {
-                       err = ext4_ext_zeroout(inode, &orig_ex);
+                       err1 = ext4_ext_zeroout(inode, &orig_ex);
                         zero_ex.ee_block = orig_ex.ee_block;
                         zero_ex.ee_len = cpu_to_le16(
 
ext4_ext_get_actual_len(&orig_ex));
@@ -3278,7 +3279,7 @@ static int ext4_split_extent_at(handle_t *handle,
                                               ext4_ext_pblock(&orig_ex));
                 }

-               if (!err) {
+               if (!err1) {
                         /* update the extent length and mark as 
initialized */
                         ex->ee_len = cpu_to_le16(ee_len);
                         ext4_ext_try_to_merge(handle, inode, path, ex);



> Secondly, returning EIO instead of ENOSPC is IMO a
> bit confusing for upper layers and makes it harder to analyze where the
> real problem is...
> 
> 								Honza
> 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ext4: if zeroout fails fall back to splitting the extent node
  2021-11-24 12:11               ` yangerkun
@ 2021-11-24 17:15                 ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2021-11-24 17:15 UTC (permalink / raw)
  To: yangerkun; +Cc: Jan Kara, Theodore Ts'o, Ext4 Developers List, yukuai3

On Wed 24-11-21 20:11:43, yangerkun wrote:
> 
> 
> On 2021/11/24 18:37, Jan Kara wrote:
> > On Wed 24-11-21 17:01:12, yangerkun wrote:
> > > On 2021/11/23 17:27, Jan Kara wrote:
> > > > Hello,
> > > > 
> > > > On Sun 26-09-21 19:35:01, yangerkun wrote:
> > > > > Rethink about this problem. Should we consider other place which call
> > > > > ext4_issue_zeroout? Maybe it can trigger the problem too(in theory, not
> > > > > really happened)...
> > > > > 
> > > > > How about include follow patch which not only transfer ENOSPC to EIO. But
> > > > > also stop to overwrite the error return by ext4_ext_insert_extent in
> > > > > ext4_split_extent_at.
> > > > > 
> > > > > Besides, 308c57ccf431 ("ext4: if zeroout fails fall back to splitting the
> > > > > extent node") can work together with this patch.
> > > > 
> > > > I've got back to this. The ext4_ext_zeroout() calls in
> > > > ext4_split_extent_at() seem to be there as fallback when insertion of a new
> > > > extent fails due to ENOSPC / EDQUOT. If even ext4_ext_zeroout(), then I
> > > > think returning an error as the code does now is correct and we don't have
> > > > much other option. Also we are really running out of disk space so I think
> > > > returning ENOSPC is fine. What exact scenario are you afraid of?
> > > 
> > > I am afraid about the EDQUOT from ext4_ext_insert_extent may be overwrite by
> > > ext4_ext_zeroout with ENOSPC. And this may lead to dead loop since
> > > ext4_writepages will retry once get ENOSPC? Maybe I am wrong...
> > 
> > OK, so passing back original error instead of the error from
> > ext4_ext_zeroout() makes sense. But I don't think doing much more is needed
> > - firstly, ENOSPC or EDQUOT should not happen in ext4_split_extent_at()
> > called from ext4_writepages() because we should have reserved enough
> > space for extent splits when writing data. So hitting that is already
> 
> ext4_da_write_begin
>   ext4_da_get_block_prep
>     ext4_insert_delayed_block
>       ext4_da_reserve_space
> 
> It seems we will only reserve space for data, no for metadata...
> 
> 
> > unexpected. Committing transaction holding blocks that are expected to be
> > free is the most likely reason for us seeing ENOSPC and returning EIO in
> > that case would be bug.
> 
> Agree. EIO from ext4_ext_zeroout that overwrite the ENOSPC from
> ext4_ext_insert_extent seems buggy too. Maybe we should ignore the error
> from ext4_ext_zeroout and return the error from ext4_ext_insert_extent
> once ext4_ext_zeroout in ext4_split_extent_at got a error. Something
> like this:

Yep, something like that looks good to me.

								Honza

> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 0ecf819bf189..56cc00ee42a1 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3185,6 +3185,7 @@ static int ext4_split_extent_at(handle_t *handle,
>         struct ext4_extent *ex2 = NULL;
>         unsigned int ee_len, depth;
>         int err = 0;
> +       int err1;
> 
>         BUG_ON((split_flag & (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2))
> ==
>                (EXT4_EXT_DATA_VALID1 | EXT4_EXT_DATA_VALID2));
> @@ -3255,7 +3256,7 @@ static int ext4_split_extent_at(handle_t *handle,
>         if (EXT4_EXT_MAY_ZEROOUT & split_flag) {
>                 if (split_flag &
> (EXT4_EXT_DATA_VALID1|EXT4_EXT_DATA_VALID2)) {
>                         if (split_flag & EXT4_EXT_DATA_VALID1) {
> -                               err = ext4_ext_zeroout(inode, ex2);
> +                               err1 = ext4_ext_zeroout(inode, ex2);
>                                 zero_ex.ee_block = ex2->ee_block;
>                                 zero_ex.ee_len = cpu_to_le16(
> 
> ext4_ext_get_actual_len(ex2));
> @@ -3270,7 +3271,7 @@ static int ext4_split_extent_at(handle_t *handle,
>                                                       ext4_ext_pblock(ex));
>                         }
>                 } else {
> -                       err = ext4_ext_zeroout(inode, &orig_ex);
> +                       err1 = ext4_ext_zeroout(inode, &orig_ex);
>                         zero_ex.ee_block = orig_ex.ee_block;
>                         zero_ex.ee_len = cpu_to_le16(
> 
> ext4_ext_get_actual_len(&orig_ex));
> @@ -3278,7 +3279,7 @@ static int ext4_split_extent_at(handle_t *handle,
>                                               ext4_ext_pblock(&orig_ex));
>                 }
> 
> -               if (!err) {
> +               if (!err1) {
>                         /* update the extent length and mark as initialized
> */
>                         ex->ee_len = cpu_to_le16(ee_len);
>                         ext4_ext_try_to_merge(handle, inode, path, ex);
> 
> 
> 
> > Secondly, returning EIO instead of ENOSPC is IMO a
> > bit confusing for upper layers and makes it harder to analyze where the
> > real problem is...
> > 
> > 								Honza
> > 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-11-24 17:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04 12:50 [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout yangerkun
2021-08-04 13:35 ` Jan Kara
2021-08-13 15:18   ` Theodore Ts'o
2021-08-13 21:27     ` [PATCH] ext4: if zeroout fails fall back to splitting the extent node Theodore Ts'o
2021-08-14  2:15       ` yangerkun
2021-08-16  6:57         ` yangerkun
2021-09-26 11:35       ` yangerkun
2021-11-23  9:27         ` Jan Kara
2021-11-24  9:01           ` yangerkun
2021-11-24 10:37             ` Jan Kara
2021-11-24 12:11               ` yangerkun
2021-11-24 17:15                 ` Jan Kara
2021-08-16 10:05     ` [PATCH] ext4: stop return ENOSPC from ext4_issue_zeroout Jan Kara
2021-08-16 14:16       ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.