linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.
       [not found] <1351075240-2725-1-git-send-email-ashish.sangwan2@gmail.com>
@ 2012-10-26 11:41 ` Namjae Jeon
  2012-12-06 11:29   ` Ashish Sangwan
  0 siblings, 1 reply; 4+ messages in thread
From: Namjae Jeon @ 2012-10-26 11:41 UTC (permalink / raw)
  To: Ashish Sangwan
  Cc: tytso, linux-ext4, Ashish Sangwan, Lukáš Czerner,
	Eric Sandeen, linux-kernel

Add Cc in mail loop.

Thanks.

2012/10/24, Ashish Sangwan <ashishsangwan2@gmail.com>:
> During orphan cleanup while doing truncate, if we fail to obtain journal
> handle, the inode for which truncate was called would not be removed from
> both the on-disk and in-memory orphan lists as the call to ext4_orphan_del
> would not be executed.
>
> This would have following consequences:
> a) As the inode is not removed from the on-disk list, truncate would be
> called again for the same inode. Each call would add the inode to the
> in-memory list. This operation would continue endlessly or until truncate
> is succeed.
>
> b) If somehow, after some iterations, truncate is succeed, ext4_orphan_del
> will only remove the inode from in-memory list just 1 time. This will
> trigger
> j_assert during put super.
>
> This patch handles both the problems. If truncate fails, first in-memory
> list is cleared and than the partition is mounted as read only.
> Failure to obtain journal handle during mount may suggest that journal
> device is corrupted.
>
> Signed-off-by: Ashish Sangwan <ashish.sangwan2@gmail.com>
> Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
> ---
>  fs/ext4/super.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4e2aacb..859eccb 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -2201,7 +2201,21 @@ static void ext4_orphan_cleanup(struct super_block
> *sb,
>  			jbd_debug(2, "truncating inode %lu to %lld bytes\n",
>  				  inode->i_ino, inode->i_size);
>  			ext4_truncate(inode);
> -			nr_truncates++;
> +			if (list_empty(&EXT4_I(inode)->i_orphan)) {
> +				nr_truncates++;
> +			} else {
> +				/* Remove inode from in-memory orphan list */
> +				list_del_init(&EXT4_I(inode)->i_orphan);
> +				ext4_msg(sb, KERN_ERR, "Truncate failed for "
> +					 "orphan  inode = %lu. Running e2fsck"
> +					 " is recommended", inode->i_ino);
> +				if (!(s_flags & MS_RDONLY)) {
> +					ext4_msg(sb, KERN_INFO, "FS would be"
> +						" mounted as readonly");
> +					s_flags |= MS_RDONLY;
> +				}
> +				break;
> +			}
>  		} else {
>  			ext4_msg(sb, KERN_DEBUG,
>  				"%s: deleting unreferenced inode %lu",
> --
> 1.7.11.4
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.
  2012-10-26 11:41 ` [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle Namjae Jeon
@ 2012-12-06 11:29   ` Ashish Sangwan
  2012-12-06 17:09     ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Ashish Sangwan @ 2012-12-06 11:29 UTC (permalink / raw)
  To: Namjae Jeon
  Cc: Theodore Ts'o, ext4 development, Ashish Sangwan,
	Lukáš Czerner, Eric Sandeen, linux-kernel

Hi Ted,

Did you get any time to look into this patch?
This problem is with ext4 only as ext4_truncate does not clean the
orphan list unlike that of ext3_truncate.
Instead, in case of failure to obtain handle, orphan list cleanup is
done in ext4_setattr.
But during mount, ext4_truncate is not called via ext4_setattr and
hence the problem.
What do you think?

On Fri, Oct 26, 2012 at 5:11 PM, Namjae Jeon <linkinjeon@gmail.com> wrote:
> Add Cc in mail loop.
>
> Thanks.
>
> 2012/10/24, Ashish Sangwan <ashishsangwan2@gmail.com>:
>> During orphan cleanup while doing truncate, if we fail to obtain journal
>> handle, the inode for which truncate was called would not be removed from
>> both the on-disk and in-memory orphan lists as the call to ext4_orphan_del
>> would not be executed.
>>
>> This would have following consequences:
>> a) As the inode is not removed from the on-disk list, truncate would be
>> called again for the same inode. Each call would add the inode to the
>> in-memory list. This operation would continue endlessly or until truncate
>> is succeed.
>>
>> b) If somehow, after some iterations, truncate is succeed, ext4_orphan_del
>> will only remove the inode from in-memory list just 1 time. This will
>> trigger
>> j_assert during put super.
>>
>> This patch handles both the problems. If truncate fails, first in-memory
>> list is cleared and than the partition is mounted as read only.
>> Failure to obtain journal handle during mount may suggest that journal
>> device is corrupted.
>>
>> Signed-off-by: Ashish Sangwan <ashish.sangwan2@gmail.com>
>> Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
>> ---
>>  fs/ext4/super.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 4e2aacb..859eccb 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -2201,7 +2201,21 @@ static void ext4_orphan_cleanup(struct super_block
>> *sb,
>>                       jbd_debug(2, "truncating inode %lu to %lld bytes\n",
>>                                 inode->i_ino, inode->i_size);
>>                       ext4_truncate(inode);
>> -                     nr_truncates++;
>> +                     if (list_empty(&EXT4_I(inode)->i_orphan)) {
>> +                             nr_truncates++;
>> +                     } else {
>> +                             /* Remove inode from in-memory orphan list */
>> +                             list_del_init(&EXT4_I(inode)->i_orphan);
>> +                             ext4_msg(sb, KERN_ERR, "Truncate failed for "
>> +                                      "orphan  inode = %lu. Running e2fsck"
>> +                                      " is recommended", inode->i_ino);
>> +                             if (!(s_flags & MS_RDONLY)) {
>> +                                     ext4_msg(sb, KERN_INFO, "FS would be"
>> +                                             " mounted as readonly");
>> +                                     s_flags |= MS_RDONLY;
>> +                             }
>> +                             break;
>> +                     }
>>               } else {
>>                       ext4_msg(sb, KERN_DEBUG,
>>                               "%s: deleting unreferenced inode %lu",
>> --
>> 1.7.11.4
>>
>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.
  2012-12-06 11:29   ` Ashish Sangwan
@ 2012-12-06 17:09     ` Theodore Ts'o
  2012-12-07  9:22       ` Ashish Sangwan
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2012-12-06 17:09 UTC (permalink / raw)
  To: Ashish Sangwan
  Cc: Namjae Jeon, ext4 development, Ashish Sangwan,
	Lukáš Czerner, Eric Sandeen, linux-kernel

On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:
> 
> Did you get any time to look into this patch?
> This problem is with ext4 only as ext4_truncate does not clean the
> orphan list unlike that of ext3_truncate.
> Instead, in case of failure to obtain handle, orphan list cleanup is
> done in ext4_setattr.
> But during mount, ext4_truncate is not called via ext4_setattr and
> hence the problem.
> What do you think?

In the patch description, you mentioned that this occurs when the
there is a failure to obtain a journal handle.  Is this a hypothetical
thing that you exposed via some kind of tester which checks to see
what happens if kmalloc() randomly fails some number of allocation
requests?  Or was it happening in real life?  And if it is happening
in real life, do we understand why it's happening, and is there
something we should be doing to mitigate against the root cause of the
failure?

The alternative to your patch is to do something similar to what ext3
does.  That is, if there are any inodes left on the orphan list, to
iterate through them all and then clean up the orphan list.  Perhaps
we should then also call ext4_error() since technically the file
system may very well be inconsistent (there may be allocated inodes
holding blocks which are no longer connected the directory hierarchy,
which e2fsck would be able to clean up).  But that could potentially
cause the system to panic or remount the file system read-only,
depending on what the errors= behavior is set to.  Which is why I go
back to the original question; do we understand why ext4_truncate()
was failing during orphan cleanup in the first place?

    	    	   	  	     	       - Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.
  2012-12-06 17:09     ` Theodore Ts'o
@ 2012-12-07  9:22       ` Ashish Sangwan
  0 siblings, 0 replies; 4+ messages in thread
From: Ashish Sangwan @ 2012-12-07  9:22 UTC (permalink / raw)
  To: Theodore Ts'o, Namjae Jeon, ext4 development,
	Lukáš Czerner, Eric Sandeen, linux-kernel

On Thu, Dec 6, 2012 at 10:39 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:
>>
>> Did you get any time to look into this patch?
>> This problem is with ext4 only as ext4_truncate does not clean the
>> orphan list unlike that of ext3_truncate.
>> Instead, in case of failure to obtain handle, orphan list cleanup is
>> done in ext4_setattr.
>> But during mount, ext4_truncate is not called via ext4_setattr and
>> hence the problem.
>> What do you think?
>
> In the patch description, you mentioned that this occurs when the
> there is a failure to obtain a journal handle.  Is this a hypothetical
> thing that you exposed via some kind of tester which checks to see
> what happens if kmalloc() randomly fails some number of allocation
> requests?  Or was it happening in real life?  And if it is happening
> in real life, do we understand why it's happening, and is there
> something we should be doing to mitigate against the root cause of the
> failure?
>
Thanks for looking into the patch.
Yes, you are right. The situation is hypothetical.
We were checking about Ext4's behavior when it returns error at
different points in truncate/hole punch path and that's when we
stumbled on to this.
In real, we cannot re-produce this scenario without modifying Ext4's
code and hence an extra RFC added with the patch.
If you think that obtaining journal handle could never fail at mount
time, than please ignore this patch.

> The alternative to your patch is to do something similar to what ext3
> does.  That is, if there are any inodes left on the orphan list, to
> iterate through them all and then clean up the orphan list.  Perhaps
> we should then also call ext4_error() since technically the file
> system may very well be inconsistent (there may be allocated inodes
> holding blocks which are no longer connected the directory hierarchy,
> which e2fsck would be able to clean up).  But that could potentially
> cause the system to panic or remount the file system read-only,
> depending on what the errors= behavior is set to.  Which is why I go
> back to the original question; do we understand why ext4_truncate()
> was failing during orphan cleanup in the first place?
>
>                                                - Ted

Right, using ext4_error() would have been a better choice than
explicitly setting RO flag.
I wonder why ext4_truncate does not clear the orphan_list in case of
failure to obtain journal handle unlike that of ext3_truncate?
We would not have to bother about explicitly clearing the orphan list
in the first place if we could just add a lable like out_notrans (as
in ext3)=>
 if (IS_ERR(handle))
                goto out_notrans;

 out_notrans:
        /*
         * Delete the inode from orphan list so that it doesn't stay there
         * forever and trigger assertion on umount.
         */
        if (inode->i_nlink)
                ext3_orphan_del(NULL, inode);

Regards,
Ashish

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-12-07  9:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1351075240-2725-1-git-send-email-ashish.sangwan2@gmail.com>
2012-10-26 11:41 ` [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle Namjae Jeon
2012-12-06 11:29   ` Ashish Sangwan
2012-12-06 17:09     ` Theodore Ts'o
2012-12-07  9:22       ` Ashish Sangwan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).