All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [RFC] Online File(system) check
@ 2015-04-27 21:32 Goldwyn Rodrigues
  2015-04-28  3:00 ` Gang He
  2015-04-29  7:59 ` Junxiao Bi
  0 siblings, 2 replies; 12+ messages in thread
From: Goldwyn Rodrigues @ 2015-04-27 21:32 UTC (permalink / raw)
  To: ocfs2-devel

On popular demand, here is an RFC. If you think there is a better
way to communicate with the kernel module for the check, please
let me know.


Intro
-----
OCFS2 is often used in high-availaibility systems. However, ocfs2
converts the filesystem to read-only at the drop of the hat. This
may not be necessary, since turning the filesystem read-only would
affect other running processes as well, decreasing availability.

This attempt is to add errors=continue, which would return the EIO
to the calling process and terminate furhter processing so that
the filesystem is not corrupted further. However, the filesystem
is not converted to read-only.

Scope
-----
This effort is to fix small issues which may hinder day-today operations
of a cluster filesystem by turning the filesystem read-only. The scope of
fixing is at the file level, initially for regular files and eventually
to all files (including system files) of the filesystem.

In case of directory to file links is incorrect, the directory inode
is reported as erroneous.

This feature is not suited for extravagant checks which involve dependency of
other components of the filesystem, such as but not limited to, checking if the bits for file blocks in the allocation has been set. In case of such an error,
the offline fsck should/would be recommended.

Finally, such an operation/feature should not be automated lest the filesystem
may end up with more damage than before the repair attempt. So, this has to
be performed using user interaction and consent.


Communication
-------------
When there are errors in the ocfs2 filesystem, they are usually accompanied
by the inode number which caused the error. This inode number would be the
input to fixing the file.

One of these options could be considered:

A file in the sys filesytem which would accept inode numbers. This
could be used to communication back what has to be fixed or is fixed.
You could write:
  # echo "CHECK <inode>" > /sys/fs/ocfs2/filecheck
  or
  # echo "FIX <inode>" > /sys/fs/ocfs2/filecheck


Fixing stuff
------------

On receivng the inode, the filesystem would read the inode and the
file metadata. In case of errors, the filesystem would fix the errors
and report the problems it fixed. As a precautionary measure, the
inode must first be checked for errors before performing a final fix.

The inode and the fix history will be maintained temporarily in a
small linked list buffer which would contain the last (N) inodes
fixed/checked, alongwith the logs of what errors were reported/fixed.


Comments/Criticism welcome.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-27 21:32 [Ocfs2-devel] [RFC] Online File(system) check Goldwyn Rodrigues
@ 2015-04-28  3:00 ` Gang He
  2015-04-28 12:21   ` Goldwyn Rodrigues
  2015-04-29  7:59 ` Junxiao Bi
  1 sibling, 1 reply; 12+ messages in thread
From: Gang He @ 2015-04-28  3:00 UTC (permalink / raw)
  To: ocfs2-devel

Hi Glodwyn,

Very nice proposal.
So far, there are some comments from me.
1) which task will we do in check/fix a file, we need to define the detailed requirements further, since we just do a light-level file check/fix according to inode number, we need to know which items can be done by online check, which items can be done by offline fsck.
2) can we keep check and fix two option, check option is to check if a file is good or bad, but not modify anything, fix option is to check and fix a file if the file is corrupted.
3) when users execute the command "echo CHECK <inode> > /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback information besides printing the messages to syslog?
4) we should support a list to accept the "check/fix" requests from user-space and queue them, then handle them one by one, right? what is the behavior for the request user which execute "echo check ..." from the user space? the user post a request to the kernel space, then the command will end or wait for the file check end?

Thanks
Gang


>>> 
> On popular demand, here is an RFC. If you think there is a better
> way to communicate with the kernel module for the check, please
> let me know.
> he
> 
> Intro
> -----
> OCFS2 is often used in high-availaibility systems. However, ocfs2
> converts the filesystem to read-only at the drop of the hat. This
> may not be necessary, since turning the filesystem read-only would
> affect other running processes as well, decreasing availability.
> 
> This attempt is to add errors=continue, which would return the EIO
> to the calling process and terminate furhter processing so that
> the filesystem is not corrupted further. However, the filesystem
> is not converted to read-only.
> 
> Scope
> -----
> This effort is to fix small issues which may hinder day-today operations
> of a cluster filesystem by turning the filesystem read-only. The scope of
> fixing is at the file level, initially for regular files and eventually
> to all files (including system files) of the filesystem.
> 
> In case of directory to file links is incorrect, the directory inode
> is reported as erroneous.
> 
> This feature is not suited for extravagant checks which involve dependency 
> of
> other components of the filesystem, such as but not limited to, checking if 
> the bits for file blocks in the allocation has been set. In case of such an 
> error,
> the offline fsck should/would be recommended.
> 
> Finally, such an operation/feature should not be automated lest the 
> filesystem
> may end up with more damage than before the repair attempt. So, this has to
> be performed using user interaction and consent.
> 
> 
> Communication
> -------------
> When there are errors in the ocfs2 filesystem, they are usually accompanied
> by the inode number which caused the error. This inode number would be the
> input to fixing the file.
> 
> One of these options could be considered:
> 
> A file in the sys filesytem which would accept inode numbers. This
> could be used to communication back what has to be fixed or is fixed.
> You could write:
>   # echo "CHECK <inode>" > /sys/fs/ocfs2/filecheck
>   or
>   # echo "FIX <inode>" > /sys/fs/ocfs2/filecheck
> 
> 
> Fixing stuff
> ------------
> 
> On receivng the inode, the filesystem would read the inode and the
> file metadata. In case of errors, the filesystem would fix the errors
> and report the problems it fixed. As a precautionary measure, the
> inode must first be checked for errors before performing a final fix.
> 
> The inode and the fix history will be maintained temporarily in a
> small linked list buffer which would contain the last (N) inodes
> fixed/checked, alongwith the logs of what errors were reported/fixed.
> 
> 
> Comments/Criticism welcome.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-28  3:00 ` Gang He
@ 2015-04-28 12:21   ` Goldwyn Rodrigues
  2015-04-28 13:20     ` Joseph Qi
  0 siblings, 1 reply; 12+ messages in thread
From: Goldwyn Rodrigues @ 2015-04-28 12:21 UTC (permalink / raw)
  To: ocfs2-devel

Hi Gang,

On 04/27/2015 10:00 PM, Gang He wrote:
> Hi Glodwyn,
>
> Very nice proposal.
> So far, there are some comments from me.
> 1) which task will we do in check/fix a file, we need to define the detailed requirements further, since we just do a light-level file check/fix according to inode number, we need to know which items can be done by online check, which items can be done by offline fsck.

For the first phase (regular files), these are all the reasons the disk 
validate function would fail. Some examples are 
ocfs2_validate_inode_block, ocfs2_validate_extent_block etc.
As we take up system inodes (phase 2), we will add more functionality.

> 2) can we keep check and fix two option, check option is to check if a file is good or bad, but not modify anything, fix option is to check and fix a file if the file is corrupted.

Yes, there are two options, CHECKS only checks wheras FIX fixes the 
errors. As a precautionary measure, a CHECK command should be provided 
before a FIX is issued. IOW, a file should be checked for errors before 
actually fixing it.

> 3) when users execute the command "echo CHECK <inode> > /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback information besides printing the messages to syslog?

The output should be when you cat /sys/fs/ocfs2/filecheck. It would 
provide the results of the last (N) files checked. I don't want to flood 
the kernel log with this. Thanks for bringing this up, I will put it on 
the doc. Something like:

Inode Status Description
1234   ERROR Metadata incorrect
2352   FIXED Valid flag not set
9382   CHECKING -
8926   GOOD -
7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.

So, for the current scenario, only 1234 can be fixed. An echo should err 
with EINVAL if any other inode number is provided with FIX.


> 4) we should support a list to accept the "check/fix" requests from user-space and queue them, then handle them one by one, right? what is the behavior for the request user which execute "echo check ..." from the user space? the user post a request to the kernel space, then the command will end or wait for the file check end?
>

I would not suggest that, atleast for now. This is to improve 
availability. However, if the filesystem is very bad, we should suggest 
an offline check. However, the user can provide multiple CHECK requests.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-28 12:21   ` Goldwyn Rodrigues
@ 2015-04-28 13:20     ` Joseph Qi
  2015-04-29  2:37       ` Gang He
  2015-05-02 13:08       ` Goldwyn Rodrigues
  0 siblings, 2 replies; 12+ messages in thread
From: Joseph Qi @ 2015-04-28 13:20 UTC (permalink / raw)
  To: ocfs2-devel

Hi Goldwyn,

Thanks for the good proposal.

On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
> Hi Gang,
> 
> On 04/27/2015 10:00 PM, Gang He wrote:
>> Hi Glodwyn,
>>
>> Very nice proposal.
>> So far, there are some comments from me.
>> 1) which task will we do in check/fix a file, we need to define the detailed requirements further, since we just do a light-level file check/fix according to inode number, we need to know which items can be done by online check, which items can be done by offline fsck.
> 
> For the first phase (regular files), these are all the reasons the disk validate function would fail. Some examples are ocfs2_validate_inode_block, ocfs2_validate_extent_block etc.
> As we take up system inodes (phase 2), we will add more functionality.
> 
Can we classify all corrupted cases and their corresponding fix ways? Maybe we can get some hints from fsck.
And I don't think errors=continue can fit for all cases.
For some cases we shouldn't let it continue with errors to prevent more damages.

>> 2) can we keep check and fix two option, check option is to check if a file is good or bad, but not modify anything, fix option is to check and fix a file if the file is corrupted.
> 
> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. As a precautionary measure, a CHECK command should be provided before a FIX is issued. IOW, a file should be checked for errors before actually fixing it.
> 
A convenient way to know which to be checked should also be taken into consideration.

>> 3) when users execute the command "echo CHECK <inode> > /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback information besides printing the messages to syslog?
> 
> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide the results of the last (N) files checked. I don't want to flood the kernel log with this. Thanks for bringing this up, I will put it on the doc. Something like:
> 
> Inode Status Description
> 1234   ERROR Metadata incorrect
> 2352   FIXED Valid flag not set
> 9382   CHECKING -
> 8926   GOOD -
> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
> 
> So, for the current scenario, only 1234 can be fixed. An echo should err with EINVAL if any other inode number is provided with FIX.
> 
> 
>> 4) we should support a list to accept the "check/fix" requests from user-space and queue them, then handle them one by one, right? what is the behavior for the request user which execute "echo check ..." from the user space? the user post a request to the kernel space, then the command will end or wait for the file check end?
>>
> 
> I would not suggest that, atleast for now. This is to improve availability. However, if the filesystem is very bad, we should suggest an offline check. However, the user can provide multiple CHECK requests.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-28 13:20     ` Joseph Qi
@ 2015-04-29  2:37       ` Gang He
  2015-04-30  2:29         ` Joseph Qi
  2015-05-02 12:52         ` Goldwyn Rodrigues
  2015-05-02 13:08       ` Goldwyn Rodrigues
  1 sibling, 2 replies; 12+ messages in thread
From: Gang He @ 2015-04-29  2:37 UTC (permalink / raw)
  To: ocfs2-devel

Hi Joseph,

Thanks for your detailed description.
See my question inline.


>>> 
> Hi Goldwyn,
> 
> Thanks for the good proposal.
> 
> On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
>> Hi Gang,
>> 
>> On 04/27/2015 10:00 PM, Gang He wrote:
>>> Hi Glodwyn,
>>>
>>> Very nice proposal.
>>> So far, there are some comments from me.
>>> 1) which task will we do in check/fix a file, we need to define the detailed 
> requirements further, since we just do a light-level file check/fix according 
> to inode number, we need to know which items can be done by online check, 
> which items can be done by offline fsck.
>> 
>> For the first phase (regular files), these are all the reasons the disk 
> validate function would fail. Some examples are ocfs2_validate_inode_block, 
> ocfs2_validate_extent_block etc.
>> As we take up system inodes (phase 2), we will add more functionality.
>> 
> Can we classify all corrupted cases and their corresponding fix ways? Maybe 
> we can get some hints from fsck.
> And I don't think errors=continue can fit for all cases.
> For some cases we shouldn't let it continue with errors to prevent more 
> damages.
> 
>>> 2) can we keep check and fix two option, check option is to check if a file 
> is good or bad, but not modify anything, fix option is to check and fix a 
> file if the file is corrupted.
>> 
>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. 
> As a precautionary measure, a CHECK command should be provided before a FIX 
> is issued. IOW, a file should be checked for errors before actually fixing 
> it.
>> 
> A convenient way to know which to be checked should also be taken into 
> consideration.
> 
>>> 3) when users execute the command "echo CHECK <inode> > 
> /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback 
> information besides printing the messages to syslog?
>> 
>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide 
> the results of the last (N) files checked. I don't want to flood the kernel 
> log with this. Thanks for bringing this up, I will put it on the doc. 
> Something like:
>> 
>> Inode Status Description
>> 1234   ERROR Metadata incorrect
>> 2352   FIXED Valid flag not set
>> 9382   CHECKING -
>> 8926   GOOD -
>> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
>> 
>> So, for the current scenario, only 1234 can be fixed. An echo should err 
> with EINVAL if any other inode number is provided with FIX.
>> 
>> 
>>> 4) we should support a list to accept the "check/fix" requests from 
> user-space and queue them, then handle them one by one, right? what is the 
> behavior for the request user which execute "echo check ..." from the user 
> space? the user post a request to the kernel space, then the command will end 
> or wait for the file check end?
>>>
>> 
>> I would not suggest that, atleast for now. This is to improve availability. 
> However, if the filesystem is very bad, we should suggest an offline check. 
> However, the user can provide multiple CHECK requests.
My question is, if users can execute "echo check > .." to check/fix files simultaneously? since users can trigger this command from different terminates.
Second, users send a command to kernel space, the kernel space have to cache these commands in a list/array, since kernel can not finish a check request immediately, otherwise, how does the kernel accept a new request during the kernel are handing the current request.  

Thanks
Gang

>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-27 21:32 [Ocfs2-devel] [RFC] Online File(system) check Goldwyn Rodrigues
  2015-04-28  3:00 ` Gang He
@ 2015-04-29  7:59 ` Junxiao Bi
  2015-05-02 12:45   ` Goldwyn Rodrigues
  1 sibling, 1 reply; 12+ messages in thread
From: Junxiao Bi @ 2015-04-29  7:59 UTC (permalink / raw)
  To: ocfs2-devel

On 04/28/2015 05:32 AM, Goldwyn Rodrigues wrote:
> On popular demand, here is an RFC. If you think there is a better
> way to communicate with the kernel module for the check, please
> let me know.
> 
> 
> Intro
> -----
> OCFS2 is often used in high-availaibility systems. However, ocfs2
> converts the filesystem to read-only at the drop of the hat. This
> may not be necessary, since turning the filesystem read-only would
> affect other running processes as well, decreasing availability.
> 
> This attempt is to add errors=continue, which would return the EIO
> to the calling process and terminate furhter processing so that
> the filesystem is not corrupted further. However, the filesystem
> is not converted to read-only.
Is this safe, if detected an error when accessing an inode, how do you
know this is only inode internal error? If there is corruptions in other
place, the fs will be corrupted further.

Thanks,
Junxiao.

> 
> Scope
> -----
> This effort is to fix small issues which may hinder day-today operations
> of a cluster filesystem by turning the filesystem read-only. The scope of
> fixing is at the file level, initially for regular files and eventually
> to all files (including system files) of the filesystem.
> 
> In case of directory to file links is incorrect, the directory inode
> is reported as erroneous.
> 
> This feature is not suited for extravagant checks which involve dependency of
> other components of the filesystem, such as but not limited to, checking if the bits for file blocks in the allocation has been set. In case of such an error,
> the offline fsck should/would be recommended.
> 
> Finally, such an operation/feature should not be automated lest the filesystem
> may end up with more damage than before the repair attempt. So, this has to
> be performed using user interaction and consent.
> 
> 
> Communication
> -------------
> When there are errors in the ocfs2 filesystem, they are usually accompanied
> by the inode number which caused the error. This inode number would be the
> input to fixing the file.
> 
> One of these options could be considered:
> 
> A file in the sys filesytem which would accept inode numbers. This
> could be used to communication back what has to be fixed or is fixed.
> You could write:
>   # echo "CHECK <inode>" > /sys/fs/ocfs2/filecheck
>   or
>   # echo "FIX <inode>" > /sys/fs/ocfs2/filecheck
> 
> 
> Fixing stuff
> ------------
> 
> On receivng the inode, the filesystem would read the inode and the
> file metadata. In case of errors, the filesystem would fix the errors
> and report the problems it fixed. As a precautionary measure, the
> inode must first be checked for errors before performing a final fix.
> 
> The inode and the fix history will be maintained temporarily in a
> small linked list buffer which would contain the last (N) inodes
> fixed/checked, alongwith the logs of what errors were reported/fixed.
> 
> 
> Comments/Criticism welcome.
> 
> 
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-29  2:37       ` Gang He
@ 2015-04-30  2:29         ` Joseph Qi
  2015-05-02 12:52         ` Goldwyn Rodrigues
  1 sibling, 0 replies; 12+ messages in thread
From: Joseph Qi @ 2015-04-30  2:29 UTC (permalink / raw)
  To: ocfs2-devel

On 2015/4/29 10:37, Gang He wrote:
> Hi Joseph,
> 
> Thanks for your detailed description.
> See my question inline.
> 
> 
>>>>
>> Hi Goldwyn,
>>
>> Thanks for the good proposal.
>>
>> On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
>>> Hi Gang,
>>>
>>> On 04/27/2015 10:00 PM, Gang He wrote:
>>>> Hi Glodwyn,
>>>>
>>>> Very nice proposal.
>>>> So far, there are some comments from me.
>>>> 1) which task will we do in check/fix a file, we need to define the detailed 
>> requirements further, since we just do a light-level file check/fix according 
>> to inode number, we need to know which items can be done by online check, 
>> which items can be done by offline fsck.
>>>
>>> For the first phase (regular files), these are all the reasons the disk 
>> validate function would fail. Some examples are ocfs2_validate_inode_block, 
>> ocfs2_validate_extent_block etc.
>>> As we take up system inodes (phase 2), we will add more functionality.
>>>
>> Can we classify all corrupted cases and their corresponding fix ways? Maybe 
>> we can get some hints from fsck.
>> And I don't think errors=continue can fit for all cases.
>> For some cases we shouldn't let it continue with errors to prevent more 
>> damages.
>>
>>>> 2) can we keep check and fix two option, check option is to check if a file 
>> is good or bad, but not modify anything, fix option is to check and fix a 
>> file if the file is corrupted.
>>>
>>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. 
>> As a precautionary measure, a CHECK command should be provided before a FIX 
>> is issued. IOW, a file should be checked for errors before actually fixing 
>> it.
>>>
>> A convenient way to know which to be checked should also be taken into 
>> consideration.
>>
>>>> 3) when users execute the command "echo CHECK <inode> > 
>> /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback 
>> information besides printing the messages to syslog?
>>>
>>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide 
>> the results of the last (N) files checked. I don't want to flood the kernel 
>> log with this. Thanks for bringing this up, I will put it on the doc. 
>> Something like:
>>>
>>> Inode Status Description
>>> 1234   ERROR Metadata incorrect
>>> 2352   FIXED Valid flag not set
>>> 9382   CHECKING -
>>> 8926   GOOD -
>>> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
>>>
>>> So, for the current scenario, only 1234 can be fixed. An echo should err 
>> with EINVAL if any other inode number is provided with FIX.
>>>
>>>
>>>> 4) we should support a list to accept the "check/fix" requests from 
>> user-space and queue them, then handle them one by one, right? what is the 
>> behavior for the request user which execute "echo check ..." from the user 
>> space? the user post a request to the kernel space, then the command will end 
>> or wait for the file check end?
>>>>
>>>
>>> I would not suggest that, atleast for now. This is to improve availability. 
>> However, if the filesystem is very bad, we should suggest an offline check. 
>> However, the user can provide multiple CHECK requests.
> My question is, if users can execute "echo check > .." to check/fix files simultaneously? since users can trigger this command from different terminates.
I think we have to restrict it. Since offline fsck is also not supposed
to allow such a case.
If we have to, maybe user dlm can take care of this.

> Second, users send a command to kernel space, the kernel space have to cache these commands in a list/array, since kernel can not finish a check request immediately, otherwise, how does the kernel accept a new request during the kernel are handing the current request.  
I think the operations should be done one by one.
IMO, kernel finds the corruption and reports to user space.
In user space we maintain a corruptions list.
Then user check/fix one by one.

> 
> Thanks
> Gang
> 
>>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-29  7:59 ` Junxiao Bi
@ 2015-05-02 12:45   ` Goldwyn Rodrigues
  2015-05-04  2:55     ` Junxiao Bi
  0 siblings, 1 reply; 12+ messages in thread
From: Goldwyn Rodrigues @ 2015-05-02 12:45 UTC (permalink / raw)
  To: ocfs2-devel



On 04/29/2015 02:59 AM, Junxiao Bi wrote:
> On 04/28/2015 05:32 AM, Goldwyn Rodrigues wrote:
>> On popular demand, here is an RFC. If you think there is a better
>> way to communicate with the kernel module for the check, please
>> let me know.
>>
>>
>> Intro
>> -----
>> OCFS2 is often used in high-availaibility systems. However, ocfs2
>> converts the filesystem to read-only at the drop of the hat. This
>> may not be necessary, since turning the filesystem read-only would
>> affect other running processes as well, decreasing availability.
>>
>> This attempt is to add errors=continue, which would return the EIO
>> to the calling process and terminate furhter processing so that
>> the filesystem is not corrupted further. However, the filesystem
>> is not converted to read-only.
> Is this safe, if detected an error when accessing an inode, how do you
> know this is only inode internal error?


Thanks for your comments. The error message would need to be modified to 
specify the inode(s) which need to be checked. It could be a regular 
file or the system inode.

> If there is corruptions in other
> place, the fs will be corrupted further.
>
It there is a corruption in another place, the process will err at that 
location.

Could you provide a sample case to explain this situation? and how is it 
different from what is already present in the code?

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-29  2:37       ` Gang He
  2015-04-30  2:29         ` Joseph Qi
@ 2015-05-02 12:52         ` Goldwyn Rodrigues
  1 sibling, 0 replies; 12+ messages in thread
From: Goldwyn Rodrigues @ 2015-05-02 12:52 UTC (permalink / raw)
  To: ocfs2-devel



On 04/28/2015 09:37 PM, Gang He wrote:
> Hi Joseph,
>
> Thanks for your detailed description.
> See my question inline.
>
>
>>>>
>> Hi Goldwyn,
>>
>> Thanks for the good proposal.
>>
>> On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
>>> Hi Gang,
>>>
>>> On 04/27/2015 10:00 PM, Gang He wrote:
>>>> Hi Glodwyn,
>>>>
>>>> Very nice proposal.
>>>> So far, there are some comments from me.
>>>> 1) which task will we do in check/fix a file, we need to define the detailed
>> requirements further, since we just do a light-level file check/fix according
>> to inode number, we need to know which items can be done by online check,
>> which items can be done by offline fsck.
>>>
>>> For the first phase (regular files), these are all the reasons the disk
>> validate function would fail. Some examples are ocfs2_validate_inode_block,
>> ocfs2_validate_extent_block etc.
>>> As we take up system inodes (phase 2), we will add more functionality.
>>>
>> Can we classify all corrupted cases and their corresponding fix ways? Maybe
>> we can get some hints from fsck.
>> And I don't think errors=continue can fit for all cases.
>> For some cases we shouldn't let it continue with errors to prevent more
>> damages.
>>
>>>> 2) can we keep check and fix two option, check option is to check if a file
>> is good or bad, but not modify anything, fix option is to check and fix a
>> file if the file is corrupted.
>>>
>>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors.
>> As a precautionary measure, a CHECK command should be provided before a FIX
>> is issued. IOW, a file should be checked for errors before actually fixing
>> it.
>>>
>> A convenient way to know which to be checked should also be taken into
>> consideration.
>>
>>>> 3) when users execute the command "echo CHECK <inode> >
>> /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback
>> information besides printing the messages to syslog?
>>>
>>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide
>> the results of the last (N) files checked. I don't want to flood the kernel
>> log with this. Thanks for bringing this up, I will put it on the doc.
>> Something like:
>>>
>>> Inode Status Description
>>> 1234   ERROR Metadata incorrect
>>> 2352   FIXED Valid flag not set
>>> 9382   CHECKING -
>>> 8926   GOOD -
>>> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
>>>
>>> So, for the current scenario, only 1234 can be fixed. An echo should err
>> with EINVAL if any other inode number is provided with FIX.
>>>
>>>
>>>> 4) we should support a list to accept the "check/fix" requests from
>> user-space and queue them, then handle them one by one, right? what is the
>> behavior for the request user which execute "echo check ..." from the user
>> space? the user post a request to the kernel space, then the command will end
>> or wait for the file check end?
>>>>
>>>
>>> I would not suggest that, atleast for now. This is to improve availability.
>> However, if the filesystem is very bad, we should suggest an offline check.
>> However, the user can provide multiple CHECK requests.
> My question is, if users can execute "echo check > .." to check/fix files simultaneously? since users can trigger this command from different terminates.

This would like a general file access with all the dlm procedures 
attached. You would need the DLM locks to access and write to the inode.
For that matter, checks for the same file can be triggered from 
different nodes as well, in which case they would be executed 
individually, just like any other file access.


> Second, users send a command to kernel space, the kernel space have to cache these commands in a list/array, since kernel can not finish a check request immediately, otherwise, how does the kernel accept a new request during the kernel are handing the current request.

No, no caching. Just one at a time.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-04-28 13:20     ` Joseph Qi
  2015-04-29  2:37       ` Gang He
@ 2015-05-02 13:08       ` Goldwyn Rodrigues
  2015-05-04  1:46         ` Joseph Qi
  1 sibling, 1 reply; 12+ messages in thread
From: Goldwyn Rodrigues @ 2015-05-02 13:08 UTC (permalink / raw)
  To: ocfs2-devel



On 04/28/2015 08:20 AM, Joseph Qi wrote:
> Hi Goldwyn,
>
> Thanks for the good proposal.
>
> On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
>> Hi Gang,
>>
>> On 04/27/2015 10:00 PM, Gang He wrote:
>>> Hi Glodwyn,
>>>
>>> Very nice proposal.
>>> So far, there are some comments from me.
>>> 1) which task will we do in check/fix a file, we need to define the detailed requirements further, since we just do a light-level file check/fix according to inode number, we need to know which items can be done by online check, which items can be done by offline fsck.
>>
>> For the first phase (regular files), these are all the reasons the disk validate function would fail. Some examples are ocfs2_validate_inode_block, ocfs2_validate_extent_block etc.
>> As we take up system inodes (phase 2), we will add more functionality.
>>
> Can we classify all corrupted cases and their corresponding fix ways? Maybe we can get some hints from fsck.

That is a pretty big list. I would like to know of cases which would not 
work with this scenario at first.

> And I don't think errors=continue can fit for all cases.
> For some cases we shouldn't let it continue with errors to prevent more damages.

Could you provide an example which would not fit into such a case to 
strengthen your argument?

>
>>> 2) can we keep check and fix two option, check option is to check if a file is good or bad, but not modify anything, fix option is to check and fix a file if the file is corrupted.
>>
>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. As a precautionary measure, a CHECK command should be provided before a FIX is issued. IOW, a file should be checked for errors before actually fixing it.
>>
> A convenient way to know which to be checked should also be taken into consideration.

What do you infer by "which"? Is inode number not enough? Of course we 
would have to go through the errors reported to make sure the right 
inode number is listed.

>
>>> 3) when users execute the command "echo CHECK <inode> > /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback information besides printing the messages to syslog?
>>
>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide the results of the last (N) files checked. I don't want to flood the kernel log with this. Thanks for bringing this up, I will put it on the doc. Something like:
>>
>> Inode Status Description
>> 1234   ERROR Metadata incorrect
>> 2352   FIXED Valid flag not set
>> 9382   CHECKING -
>> 8926   GOOD -
>> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
>>
>> So, for the current scenario, only 1234 can be fixed. An echo should err with EINVAL if any other inode number is provided with FIX.
>>
>>
>>> 4) we should support a list to accept the "check/fix" requests from user-space and queue them, then handle them one by one, right? what is the behavior for the request user which execute "echo check ..." from the user space? the user post a request to the kernel space, then the command will end or wait for the file check end?
>>>
>>
>> I would not suggest that, atleast for now. This is to improve availability. However, if the filesystem is very bad, we should suggest an offline check. However, the user can provide multiple CHECK requests.
>>
>
>
>
>

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-05-02 13:08       ` Goldwyn Rodrigues
@ 2015-05-04  1:46         ` Joseph Qi
  0 siblings, 0 replies; 12+ messages in thread
From: Joseph Qi @ 2015-05-04  1:46 UTC (permalink / raw)
  To: ocfs2-devel

On 2015/5/2 21:08, Goldwyn Rodrigues wrote:
> 
> 
> On 04/28/2015 08:20 AM, Joseph Qi wrote:
>> Hi Goldwyn,
>>
>> Thanks for the good proposal.
>>
>> On 2015/4/28 20:21, Goldwyn Rodrigues wrote:
>>> Hi Gang,
>>>
>>> On 04/27/2015 10:00 PM, Gang He wrote:
>>>> Hi Glodwyn,
>>>>
>>>> Very nice proposal.
>>>> So far, there are some comments from me.
>>>> 1) which task will we do in check/fix a file, we need to define the detailed requirements further, since we just do a light-level file check/fix according to inode number, we need to know which items can be done by online check, which items can be done by offline fsck.
>>>
>>> For the first phase (regular files), these are all the reasons the disk validate function would fail. Some examples are ocfs2_validate_inode_block, ocfs2_validate_extent_block etc.
>>> As we take up system inodes (phase 2), we will add more functionality.
>>>
>> Can we classify all corrupted cases and their corresponding fix ways? Maybe we can get some hints from fsck.
> 
> That is a pretty big list. I would like to know of cases which would not work with this scenario at first.
> 
>> And I don't think errors=continue can fit for all cases.
>> For some cases we shouldn't let it continue with errors to prevent more damages.
> 
> Could you provide an example which would not fit into such a case to strengthen your argument?
> 
IMO, most system inodes would not fit. For example, group descriptor corruption.

>>
>>>> 2) can we keep check and fix two option, check option is to check if a file is good or bad, but not modify anything, fix option is to check and fix a file if the file is corrupted.
>>>
>>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. As a precautionary measure, a CHECK command should be provided before a FIX is issued. IOW, a file should be checked for errors before actually fixing it.
>>>
>> A convenient way to know which to be checked should also be taken into consideration.
> 
> What do you infer by "which"? Is inode number not enough? Of course we would have to go through the errors reported to make sure the right inode number is listed.
>
Inode number is the basic information. But it may not be enough because
the corruption may be valid flag cleared, or an empty extent record.
So I think we have to know the corruption type.

>>
>>>> 3) when users execute the command "echo CHECK <inode> > /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback information besides printing the messages to syslog?
>>>
>>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide the results of the last (N) files checked. I don't want to flood the kernel log with this. Thanks for bringing this up, I will put it on the doc. Something like:
>>>
>>> Inode Status Description
>>> 1234   ERROR Metadata incorrect
>>> 2352   FIXED Valid flag not set
>>> 9382   CHECKING -
>>> 8926   GOOD -
>>> 7230   CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline.
>>>
>>> So, for the current scenario, only 1234 can be fixed. An echo should err with EINVAL if any other inode number is provided with FIX.
>>>
>>>
>>>> 4) we should support a list to accept the "check/fix" requests from user-space and queue them, then handle them one by one, right? what is the behavior for the request user which execute "echo check ..." from the user space? the user post a request to the kernel space, then the command will end or wait for the file check end?
>>>>
>>>
>>> I would not suggest that, atleast for now. This is to improve availability. However, if the filesystem is very bad, we should suggest an offline check. However, the user can provide multiple CHECK requests.
>>>
>>
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Ocfs2-devel] [RFC] Online File(system) check
  2015-05-02 12:45   ` Goldwyn Rodrigues
@ 2015-05-04  2:55     ` Junxiao Bi
  0 siblings, 0 replies; 12+ messages in thread
From: Junxiao Bi @ 2015-05-04  2:55 UTC (permalink / raw)
  To: ocfs2-devel

On 05/02/2015 08:45 PM, Goldwyn Rodrigues wrote:
> 
> 
> On 04/29/2015 02:59 AM, Junxiao Bi wrote:
>> On 04/28/2015 05:32 AM, Goldwyn Rodrigues wrote:
>>> On popular demand, here is an RFC. If you think there is a better
>>> way to communicate with the kernel module for the check, please
>>> let me know.
>>>
>>>
>>> Intro
>>> -----
>>> OCFS2 is often used in high-availaibility systems. However, ocfs2
>>> converts the filesystem to read-only at the drop of the hat. This
>>> may not be necessary, since turning the filesystem read-only would
>>> affect other running processes as well, decreasing availability.
>>>
>>> This attempt is to add errors=continue, which would return the EIO
>>> to the calling process and terminate furhter processing so that
>>> the filesystem is not corrupted further. However, the filesystem
>>> is not converted to read-only.
>> Is this safe, if detected an error when accessing an inode, how do you
>> know this is only inode internal error?
> 
> 
> Thanks for your comments. The error message would need to be modified to
> specify the inode(s) which need to be checked. It could be a regular
> file or the system inode.
> 
>> If there is corruptions in other
>> place, the fs will be corrupted further.
>>
> It there is a corruption in another place, the process will err at that
> location.
> 
> Could you provide a sample case to explain this situation? and how is it
> different from what is already present in the code?

For example, if a disk had some bit reversion error, some used bits in
local alloc are marked free and also an inode X's inline flag is
cleared, set fs read-only when detected the inode error at the first
time will stop more data corruption.

I think if want to continue for some inconsistent, we need to prove it's
safe, if can't then better stop at first time.

Thanks,
Junxiao.

> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-05-04  2:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-27 21:32 [Ocfs2-devel] [RFC] Online File(system) check Goldwyn Rodrigues
2015-04-28  3:00 ` Gang He
2015-04-28 12:21   ` Goldwyn Rodrigues
2015-04-28 13:20     ` Joseph Qi
2015-04-29  2:37       ` Gang He
2015-04-30  2:29         ` Joseph Qi
2015-05-02 12:52         ` Goldwyn Rodrigues
2015-05-02 13:08       ` Goldwyn Rodrigues
2015-05-04  1:46         ` Joseph Qi
2015-04-29  7:59 ` Junxiao Bi
2015-05-02 12:45   ` Goldwyn Rodrigues
2015-05-04  2:55     ` Junxiao Bi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.