From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755357AbbKXWQL (ORCPT ); Tue, 24 Nov 2015 17:16:11 -0500 Received: from mx2.suse.de ([195.135.220.15]:46570 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755305AbbKXWQF (ORCPT ); Tue, 24 Nov 2015 17:16:05 -0500 Date: Tue, 24 Nov 2015 14:16:04 -0800 From: Mark Fasheh To: Junxiao Bi Cc: Gang He , rgoldwyn@suse.de, linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com Subject: Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check Message-ID: <20151124221604.GX15575@wotan.suse.de> Reply-To: Mark Fasheh References: <1446013561-22121-1-git-send-email-ghe@suse.com> <1446013561-22121-5-git-send-email-ghe@suse.com> <56385E63.80808@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56385E63.80808@oracle.com> Organization: SUSE Labs User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Junxiao, On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote: > Hi Gang, > > This is not like a right patch. > First, online file check only checks inode's block number, valid flag, > fs generation value, and meta ecc. I never see a real corruption > happened only on this field, if these fields are corrupted, that means > something bad may happen on other place. So fix this field may not help > and even cause corruption more hard. I agree that these are rather uncommon, we might even consider removing the VALID_FL fixup. I definitely don't think we're ready for anything more complicated than this though either. We kind of have to start somewhere too. > Second, the repair way is wrong. In > ocfs2_filecheck_repair_inode_block(), if these fields in disk don't > match the ones in memory, the ones in memory are used to update the disk > fields. The question is how do you know these field in memory are > right(they may be the real corrupted ones)? Your second point (and the last part of your 1st point) makes a good argument for why this shouldn't happen automatically. Some of these corruptions might require a human to look at the log and decide what to do. Especially as you point out, where we might not know where the source of the corruption is. And if the human can't figure it out, then it's probably time to unmount and fsck. Thanks, --Mark -- Mark Fasheh From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark Fasheh Date: Tue, 24 Nov 2015 14:16:04 -0800 Subject: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check In-Reply-To: <56385E63.80808@oracle.com> References: <1446013561-22121-1-git-send-email-ghe@suse.com> <1446013561-22121-5-git-send-email-ghe@suse.com> <56385E63.80808@oracle.com> Message-ID: <20151124221604.GX15575@wotan.suse.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Junxiao Bi Cc: Gang He , rgoldwyn@suse.de, linux-kernel@vger.kernel.org, ocfs2-devel@oss.oracle.com Hi Junxiao, On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote: > Hi Gang, > > This is not like a right patch. > First, online file check only checks inode's block number, valid flag, > fs generation value, and meta ecc. I never see a real corruption > happened only on this field, if these fields are corrupted, that means > something bad may happen on other place. So fix this field may not help > and even cause corruption more hard. I agree that these are rather uncommon, we might even consider removing the VALID_FL fixup. I definitely don't think we're ready for anything more complicated than this though either. We kind of have to start somewhere too. > Second, the repair way is wrong. In > ocfs2_filecheck_repair_inode_block(), if these fields in disk don't > match the ones in memory, the ones in memory are used to update the disk > fields. The question is how do you know these field in memory are > right(they may be the real corrupted ones)? Your second point (and the last part of your 1st point) makes a good argument for why this shouldn't happen automatically. Some of these corruptions might require a human to look at the log and decide what to do. Especially as you point out, where we might not know where the source of the corruption is. And if the human can't figure it out, then it's probably time to unmount and fsck. Thanks, --Mark -- Mark Fasheh