Re: [PATCH 2/2] f2fs: support {d,id,did,x}node checksum

From: Chao Yu <chao@kernel.org>
To: Chao Yu <yuchao0@huawei.com>, Jaegeuk Kim <jaegeuk@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] f2fs: support {d,id,did,x}node checksum
Date: Tue, 27 Feb 2018 22:16:58 +0800	[thread overview]
Message-ID: <3782ed9e-741d-d711-3937-bc8127566982@kernel.org> (raw)
In-Reply-To: <558ea6b1-e57d-5910-d931-a47f143cca25@huawei.com>

Ping,

On 2018/2/13 15:34, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2018/2/10 10:52, Chao Yu wrote:
>> On 2018/2/10 9:41, Jaegeuk Kim wrote:
>>> On 02/01, Chao Yu wrote:
>>>>
>>>>
>>>> On 2018/2/1 6:15, Jaegeuk Kim wrote:
>>>>> On 01/31, Chao Yu wrote:
>>>>>> On 2018/1/31 10:02, Jaegeuk Kim wrote:
>>>>>>> What if we want to add more entries in addition to node_checksum? Do we have
>>>>>>> to add a new feature flag at every time? How about adding a layout value instead
>>>>>>
>>>>>> Hmm.. for previous implementation, IMO, we'd better add a new feature flag at
>>>>>> every time, otherwise, w/ extra_nsize only, in current image, we can know a
>>>>>> valid range of extended area in node block, but we don't know which
>>>>>> fields/features are valid/enabled or not.
>>>>>>
>>>>>> One more thing is that if we can add one feature flag for each field, we got one
>>>>>> more chance to disable it dynamically.
>>>>>>
>>>>>>> of extra_nsize? For example, layout #1 means node_checksum with extra_nsize=X?
>>>>>>>
>>>>>>>
>>>>>>> What does 1017 mean? We need to make this structure more flexibly for new
>>>>>>
>>>>>> Yes, using raw 1017 is not appropriate here.
>>>>>>
>>>>>>> entries. Like this?
>>>>>>> 		union {
>>>>>>> 			struct node_v1;
>>>>>>> 			struct node_v2;
>>>>>>> 			struct node_v3;
>>>>>>> 			...
>>>>>>> 			struct direct_node dn;
>>>>>>> 			struct indirect_node in;
>>>>>>> 		};
>>>>>>> 	};
>>>>>>>
>>>>>>> 	struct node_v1 {
>>>>>>> 		__le32 data[DEF_ADDRS_PER_BLOCK - V1_NSIZE=1];
>>>>>>> 		__le32 node_checksum;
>>>>>>> 	}
>>>>>>>
>>>>>>> 	struct node_v2 {
>>>>>>> 		__le32 data[DEF_ADDRS_PER_BLOCK - V2_NSIZE=500];
>>>>>>
>>>>>> Hmm.. If we only need to add one more 4 bytes field in struct node_v2, but
>>>>>> V2_NSIZE is defined as fixed 500, there must be 492 bytes wasted.
>>>>>>
>>>>>> Or we can define V2_NSIZE as 8, but if there comes more and more extended
>>>>>> fields, node version count can be a large number, it results in complicated
>>>>>> version recognization and handling.
>>>>>>
>>>>>> One more question is how can we control which fields are valid or not in
>>>>>> comp[Vx_NSIZE]?
>>>>>>
>>>>>>
>>>>>> Anyway, what I'm thinking is maybe we can restructure layout of node block like
>>>>>> the one used by f2fs_inode:
>>>>>>
>>>>>> struct f2fs_node {
>>>>>> 	union {
>>>>>> 		struct f2fs_inode i;
>>>>>> 		union {
>>>>>> 			struct {
>>>>>> 				__le32 node_checksum;
>>>>>> 				__le32 feature_field_1;
>>>>>> 				__le32 feature_field_2;
>>>>>> 				....
>>>>>> 				__le32 addr[];
>>>>>> 				
>>>>>> 			};
>>>>>> 			struct direct_node dn;
>>>>>> 			struct indirect_node in;
>>>>>> 		};
>>>>>> 	};
>>>>>> 	struct node_footer footer;
>>>>>> } __packed;
>>>>>>
>>>>>> Moving all extended fields to the head of f2fs_node, so we don't have to use
>>>>>> macro to indicate actual size of addr.
>>>>>
>>>>> Thinking what'd be the best way. My concern is, once getting more entries, we
>>>>
>>>> OK, I think we need more discussion.. ;)
>>>>
>>>>> can't set each of features individually. Like the second entry should have
>>>>
>>>> Oh, that will be hard. If we have to avoid that, we have to tag in somewhere
>>>> e.g. f2fs_inode::i_flags2 to indicate which new field in f2fs_node is valid, for
>>>> example:
>>>>
>>>> #define F2FS_NODE_CHECKSUM	0x0001
>>>> #define F2FS_NODE_FIELD1	0x0002
>>>> #define F2FS_NODE_FIELD2	0x0004
>>>>
>>>> 	union {
>>>> 		struct {
>>>> 			__le32 node_checksum;
>>>> 			__le32 field_1;
>>>> 			__le32 field_2;
>>>> 			....
>>>> 			__le32 addr[];
>>>> 		};
>>>> 		struct direct_node dn;
>>>> 		struct indirect_node in;
>>>> 	};
>>>>
>>>> f2fs_inode::i_flags2 = F2FS_NODE_CHECKSUM | F2FS_NODE_FIELD1
>>>> indicates that f2fs_node::node_checksum and f2fs_node::field_1 are valid;
>>>>
>>>> f2fs_inode::i_flags2 = F2FS_NODE_FIELD1 | F2FS_NODE_FIELD2
>>>> indicates that f2fs_node::field_1 and f2fs_node::field_2 are valid.
>>>
>>> So, that's why I thought we may need a sort of each formats.
>>
>> Hmm.. if we have two new added fields, there are (2 << 2) combinations
>> of all formats, as:
>>
>> struct original {
>> 	__le32 data[DEF_ADDRS_PER_BLOCK];
>> }
>>
>> struct node_v1 {
>> 	__le32 data[DEF_ADDRS_PER_BLOCK - V1_NSIZE=1];
>> 	__le32 field_1;
>> }
>>
>> struct node_v2 {
>> 	__le32 data[DEF_ADDRS_PER_BLOCK - V2_NSIZE=1];
>> 	__le32 field_2;
>> }
>>
>> struct node_v2 {
>> 	__le32 data[DEF_ADDRS_PER_BLOCK - V3_NSIZE=2];
>> 	__le32 field_1;
>> 	__le32 field_2;
>> }
>>
>> If we add more new fields, the node version will increase sharply due
>> to there is (n << 2) combination with n fields. Right? Any thoughts to
>> reduce maintaining overhead on those node versions structures?
> 
> Do you have time to explain more about the design of multiple version structure
> for node block, I'm still be confused about two things:
> 1. what will we do if we want to add one new field in node structure.
> 2. how can we recognize which fields are valid and which ones are invalid.
> 
> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks,
>>>>
>>>>> enabled node_checksum, which we may not want to do.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>> 		__le32 comp[V2_NSIZE];
>>>>>>> 	}
>>>>>>> 	...
>>>>>>>
>>>>>>>> +			};
>>>>>>>> +			struct direct_node dn;
>>>>>>>> +			struct indirect_node in;
>>>>>>>> +		};
>>>>>>>>  	};
>>>>>>>>  	struct node_footer footer;
>>>>>>>>  } __packed;
>>>>>>>> -- 
>>>>>>>> 2.15.0.55.gc2ece9dc4de6
>>>>>>>
>>>>>>> .
>>>>>>>
>>
>> .
>>
>