On 2020/9/17 下午8:37, David Sterba wrote:
> On Thu, Sep 17, 2020 at 04:02:37PM +0800, Qu Wenruo wrote:
>> On 2020/9/17 上午12:01, David Sterba wrote:
>>> On Tue, Sep 15, 2020 at 01:35:17PM +0800, Qu Wenruo wrote:
>>>> generic_bin_search() distinguishes between reading a key which doesn't
>>>> cross a page and one which does. However this distinction is not
>>>> necessary since read_extent_buffer handles both cases transparently.
>>>>
>>>> Just use read_extent_buffer to streamline the code.
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>>  fs/btrfs/ctree.c | 13 ++-----------
>>>>  1 file changed, 2 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>>>> index cd1cd673bc0b..e204e1320745 100644
>>>> --- a/fs/btrfs/ctree.c
>>>> +++ b/fs/btrfs/ctree.c
>>>> @@ -1697,7 +1697,6 @@ static noinline int generic_bin_search(struct extent_buffer *eb,
>>>>  	}
>>>>  
>>>>  	while (low < high) {
>>>> -		unsigned long oip;
>>>>  		unsigned long offset;
>>>>  		struct btrfs_disk_key *tmp;
>>>>  		struct btrfs_disk_key unaligned;
>>>> @@ -1705,17 +1704,9 @@ static noinline int generic_bin_search(struct extent_buffer *eb,
>>>>  
>>>>  		mid = (low + high) / 2;
>>>>  		offset = p + mid * item_size;
>>>> -		oip = offset_in_page(offset);
>>>>  
>>>> -		if (oip + key_size <= PAGE_SIZE) {
>>>> -			const unsigned long idx = offset >> PAGE_SHIFT;
>>>> -			char *kaddr = page_address(eb->pages[idx]);
>>>> -
>>>> -			tmp = (struct btrfs_disk_key *)(kaddr + oip);
>>>> -		} else {
>>>> -			read_extent_buffer(eb, &unaligned, offset, key_size);
>>>> -			tmp = &unaligned;
>>>> -		}
>>>> +		read_extent_buffer(eb, &unaligned, offset, key_size);
>>>> +		tmp = &unaligned;
>>>
>>> Reading from the first page is a performance optimization on systems
>>> with 4K pages, ie. the majority. I'm not in favor removing it just to
>>> make the code look nicer.
>>
>> For 4K system, with the optimization it only saves one
>> read_extent_buffer() call cost.
> 
> This evaluation is wrong, you missed several things that
> generic_bin_search and read_extent_buffer do.
> 
> generic_bin_search is called very often, each search slot so
> optimization is worth here
> 
> read_extent_buffer is used _only_ for keys that cross page boundary, so
> we need to read the bytes in two steps and this is wrapped into a
> function that we call in a limited number of cases

Then to me, the better solution is to make read_extent_buffer() to be
split into two part.

Part 1 to handle the same page read, which should be made inline.
The part 1 should be small enough, as it only involves the in-page
offset calculation, which is also already done in current
generic_bin_search.

Then part 2 to handle the cross page case, and that part can be a
function call.

Personally speaking, even generic_bin_search() is a hot-path, I still
don't believe it's worthy, as read_extent_buffer() itself is also
frequently called in other locations, and I never see a special handling
for it in any other location.

Anyway, I will use the get_eb_page_offset()/get_eb_page_index() macros
here first, or subpage will be completely screwed.

And then try to use that two-part solution for read_extent_buffer().

Thanks,
Qu

> 
> In all other cases, when the whole key is contained in the page the call
> is inline in generic_bin_search, ie. no function call overhead
> 
>> Or we will need to manually call get_eb_page_offset() here to make it
>> work for subpage.
> 
> For nodesize that is smaller than PAGE_SIZE there's no page crossing at
> all so using read_extent_buffer would be making things worse.
>