From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756167Ab2I0HKo (ORCPT ); Thu, 27 Sep 2012 03:10:44 -0400 Received: from mail-bk0-f46.google.com ([209.85.214.46]:34463 "EHLO mail-bk0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754815Ab2I0HKl (ORCPT ); Thu, 27 Sep 2012 03:10:41 -0400 MIME-Version: 1.0 In-Reply-To: <20120927065726.GP15236@dastard> References: <1348404995-14372-1-git-send-email-zwu.kernel@gmail.com> <1348404995-14372-6-git-send-email-zwu.kernel@gmail.com> <20120927034310.GM15236@dastard> <20120927065726.GP15236@dastard> Date: Thu, 27 Sep 2012 15:10:38 +0800 Message-ID: Subject: Re: [RFC v2 05/10] vfs: introduce one hash table From: Zhi Yong Wu To: Dave Chinner Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, linuxram@linux.vnet.ibm.com, viro@zeniv.linux.org.uk, cmm@us.ibm.com, tytso@mit.edu, marco.stornelli@gmail.com, stroetmann@ontolinux.com, diegocg@gmail.com, chris@csamuel.org, Zhi Yong Wu Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 27, 2012 at 2:57 PM, Dave Chinner wrote: > On Thu, Sep 27, 2012 at 02:23:16PM +0800, Zhi Yong Wu wrote: >> On Thu, Sep 27, 2012 at 11:43 AM, Dave Chinner wrote: >> > On Sun, Sep 23, 2012 at 08:56:30PM +0800, zwu.kernel@gmail.com wrote: >> >> From: Zhi Yong Wu >> >> >> >> Adds a hash table structure which contains >> >> a lot of hash list and is used to efficiently >> >> look up the data temperature of a file or its >> >> ranges. >> >> In each hash list of hash table, the hash node >> >> will keep track of temperature info. >> > >> > So, let me see if I've got the relationship straight: >> > >> > - sb->s_hot_info.hot_inode_tree indexes hot_inode_items, one per inode >> > >> > - hot_inode_item contains access frequency data for that inode >> > >> > - hot_inode_item holds a heat hash node to index the access >> > frequency data for that inode >> > >> > - hot_inode_item.hot_range_tree indexes hot_range_items for that inode >> > >> > - hot_range_item contains access frequency data for that range >> > >> > - hot_range_item holds a heat hash node to index the access >> > frequency data for that range >> > >> > - sb->s_hot_info.heat_inode_hl indexes per-inode heat hash nodes >> > >> > - sb->s_hot_info.heat_range_hl indexes per-range heat hash nodes >> Correct. >> > >> > How about some ascii art? :) Just looking at the hot inode item case >> > (the range item case is the same pattern, though), we have: >> > >> > >> > heat_inode_hl hot_inode_tree >> > | | >> > | V >> > | +-------hot_inode_item-------+ >> > +---+ | frequency data | >> > | V ^ V >> > | ...<--hot_inode_item-->... | ...<--hot_inode_item-->.... >> > | frequency data | frequency data >> > | ^ | ^ >> > | | | | >> > | | | | >> > +------>hot_hash_node-->hot_hash_node-->hot_hash_node-->.... >> Great, can we put them in hot_tracking.txt in Documentation? >> > >> > >> > There's no actual data stored in the hot_hash_node, just pointer >> > back to the frequency data, a hlist_node and a pointer to the >> > hashlist head. IOWs, I agree with Ram that this does not need to >> > exist and just embedding a hlist_node inside the hot_inode_item is >> > all that is needed. i.e: >> > >> > heat_inode_hl hot_inode_tree >> > | | >> > | V >> > | +-------hot_inode_item-------+ >> > | | frequency data | >> > +---+ | hlist_node | >> > | V ^ | V >> > | ...<--hot_inode_item-->... | | ...<--hot_inode_item-->.... >> > | frequency data | | frequency data >> > +------>hlist_node-----------+ +------->hlist_node--->..... >> > >> > There's no need for separate allocations, initialisations, locks and >> > reference counting - all that is already in the hot_inode_item. The >> > items have the same lifecycle limitations - a hot_hash_node must be >> > torn down before the frequency data it points to is freed. Finally, >> > there's no difference in how you move it between lists. >> How will you know if one hot_inode_item should be moved between lists >> when its freq data is changed? > > Record the current temperature in the frequency data, and if it I know how to do it, thanks. > changes, change the list it is on. > >> > Indeed, calling it a hash is wrong - there's not hashing at all >> > - it keeping an array of list where each entry corresponds to a >> > specific temperature. It is a *heat map*, not a hash list. i.e. >> > inode_heat_map, not heat_inode_hl. HEAT_MAP_SIZE, not HASH_SIZE. >> OK. >> > >> > As it is, there aren't any users of the heat maps that are generated >> > in this patch set - it's not even exported to userspace or to >> > debugfs, so I'm not sure how it will be used yet. How are these heat >> > maps going to be used by filesystems, Zhi? >> In hot_hash_calc_temperature(), you can see that one hot_inode or >> hot_range's freq data will be distilled into one temperature value, >> then it will be inserted to the heat map based on its temperature. >> When the file corresponding to the inode or range got hotter or cold, >> its location will be changed in the heat map based on its new >> temperature in hot_hash_update_hash_table(). > > Yes, but a hot_inode_item or hot_range_item can only have one > location in the heat map, right? So it doesn't need external Yes. > structure to point to the frequency data to track this.... OK. > >> And the user will retrieve those freq data and temperature info via >> debugfs or ioctl interfaces. > > Right - but that data is only extracted after an initial > hot_inode_tree lookup - The heat map itself is never directly used > for lookups. If it's not used for lookups based on temperature, why > is it needed? You mean we don't need hot_inode_tree? You know, after those hook functions collect the freq data for inode, they will store those raw info in hot_inode_tree. One private kthread will iterate this tree to distill those raw freq data into one temperatue value in [0 ~ 255], then link the corresponding hot_inode_item in heat map. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com -- Regards, Zhi Yong Wu