From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758083AbdLRH2d (ORCPT ); Mon, 18 Dec 2017 02:28:33 -0500 Received: from LGEAMRELO11.lge.com ([156.147.23.51]:53360 "EHLO lgeamrelo11.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757884AbdLRH2a (ORCPT ); Mon, 18 Dec 2017 02:28:30 -0500 X-Original-SENDERIP: 156.147.1.127 X-Original-MAILFROM: hyc.lee@gmail.com X-Original-SENDERIP: 10.177.225.35 X-Original-MAILFROM: hyc.lee@gmail.com Message-ID: <5A376E1B.6040201@gmail.com> Date: Mon, 18 Dec 2017 16:28:27 +0900 From: Hyunchul Lee User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Jaegeuk Kim , Chao Yu , Chao Yu CC: Jens Axboe , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, kernel-team@lge.com, linux-fsdevel@vger.kernel.org, Hyunchul Lee Subject: Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write References: <1511828607-624-1-git-send-email-hyc.lee@gmail.com> <5A2112A7.2070208@gmail.com> <1fa09755-7322-a886-c582-02e3d93d8f87@kernel.org> <5A2F3BC5.90803@gmail.com> <85f7fc1b-5286-c66f-6833-af1a44c5130f@huawei.com> <5A31D507.70304@gmail.com> <20171215020612.GF35234@jaegeuk-macbookpro.roam.corp.google.com> In-Reply-To: <20171215020612.GF35234@jaegeuk-macbookpro.roam.corp.google.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jaegeuk, Agreed. If Chao agrees with this policy, I will implement it. Thanks for the comment. On 12/15/2017 11:06 AM, Jaegeuk Kim wrote: > On 12/14, Hyunchul Lee wrote: >> Hi Jaegeuk, >> >> I need your comment about the fs_iohint mount option. >> >> a) w/o fs_iohint, propagate user hints to low layer. >> b) w/ fs_iohint, ignore user hints, and use hints which is generated >> with F2FS. >> >> Chao suggests this option. because user hints are more accurate than >> file system. >> >> This is resonable, But I have some concerns about this option. >> The first thing is that blocks of a segments have different hints. This >> could make GC less effective. >> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is >> really needed. I think that difference between them is a little ambigous >> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different >> hints by F2FS. > > I think what we really can do would assign many user hints to our 3 DATA > logs likewise rw_hint_to_seg_type(), since it's just hints for user data. > Then, we can decide how to keep that as much as possible, since we have > another filesystem metadata such as meta and nodes. In addition, I don't > think we have to keep the original user-hints which makes F2FS logs be > messed up. > > With that mind, I can think of the below cases. Especially, if user wants > to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints. > In order to keep this policy, I think fs_iohints would be better to be a > feature set by mkfs.f2fs and detected by sysfs entries for users. > > 1) w/ fs_iohints > > User F2FS Block > ------------------------------------------------------------------- > Meta WRITE_LIFE_MEDIUM > HOT_NODE WRITE_LIFE_NOTSET > WARM_NODE -' > COLD_NODE WRITE_LIFE_NONE > ioctl(cold) COLD_DATA WRITE_LIFE_EXTREME > extention list -' -' > WRITE_LIFE_EXTREME -' -' > WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT > > -- buffered_io > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_LONG > WRITE_LIFE_NONE -' -' > WRITE_LIFE_MEDIUM -' -' > WRITE_LIFE_LONG -' -' > > -- direct_io (Not recommendable) > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET > WRITE_LIFE_NONE -' WRITE_LIFE_NONE > WRITE_LIFE_MEDIUM -' WRITE_LIFE_MEDIUM > WRITE_LIFE_LONG -' WRITE_LIFE_LONG > > 2) w/o fs_iohints > > User F2FS Block > ------------------------------------------------------------------- > Meta - > HOT_NODE - > WARM_NODE - > COLD_NODE - > ioctl(cold) COLD_DATA - > extention list -' - > > -- buffered_io > WRITE_LIFE_EXTREME COLD_DATA - > WRITE_LIFE_SHORT HOT_DATA - > WRITE_LIFE_NOT_SET WARM_DATA - > WRITE_LIFE_NONE -' - > WRITE_LIFE_MEDIUM -' - > WRITE_LIFE_LONG -' - > > -- direct_io > WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME > WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET > WRITE_LIFE_NONE -' WRITE_LIFE_NONE > WRITE_LIFE_MEDIUM -' WRITE_LIFE_MEDIUM > WRITE_LIFE_LONG -' WRITE_LIFE_LONG > > > Note that, I don't much care about how to manipulate streamid in nvme driver > in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them > in different ways. Taking a look at the definition, at least, we don't need > to assume that those are same at all. For example, if we can expolit this in > UFS driver, we can pass all the stream ids to the device as context ids. > > Thanks, > >> >> Thanks. >> >> On 12/12/2017 11:45 AM, Chao Yu wrote: >>> Hi Hyunchul, >>> >>> On 2017/12/12 10:15, Hyunchul Lee wrote: >>>> Hi Chao, >>>> >>>> On 12/11/2017 10:15 PM, Chao Yu wrote: >>>>> Hi Hyunchul, >>>>> >>>>> On 2017/12/1 16:28, Hyunchul Lee wrote: >>>>>> Hi Chao, >>>>>> >>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote: >>>>>>> Hi Hyunchul, >>>>>>> >>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote: >>>>>>>> From: Hyunchul Lee >>>>>>>> >>>>>>>> This implements which hint is passed down to block layer >>>>>>>> for datas from the specific segment type. >>>>>>>> >>>>>>>> segment type hints >>>>>>>> ------------ ----- >>>>>>>> COLD_NODE & COLD_DATA WRITE_LIFE_EXTREME >>>>>>>> WARM_DATA WRITE_LIFE_NONE >>>>>>>> HOT_NODE & WARM_NODE WRITE_LIFE_LONG >>>>>>>> HOT_DATA WRITE_LIFE_MEDIUM >>>>>>>> META_DATA WRITE_LIFE_SHORT >>>>>>> >>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can >>>>>>> provider hint to lower layer according to hot/cold separation ability, >>>>>>> it will be okay. But once user give his hint which may be more accurate >>>>>>> than filesystem, hint converted by f2fs may be wrong. >>>>>>> >>>>>>> So what do you think of adding an option to control whether filesystem >>>>>>> can convert hint user given? >>>>>>> >>>>>> >>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are >>>>>> converted to different hints. >>>>> >>>>> What I mean is introducing a mount option, e.g. fs_iohint, >>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer. >>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated >>>>> with filesystem's private rule. >>>>> >>>> >>>> Okay, I will implement this option and send this patch again. >>> >>> Let's wait for Jaegeuk's comments first? >>> >>>> >>>> Without fs_iohint, Even if data blocks are moved due to GC, >>>> we should keep user hints. And if user hints are not given, >>>> any hints are not passed down to block layer, right? >>> >>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode >>> layout, so later when we trigger GC, we can use the last io_hint in inode rather >>> than giving no hint or fs' hint. >>> >>> I think it needs to discuss with original author of IO hint, what is the IO hint >>> policy when filesystem move block by itself after inode has been released in system. >>> >>> Thanks, >>> >>>> >>>> Thank you for comments. >>>> >>>>> Thanks, >>>>> >>>>>> >>>>>> file hint segment type io hint >>>>>> --------- ------------ ------- >>>>>> LIFE_SHORT HOT_DATA LIFE_MEDIUM >>>>>> LIFE_MEDIUM WARM_DATA LIFE_NONE >>>>>> LIFE_LONG WARM_DATA LIFE_NONE >>>>>> LIFE_EXTREME COLD_DATA LIFE_EXTREME >>>>>> >>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to >>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between >>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the >>>>>> difference between them is a little ambigous for users, and if WARM_DATA >>>>>> segment has two different hints, it can makes GC non-efficient. >>>>>> >>>>>> I wonder your thought about this. >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Check out the vibrant tech community on one of the world's most >>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>>>> _______________________________________________ >>>>>> Linux-f2fs-devel mailing list >>>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>>> >>>>> >>>> >>>> . >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Linux-f2fs-devel mailing list >>> Linux-f2fs-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hyunchul Lee Subject: Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Date: Mon, 18 Dec 2017 16:28:27 +0900 Message-ID: <5A376E1B.6040201@gmail.com> References: <1511828607-624-1-git-send-email-hyc.lee@gmail.com> <5A2112A7.2070208@gmail.com> <1fa09755-7322-a886-c582-02e3d93d8f87@kernel.org> <5A2F3BC5.90803@gmail.com> <85f7fc1b-5286-c66f-6833-af1a44c5130f@huawei.com> <5A31D507.70304@gmail.com> <20171215020612.GF35234@jaegeuk-macbookpro.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from sfi-mx-2.v28.ch3.sourceforge.com ([172.29.28.192] helo=mx.sourceforge.net) by sfs-ml-3.v29.ch3.sourceforge.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.89) (envelope-from ) id 1eQpqp-0008Ok-LF for linux-f2fs-devel@lists.sourceforge.net; Mon, 18 Dec 2017 07:28:39 +0000 Received: from lgeamrelo11.lge.com ([156.147.23.51]) by sfi-mx-2.v28.ch3.sourceforge.com with esmtp (Exim 4.89) id 1eQpql-0008DC-IS for linux-f2fs-devel@lists.sourceforge.net; Mon, 18 Dec 2017 07:28:39 +0000 In-Reply-To: <20171215020612.GF35234@jaegeuk-macbookpro.roam.corp.google.com> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-f2fs-devel-bounces@lists.sourceforge.net To: Jaegeuk Kim , Chao Yu , Chao Yu Cc: Jens Axboe , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, kernel-team@lge.com, linux-fsdevel@vger.kernel.org, Hyunchul Lee Hi Jaegeuk, Agreed. If Chao agrees with this policy, I will implement it. Thanks for the comment. On 12/15/2017 11:06 AM, Jaegeuk Kim wrote: > On 12/14, Hyunchul Lee wrote: >> Hi Jaegeuk, >> >> I need your comment about the fs_iohint mount option. >> >> a) w/o fs_iohint, propagate user hints to low layer. >> b) w/ fs_iohint, ignore user hints, and use hints which is generated >> with F2FS. >> >> Chao suggests this option. because user hints are more accurate than >> file system. >> >> This is resonable, But I have some concerns about this option. >> The first thing is that blocks of a segments have different hints. This >> could make GC less effective. >> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is >> really needed. I think that difference between them is a little ambigous >> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different >> hints by F2FS. > > I think what we really can do would assign many user hints to our 3 DATA > logs likewise rw_hint_to_seg_type(), since it's just hints for user data. > Then, we can decide how to keep that as much as possible, since we have > another filesystem metadata such as meta and nodes. In addition, I don't > think we have to keep the original user-hints which makes F2FS logs be > messed up. > > With that mind, I can think of the below cases. Especially, if user wants > to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints. > In order to keep this policy, I think fs_iohints would be better to be a > feature set by mkfs.f2fs and detected by sysfs entries for users. > > 1) w/ fs_iohints > > User F2FS Block > ------------------------------------------------------------------- > Meta WRITE_LIFE_MEDIUM > HOT_NODE WRITE_LIFE_NOTSET > WARM_NODE -' > COLD_NODE WRITE_LIFE_NONE > ioctl(cold) COLD_DATA WRITE_LIFE_EXTREME > extention list -' -' > WRITE_LIFE_EXTREME -' -' > WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT > > -- buffered_io > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_LONG > WRITE_LIFE_NONE -' -' > WRITE_LIFE_MEDIUM -' -' > WRITE_LIFE_LONG -' -' > > -- direct_io (Not recommendable) > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET > WRITE_LIFE_NONE -' WRITE_LIFE_NONE > WRITE_LIFE_MEDIUM -' WRITE_LIFE_MEDIUM > WRITE_LIFE_LONG -' WRITE_LIFE_LONG > > 2) w/o fs_iohints > > User F2FS Block > ------------------------------------------------------------------- > Meta - > HOT_NODE - > WARM_NODE - > COLD_NODE - > ioctl(cold) COLD_DATA - > extention list -' - > > -- buffered_io > WRITE_LIFE_EXTREME COLD_DATA - > WRITE_LIFE_SHORT HOT_DATA - > WRITE_LIFE_NOT_SET WARM_DATA - > WRITE_LIFE_NONE -' - > WRITE_LIFE_MEDIUM -' - > WRITE_LIFE_LONG -' - > > -- direct_io > WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME > WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT > WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET > WRITE_LIFE_NONE -' WRITE_LIFE_NONE > WRITE_LIFE_MEDIUM -' WRITE_LIFE_MEDIUM > WRITE_LIFE_LONG -' WRITE_LIFE_LONG > > > Note that, I don't much care about how to manipulate streamid in nvme driver > in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them > in different ways. Taking a look at the definition, at least, we don't need > to assume that those are same at all. For example, if we can expolit this in > UFS driver, we can pass all the stream ids to the device as context ids. > > Thanks, > >> >> Thanks. >> >> On 12/12/2017 11:45 AM, Chao Yu wrote: >>> Hi Hyunchul, >>> >>> On 2017/12/12 10:15, Hyunchul Lee wrote: >>>> Hi Chao, >>>> >>>> On 12/11/2017 10:15 PM, Chao Yu wrote: >>>>> Hi Hyunchul, >>>>> >>>>> On 2017/12/1 16:28, Hyunchul Lee wrote: >>>>>> Hi Chao, >>>>>> >>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote: >>>>>>> Hi Hyunchul, >>>>>>> >>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote: >>>>>>>> From: Hyunchul Lee >>>>>>>> >>>>>>>> This implements which hint is passed down to block layer >>>>>>>> for datas from the specific segment type. >>>>>>>> >>>>>>>> segment type hints >>>>>>>> ------------ ----- >>>>>>>> COLD_NODE & COLD_DATA WRITE_LIFE_EXTREME >>>>>>>> WARM_DATA WRITE_LIFE_NONE >>>>>>>> HOT_NODE & WARM_NODE WRITE_LIFE_LONG >>>>>>>> HOT_DATA WRITE_LIFE_MEDIUM >>>>>>>> META_DATA WRITE_LIFE_SHORT >>>>>>> >>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can >>>>>>> provider hint to lower layer according to hot/cold separation ability, >>>>>>> it will be okay. But once user give his hint which may be more accurate >>>>>>> than filesystem, hint converted by f2fs may be wrong. >>>>>>> >>>>>>> So what do you think of adding an option to control whether filesystem >>>>>>> can convert hint user given? >>>>>>> >>>>>> >>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are >>>>>> converted to different hints. >>>>> >>>>> What I mean is introducing a mount option, e.g. fs_iohint, >>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer. >>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated >>>>> with filesystem's private rule. >>>>> >>>> >>>> Okay, I will implement this option and send this patch again. >>> >>> Let's wait for Jaegeuk's comments first? >>> >>>> >>>> Without fs_iohint, Even if data blocks are moved due to GC, >>>> we should keep user hints. And if user hints are not given, >>>> any hints are not passed down to block layer, right? >>> >>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode >>> layout, so later when we trigger GC, we can use the last io_hint in inode rather >>> than giving no hint or fs' hint. >>> >>> I think it needs to discuss with original author of IO hint, what is the IO hint >>> policy when filesystem move block by itself after inode has been released in system. >>> >>> Thanks, >>> >>>> >>>> Thank you for comments. >>>> >>>>> Thanks, >>>>> >>>>>> >>>>>> file hint segment type io hint >>>>>> --------- ------------ ------- >>>>>> LIFE_SHORT HOT_DATA LIFE_MEDIUM >>>>>> LIFE_MEDIUM WARM_DATA LIFE_NONE >>>>>> LIFE_LONG WARM_DATA LIFE_NONE >>>>>> LIFE_EXTREME COLD_DATA LIFE_EXTREME >>>>>> >>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to >>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between >>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the >>>>>> difference between them is a little ambigous for users, and if WARM_DATA >>>>>> segment has two different hints, it can makes GC non-efficient. >>>>>> >>>>>> I wonder your thought about this. >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Check out the vibrant tech community on one of the world's most >>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>>>> _______________________________________________ >>>>>> Linux-f2fs-devel mailing list >>>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>>> >>>>> >>>> >>>> . >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> _______________________________________________ >>> Linux-f2fs-devel mailing list >>> Linux-f2fs-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot