linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] apply write hints to select the type of segments
@ 2017-11-09  5:51 Hyunchul Lee
  2017-11-09  5:51 ` [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write Hyunchul Lee
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-09  5:51 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team, Hyunchul Lee

From: Hyunchul Lee <cheol.lee@lge.com>

Using write hints[1], applications can inform the life time of the data
written to devices. and this[2] reported that the write hints patch
decreased writes in NAND by 25%.

This hints help F2FS to determine the followings.
  1) the segment types where the data will be written.
  2) the hints that will be passed down to devices with the data of segments.

This patch set implements the first mapping from write hints to segment types
as shown below.

  hints                     segment type
  -----                     ------------
  WRITE_LIFE_SHORT          CURSEG_COLD_DATA
  WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
  others                    CURSEG_WARM_DATA

The F2FS poliy for hot/cold seperation has precedence over this hints, And
hints are not applied in in-place update.

Before the second mapping is implemented, write hints are not passed down
to devices. Because it is better that the data of a segment have the same 
hint.

[1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
[2]: https://lwn.net/Articles/726477/

Hyunchul Lee (2):
  f2fs: apply write hints to select the type of segments for buffered
    write
  f2fs: apply write hints to select the type of segment for direct write

 fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
 fs/f2fs/f2fs.h    |   1 +
 fs/f2fs/segment.c |  14 +++++++-
 3 files changed, 74 insertions(+), 42 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write
  2017-11-09  5:51 [RFC PATCH 0/2] apply write hints to select the type of segments Hyunchul Lee
@ 2017-11-09  5:51 ` Hyunchul Lee
  2017-11-09  8:21   ` Chao Yu
  2017-11-09  5:51 ` [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write Hyunchul Lee
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-09  5:51 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team, Hyunchul Lee

From: Hyunchul Lee <cheol.lee@lge.com>

Write hints helps F2FS to determine which type of segments would be
selected for buffered write.

This patch implements the mapping from write hints to segment types
as shown below.

  hints               segment type
  -----               ------------
  WRITE_LIFE_SHORT    CURSEG_COLD_DATA
  WRITE_LIFE_EXTREME  CURSEG_HOT_DATA
  others              CURSEG_WARM_DATA

the F2FS poliy for hot/cold seperation has precedence over this hints.
And hints are not applied in in-place update.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
---
 fs/f2fs/segment.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c695ff4..45aef53 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2258,6 +2258,18 @@ static bool __has_curseg_space(struct f2fs_sb_info *sbi, int type)
 	return false;
 }
 
+int rw_hint_to_seg_type(enum rw_hint hint)
+{
+	switch (hint) {
+	case WRITE_LIFE_SHORT:
+		return CURSEG_HOT_DATA;
+	case WRITE_LIFE_EXTREME:
+		return CURSEG_COLD_DATA;
+	default:
+		return CURSEG_WARM_DATA;
+	}
+}
+
 static int __get_segment_type_2(struct f2fs_io_info *fio)
 {
 	if (fio->type == DATA)
@@ -2292,7 +2304,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
 			return CURSEG_COLD_DATA;
 		if (is_inode_flag_set(inode, FI_HOT_DATA))
 			return CURSEG_HOT_DATA;
-		return CURSEG_WARM_DATA;
+		return rw_hint_to_seg_type(inode->i_write_hint);
 	} else {
 		if (IS_DNODE(fio->page))
 			return is_cold_node(fio->page) ? CURSEG_WARM_NODE :
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write
  2017-11-09  5:51 [RFC PATCH 0/2] apply write hints to select the type of segments Hyunchul Lee
  2017-11-09  5:51 ` [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write Hyunchul Lee
@ 2017-11-09  5:51 ` Hyunchul Lee
  2017-11-11  0:38   ` [f2fs-dev] " Chao Yu
  2017-11-09  9:12 ` [RFC PATCH 0/2] apply write hints to select the type of segments Chao Yu
  2017-11-17 17:23 ` Christoph Hellwig
  3 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-09  5:51 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team, Hyunchul Lee

From: Hyunchul Lee <cheol.lee@lge.com>

Select the type of the segment using write hints, when blocks are
allocated for direct write.

There are unhandled corner cases. Hints are not applied in
in-place update.  And if the blocks of a file is not pre-allocated
because of the invalid user buffer, CURSEG_WARM_DATA segment will
be selected.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
---
 fs/f2fs/data.c | 101 ++++++++++++++++++++++++++++++++++-----------------------
 fs/f2fs/f2fs.h |   1 +
 2 files changed, 61 insertions(+), 41 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 36b5352..d06048a 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -783,7 +783,7 @@ struct page *get_new_data_page(struct inode *inode,
 	return page;
 }
 
-static int __allocate_data_block(struct dnode_of_data *dn)
+static int __allocate_data_block(struct dnode_of_data *dn, int seg_type)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
 	struct f2fs_summary sum;
@@ -808,7 +808,7 @@ static int __allocate_data_block(struct dnode_of_data *dn)
 	set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
 
 	allocate_data_block(sbi, NULL, dn->data_blkaddr, &dn->data_blkaddr,
-					&sum, CURSEG_WARM_DATA, NULL, false);
+					&sum, seg_type, NULL, false);
 	set_data_blkaddr(dn);
 
 	/* update i_size */
@@ -827,42 +827,6 @@ static inline bool __force_buffered_io(struct inode *inode, int rw)
 			F2FS_I_SB(inode)->s_ndevs);
 }
 
-int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
-{
-	struct inode *inode = file_inode(iocb->ki_filp);
-	struct f2fs_map_blocks map;
-	int err = 0;
-
-	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
-		return 0;
-
-	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
-	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
-	if (map.m_len > map.m_lblk)
-		map.m_len -= map.m_lblk;
-	else
-		map.m_len = 0;
-
-	map.m_next_pgofs = NULL;
-
-	if (iocb->ki_flags & IOCB_DIRECT) {
-		err = f2fs_convert_inline_inode(inode);
-		if (err)
-			return err;
-		return f2fs_map_blocks(inode, &map, 1,
-			__force_buffered_io(inode, WRITE) ?
-				F2FS_GET_BLOCK_PRE_AIO :
-				F2FS_GET_BLOCK_PRE_DIO);
-	}
-	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
-		err = f2fs_convert_inline_inode(inode);
-		if (err)
-			return err;
-	}
-	if (!f2fs_has_inline_data(inode))
-		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
-	return err;
-}
 
 static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
 {
@@ -888,8 +852,8 @@ static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
  *     b. do not use extent cache for better performance
  *     c. give the block addresses to blockdev
  */
-int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
-						int create, int flag)
+static int __f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
+				int create, int flag, int seg_type)
 {
 	unsigned int maxblocks = map->m_len;
 	struct dnode_of_data dn;
@@ -957,7 +921,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
 					last_ofs_in_node = dn.ofs_in_node;
 				}
 			} else {
-				err = __allocate_data_block(&dn);
+				/* if this inode is marked with FI_NO_PREALLOC,
+				 * @seg_type is NO_CHECK_TYPE
+				 */
+				if (seg_type == NO_CHECK_TYPE)
+					seg_type = CURSEG_WARM_DATA;
+				err = __allocate_data_block(&dn, seg_type);
 				if (!err)
 					set_inode_flag(inode, FI_APPEND_WRITE);
 			}
@@ -1048,6 +1017,51 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
 	return err;
 }
 
+int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
+						int create, int flag)
+{
+	return __f2fs_map_blocks(inode, map, create, flag, NO_CHECK_TYPE);
+}
+
+int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	struct f2fs_map_blocks map;
+	int err = 0;
+
+	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
+		return 0;
+
+	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
+	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
+	if (map.m_len > map.m_lblk)
+		map.m_len -= map.m_lblk;
+	else
+		map.m_len = 0;
+
+	map.m_next_pgofs = NULL;
+
+	if (iocb->ki_flags & IOCB_DIRECT) {
+		err = f2fs_convert_inline_inode(inode);
+		if (err)
+			return err;
+		return __f2fs_map_blocks(inode, &map, 1,
+			__force_buffered_io(inode, WRITE) ?
+				F2FS_GET_BLOCK_PRE_AIO :
+				F2FS_GET_BLOCK_PRE_DIO,
+				rw_hint_to_seg_type(iocb->ki_hint));
+	}
+	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
+		err = f2fs_convert_inline_inode(inode);
+		if (err)
+			return err;
+	}
+	if (!f2fs_has_inline_data(inode))
+		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
+
+	return err;
+}
+
 static int __get_data_block(struct inode *inode, sector_t iblock,
 			struct buffer_head *bh, int create, int flag,
 			pgoff_t *next_pgofs)
@@ -2082,6 +2096,11 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 
 	trace_f2fs_direct_IO_enter(inode, offset, count, rw);
 
+	/* This is for avoiding the situation that the data of a segment is
+	 * passed down to devices with different hints
+	 */
+	iocb->ki_hint = WRITE_LIFE_NOT_SET;
+
 	down_read(&F2FS_I(inode)->dio_rwsem[rw]);
 	err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
 	up_read(&F2FS_I(inode)->dio_rwsem[rw]);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 4b4a72f..9be5658 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2562,6 +2562,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
 void destroy_segment_manager(struct f2fs_sb_info *sbi);
 int __init create_segment_manager_caches(void);
 void destroy_segment_manager_caches(void);
+int rw_hint_to_seg_type(enum rw_hint hint);
 
 /*
  * checkpoint.c
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write
  2017-11-09  5:51 ` [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write Hyunchul Lee
@ 2017-11-09  8:21   ` Chao Yu
  2017-11-09 18:19     ` Jaegeuk Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-09  8:21 UTC (permalink / raw)
  To: Hyunchul Lee, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team, Hyunchul Lee

On 2017/11/9 13:51, Hyunchul Lee wrote:
> From: Hyunchul Lee <cheol.lee@lge.com>
> 
> Write hints helps F2FS to determine which type of segments would be
> selected for buffered write.
> 
> This patch implements the mapping from write hints to segment types
> as shown below.
> 
>   hints               segment type
>   -----               ------------
>   WRITE_LIFE_SHORT    CURSEG_COLD_DATA
>   WRITE_LIFE_EXTREME  CURSEG_HOT_DATA

Should keep consistent with code implementation.

	WRITE_LIFE_SHORT    CURSEG_HOT_DATA
	WRITE_LIFE_EXTREME  CURSEG_COLD_DATA

>   others              CURSEG_WARM_DATA
> 
> the F2FS poliy for hot/cold seperation has precedence over this hints.
> And hints are not applied in in-place update.
> 
> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>

Reviewed-by: Chao Yu <yuchao0@huawei.com>

Thanks,

> ---
>  fs/f2fs/segment.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index c695ff4..45aef53 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -2258,6 +2258,18 @@ static bool __has_curseg_space(struct f2fs_sb_info *sbi, int type)
>  	return false;
>  }
>  
> +int rw_hint_to_seg_type(enum rw_hint hint)
> +{
> +	switch (hint) {
> +	case WRITE_LIFE_SHORT:
> +		return CURSEG_HOT_DATA;
> +	case WRITE_LIFE_EXTREME:
> +		return CURSEG_COLD_DATA;
> +	default:
> +		return CURSEG_WARM_DATA;
> +	}
> +}
> +
>  static int __get_segment_type_2(struct f2fs_io_info *fio)
>  {
>  	if (fio->type == DATA)
> @@ -2292,7 +2304,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
>  			return CURSEG_COLD_DATA;
>  		if (is_inode_flag_set(inode, FI_HOT_DATA))
>  			return CURSEG_HOT_DATA;
> -		return CURSEG_WARM_DATA;
> +		return rw_hint_to_seg_type(inode->i_write_hint);
>  	} else {
>  		if (IS_DNODE(fio->page))
>  			return is_cold_node(fio->page) ? CURSEG_WARM_NODE :
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-09  5:51 [RFC PATCH 0/2] apply write hints to select the type of segments Hyunchul Lee
  2017-11-09  5:51 ` [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write Hyunchul Lee
  2017-11-09  5:51 ` [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write Hyunchul Lee
@ 2017-11-09  9:12 ` Chao Yu
  2017-11-09 18:16   ` Jaegeuk Kim
  2017-11-10  0:23   ` Hyunchul Lee
  2017-11-17 17:23 ` Christoph Hellwig
  3 siblings, 2 replies; 30+ messages in thread
From: Chao Yu @ 2017-11-09  9:12 UTC (permalink / raw)
  To: Hyunchul Lee, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee, Chao Yu

On 2017/11/9 13:51, Hyunchul Lee wrote:
> From: Hyunchul Lee <cheol.lee@lge.com>
> 
> Using write hints[1], applications can inform the life time of the data
> written to devices. and this[2] reported that the write hints patch
> decreased writes in NAND by 25%.
> 
> This hints help F2FS to determine the followings.
>   1) the segment types where the data will be written.
>   2) the hints that will be passed down to devices with the data of segments.
> 
> This patch set implements the first mapping from write hints to segment types
> as shown below.
> 
>   hints                     segment type
>   -----                     ------------
>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>   others                    CURSEG_WARM_DATA
> 
> The F2FS poliy for hot/cold seperation has precedence over this hints, And
> hints are not applied in in-place update.

Could we change to disable IPU if file/inode write hint is existing?

> 
> Before the second mapping is implemented, write hints are not passed down
> to devices. Because it is better that the data of a segment have the same 
> hint.
> 
> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
> [2]: https://lwn.net/Articles/726477/

Could you write a patch to support passing write hint to block layer for
buffered writes as below commit:
0127251c45ae ("ext4: add support for passing in write hints for buffered writes")

Thanks,

> 
> Hyunchul Lee (2):
>   f2fs: apply write hints to select the type of segments for buffered
>     write
>   f2fs: apply write hints to select the type of segment for direct write
> 
>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>  fs/f2fs/f2fs.h    |   1 +
>  fs/f2fs/segment.c |  14 +++++++-
>  3 files changed, 74 insertions(+), 42 deletions(-)
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-09  9:12 ` [RFC PATCH 0/2] apply write hints to select the type of segments Chao Yu
@ 2017-11-09 18:16   ` Jaegeuk Kim
  2017-11-10  2:31     ` Chao Yu
  2017-11-10  0:23   ` Hyunchul Lee
  1 sibling, 1 reply; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-09 18:16 UTC (permalink / raw)
  To: Chao Yu
  Cc: Hyunchul Lee, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 11/09, Chao Yu wrote:
> On 2017/11/9 13:51, Hyunchul Lee wrote:
> > From: Hyunchul Lee <cheol.lee@lge.com>
> > 
> > Using write hints[1], applications can inform the life time of the data
> > written to devices. and this[2] reported that the write hints patch
> > decreased writes in NAND by 25%.
> > 
> > This hints help F2FS to determine the followings.
> >   1) the segment types where the data will be written.
> >   2) the hints that will be passed down to devices with the data of segments.
> > 
> > This patch set implements the first mapping from write hints to segment types
> > as shown below.
> > 
> >   hints                     segment type
> >   -----                     ------------
> >   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
> >   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
> >   others                    CURSEG_WARM_DATA
> > 
> > The F2FS poliy for hot/cold seperation has precedence over this hints, And
> > hints are not applied in in-place update.
> 
> Could we change to disable IPU if file/inode write hint is existing?

Well, this is a hint which is not related to IPU. I think user should be aware
of controlling IPU.

> > 
> > Before the second mapping is implemented, write hints are not passed down
> > to devices. Because it is better that the data of a segment have the same 
> > hint.
> > 
> > [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
> > [2]: https://lwn.net/Articles/726477/
> 
> Could you write a patch to support passing write hint to block layer for
> buffered writes as below commit:
> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
> 
> Thanks,
> 
> > 
> > Hyunchul Lee (2):
> >   f2fs: apply write hints to select the type of segments for buffered
> >     write
> >   f2fs: apply write hints to select the type of segment for direct write
> > 
> >  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
> >  fs/f2fs/f2fs.h    |   1 +
> >  fs/f2fs/segment.c |  14 +++++++-
> >  3 files changed, 74 insertions(+), 42 deletions(-)
> > 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write
  2017-11-09  8:21   ` Chao Yu
@ 2017-11-09 18:19     ` Jaegeuk Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-09 18:19 UTC (permalink / raw)
  To: Chao Yu
  Cc: Hyunchul Lee, linux-f2fs-devel, linux-kernel, linux-fsdevel,
	kernel-team, Hyunchul Lee

On 11/09, Chao Yu wrote:
> On 2017/11/9 13:51, Hyunchul Lee wrote:
> > From: Hyunchul Lee <cheol.lee@lge.com>
> > 
> > Write hints helps F2FS to determine which type of segments would be
> > selected for buffered write.
> > 
> > This patch implements the mapping from write hints to segment types
> > as shown below.
> > 
> >   hints               segment type
> >   -----               ------------
> >   WRITE_LIFE_SHORT    CURSEG_COLD_DATA
> >   WRITE_LIFE_EXTREME  CURSEG_HOT_DATA
> 
> Should keep consistent with code implementation.

Merged with this fix. ;)

> 
> 	WRITE_LIFE_SHORT    CURSEG_HOT_DATA
> 	WRITE_LIFE_EXTREME  CURSEG_COLD_DATA
> 
> >   others              CURSEG_WARM_DATA
> > 
> > the F2FS poliy for hot/cold seperation has precedence over this hints.
> > And hints are not applied in in-place update.
> > 
> > Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
> 
> Reviewed-by: Chao Yu <yuchao0@huawei.com>
> 
> Thanks,
> 
> > ---
> >  fs/f2fs/segment.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index c695ff4..45aef53 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -2258,6 +2258,18 @@ static bool __has_curseg_space(struct f2fs_sb_info *sbi, int type)
> >  	return false;
> >  }
> >  
> > +int rw_hint_to_seg_type(enum rw_hint hint)
> > +{
> > +	switch (hint) {
> > +	case WRITE_LIFE_SHORT:
> > +		return CURSEG_HOT_DATA;
> > +	case WRITE_LIFE_EXTREME:
> > +		return CURSEG_COLD_DATA;
> > +	default:
> > +		return CURSEG_WARM_DATA;
> > +	}
> > +}
> > +
> >  static int __get_segment_type_2(struct f2fs_io_info *fio)
> >  {
> >  	if (fio->type == DATA)
> > @@ -2292,7 +2304,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
> >  			return CURSEG_COLD_DATA;
> >  		if (is_inode_flag_set(inode, FI_HOT_DATA))
> >  			return CURSEG_HOT_DATA;
> > -		return CURSEG_WARM_DATA;
> > +		return rw_hint_to_seg_type(inode->i_write_hint);
> >  	} else {
> >  		if (IS_DNODE(fio->page))
> >  			return is_cold_node(fio->page) ? CURSEG_WARM_NODE :
> > 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-09  9:12 ` [RFC PATCH 0/2] apply write hints to select the type of segments Chao Yu
  2017-11-09 18:16   ` Jaegeuk Kim
@ 2017-11-10  0:23   ` Hyunchul Lee
  2017-11-10  6:42     ` Chao Yu
  1 sibling, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-10  0:23 UTC (permalink / raw)
  To: Chao Yu
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

Hello, Chao

On 11/09/2017 06:12 PM, Chao Yu wrote:
> On 2017/11/9 13:51, Hyunchul Lee wrote:
>> From: Hyunchul Lee <cheol.lee@lge.com>
>>
>> Using write hints[1], applications can inform the life time of the data
>> written to devices. and this[2] reported that the write hints patch
>> decreased writes in NAND by 25%.
>>
>> This hints help F2FS to determine the followings.
>>   1) the segment types where the data will be written.
>>   2) the hints that will be passed down to devices with the data of segments.
>>
>> This patch set implements the first mapping from write hints to segment types
>> as shown below.
>>
>>   hints                     segment type
>>   -----                     ------------
>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>   others                    CURSEG_WARM_DATA
>>
>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>> hints are not applied in in-place update.
> 
> Could we change to disable IPU if file/inode write hint is existing?
> 

I am afraid that this makes side effects. for example, this could cause
out-of-place updates even when there are not enough free segments. 
I can write the patch that handles these situations. But I wonder 
that this is required, and I am not sure which IPU polices can be disabled.

>>
>> Before the second mapping is implemented, write hints are not passed down
>> to devices. Because it is better that the data of a segment have the same 
>> hint.
>>
>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>> [2]: https://lwn.net/Articles/726477/
> 
> Could you write a patch to support passing write hint to block layer for
> buffered writes as below commit:
> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
> 

Sure I will. I wrote it already ;)
I think that datas from the same segment should be passed down with the same
hint, and the following mapping is reasonable. I wonder what is your opinion
about it.

  segment type               hints
  ------------               -----
  CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
  CURSEG_HOT_DATA            WRITE_LIFE_SHORT
  CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
  CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
  others                     WRITE_LIFE_NONE
 
> Thanks,
> 
>>
>> Hyunchul Lee (2):
>>   f2fs: apply write hints to select the type of segments for buffered
>>     write
>>   f2fs: apply write hints to select the type of segment for direct write
>>
>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>  fs/f2fs/f2fs.h    |   1 +
>>  fs/f2fs/segment.c |  14 +++++++-
>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>
> 
> 

Thanks

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-09 18:16   ` Jaegeuk Kim
@ 2017-11-10  2:31     ` Chao Yu
  0 siblings, 0 replies; 30+ messages in thread
From: Chao Yu @ 2017-11-10  2:31 UTC (permalink / raw)
  To: Jaegeuk Kim
  Cc: Hyunchul Lee, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 2017/11/10 2:16, Jaegeuk Kim wrote:
> On 11/09, Chao Yu wrote:
>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>
>>> Using write hints[1], applications can inform the life time of the data
>>> written to devices. and this[2] reported that the write hints patch
>>> decreased writes in NAND by 25%.
>>>
>>> This hints help F2FS to determine the followings.
>>>   1) the segment types where the data will be written.
>>>   2) the hints that will be passed down to devices with the data of segments.
>>>
>>> This patch set implements the first mapping from write hints to segment types
>>> as shown below.
>>>
>>>   hints                     segment type
>>>   -----                     ------------
>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>   others                    CURSEG_WARM_DATA
>>>
>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>> hints are not applied in in-place update.
>>
>> Could we change to disable IPU if file/inode write hint is existing?
> 
> Well, this is a hint which is not related to IPU. I think user should be aware
> of controlling IPU.

Yes, no matter filesystem trigger IPU or OPU, in FTL, old PPA will always become
invalid (life is end), FTL only cares about how long PPA will become invalid
instead of how LPA/PPA mapps. So from FTL aspect, it's no problem. :)

But from f2fs space management aspect, IPU method can make effect of hot/cold
separating strategy becoming worse, e.g. if we trigger IPU for some hot datas,
data migration in hot segment can be postponed, result in high usage of hot
segment, leaving more work to GC.

So there is a trade-off in between reducing node writes in fsync and reduce GC
work with better hot/cold separating...

Thanks,

> 
>>>
>>> Before the second mapping is implemented, write hints are not passed down
>>> to devices. Because it is better that the data of a segment have the same 
>>> hint.
>>>
>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>> [2]: https://lwn.net/Articles/726477/
>>
>> Could you write a patch to support passing write hint to block layer for
>> buffered writes as below commit:
>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>
>> Thanks,
>>
>>>
>>> Hyunchul Lee (2):
>>>   f2fs: apply write hints to select the type of segments for buffered
>>>     write
>>>   f2fs: apply write hints to select the type of segment for direct write
>>>
>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>  fs/f2fs/f2fs.h    |   1 +
>>>  fs/f2fs/segment.c |  14 +++++++-
>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-10  0:23   ` Hyunchul Lee
@ 2017-11-10  6:42     ` Chao Yu
  2017-11-13  0:24       ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-10  6:42 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 2017/11/10 8:23, Hyunchul Lee wrote:
> Hello, Chao
> 
> On 11/09/2017 06:12 PM, Chao Yu wrote:
>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>
>>> Using write hints[1], applications can inform the life time of the data
>>> written to devices. and this[2] reported that the write hints patch
>>> decreased writes in NAND by 25%.
>>>
>>> This hints help F2FS to determine the followings.
>>>   1) the segment types where the data will be written.
>>>   2) the hints that will be passed down to devices with the data of segments.
>>>
>>> This patch set implements the first mapping from write hints to segment types
>>> as shown below.
>>>
>>>   hints                     segment type
>>>   -----                     ------------
>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>   others                    CURSEG_WARM_DATA
>>>
>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>> hints are not applied in in-place update.
>>
>> Could we change to disable IPU if file/inode write hint is existing?
>>
> 
> I am afraid that this makes side effects. for example, this could cause
> out-of-place updates even when there are not enough free segments. 
> I can write the patch that handles these situations. But I wonder 
> that this is required, and I am not sure which IPU polices can be disabled.

Oh, As I replied in another thread, I think IPU just affects filesystem
hot/cold separating, rather than this feature. So I think it will be okay
to not consider it.

> 
>>>
>>> Before the second mapping is implemented, write hints are not passed down
>>> to devices. Because it is better that the data of a segment have the same 
>>> hint.
>>>
>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>> [2]: https://lwn.net/Articles/726477/
>>
>> Could you write a patch to support passing write hint to block layer for
>> buffered writes as below commit:
>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>
> 
> Sure I will. I wrote it already ;)

Cool, ;)

> I think that datas from the same segment should be passed down with the same
> hint, and the following mapping is reasonable. I wonder what is your opinion
> about it.
> 
>   segment type               hints
>   ------------               -----
>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL

We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?

>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM

As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
data, warm node, and cold node should be coldest. So I suggested we can define
as below:

META_DATA			WRITE_LIFE_SHORT
HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME

Thanks,

>   others                     WRITE_LIFE_NONE
>  
>> Thanks,
>>
>>>
>>> Hyunchul Lee (2):
>>>   f2fs: apply write hints to select the type of segments for buffered
>>>     write
>>>   f2fs: apply write hints to select the type of segment for direct write
>>>
>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>  fs/f2fs/f2fs.h    |   1 +
>>>  fs/f2fs/segment.c |  14 +++++++-
>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>
>>
>>
> 
> Thanks
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [f2fs-dev] [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write
  2017-11-09  5:51 ` [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write Hyunchul Lee
@ 2017-11-11  0:38   ` Chao Yu
  2017-11-13  0:07     ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-11  0:38 UTC (permalink / raw)
  To: Hyunchul Lee, Jaegeuk Kim, Chao Yu
  Cc: kernel-team, Hyunchul Lee, linux-kernel, linux-f2fs-devel, chao

On 2017/11/9 13:51, Hyunchul Lee wrote:
> From: Hyunchul Lee <cheol.lee@lge.com>
> 
> Select the type of the segment using write hints, when blocks are
> allocated for direct write.
> 
> There are unhandled corner cases. Hints are not applied in
> in-place update.  And if the blocks of a file is not pre-allocated
> because of the invalid user buffer, CURSEG_WARM_DATA segment will
> be selected.
> 
> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
> ---
>  fs/f2fs/data.c | 101 ++++++++++++++++++++++++++++++++++-----------------------
>  fs/f2fs/f2fs.h |   1 +
>  2 files changed, 61 insertions(+), 41 deletions(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 36b5352..d06048a 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -783,7 +783,7 @@ struct page *get_new_data_page(struct inode *inode,
>  	return page;
>  }
>  
> -static int __allocate_data_block(struct dnode_of_data *dn)
> +static int __allocate_data_block(struct dnode_of_data *dn, int seg_type)
>  {
>  	struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
>  	struct f2fs_summary sum;
> @@ -808,7 +808,7 @@ static int __allocate_data_block(struct dnode_of_data *dn)
>  	set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
>  
>  	allocate_data_block(sbi, NULL, dn->data_blkaddr, &dn->data_blkaddr,
> -					&sum, CURSEG_WARM_DATA, NULL, false);
> +					&sum, seg_type, NULL, false);
>  	set_data_blkaddr(dn);
>  
>  	/* update i_size */
> @@ -827,42 +827,6 @@ static inline bool __force_buffered_io(struct inode *inode, int rw)
>  			F2FS_I_SB(inode)->s_ndevs);
>  }
>  
> -int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> -{
> -	struct inode *inode = file_inode(iocb->ki_filp);
> -	struct f2fs_map_blocks map;
> -	int err = 0;
> -
> -	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> -		return 0;
> -
> -	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> -	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> -	if (map.m_len > map.m_lblk)
> -		map.m_len -= map.m_lblk;
> -	else
> -		map.m_len = 0;
> -
> -	map.m_next_pgofs = NULL;
> -
> -	if (iocb->ki_flags & IOCB_DIRECT) {
> -		err = f2fs_convert_inline_inode(inode);
> -		if (err)
> -			return err;
> -		return f2fs_map_blocks(inode, &map, 1,
> -			__force_buffered_io(inode, WRITE) ?
> -				F2FS_GET_BLOCK_PRE_AIO :
> -				F2FS_GET_BLOCK_PRE_DIO);
> -	}
> -	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
> -		err = f2fs_convert_inline_inode(inode);
> -		if (err)
> -			return err;
> -	}
> -	if (!f2fs_has_inline_data(inode))
> -		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
> -	return err;
> -}
>  
>  static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>  {
> @@ -888,8 +852,8 @@ static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>   *     b. do not use extent cache for better performance
>   *     c. give the block addresses to blockdev
>   */
> -int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
> -						int create, int flag)
> +static int __f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
> +				int create, int flag, int seg_type)
>  {
>  	unsigned int maxblocks = map->m_len;
>  	struct dnode_of_data dn;
> @@ -957,7 +921,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>  					last_ofs_in_node = dn.ofs_in_node;
>  				}
>  			} else {
> -				err = __allocate_data_block(&dn);
> +				/* if this inode is marked with FI_NO_PREALLOC,
> +				 * @seg_type is NO_CHECK_TYPE
> +				 */
> +				if (seg_type == NO_CHECK_TYPE)
> +					seg_type = CURSEG_WARM_DATA;
> +				err = __allocate_data_block(&dn, seg_type);

We need to use inode.i_write_hint instead of ki_hint passed from file.f_write_hint?

Thanks,

>  				if (!err)
>  					set_inode_flag(inode, FI_APPEND_WRITE);
>  			}
> @@ -1048,6 +1017,51 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>  	return err;
>  }
>  
> +int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
> +						int create, int flag)
> +{
> +	return __f2fs_map_blocks(inode, map, create, flag, NO_CHECK_TYPE);
> +}
> +
> +int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> +{
> +	struct inode *inode = file_inode(iocb->ki_filp);
> +	struct f2fs_map_blocks map;
> +	int err = 0;
> +
> +	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> +		return 0;
> +
> +	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> +	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> +	if (map.m_len > map.m_lblk)
> +		map.m_len -= map.m_lblk;
> +	else
> +		map.m_len = 0;
> +
> +	map.m_next_pgofs = NULL;
> +
> +	if (iocb->ki_flags & IOCB_DIRECT) {
> +		err = f2fs_convert_inline_inode(inode);
> +		if (err)
> +			return err;
> +		return __f2fs_map_blocks(inode, &map, 1,
> +			__force_buffered_io(inode, WRITE) ?
> +				F2FS_GET_BLOCK_PRE_AIO :
> +				F2FS_GET_BLOCK_PRE_DIO,
> +				rw_hint_to_seg_type(iocb->ki_hint));
> +	}
> +	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
> +		err = f2fs_convert_inline_inode(inode);
> +		if (err)
> +			return err;
> +	}
> +	if (!f2fs_has_inline_data(inode))
> +		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
> +
> +	return err;
> +}
> +
>  static int __get_data_block(struct inode *inode, sector_t iblock,
>  			struct buffer_head *bh, int create, int flag,
>  			pgoff_t *next_pgofs)
> @@ -2082,6 +2096,11 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>  
>  	trace_f2fs_direct_IO_enter(inode, offset, count, rw);
>  
> +	/* This is for avoiding the situation that the data of a segment is
> +	 * passed down to devices with different hints
> +	 */
> +	iocb->ki_hint = WRITE_LIFE_NOT_SET;
> +
>  	down_read(&F2FS_I(inode)->dio_rwsem[rw]);
>  	err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
>  	up_read(&F2FS_I(inode)->dio_rwsem[rw]);
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 4b4a72f..9be5658 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -2562,6 +2562,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
>  void destroy_segment_manager(struct f2fs_sb_info *sbi);
>  int __init create_segment_manager_caches(void);
>  void destroy_segment_manager_caches(void);
> +int rw_hint_to_seg_type(enum rw_hint hint);
>  
>  /*
>   * checkpoint.c
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [f2fs-dev] [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write
  2017-11-11  0:38   ` [f2fs-dev] " Chao Yu
@ 2017-11-13  0:07     ` Hyunchul Lee
  2017-11-13  1:24       ` Chao Yu
  0 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-13  0:07 UTC (permalink / raw)
  To: Chao Yu, Jaegeuk Kim, Chao Yu
  Cc: kernel-team, Hyunchul Lee, linux-kernel, linux-f2fs-devel, kernel-team

On 11/11/2017 09:38 AM, Chao Yu wrote:
> On 2017/11/9 13:51, Hyunchul Lee wrote:
>> From: Hyunchul Lee <cheol.lee@lge.com>
>>
>> Select the type of the segment using write hints, when blocks are
>> allocated for direct write.
>>
>> There are unhandled corner cases. Hints are not applied in
>> in-place update.  And if the blocks of a file is not pre-allocated
>> because of the invalid user buffer, CURSEG_WARM_DATA segment will
>> be selected.
>>
>> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
>> ---
>>  fs/f2fs/data.c | 101 ++++++++++++++++++++++++++++++++++-----------------------
>>  fs/f2fs/f2fs.h |   1 +
>>  2 files changed, 61 insertions(+), 41 deletions(-)
>>
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index 36b5352..d06048a 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -783,7 +783,7 @@ struct page *get_new_data_page(struct inode *inode,
>>  	return page;
>>  }
>>  
>> -static int __allocate_data_block(struct dnode_of_data *dn)
>> +static int __allocate_data_block(struct dnode_of_data *dn, int seg_type)
>>  {
>>  	struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
>>  	struct f2fs_summary sum;
>> @@ -808,7 +808,7 @@ static int __allocate_data_block(struct dnode_of_data *dn)
>>  	set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
>>  
>>  	allocate_data_block(sbi, NULL, dn->data_blkaddr, &dn->data_blkaddr,
>> -					&sum, CURSEG_WARM_DATA, NULL, false);
>> +					&sum, seg_type, NULL, false);
>>  	set_data_blkaddr(dn);
>>  
>>  	/* update i_size */
>> @@ -827,42 +827,6 @@ static inline bool __force_buffered_io(struct inode *inode, int rw)
>>  			F2FS_I_SB(inode)->s_ndevs);
>>  }
>>  
>> -int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> -{
>> -	struct inode *inode = file_inode(iocb->ki_filp);
>> -	struct f2fs_map_blocks map;
>> -	int err = 0;
>> -
>> -	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>> -		return 0;
>> -
>> -	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
>> -	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
>> -	if (map.m_len > map.m_lblk)
>> -		map.m_len -= map.m_lblk;
>> -	else
>> -		map.m_len = 0;
>> -
>> -	map.m_next_pgofs = NULL;
>> -
>> -	if (iocb->ki_flags & IOCB_DIRECT) {
>> -		err = f2fs_convert_inline_inode(inode);
>> -		if (err)
>> -			return err;
>> -		return f2fs_map_blocks(inode, &map, 1,
>> -			__force_buffered_io(inode, WRITE) ?
>> -				F2FS_GET_BLOCK_PRE_AIO :
>> -				F2FS_GET_BLOCK_PRE_DIO);
>> -	}
>> -	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
>> -		err = f2fs_convert_inline_inode(inode);
>> -		if (err)
>> -			return err;
>> -	}
>> -	if (!f2fs_has_inline_data(inode))
>> -		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
>> -	return err;
>> -}
>>  
>>  static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>>  {
>> @@ -888,8 +852,8 @@ static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>>   *     b. do not use extent cache for better performance
>>   *     c. give the block addresses to blockdev
>>   */
>> -int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>> -						int create, int flag)
>> +static int __f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>> +				int create, int flag, int seg_type)
>>  {
>>  	unsigned int maxblocks = map->m_len;
>>  	struct dnode_of_data dn;
>> @@ -957,7 +921,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>  					last_ofs_in_node = dn.ofs_in_node;
>>  				}
>>  			} else {
>> -				err = __allocate_data_block(&dn);
>> +				/* if this inode is marked with FI_NO_PREALLOC,
>> +				 * @seg_type is NO_CHECK_TYPE
>> +				 */
>> +				if (seg_type == NO_CHECK_TYPE)
>> +					seg_type = CURSEG_WARM_DATA;
>> +				err = __allocate_data_block(&dn, seg_type);
> 
> We need to use inode.i_write_hint instead of ki_hint passed from file.f_write_hint?
> 

The following commit says to use file.f_write_hint if it is available. 
"c75b1d9 fs: add fcntl() interface for setting/getting write life time hints"

And ki_hint is assiged to file.f_write_hint or inode.i_write_hint 
by file_write_hint() in init_sync_kiocb().
So, I think we need to use ki_hint instead of inode.i_write_hint.

Thanks

> Thanks,
> 
>>  				if (!err)
>>  					set_inode_flag(inode, FI_APPEND_WRITE);
>>  			}
>> @@ -1048,6 +1017,51 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>  	return err;
>>  }
>>  
>> +int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>> +						int create, int flag)
>> +{
>> +	return __f2fs_map_blocks(inode, map, create, flag, NO_CHECK_TYPE);
>> +}
>> +
>> +int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> +{
>> +	struct inode *inode = file_inode(iocb->ki_filp);
>> +	struct f2fs_map_blocks map;
>> +	int err = 0;
>> +
>> +	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>> +		return 0;
>> +
>> +	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
>> +	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
>> +	if (map.m_len > map.m_lblk)
>> +		map.m_len -= map.m_lblk;
>> +	else
>> +		map.m_len = 0;
>> +
>> +	map.m_next_pgofs = NULL;
>> +
>> +	if (iocb->ki_flags & IOCB_DIRECT) {
>> +		err = f2fs_convert_inline_inode(inode);
>> +		if (err)
>> +			return err;
>> +		return __f2fs_map_blocks(inode, &map, 1,
>> +			__force_buffered_io(inode, WRITE) ?
>> +				F2FS_GET_BLOCK_PRE_AIO :
>> +				F2FS_GET_BLOCK_PRE_DIO,
>> +				rw_hint_to_seg_type(iocb->ki_hint));
>> +	}
>> +	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
>> +		err = f2fs_convert_inline_inode(inode);
>> +		if (err)
>> +			return err;
>> +	}
>> +	if (!f2fs_has_inline_data(inode))
>> +		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
>> +
>> +	return err;
>> +}
>> +
>>  static int __get_data_block(struct inode *inode, sector_t iblock,
>>  			struct buffer_head *bh, int create, int flag,
>>  			pgoff_t *next_pgofs)
>> @@ -2082,6 +2096,11 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>>  
>>  	trace_f2fs_direct_IO_enter(inode, offset, count, rw);
>>  
>> +	/* This is for avoiding the situation that the data of a segment is
>> +	 * passed down to devices with different hints
>> +	 */
>> +	iocb->ki_hint = WRITE_LIFE_NOT_SET;
>> +
>>  	down_read(&F2FS_I(inode)->dio_rwsem[rw]);
>>  	err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
>>  	up_read(&F2FS_I(inode)->dio_rwsem[rw]);
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 4b4a72f..9be5658 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -2562,6 +2562,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
>>  void destroy_segment_manager(struct f2fs_sb_info *sbi);
>>  int __init create_segment_manager_caches(void);
>>  void destroy_segment_manager_caches(void);
>> +int rw_hint_to_seg_type(enum rw_hint hint);
>>  
>>  /*
>>   * checkpoint.c
>>
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-10  6:42     ` Chao Yu
@ 2017-11-13  0:24       ` Hyunchul Lee
  2017-11-13  1:26         ` Chao Yu
  0 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-13  0:24 UTC (permalink / raw)
  To: Chao Yu
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 11/10/2017 03:42 PM, Chao Yu wrote:
> On 2017/11/10 8:23, Hyunchul Lee wrote:
>> Hello, Chao
>>
>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>
>>>> Using write hints[1], applications can inform the life time of the data
>>>> written to devices. and this[2] reported that the write hints patch
>>>> decreased writes in NAND by 25%.
>>>>
>>>> This hints help F2FS to determine the followings.
>>>>   1) the segment types where the data will be written.
>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>
>>>> This patch set implements the first mapping from write hints to segment types
>>>> as shown below.
>>>>
>>>>   hints                     segment type
>>>>   -----                     ------------
>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>   others                    CURSEG_WARM_DATA
>>>>
>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>> hints are not applied in in-place update.
>>>
>>> Could we change to disable IPU if file/inode write hint is existing?
>>>
>>
>> I am afraid that this makes side effects. for example, this could cause
>> out-of-place updates even when there are not enough free segments. 
>> I can write the patch that handles these situations. But I wonder 
>> that this is required, and I am not sure which IPU polices can be disabled.
> 
> Oh, As I replied in another thread, I think IPU just affects filesystem
> hot/cold separating, rather than this feature. So I think it will be okay
> to not consider it.
> 
>>
>>>>
>>>> Before the second mapping is implemented, write hints are not passed down
>>>> to devices. Because it is better that the data of a segment have the same 
>>>> hint.
>>>>
>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>> [2]: https://lwn.net/Articles/726477/
>>>
>>> Could you write a patch to support passing write hint to block layer for
>>> buffered writes as below commit:
>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>
>>
>> Sure I will. I wrote it already ;)
> 
> Cool, ;)
> 
>> I think that datas from the same segment should be passed down with the same
>> hint, and the following mapping is reasonable. I wonder what is your opinion
>> about it.
>>
>>   segment type               hints
>>   ------------               -----
>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
> 
> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
> 
>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
> 
> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
> data, warm node, and cold node should be coldest. So I suggested we can define
> as below:
> 
> META_DATA			WRITE_LIFE_SHORT
> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
> 

I agree, But I am not sure that assigning the same hint to a node and data
segment is good. Because NVMe is likely to write them in the same erase 
block if they have the same hint.

Thanks.

> Thanks,
> 
>>   others                     WRITE_LIFE_NONE
>>  
>>> Thanks,
>>>
>>>>
>>>> Hyunchul Lee (2):
>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>     write
>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>
>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>
>>>
>>>
>>
>> Thanks
>>
>> .
>>
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [f2fs-dev] [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write
  2017-11-13  0:07     ` Hyunchul Lee
@ 2017-11-13  1:24       ` Chao Yu
  2017-11-13  1:48         ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-13  1:24 UTC (permalink / raw)
  To: Hyunchul Lee, Chao Yu, Jaegeuk Kim
  Cc: kernel-team, Hyunchul Lee, linux-kernel, linux-f2fs-devel

On 2017/11/13 8:07, Hyunchul Lee wrote:
> On 11/11/2017 09:38 AM, Chao Yu wrote:
>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>
>>> Select the type of the segment using write hints, when blocks are
>>> allocated for direct write.
>>>
>>> There are unhandled corner cases. Hints are not applied in
>>> in-place update.  And if the blocks of a file is not pre-allocated
>>> because of the invalid user buffer, CURSEG_WARM_DATA segment will
>>> be selected.
>>>
>>> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
>>> ---
>>>  fs/f2fs/data.c | 101 ++++++++++++++++++++++++++++++++++-----------------------
>>>  fs/f2fs/f2fs.h |   1 +
>>>  2 files changed, 61 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 36b5352..d06048a 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -783,7 +783,7 @@ struct page *get_new_data_page(struct inode *inode,
>>>  	return page;
>>>  }
>>>  
>>> -static int __allocate_data_block(struct dnode_of_data *dn)
>>> +static int __allocate_data_block(struct dnode_of_data *dn, int seg_type)
>>>  {
>>>  	struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
>>>  	struct f2fs_summary sum;
>>> @@ -808,7 +808,7 @@ static int __allocate_data_block(struct dnode_of_data *dn)
>>>  	set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
>>>  
>>>  	allocate_data_block(sbi, NULL, dn->data_blkaddr, &dn->data_blkaddr,
>>> -					&sum, CURSEG_WARM_DATA, NULL, false);
>>> +					&sum, seg_type, NULL, false);
>>>  	set_data_blkaddr(dn);
>>>  
>>>  	/* update i_size */
>>> @@ -827,42 +827,6 @@ static inline bool __force_buffered_io(struct inode *inode, int rw)
>>>  			F2FS_I_SB(inode)->s_ndevs);
>>>  }
>>>  
>>> -int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>> -{
>>> -	struct inode *inode = file_inode(iocb->ki_filp);
>>> -	struct f2fs_map_blocks map;
>>> -	int err = 0;
>>> -
>>> -	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>> -		return 0;
>>> -
>>> -	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
>>> -	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
>>> -	if (map.m_len > map.m_lblk)
>>> -		map.m_len -= map.m_lblk;
>>> -	else
>>> -		map.m_len = 0;
>>> -
>>> -	map.m_next_pgofs = NULL;
>>> -
>>> -	if (iocb->ki_flags & IOCB_DIRECT) {
>>> -		err = f2fs_convert_inline_inode(inode);
>>> -		if (err)
>>> -			return err;
>>> -		return f2fs_map_blocks(inode, &map, 1,
>>> -			__force_buffered_io(inode, WRITE) ?
>>> -				F2FS_GET_BLOCK_PRE_AIO :
>>> -				F2FS_GET_BLOCK_PRE_DIO);
>>> -	}
>>> -	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
>>> -		err = f2fs_convert_inline_inode(inode);
>>> -		if (err)
>>> -			return err;
>>> -	}
>>> -	if (!f2fs_has_inline_data(inode))
>>> -		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
>>> -	return err;
>>> -}
>>>  
>>>  static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>>>  {
>>> @@ -888,8 +852,8 @@ static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>>>   *     b. do not use extent cache for better performance
>>>   *     c. give the block addresses to blockdev
>>>   */
>>> -int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>> -						int create, int flag)
>>> +static int __f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>> +				int create, int flag, int seg_type)
>>>  {
>>>  	unsigned int maxblocks = map->m_len;
>>>  	struct dnode_of_data dn;
>>> @@ -957,7 +921,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>  					last_ofs_in_node = dn.ofs_in_node;
>>>  				}
>>>  			} else {
>>> -				err = __allocate_data_block(&dn);
>>> +				/* if this inode is marked with FI_NO_PREALLOC,
>>> +				 * @seg_type is NO_CHECK_TYPE
>>> +				 */
>>> +				if (seg_type == NO_CHECK_TYPE)
>>> +					seg_type = CURSEG_WARM_DATA;
>>> +				err = __allocate_data_block(&dn, seg_type);
>>
>> We need to use inode.i_write_hint instead of ki_hint passed from file.f_write_hint?
>>
> 
> The following commit says to use file.f_write_hint if it is available. 
> "c75b1d9 fs: add fcntl() interface for setting/getting write life time hints"

Oh, yes, f_write_hint is recommended. So, I'm OK with this.

One left question as below.

> 
> And ki_hint is assiged to file.f_write_hint or inode.i_write_hint 
> by file_write_hint() in init_sync_kiocb().
> So, I think we need to use ki_hint instead of inode.i_write_hint.
> 
> Thanks
> 
>> Thanks,
>>
>>>  				if (!err)
>>>  					set_inode_flag(inode, FI_APPEND_WRITE);
>>>  			}
>>> @@ -1048,6 +1017,51 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>  	return err;
>>>  }
>>>  
>>> +int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>> +						int create, int flag)
>>> +{
>>> +	return __f2fs_map_blocks(inode, map, create, flag, NO_CHECK_TYPE);
>>> +}
>>> +
>>> +int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>> +{
>>> +	struct inode *inode = file_inode(iocb->ki_filp);
>>> +	struct f2fs_map_blocks map;
>>> +	int err = 0;
>>> +
>>> +	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>> +		return 0;
>>> +
>>> +	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
>>> +	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
>>> +	if (map.m_len > map.m_lblk)
>>> +		map.m_len -= map.m_lblk;
>>> +	else
>>> +		map.m_len = 0;
>>> +
>>> +	map.m_next_pgofs = NULL;
>>> +
>>> +	if (iocb->ki_flags & IOCB_DIRECT) {
>>> +		err = f2fs_convert_inline_inode(inode);
>>> +		if (err)
>>> +			return err;
>>> +		return __f2fs_map_blocks(inode, &map, 1,
>>> +			__force_buffered_io(inode, WRITE) ?
>>> +				F2FS_GET_BLOCK_PRE_AIO :
>>> +				F2FS_GET_BLOCK_PRE_DIO,
>>> +				rw_hint_to_seg_type(iocb->ki_hint));
>>> +	}
>>> +	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
>>> +		err = f2fs_convert_inline_inode(inode);
>>> +		if (err)
>>> +			return err;
>>> +	}
>>> +	if (!f2fs_has_inline_data(inode))
>>> +		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
>>> +
>>> +	return err;
>>> +}
>>> +
>>>  static int __get_data_block(struct inode *inode, sector_t iblock,
>>>  			struct buffer_head *bh, int create, int flag,
>>>  			pgoff_t *next_pgofs)
>>> @@ -2082,6 +2096,11 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>>>  
>>>  	trace_f2fs_direct_IO_enter(inode, offset, count, rw);
>>>  
>>> +	/* This is for avoiding the situation that the data of a segment is
>>> +	 * passed down to devices with different hints
>>> +	 */
>>> +	iocb->ki_hint = WRITE_LIFE_NOT_SET;

Why we need to change this? I'm not sure this will be used later in somewhere,
if it's not necessary, how about keeping it as it is?

Thanks,

>>> +
>>>  	down_read(&F2FS_I(inode)->dio_rwsem[rw]);
>>>  	err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
>>>  	up_read(&F2FS_I(inode)->dio_rwsem[rw]);
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>> index 4b4a72f..9be5658 100644
>>> --- a/fs/f2fs/f2fs.h
>>> +++ b/fs/f2fs/f2fs.h
>>> @@ -2562,6 +2562,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
>>>  void destroy_segment_manager(struct f2fs_sb_info *sbi);
>>>  int __init create_segment_manager_caches(void);
>>>  void destroy_segment_manager_caches(void);
>>> +int rw_hint_to_seg_type(enum rw_hint hint);
>>>  
>>>  /*
>>>   * checkpoint.c
>>>
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-13  0:24       ` Hyunchul Lee
@ 2017-11-13  1:26         ` Chao Yu
  2017-11-13  1:35           ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-13  1:26 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 2017/11/13 8:24, Hyunchul Lee wrote:
> On 11/10/2017 03:42 PM, Chao Yu wrote:
>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>> Hello, Chao
>>>
>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>
>>>>> Using write hints[1], applications can inform the life time of the data
>>>>> written to devices. and this[2] reported that the write hints patch
>>>>> decreased writes in NAND by 25%.
>>>>>
>>>>> This hints help F2FS to determine the followings.
>>>>>   1) the segment types where the data will be written.
>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>
>>>>> This patch set implements the first mapping from write hints to segment types
>>>>> as shown below.
>>>>>
>>>>>   hints                     segment type
>>>>>   -----                     ------------
>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>   others                    CURSEG_WARM_DATA
>>>>>
>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>> hints are not applied in in-place update.
>>>>
>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>
>>>
>>> I am afraid that this makes side effects. for example, this could cause
>>> out-of-place updates even when there are not enough free segments. 
>>> I can write the patch that handles these situations. But I wonder 
>>> that this is required, and I am not sure which IPU polices can be disabled.
>>
>> Oh, As I replied in another thread, I think IPU just affects filesystem
>> hot/cold separating, rather than this feature. So I think it will be okay
>> to not consider it.
>>
>>>
>>>>>
>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>> hint.
>>>>>
>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>> [2]: https://lwn.net/Articles/726477/
>>>>
>>>> Could you write a patch to support passing write hint to block layer for
>>>> buffered writes as below commit:
>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>
>>>
>>> Sure I will. I wrote it already ;)
>>
>> Cool, ;)
>>
>>> I think that datas from the same segment should be passed down with the same
>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>> about it.
>>>
>>>   segment type               hints
>>>   ------------               -----
>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>
>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>
>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>
>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>> data, warm node, and cold node should be coldest. So I suggested we can define
>> as below:
>>
>> META_DATA			WRITE_LIFE_SHORT
>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>
> 
> I agree, But I am not sure that assigning the same hint to a node and data
> segment is good. Because NVMe is likely to write them in the same erase 
> block if they have the same hint.

If we do not give the hint, they can still be written to the same erase block,
right? it will not be worse?

Thanks,

> 
> Thanks.
> 
>> Thanks,
>>
>>>   others                     WRITE_LIFE_NONE
>>>  
>>>> Thanks,
>>>>
>>>>>
>>>>> Hyunchul Lee (2):
>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>     write
>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>
>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>
>>>>
>>>>
>>>
>>> Thanks
>>>
>>> .
>>>
>>
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-13  1:26         ` Chao Yu
@ 2017-11-13  1:35           ` Hyunchul Lee
  2017-11-13  1:59             ` Chao Yu
  0 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-13  1:35 UTC (permalink / raw)
  To: Chao Yu
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 11/13/2017 10:26 AM, Chao Yu wrote:
> On 2017/11/13 8:24, Hyunchul Lee wrote:
>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>> Hello, Chao
>>>>
>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>
>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>> decreased writes in NAND by 25%.
>>>>>>
>>>>>> This hints help F2FS to determine the followings.
>>>>>>   1) the segment types where the data will be written.
>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>
>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>> as shown below.
>>>>>>
>>>>>>   hints                     segment type
>>>>>>   -----                     ------------
>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>
>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>> hints are not applied in in-place update.
>>>>>
>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>
>>>>
>>>> I am afraid that this makes side effects. for example, this could cause
>>>> out-of-place updates even when there are not enough free segments. 
>>>> I can write the patch that handles these situations. But I wonder 
>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>
>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>> hot/cold separating, rather than this feature. So I think it will be okay
>>> to not consider it.
>>>
>>>>
>>>>>>
>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>> hint.
>>>>>>
>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>
>>>>> Could you write a patch to support passing write hint to block layer for
>>>>> buffered writes as below commit:
>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>
>>>>
>>>> Sure I will. I wrote it already ;)
>>>
>>> Cool, ;)
>>>
>>>> I think that datas from the same segment should be passed down with the same
>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>> about it.
>>>>
>>>>   segment type               hints
>>>>   ------------               -----
>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>
>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>
>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>
>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>> as below:
>>>
>>> META_DATA			WRITE_LIFE_SHORT
>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>
>>
>> I agree, But I am not sure that assigning the same hint to a node and data
>> segment is good. Because NVMe is likely to write them in the same erase 
>> block if they have the same hint.
> 
> If we do not give the hint, they can still be written to the same erase block,
> right? it will not be worse?
> 

If the hint is not given, I think that they could be written to 
the same erase block, or not. But if we give the same hint, they are written
to the same block.
I am not sure ;)

> Thanks,
> 
>>
>> Thanks.
>>
>>> Thanks,
>>>
>>>>   others                     WRITE_LIFE_NONE
>>>>  
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Hyunchul Lee (2):
>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>     write
>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>
>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>
>>>>>
>>>>>
>>>>
>>>> Thanks
>>>>
>>>> .
>>>>
>>>
>>>
>>
>> .
>>
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [f2fs-dev] [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write
  2017-11-13  1:24       ` Chao Yu
@ 2017-11-13  1:48         ` Hyunchul Lee
  0 siblings, 0 replies; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-13  1:48 UTC (permalink / raw)
  To: Chao Yu, Chao Yu, Jaegeuk Kim
  Cc: kernel-team, Hyunchul Lee, linux-kernel, linux-f2fs-devel

On 11/13/2017 10:24 AM, Chao Yu wrote:
> On 2017/11/13 8:07, Hyunchul Lee wrote:
>> On 11/11/2017 09:38 AM, Chao Yu wrote:
>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>
>>>> Select the type of the segment using write hints, when blocks are
>>>> allocated for direct write.
>>>>
>>>> There are unhandled corner cases. Hints are not applied in
>>>> in-place update.  And if the blocks of a file is not pre-allocated
>>>> because of the invalid user buffer, CURSEG_WARM_DATA segment will
>>>> be selected.
>>>>
>>>> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
>>>> ---
>>>>  fs/f2fs/data.c | 101 ++++++++++++++++++++++++++++++++++-----------------------
>>>>  fs/f2fs/f2fs.h |   1 +
>>>>  2 files changed, 61 insertions(+), 41 deletions(-)
>>>>
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index 36b5352..d06048a 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -783,7 +783,7 @@ struct page *get_new_data_page(struct inode *inode,
>>>>  	return page;
>>>>  }
>>>>  
>>>> -static int __allocate_data_block(struct dnode_of_data *dn)
>>>> +static int __allocate_data_block(struct dnode_of_data *dn, int seg_type)
>>>>  {
>>>>  	struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
>>>>  	struct f2fs_summary sum;
>>>> @@ -808,7 +808,7 @@ static int __allocate_data_block(struct dnode_of_data *dn)
>>>>  	set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
>>>>  
>>>>  	allocate_data_block(sbi, NULL, dn->data_blkaddr, &dn->data_blkaddr,
>>>> -					&sum, CURSEG_WARM_DATA, NULL, false);
>>>> +					&sum, seg_type, NULL, false);
>>>>  	set_data_blkaddr(dn);
>>>>  
>>>>  	/* update i_size */
>>>> @@ -827,42 +827,6 @@ static inline bool __force_buffered_io(struct inode *inode, int rw)
>>>>  			F2FS_I_SB(inode)->s_ndevs);
>>>>  }
>>>>  
>>>> -int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>>> -{
>>>> -	struct inode *inode = file_inode(iocb->ki_filp);
>>>> -	struct f2fs_map_blocks map;
>>>> -	int err = 0;
>>>> -
>>>> -	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>>> -		return 0;
>>>> -
>>>> -	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
>>>> -	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
>>>> -	if (map.m_len > map.m_lblk)
>>>> -		map.m_len -= map.m_lblk;
>>>> -	else
>>>> -		map.m_len = 0;
>>>> -
>>>> -	map.m_next_pgofs = NULL;
>>>> -
>>>> -	if (iocb->ki_flags & IOCB_DIRECT) {
>>>> -		err = f2fs_convert_inline_inode(inode);
>>>> -		if (err)
>>>> -			return err;
>>>> -		return f2fs_map_blocks(inode, &map, 1,
>>>> -			__force_buffered_io(inode, WRITE) ?
>>>> -				F2FS_GET_BLOCK_PRE_AIO :
>>>> -				F2FS_GET_BLOCK_PRE_DIO);
>>>> -	}
>>>> -	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
>>>> -		err = f2fs_convert_inline_inode(inode);
>>>> -		if (err)
>>>> -			return err;
>>>> -	}
>>>> -	if (!f2fs_has_inline_data(inode))
>>>> -		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
>>>> -	return err;
>>>> -}
>>>>  
>>>>  static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>>>>  {
>>>> @@ -888,8 +852,8 @@ static inline void __do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock)
>>>>   *     b. do not use extent cache for better performance
>>>>   *     c. give the block addresses to blockdev
>>>>   */
>>>> -int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>> -						int create, int flag)
>>>> +static int __f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>> +				int create, int flag, int seg_type)
>>>>  {
>>>>  	unsigned int maxblocks = map->m_len;
>>>>  	struct dnode_of_data dn;
>>>> @@ -957,7 +921,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>>  					last_ofs_in_node = dn.ofs_in_node;
>>>>  				}
>>>>  			} else {
>>>> -				err = __allocate_data_block(&dn);
>>>> +				/* if this inode is marked with FI_NO_PREALLOC,
>>>> +				 * @seg_type is NO_CHECK_TYPE
>>>> +				 */
>>>> +				if (seg_type == NO_CHECK_TYPE)
>>>> +					seg_type = CURSEG_WARM_DATA;
>>>> +				err = __allocate_data_block(&dn, seg_type);
>>>
>>> We need to use inode.i_write_hint instead of ki_hint passed from file.f_write_hint?
>>>
>>
>> The following commit says to use file.f_write_hint if it is available. 
>> "c75b1d9 fs: add fcntl() interface for setting/getting write life time hints"
> 
> Oh, yes, f_write_hint is recommended. So, I'm OK with this.
> 
> One left question as below.
> 
>>
>> And ki_hint is assiged to file.f_write_hint or inode.i_write_hint 
>> by file_write_hint() in init_sync_kiocb().
>> So, I think we need to use ki_hint instead of inode.i_write_hint.
>>
>> Thanks
>>
>>> Thanks,
>>>
>>>>  				if (!err)
>>>>  					set_inode_flag(inode, FI_APPEND_WRITE);
>>>>  			}
>>>> @@ -1048,6 +1017,51 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>>  	return err;
>>>>  }
>>>>  
>>>> +int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
>>>> +						int create, int flag)
>>>> +{
>>>> +	return __f2fs_map_blocks(inode, map, create, flag, NO_CHECK_TYPE);
>>>> +}
>>>> +
>>>> +int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>>> +{
>>>> +	struct inode *inode = file_inode(iocb->ki_filp);
>>>> +	struct f2fs_map_blocks map;
>>>> +	int err = 0;
>>>> +
>>>> +	if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>>> +		return 0;
>>>> +
>>>> +	map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
>>>> +	map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
>>>> +	if (map.m_len > map.m_lblk)
>>>> +		map.m_len -= map.m_lblk;
>>>> +	else
>>>> +		map.m_len = 0;
>>>> +
>>>> +	map.m_next_pgofs = NULL;
>>>> +
>>>> +	if (iocb->ki_flags & IOCB_DIRECT) {
>>>> +		err = f2fs_convert_inline_inode(inode);
>>>> +		if (err)
>>>> +			return err;
>>>> +		return __f2fs_map_blocks(inode, &map, 1,
>>>> +			__force_buffered_io(inode, WRITE) ?
>>>> +				F2FS_GET_BLOCK_PRE_AIO :
>>>> +				F2FS_GET_BLOCK_PRE_DIO,
>>>> +				rw_hint_to_seg_type(iocb->ki_hint));
>>>> +	}
>>>> +	if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA(inode)) {
>>>> +		err = f2fs_convert_inline_inode(inode);
>>>> +		if (err)
>>>> +			return err;
>>>> +	}
>>>> +	if (!f2fs_has_inline_data(inode))
>>>> +		return f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO);
>>>> +
>>>> +	return err;
>>>> +}
>>>> +
>>>>  static int __get_data_block(struct inode *inode, sector_t iblock,
>>>>  			struct buffer_head *bh, int create, int flag,
>>>>  			pgoff_t *next_pgofs)
>>>> @@ -2082,6 +2096,11 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>>>>  
>>>>  	trace_f2fs_direct_IO_enter(inode, offset, count, rw);
>>>>  
>>>> +	/* This is for avoiding the situation that the data of a segment is
>>>> +	 * passed down to devices with different hints
>>>> +	 */
>>>> +	iocb->ki_hint = WRITE_LIFE_NOT_SET;
> 
> Why we need to change this? I'm not sure this will be used later in somewhere,
> if it's not necessary, how about keeping it as it is?
> 

blkdev_direct_IO will pass down this hint to block layer. But we need to control
the hint before this.
So I think that after we implement this control, passing down the hint is better. 

iocb->ki_hint needs to be re-assigned to the orignal hint after the write.
I will handle this.

Thanks

> Thanks,
> 
>>>> +
>>>>  	down_read(&F2FS_I(inode)->dio_rwsem[rw]);
>>>>  	err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
>>>>  	up_read(&F2FS_I(inode)->dio_rwsem[rw]);
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 4b4a72f..9be5658 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -2562,6 +2562,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
>>>>  void destroy_segment_manager(struct f2fs_sb_info *sbi);
>>>>  int __init create_segment_manager_caches(void);
>>>>  void destroy_segment_manager_caches(void);
>>>> +int rw_hint_to_seg_type(enum rw_hint hint);
>>>>  
>>>>  /*
>>>>   * checkpoint.c
>>>>
>>>
>>
>> .
>>
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-13  1:35           ` Hyunchul Lee
@ 2017-11-13  1:59             ` Chao Yu
  2017-11-13  2:25               ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-13  1:59 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Jaegeuk Kim, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 2017/11/13 9:35, Hyunchul Lee wrote:
> On 11/13/2017 10:26 AM, Chao Yu wrote:
>> On 2017/11/13 8:24, Hyunchul Lee wrote:
>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>>> Hello, Chao
>>>>>
>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>
>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>> decreased writes in NAND by 25%.
>>>>>>>
>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>   1) the segment types where the data will be written.
>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>
>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>> as shown below.
>>>>>>>
>>>>>>>   hints                     segment type
>>>>>>>   -----                     ------------
>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>
>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>> hints are not applied in in-place update.
>>>>>>
>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>
>>>>>
>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>> out-of-place updates even when there are not enough free segments. 
>>>>> I can write the patch that handles these situations. But I wonder 
>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>
>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>> to not consider it.
>>>>
>>>>>
>>>>>>>
>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>> hint.
>>>>>>>
>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>
>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>> buffered writes as below commit:
>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>
>>>>>
>>>>> Sure I will. I wrote it already ;)
>>>>
>>>> Cool, ;)
>>>>
>>>>> I think that datas from the same segment should be passed down with the same
>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>> about it.
>>>>>
>>>>>   segment type               hints
>>>>>   ------------               -----
>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>
>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>
>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>
>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>> as below:
>>>>
>>>> META_DATA			WRITE_LIFE_SHORT
>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>
>>>
>>> I agree, But I am not sure that assigning the same hint to a node and data
>>> segment is good. Because NVMe is likely to write them in the same erase 
>>> block if they have the same hint.
>>
>> If we do not give the hint, they can still be written to the same erase block,

I mean it's possible to write them to the same erase block. :)

>> right? it will not be worse?
>>
> 
> If the hint is not given, I think that they could be written to 
> the same erase block, or not. But if we give the same hint, they are written
> to the same block.

IMO, Only if underlying device can support more hint type or opened channels,
and actual temperature of data segment and node segment is quite different, we
can separate them.

Thanks,

> I am not sure ;)
> 
>> Thanks,
>>
>>>
>>> Thanks.
>>>
>>>> Thanks,
>>>>
>>>>>   others                     WRITE_LIFE_NONE
>>>>>  
>>>>>> Thanks,
>>>>>>
>>>>>>>
>>>>>>> Hyunchul Lee (2):
>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>>     write
>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>>
>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-13  1:59             ` Chao Yu
@ 2017-11-13  2:25               ` Hyunchul Lee
  2017-11-14  4:20                 ` Jaegeuk Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-13  2:25 UTC (permalink / raw)
  To: Chao Yu, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 11/13/2017 10:59 AM, Chao Yu wrote:
> On 2017/11/13 9:35, Hyunchul Lee wrote:
>> On 11/13/2017 10:26 AM, Chao Yu wrote:
>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>>>> Hello, Chao
>>>>>>
>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>
>>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>>> decreased writes in NAND by 25%.
>>>>>>>>
>>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>>   1) the segment types where the data will be written.
>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>>
>>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>>> as shown below.
>>>>>>>>
>>>>>>>>   hints                     segment type
>>>>>>>>   -----                     ------------
>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>>
>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>>> hints are not applied in in-place update.
>>>>>>>
>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>>
>>>>>>
>>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>>> out-of-place updates even when there are not enough free segments. 
>>>>>> I can write the patch that handles these situations. But I wonder 
>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>>
>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>>> to not consider it.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>>> hint.
>>>>>>>>
>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>>
>>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>>> buffered writes as below commit:
>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>>
>>>>>>
>>>>>> Sure I will. I wrote it already ;)
>>>>>
>>>>> Cool, ;)
>>>>>
>>>>>> I think that datas from the same segment should be passed down with the same
>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>>> about it.
>>>>>>
>>>>>>   segment type               hints
>>>>>>   ------------               -----
>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>>
>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>>
>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>>
>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>>> as below:
>>>>>
>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>>
>>>>
>>>> I agree, But I am not sure that assigning the same hint to a node and data
>>>> segment is good. Because NVMe is likely to write them in the same erase 
>>>> block if they have the same hint.
>>>
>>> If we do not give the hint, they can still be written to the same erase block,
> 
> I mean it's possible to write them to the same erase block. :)
> 
>>> right? it will not be worse?
>>>
>>
>> If the hint is not given, I think that they could be written to 
>> the same erase block, or not. But if we give the same hint, they are written
>> to the same block.
> 
> IMO, Only if underlying device can support more hint type or opened channels,
> and actual temperature of data segment and node segment is quite different, we
> can separate them.
> 

Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
implements your proposed mapping.

Thank you for comments ;)

> Thanks,
> 
>> I am not sure ;)
>>
>>> Thanks,
>>>
>>>>
>>>> Thanks.
>>>>
>>>>> Thanks,
>>>>>
>>>>>>   others                     WRITE_LIFE_NONE
>>>>>>  
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>
>>>>>>>> Hyunchul Lee (2):
>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>>>     write
>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>>>
>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>
>> .
>>
> 
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-13  2:25               ` Hyunchul Lee
@ 2017-11-14  4:20                 ` Jaegeuk Kim
  2017-11-14  6:22                   ` Chao Yu
  0 siblings, 1 reply; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-14  4:20 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Chao Yu, linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee

On 11/13, Hyunchul Lee wrote:
> On 11/13/2017 10:59 AM, Chao Yu wrote:
> > On 2017/11/13 9:35, Hyunchul Lee wrote:
> >> On 11/13/2017 10:26 AM, Chao Yu wrote:
> >>> On 2017/11/13 8:24, Hyunchul Lee wrote:
> >>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
> >>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
> >>>>>> Hello, Chao
> >>>>>>
> >>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
> >>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
> >>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>>>
> >>>>>>>> Using write hints[1], applications can inform the life time of the data
> >>>>>>>> written to devices. and this[2] reported that the write hints patch
> >>>>>>>> decreased writes in NAND by 25%.
> >>>>>>>>
> >>>>>>>> This hints help F2FS to determine the followings.
> >>>>>>>>   1) the segment types where the data will be written.
> >>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
> >>>>>>>>
> >>>>>>>> This patch set implements the first mapping from write hints to segment types
> >>>>>>>> as shown below.
> >>>>>>>>
> >>>>>>>>   hints                     segment type
> >>>>>>>>   -----                     ------------
> >>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
> >>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
> >>>>>>>>   others                    CURSEG_WARM_DATA
> >>>>>>>>
> >>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
> >>>>>>>> hints are not applied in in-place update.
> >>>>>>>
> >>>>>>> Could we change to disable IPU if file/inode write hint is existing?
> >>>>>>>
> >>>>>>
> >>>>>> I am afraid that this makes side effects. for example, this could cause
> >>>>>> out-of-place updates even when there are not enough free segments. 
> >>>>>> I can write the patch that handles these situations. But I wonder 
> >>>>>> that this is required, and I am not sure which IPU polices can be disabled.
> >>>>>
> >>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
> >>>>> hot/cold separating, rather than this feature. So I think it will be okay
> >>>>> to not consider it.
> >>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>> Before the second mapping is implemented, write hints are not passed down
> >>>>>>>> to devices. Because it is better that the data of a segment have the same 
> >>>>>>>> hint.
> >>>>>>>>
> >>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
> >>>>>>>> [2]: https://lwn.net/Articles/726477/
> >>>>>>>
> >>>>>>> Could you write a patch to support passing write hint to block layer for
> >>>>>>> buffered writes as below commit:
> >>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
> >>>>>>>
> >>>>>>
> >>>>>> Sure I will. I wrote it already ;)
> >>>>>
> >>>>> Cool, ;)
> >>>>>
> >>>>>> I think that datas from the same segment should be passed down with the same
> >>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
> >>>>>> about it.
> >>>>>>
> >>>>>>   segment type               hints
> >>>>>>   ------------               -----
> >>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
> >>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
> >>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
> >>>>>
> >>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
> >>>>>
> >>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
> >>>>>
> >>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
> >>>>> data, warm node, and cold node should be coldest. So I suggested we can define
> >>>>> as below:
> >>>>>
> >>>>> META_DATA			WRITE_LIFE_SHORT
> >>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
> >>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
> >>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
> >>>>>
> >>>>
> >>>> I agree, But I am not sure that assigning the same hint to a node and data
> >>>> segment is good. Because NVMe is likely to write them in the same erase 
> >>>> block if they have the same hint.
> >>>
> >>> If we do not give the hint, they can still be written to the same erase block,
> > 
> > I mean it's possible to write them to the same erase block. :)
> > 
> >>> right? it will not be worse?
> >>>
> >>
> >> If the hint is not given, I think that they could be written to 
> >> the same erase block, or not. But if we give the same hint, they are written
> >> to the same block.
> > 
> > IMO, Only if underlying device can support more hint type or opened channels,
> > and actual temperature of data segment and node segment is quite different, we
> > can separate them.
> > 
> 
> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
> implements your proposed mapping.

How about this? We'd better to split data and node blocks as much as possible.

segment type                    hints
------------                    -----
COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
WARM_DATA			WRITE_LIFE_EXTERME
HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
HOT_DATA			WRITE_LIFE_MEDIUM
META_DATA			WRITE_LIFE_SHORT

> 
> Thank you for comments ;)
> 
> > Thanks,
> > 
> >> I am not sure ;)
> >>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thanks.
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>>>   others                     WRITE_LIFE_NONE
> >>>>>>  
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Hyunchul Lee (2):
> >>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
> >>>>>>>>     write
> >>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
> >>>>>>>>
> >>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
> >>>>>>>>  fs/f2fs/f2fs.h    |   1 +
> >>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
> >>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> .
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> .
> >>>>
> >>>
> >>>
> >>
> >> .
> >>
> > 
> > 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-14  4:20                 ` Jaegeuk Kim
@ 2017-11-14  6:22                   ` Chao Yu
  2017-11-15 16:27                     ` Jaegeuk Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-14  6:22 UTC (permalink / raw)
  To: Jaegeuk Kim, Hyunchul Lee
  Cc: linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee,
	Chao Yu, Chao Yu

On 2017/11/14 12:20, Jaegeuk Kim wrote:
> On 11/13, Hyunchul Lee wrote:
>> On 11/13/2017 10:59 AM, Chao Yu wrote:
>>> On 2017/11/13 9:35, Hyunchul Lee wrote:
>>>> On 11/13/2017 10:26 AM, Chao Yu wrote:
>>>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
>>>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>>>>>> Hello, Chao
>>>>>>>>
>>>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>>>
>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>>>>> decreased writes in NAND by 25%.
>>>>>>>>>>
>>>>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>>>>   1) the segment types where the data will be written.
>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>>>>
>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>>>>> as shown below.
>>>>>>>>>>
>>>>>>>>>>   hints                     segment type
>>>>>>>>>>   -----                     ------------
>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>>>>
>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>>>>> hints are not applied in in-place update.
>>>>>>>>>
>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>>>>> out-of-place updates even when there are not enough free segments. 
>>>>>>>> I can write the patch that handles these situations. But I wonder 
>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>>>>
>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>>>>> to not consider it.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>>>>> hint.
>>>>>>>>>>
>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>>>>
>>>>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>>>>> buffered writes as below commit:
>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>>>>
>>>>>>>>
>>>>>>>> Sure I will. I wrote it already ;)
>>>>>>>
>>>>>>> Cool, ;)
>>>>>>>
>>>>>>>> I think that datas from the same segment should be passed down with the same
>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>>>>> about it.
>>>>>>>>
>>>>>>>>   segment type               hints
>>>>>>>>   ------------               -----
>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>>>>
>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>>>>
>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>>>>
>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>>>>> as below:
>>>>>>>
>>>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>>>>
>>>>>>
>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
>>>>>> block if they have the same hint.
>>>>>
>>>>> If we do not give the hint, they can still be written to the same erase block,
>>>
>>> I mean it's possible to write them to the same erase block. :)
>>>
>>>>> right? it will not be worse?
>>>>>
>>>>
>>>> If the hint is not given, I think that they could be written to 
>>>> the same erase block, or not. But if we give the same hint, they are written
>>>> to the same block.
>>>
>>> IMO, Only if underlying device can support more hint type or opened channels,
>>> and actual temperature of data segment and node segment is quite different, we
>>> can separate them.
>>>
>>
>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
>> implements your proposed mapping.
> 
> How about this? We'd better to split data and node blocks as much as possible.
> 
> segment type                    hints
> ------------                    -----
> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE

WRITE_LIFE_NONE means there is no hints about write life time.

Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?

Thanks,

> WARM_DATA			WRITE_LIFE_EXTERME
> HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
> HOT_DATA			WRITE_LIFE_MEDIUM
> META_DATA			WRITE_LIFE_SHORT
> 
>>
>> Thank you for comments ;)
>>
>>> Thanks,
>>>
>>>> I am not sure ;)
>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>   others                     WRITE_LIFE_NONE
>>>>>>>>  
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hyunchul Lee (2):
>>>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>>>>>     write
>>>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>>>>>
>>>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-14  6:22                   ` Chao Yu
@ 2017-11-15 16:27                     ` Jaegeuk Kim
  2017-11-16  0:56                       ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-15 16:27 UTC (permalink / raw)
  To: Chao Yu
  Cc: Hyunchul Lee, linux-f2fs-devel, linux-kernel, kernel-team,
	Hyunchul Lee, Chao Yu

On 11/14, Chao Yu wrote:
> On 2017/11/14 12:20, Jaegeuk Kim wrote:
> > On 11/13, Hyunchul Lee wrote:
> >> On 11/13/2017 10:59 AM, Chao Yu wrote:
> >>> On 2017/11/13 9:35, Hyunchul Lee wrote:
> >>>> On 11/13/2017 10:26 AM, Chao Yu wrote:
> >>>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
> >>>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
> >>>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
> >>>>>>>> Hello, Chao
> >>>>>>>>
> >>>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
> >>>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
> >>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>>>>>
> >>>>>>>>>> Using write hints[1], applications can inform the life time of the data
> >>>>>>>>>> written to devices. and this[2] reported that the write hints patch
> >>>>>>>>>> decreased writes in NAND by 25%.
> >>>>>>>>>>
> >>>>>>>>>> This hints help F2FS to determine the followings.
> >>>>>>>>>>   1) the segment types where the data will be written.
> >>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
> >>>>>>>>>>
> >>>>>>>>>> This patch set implements the first mapping from write hints to segment types
> >>>>>>>>>> as shown below.
> >>>>>>>>>>
> >>>>>>>>>>   hints                     segment type
> >>>>>>>>>>   -----                     ------------
> >>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
> >>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
> >>>>>>>>>>   others                    CURSEG_WARM_DATA
> >>>>>>>>>>
> >>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
> >>>>>>>>>> hints are not applied in in-place update.
> >>>>>>>>>
> >>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I am afraid that this makes side effects. for example, this could cause
> >>>>>>>> out-of-place updates even when there are not enough free segments. 
> >>>>>>>> I can write the patch that handles these situations. But I wonder 
> >>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
> >>>>>>>
> >>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
> >>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
> >>>>>>> to not consider it.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
> >>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
> >>>>>>>>>> hint.
> >>>>>>>>>>
> >>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
> >>>>>>>>>> [2]: https://lwn.net/Articles/726477/
> >>>>>>>>>
> >>>>>>>>> Could you write a patch to support passing write hint to block layer for
> >>>>>>>>> buffered writes as below commit:
> >>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Sure I will. I wrote it already ;)
> >>>>>>>
> >>>>>>> Cool, ;)
> >>>>>>>
> >>>>>>>> I think that datas from the same segment should be passed down with the same
> >>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
> >>>>>>>> about it.
> >>>>>>>>
> >>>>>>>>   segment type               hints
> >>>>>>>>   ------------               -----
> >>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
> >>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
> >>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
> >>>>>>>
> >>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
> >>>>>>>
> >>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
> >>>>>>>
> >>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
> >>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
> >>>>>>> as below:
> >>>>>>>
> >>>>>>> META_DATA			WRITE_LIFE_SHORT
> >>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
> >>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
> >>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
> >>>>>>>
> >>>>>>
> >>>>>> I agree, But I am not sure that assigning the same hint to a node and data
> >>>>>> segment is good. Because NVMe is likely to write them in the same erase 
> >>>>>> block if they have the same hint.
> >>>>>
> >>>>> If we do not give the hint, they can still be written to the same erase block,
> >>>
> >>> I mean it's possible to write them to the same erase block. :)
> >>>
> >>>>> right? it will not be worse?
> >>>>>
> >>>>
> >>>> If the hint is not given, I think that they could be written to 
> >>>> the same erase block, or not. But if we give the same hint, they are written
> >>>> to the same block.
> >>>
> >>> IMO, Only if underlying device can support more hint type or opened channels,
> >>> and actual temperature of data segment and node segment is quite different, we
> >>> can separate them.
> >>>
> >>
> >> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
> >> implements your proposed mapping.
> > 
> > How about this? We'd better to split data and node blocks as much as possible.
> > 
> > segment type                    hints
> > ------------                    -----
> > COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
> 
> WRITE_LIFE_NONE means there is no hints about write life time.
> 
> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?

The assumption would be to split different types of blocks by flash firmware,
so I think we can use WRITE_LIFE_NONE as a type as well.

Thanks,

> 
> Thanks,
> 
> > WARM_DATA			WRITE_LIFE_EXTERME
> > HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
> > HOT_DATA			WRITE_LIFE_MEDIUM
> > META_DATA			WRITE_LIFE_SHORT
> > 
> >>
> >> Thank you for comments ;)
> >>
> >>> Thanks,
> >>>
> >>>> I am not sure ;)
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>>   others                     WRITE_LIFE_NONE
> >>>>>>>>  
> >>>>>>>>> Thanks,
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Hyunchul Lee (2):
> >>>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
> >>>>>>>>>>     write
> >>>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
> >>>>>>>>>>
> >>>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
> >>>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
> >>>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
> >>>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>>
> >>>>>>>> .
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> .
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> .
> >>>>
> >>>
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-15 16:27                     ` Jaegeuk Kim
@ 2017-11-16  0:56                       ` Hyunchul Lee
  2017-11-16  2:52                         ` Chao Yu
  0 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-16  0:56 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee, Chao Yu


On 11/16/2017 01:27 AM, Jaegeuk Kim wrote:
> On 11/14, Chao Yu wrote:
>> On 2017/11/14 12:20, Jaegeuk Kim wrote:
>>> On 11/13, Hyunchul Lee wrote:
>>>> On 11/13/2017 10:59 AM, Chao Yu wrote:
>>>>> On 2017/11/13 9:35, Hyunchul Lee wrote:
>>>>>> On 11/13/2017 10:26 AM, Chao Yu wrote:
>>>>>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
>>>>>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>>>>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>>>>>>>> Hello, Chao
>>>>>>>>>>
>>>>>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>>>>>
>>>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>>>>>>> decreased writes in NAND by 25%.
>>>>>>>>>>>>
>>>>>>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>>>>>>   1) the segment types where the data will be written.
>>>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>>>>>>
>>>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>>>>>>> as shown below.
>>>>>>>>>>>>
>>>>>>>>>>>>   hints                     segment type
>>>>>>>>>>>>   -----                     ------------
>>>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>>>>>>
>>>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>>>>>>> hints are not applied in in-place update.
>>>>>>>>>>>
>>>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>>>>>>> out-of-place updates even when there are not enough free segments. 
>>>>>>>>>> I can write the patch that handles these situations. But I wonder 
>>>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>>>>>>
>>>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>>>>>>> to not consider it.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>>>>>>> hint.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>>>>>>
>>>>>>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>>>>>>> buffered writes as below commit:
>>>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sure I will. I wrote it already ;)
>>>>>>>>>
>>>>>>>>> Cool, ;)
>>>>>>>>>
>>>>>>>>>> I think that datas from the same segment should be passed down with the same
>>>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>>>>>>> about it.
>>>>>>>>>>
>>>>>>>>>>   segment type               hints
>>>>>>>>>>   ------------               -----
>>>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>>>>>>
>>>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>>>>>>
>>>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>>>>>>
>>>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>>>>>>> as below:
>>>>>>>>>
>>>>>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>>>>>>
>>>>>>>>
>>>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
>>>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
>>>>>>>> block if they have the same hint.
>>>>>>>
>>>>>>> If we do not give the hint, they can still be written to the same erase block,
>>>>>
>>>>> I mean it's possible to write them to the same erase block. :)
>>>>>
>>>>>>> right? it will not be worse?
>>>>>>>
>>>>>>
>>>>>> If the hint is not given, I think that they could be written to 
>>>>>> the same erase block, or not. But if we give the same hint, they are written
>>>>>> to the same block.
>>>>>
>>>>> IMO, Only if underlying device can support more hint type or opened channels,
>>>>> and actual temperature of data segment and node segment is quite different, we
>>>>> can separate them.
>>>>>
>>>>
>>>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
>>>> implements your proposed mapping.
>>>
>>> How about this? We'd better to split data and node blocks as much as possible.
>>>
>>> segment type                    hints
>>> ------------                    -----
>>> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
>>
>> WRITE_LIFE_NONE means there is no hints about write life time.
>>
>> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?
> 
> The assumption would be to split different types of blocks by flash firmware,
> so I think we can use WRITE_LIFE_NONE as a type as well.
> 

WRITE_LIFE_NONE means that no stream id is specified. It equals WRITE_LIFE_NOT_SET.
So I think that we can define WARM_DATA as WRITE_LIFE_NONE, and
COLD_NODE & COLD_DATA as WRITE_LIFE_EXTREME.

Thanks.

> Thanks,
> 
>>
>> Thanks,
>>
>>> WARM_DATA			WRITE_LIFE_EXTERME
>>> HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
>>> HOT_DATA			WRITE_LIFE_MEDIUM
>>> META_DATA			WRITE_LIFE_SHORT
>>>
>>>>
>>>> Thank you for comments ;)
>>>>
>>>>> Thanks,
>>>>>
>>>>>> I am not sure ;)
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>>   others                     WRITE_LIFE_NONE
>>>>>>>>>>  
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hyunchul Lee (2):
>>>>>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>>>>>>>     write
>>>>>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>>>>>>>
>>>>>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>
>>> .
>>>
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-16  0:56                       ` Hyunchul Lee
@ 2017-11-16  2:52                         ` Chao Yu
  2017-11-16  3:58                           ` Jaegeuk Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Chao Yu @ 2017-11-16  2:52 UTC (permalink / raw)
  To: Hyunchul Lee, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee, Chao Yu

On 2017/11/16 8:56, Hyunchul Lee wrote:
> 
> On 11/16/2017 01:27 AM, Jaegeuk Kim wrote:
>> On 11/14, Chao Yu wrote:
>>> On 2017/11/14 12:20, Jaegeuk Kim wrote:
>>>> On 11/13, Hyunchul Lee wrote:
>>>>> On 11/13/2017 10:59 AM, Chao Yu wrote:
>>>>>> On 2017/11/13 9:35, Hyunchul Lee wrote:
>>>>>>> On 11/13/2017 10:26 AM, Chao Yu wrote:
>>>>>>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
>>>>>>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>>>>>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>>>>>>>>> Hello, Chao
>>>>>>>>>>>
>>>>>>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>>>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>>>>>>>> decreased writes in NAND by 25%.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>>>>>>>   1) the segment types where the data will be written.
>>>>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>>>>>>>> as shown below.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   hints                     segment type
>>>>>>>>>>>>>   -----                     ------------
>>>>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>>>>>>>
>>>>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>>>>>>>> hints are not applied in in-place update.
>>>>>>>>>>>>
>>>>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>>>>>>>> out-of-place updates even when there are not enough free segments. 
>>>>>>>>>>> I can write the patch that handles these situations. But I wonder 
>>>>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>>>>>>>
>>>>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>>>>>>>> to not consider it.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>>>>>>>> hint.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>>>>>>>
>>>>>>>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>>>>>>>> buffered writes as below commit:
>>>>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Sure I will. I wrote it already ;)
>>>>>>>>>>
>>>>>>>>>> Cool, ;)
>>>>>>>>>>
>>>>>>>>>>> I think that datas from the same segment should be passed down with the same
>>>>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>>>>>>>> about it.
>>>>>>>>>>>
>>>>>>>>>>>   segment type               hints
>>>>>>>>>>>   ------------               -----
>>>>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>>>>>>>
>>>>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>>>>>>>
>>>>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>>>>>>>
>>>>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>>>>>>>> as below:
>>>>>>>>>>
>>>>>>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
>>>>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
>>>>>>>>> block if they have the same hint.
>>>>>>>>
>>>>>>>> If we do not give the hint, they can still be written to the same erase block,
>>>>>>
>>>>>> I mean it's possible to write them to the same erase block. :)
>>>>>>
>>>>>>>> right? it will not be worse?
>>>>>>>>
>>>>>>>
>>>>>>> If the hint is not given, I think that they could be written to 
>>>>>>> the same erase block, or not. But if we give the same hint, they are written
>>>>>>> to the same block.
>>>>>>
>>>>>> IMO, Only if underlying device can support more hint type or opened channels,
>>>>>> and actual temperature of data segment and node segment is quite different, we
>>>>>> can separate them.
>>>>>>
>>>>>
>>>>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
>>>>> implements your proposed mapping.
>>>>
>>>> How about this? We'd better to split data and node blocks as much as possible.
>>>>
>>>> segment type                    hints
>>>> ------------                    -----
>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
>>>
>>> WRITE_LIFE_NONE means there is no hints about write life time.
>>>
>>> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?
>>
>> The assumption would be to split different types of blocks by flash firmware,
>> so I think we can use WRITE_LIFE_NONE as a type as well.
>>
> 
> WRITE_LIFE_NONE means that no stream id is specified. It equals WRITE_LIFE_NOT_SET.

Rgith, I just saw nvme implementation:

nvme_assign_write_stream

	enum rw_hint streamid = req->write_hint;

	if (streamid == WRITE_LIFE_NOT_SET || streamid == WRITE_LIFE_NONE)
		streamid = 0;
	else {
		streamid--;
...

> So I think that we can define WARM_DATA as WRITE_LIFE_NONE, and
> COLD_NODE & COLD_DATA as WRITE_LIFE_EXTREME.

I think that would be better.

Thanks,

> 
> Thanks.
> 
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>> WARM_DATA			WRITE_LIFE_EXTERME
>>>> HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
>>>> HOT_DATA			WRITE_LIFE_MEDIUM
>>>> META_DATA			WRITE_LIFE_SHORT
>>>>
>>>>>
>>>>> Thank you for comments ;)
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>> I am not sure ;)
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>>   others                     WRITE_LIFE_NONE
>>>>>>>>>>>  
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hyunchul Lee (2):
>>>>>>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>>>>>>>>     write
>>>>>>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>>>>>>>>
>>>>>>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> .
>>>>
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-16  2:52                         ` Chao Yu
@ 2017-11-16  3:58                           ` Jaegeuk Kim
  2017-11-16  4:35                             ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-16  3:58 UTC (permalink / raw)
  To: Chao Yu
  Cc: Hyunchul Lee, linux-f2fs-devel, linux-kernel, kernel-team,
	Hyunchul Lee, Chao Yu

On 11/16, Chao Yu wrote:
> On 2017/11/16 8:56, Hyunchul Lee wrote:
> > 
> > On 11/16/2017 01:27 AM, Jaegeuk Kim wrote:
> >> On 11/14, Chao Yu wrote:
> >>> On 2017/11/14 12:20, Jaegeuk Kim wrote:
> >>>> On 11/13, Hyunchul Lee wrote:
> >>>>> On 11/13/2017 10:59 AM, Chao Yu wrote:
> >>>>>> On 2017/11/13 9:35, Hyunchul Lee wrote:
> >>>>>>> On 11/13/2017 10:26 AM, Chao Yu wrote:
> >>>>>>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
> >>>>>>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
> >>>>>>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
> >>>>>>>>>>> Hello, Chao
> >>>>>>>>>>>
> >>>>>>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
> >>>>>>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
> >>>>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
> >>>>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
> >>>>>>>>>>>>> decreased writes in NAND by 25%.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This hints help F2FS to determine the followings.
> >>>>>>>>>>>>>   1) the segment types where the data will be written.
> >>>>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
> >>>>>>>>>>>>> as shown below.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>   hints                     segment type
> >>>>>>>>>>>>>   -----                     ------------
> >>>>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
> >>>>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
> >>>>>>>>>>>>>   others                    CURSEG_WARM_DATA
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
> >>>>>>>>>>>>> hints are not applied in in-place update.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I am afraid that this makes side effects. for example, this could cause
> >>>>>>>>>>> out-of-place updates even when there are not enough free segments. 
> >>>>>>>>>>> I can write the patch that handles these situations. But I wonder 
> >>>>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
> >>>>>>>>>>
> >>>>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
> >>>>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
> >>>>>>>>>> to not consider it.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
> >>>>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
> >>>>>>>>>>>>> hint.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
> >>>>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
> >>>>>>>>>>>>
> >>>>>>>>>>>> Could you write a patch to support passing write hint to block layer for
> >>>>>>>>>>>> buffered writes as below commit:
> >>>>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Sure I will. I wrote it already ;)
> >>>>>>>>>>
> >>>>>>>>>> Cool, ;)
> >>>>>>>>>>
> >>>>>>>>>>> I think that datas from the same segment should be passed down with the same
> >>>>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
> >>>>>>>>>>> about it.
> >>>>>>>>>>>
> >>>>>>>>>>>   segment type               hints
> >>>>>>>>>>>   ------------               -----
> >>>>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
> >>>>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
> >>>>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
> >>>>>>>>>>
> >>>>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
> >>>>>>>>>>
> >>>>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
> >>>>>>>>>>
> >>>>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
> >>>>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
> >>>>>>>>>> as below:
> >>>>>>>>>>
> >>>>>>>>>> META_DATA			WRITE_LIFE_SHORT
> >>>>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
> >>>>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
> >>>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
> >>>>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
> >>>>>>>>> block if they have the same hint.
> >>>>>>>>
> >>>>>>>> If we do not give the hint, they can still be written to the same erase block,
> >>>>>>
> >>>>>> I mean it's possible to write them to the same erase block. :)
> >>>>>>
> >>>>>>>> right? it will not be worse?
> >>>>>>>>
> >>>>>>>
> >>>>>>> If the hint is not given, I think that they could be written to 
> >>>>>>> the same erase block, or not. But if we give the same hint, they are written
> >>>>>>> to the same block.
> >>>>>>
> >>>>>> IMO, Only if underlying device can support more hint type or opened channels,
> >>>>>> and actual temperature of data segment and node segment is quite different, we
> >>>>>> can separate them.
> >>>>>>
> >>>>>
> >>>>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
> >>>>> implements your proposed mapping.
> >>>>
> >>>> How about this? We'd better to split data and node blocks as much as possible.
> >>>>
> >>>> segment type                    hints
> >>>> ------------                    -----
> >>>> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
> >>>
> >>> WRITE_LIFE_NONE means there is no hints about write life time.
> >>>
> >>> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?
> >>
> >> The assumption would be to split different types of blocks by flash firmware,
> >> so I think we can use WRITE_LIFE_NONE as a type as well.
> >>
> > 
> > WRITE_LIFE_NONE means that no stream id is specified. It equals WRITE_LIFE_NOT_SET.
> 
> Rgith, I just saw nvme implementation:
> 
> nvme_assign_write_stream
> 
> 	enum rw_hint streamid = req->write_hint;
> 
> 	if (streamid == WRITE_LIFE_NOT_SET || streamid == WRITE_LIFE_NONE)
> 		streamid = 0;
> 	else {
> 		streamid--;
> ...
> 
> > So I think that we can define WARM_DATA as WRITE_LIFE_NONE, and
> > COLD_NODE & COLD_DATA as WRITE_LIFE_EXTREME.

What's the point?

segment type                 hints                streamid
-------------                -----                -------
COLD_NODE & COLD_DATA        WRITE_LIFE_NONE      0
WARM_DATA                    WRITE_LIFE_EXTERME   4
HOT_NODE & WARM_NODE         WRITE_LIFE_LONG      3
HOT_DATA                     WRITE_LIFE_MEDIUM    2
META_DATA                    WRITE_LIFE_SHORT     1

So, I don't think something is wrong. Again, I don't care about its hotness
given to the naming, but do care how to split different types of blocks with
different stream ids. Exceptions would be giving _SHORT or _MEDIUM which are
likely to be latency-critical, since I guess firmware may be able to store them
into SLC buffer.

Am I missing that _NONE has another meaning?

Thanks,

> 
> I think that would be better.
> 
> Thanks,
> 
> > 
> > Thanks.
> > 
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> >>>> WARM_DATA			WRITE_LIFE_EXTERME
> >>>> HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
> >>>> HOT_DATA			WRITE_LIFE_MEDIUM
> >>>> META_DATA			WRITE_LIFE_SHORT
> >>>>
> >>>>>
> >>>>> Thank you for comments ;)
> >>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>> I am not sure ;)
> >>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks.
> >>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>>>   others                     WRITE_LIFE_NONE
> >>>>>>>>>>>  
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hyunchul Lee (2):
> >>>>>>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
> >>>>>>>>>>>>>     write
> >>>>>>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
> >>>>>>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
> >>>>>>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
> >>>>>>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks
> >>>>>>>>>>>
> >>>>>>>>>>> .
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> .
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>> .
> >>>>
> >>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-16  3:58                           ` Jaegeuk Kim
@ 2017-11-16  4:35                             ` Hyunchul Lee
  2017-11-17 18:53                               ` Jaegeuk Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-16  4:35 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, kernel-team, Hyunchul Lee, Chao Yu

On 11/16/2017 12:58 PM, Jaegeuk Kim wrote:
> On 11/16, Chao Yu wrote:
>> On 2017/11/16 8:56, Hyunchul Lee wrote:
>>>
>>> On 11/16/2017 01:27 AM, Jaegeuk Kim wrote:
>>>> On 11/14, Chao Yu wrote:
>>>>> On 2017/11/14 12:20, Jaegeuk Kim wrote:
>>>>>> On 11/13, Hyunchul Lee wrote:
>>>>>>> On 11/13/2017 10:59 AM, Chao Yu wrote:
>>>>>>>> On 2017/11/13 9:35, Hyunchul Lee wrote:
>>>>>>>>> On 11/13/2017 10:26 AM, Chao Yu wrote:
>>>>>>>>>> On 2017/11/13 8:24, Hyunchul Lee wrote:
>>>>>>>>>>> On 11/10/2017 03:42 PM, Chao Yu wrote:
>>>>>>>>>>>> On 2017/11/10 8:23, Hyunchul Lee wrote:
>>>>>>>>>>>>> Hello, Chao
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11/09/2017 06:12 PM, Chao Yu wrote:
>>>>>>>>>>>>>> On 2017/11/9 13:51, Hyunchul Lee wrote:
>>>>>>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>>>>>>>>>> decreased writes in NAND by 25%.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>>>>>>>>>   1) the segment types where the data will be written.
>>>>>>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>>>>>>>>>> as shown below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   hints                     segment type
>>>>>>>>>>>>>>>   -----                     ------------
>>>>>>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>>>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>>>>>>>>>> hints are not applied in in-place update.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>>>>>>>>>> out-of-place updates even when there are not enough free segments. 
>>>>>>>>>>>>> I can write the patch that handles these situations. But I wonder 
>>>>>>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>>>>>>>>>
>>>>>>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>>>>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>>>>>>>>>> to not consider it.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>>>>>>>>>> hint.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>>>>>>>>>> buffered writes as below commit:
>>>>>>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sure I will. I wrote it already ;)
>>>>>>>>>>>>
>>>>>>>>>>>> Cool, ;)
>>>>>>>>>>>>
>>>>>>>>>>>>> I think that datas from the same segment should be passed down with the same
>>>>>>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>>>>>>>>>> about it.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   segment type               hints
>>>>>>>>>>>>>   ------------               -----
>>>>>>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>>>>>>>>>
>>>>>>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>>>>>>>>>
>>>>>>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>>>>>>>>>
>>>>>>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>>>>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>>>>>>>>>> as below:
>>>>>>>>>>>>
>>>>>>>>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>>>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>>>>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>>>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
>>>>>>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
>>>>>>>>>>> block if they have the same hint.
>>>>>>>>>>
>>>>>>>>>> If we do not give the hint, they can still be written to the same erase block,
>>>>>>>>
>>>>>>>> I mean it's possible to write them to the same erase block. :)
>>>>>>>>
>>>>>>>>>> right? it will not be worse?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If the hint is not given, I think that they could be written to 
>>>>>>>>> the same erase block, or not. But if we give the same hint, they are written
>>>>>>>>> to the same block.
>>>>>>>>
>>>>>>>> IMO, Only if underlying device can support more hint type or opened channels,
>>>>>>>> and actual temperature of data segment and node segment is quite different, we
>>>>>>>> can separate them.
>>>>>>>>
>>>>>>>
>>>>>>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
>>>>>>> implements your proposed mapping.
>>>>>>
>>>>>> How about this? We'd better to split data and node blocks as much as possible.
>>>>>>
>>>>>> segment type                    hints
>>>>>> ------------                    -----
>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
>>>>>
>>>>> WRITE_LIFE_NONE means there is no hints about write life time.
>>>>>
>>>>> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?
>>>>
>>>> The assumption would be to split different types of blocks by flash firmware,
>>>> so I think we can use WRITE_LIFE_NONE as a type as well.
>>>>
>>>
>>> WRITE_LIFE_NONE means that no stream id is specified. It equals WRITE_LIFE_NOT_SET.
>>
>> Rgith, I just saw nvme implementation:
>>
>> nvme_assign_write_stream
>>
>> 	enum rw_hint streamid = req->write_hint;
>>
>> 	if (streamid == WRITE_LIFE_NOT_SET || streamid == WRITE_LIFE_NONE)
>> 		streamid = 0;
>> 	else {
>> 		streamid--;
>> ...
>>
>>> So I think that we can define WARM_DATA as WRITE_LIFE_NONE, and
>>> COLD_NODE & COLD_DATA as WRITE_LIFE_EXTREME.
> 
> What's the point?
> 
> segment type                 hints                streamid
> -------------                -----                -------
> COLD_NODE & COLD_DATA        WRITE_LIFE_NONE      0
> WARM_DATA                    WRITE_LIFE_EXTERME   4
> HOT_NODE & WARM_NODE         WRITE_LIFE_LONG      3
> HOT_DATA                     WRITE_LIFE_MEDIUM    2
> META_DATA                    WRITE_LIFE_SHORT     1
> 
> So, I don't think something is wrong. Again, I don't care about its hotness
> given to the naming, but do care how to split different types of blocks with
> different stream ids. Exceptions would be giving _SHORT or _MEDIUM which are
> likely to be latency-critical, since I guess firmware may be able to store them
> into SLC buffer.
> 
> Am I missing that _NONE has another meaning?
> 

What I am worried about is that datas with no hint have WRITE_LIFE_NOT_SET(id 0).
If block devices have swap partitions and anothor file systems, cold datas could
be mixed with datas from that. Does this seems way too much?

And I think that stream id 0 means disabling stream directives. 
Becasue NVME_RW_DTYPE_STREAMS is clear.

Thanks.

> Thanks,
> 
>>
>> I think that would be better.
>>
>> Thanks,
>>
>>>
>>> Thanks.
>>>
>>>> Thanks,
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>> WARM_DATA			WRITE_LIFE_EXTERME
>>>>>> HOT_NODE & WARM_NODE		WRITE_LIFE_LONG
>>>>>> HOT_DATA			WRITE_LIFE_MEDIUM
>>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>>>
>>>>>>>
>>>>>>> Thank you for comments ;)
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>>> I am not sure ;)
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>>>   others                     WRITE_LIFE_NONE
>>>>>>>>>>>>>  
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hyunchul Lee (2):
>>>>>>>>>>>>>>>   f2fs: apply write hints to select the type of segments for buffered
>>>>>>>>>>>>>>>     write
>>>>>>>>>>>>>>>   f2fs: apply write hints to select the type of segment for direct write
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  fs/f2fs/data.c    | 101 ++++++++++++++++++++++++++++++++----------------------
>>>>>>>>>>>>>>>  fs/f2fs/f2fs.h    |   1 +
>>>>>>>>>>>>>>>  fs/f2fs/segment.c |  14 +++++++-
>>>>>>>>>>>>>>>  3 files changed, 74 insertions(+), 42 deletions(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>> .
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>>
>>>
>>> .
>>>
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-09  5:51 [RFC PATCH 0/2] apply write hints to select the type of segments Hyunchul Lee
                   ` (2 preceding siblings ...)
  2017-11-09  9:12 ` [RFC PATCH 0/2] apply write hints to select the type of segments Chao Yu
@ 2017-11-17 17:23 ` Christoph Hellwig
  2017-11-17 18:36   ` Jaegeuk Kim
  3 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2017-11-17 17:23 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-kernel,
	linux-fsdevel, kernel-team, Hyunchul Lee, Jens Axboe,
	linux-block


Next time please coordinate this with the block list and Jens, who
actually wrote the patch.

>   hints                     segment type
>   -----                     ------------
>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>   others                    CURSEG_WARM_DATA

Normally cold data is data with a long lifetime, and extreme is colder
than cold, so there seems to be some mismatch here.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-17 17:23 ` Christoph Hellwig
@ 2017-11-17 18:36   ` Jaegeuk Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-17 18:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hyunchul Lee, Chao Yu, linux-f2fs-devel, linux-kernel,
	linux-fsdevel, kernel-team, Hyunchul Lee, Jens Axboe,
	linux-block

On 11/17, Christoph Hellwig wrote:
> 
> Next time please coordinate this with the block list and Jens, who
> actually wrote the patch.

Got it.

> 
> >   hints                     segment type
> >   -----                     ------------
> >   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
> >   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
> >   others                    CURSEG_WARM_DATA
> 
> Normally cold data is data with a long lifetime, and extreme is colder
> than cold, so there seems to be some mismatch here.

It was wrong description and I fixed it which matches to implementation.

The below description was merged:

WRITE_LIFE_SHORT          CURSEG_HOT_DATA
WRITE_LIFE_EXTREME        CURSEG_COLD_DATA
others                    CURSEG_WARM_DATA

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-16  4:35                             ` Hyunchul Lee
@ 2017-11-17 18:53                               ` Jaegeuk Kim
  2017-11-20  2:12                                 ` Hyunchul Lee
  0 siblings, 1 reply; 30+ messages in thread
From: Jaegeuk Kim @ 2017-11-17 18:53 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Chao Yu, linux-f2fs-devel, linux-kernel, kernel-team,
	Hyunchul Lee, Chao Yu, linux-block, axboe, hch

...
> >>>>>>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
> >>>>>>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
> >>>>>>>>>>>>>>> decreased writes in NAND by 25%.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This hints help F2FS to determine the followings.
> >>>>>>>>>>>>>>>   1) the segment types where the data will be written.
> >>>>>>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
> >>>>>>>>>>>>>>> as shown below.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>   hints                     segment type
> >>>>>>>>>>>>>>>   -----                     ------------
> >>>>>>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
> >>>>>>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
> >>>>>>>>>>>>>>>   others                    CURSEG_WARM_DATA
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
> >>>>>>>>>>>>>>> hints are not applied in in-place update.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I am afraid that this makes side effects. for example, this could cause
> >>>>>>>>>>>>> out-of-place updates even when there are not enough free segments. 
> >>>>>>>>>>>>> I can write the patch that handles these situations. But I wonder 
> >>>>>>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
> >>>>>>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
> >>>>>>>>>>>> to not consider it.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
> >>>>>>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
> >>>>>>>>>>>>>>> hint.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
> >>>>>>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Could you write a patch to support passing write hint to block layer for
> >>>>>>>>>>>>>> buffered writes as below commit:
> >>>>>>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Sure I will. I wrote it already ;)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cool, ;)
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I think that datas from the same segment should be passed down with the same
> >>>>>>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
> >>>>>>>>>>>>> about it.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>   segment type               hints
> >>>>>>>>>>>>>   ------------               -----
> >>>>>>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
> >>>>>>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
> >>>>>>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
> >>>>>>>>>>>>
> >>>>>>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
> >>>>>>>>>>>>
> >>>>>>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
> >>>>>>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
> >>>>>>>>>>>> as below:
> >>>>>>>>>>>>
> >>>>>>>>>>>> META_DATA			WRITE_LIFE_SHORT
> >>>>>>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
> >>>>>>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
> >>>>>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
> >>>>>>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
> >>>>>>>>>>> block if they have the same hint.
> >>>>>>>>>>
> >>>>>>>>>> If we do not give the hint, they can still be written to the same erase block,
> >>>>>>>>
> >>>>>>>> I mean it's possible to write them to the same erase block. :)
> >>>>>>>>
> >>>>>>>>>> right? it will not be worse?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> If the hint is not given, I think that they could be written to 
> >>>>>>>>> the same erase block, or not. But if we give the same hint, they are written
> >>>>>>>>> to the same block.
> >>>>>>>>
> >>>>>>>> IMO, Only if underlying device can support more hint type or opened channels,
> >>>>>>>> and actual temperature of data segment and node segment is quite different, we
> >>>>>>>> can separate them.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
> >>>>>>> implements your proposed mapping.
> >>>>>>
> >>>>>> How about this? We'd better to split data and node blocks as much as possible.
> >>>>>>
> >>>>>> segment type                    hints
> >>>>>> ------------                    -----
> >>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
> >>>>>
> >>>>> WRITE_LIFE_NONE means there is no hints about write life time.
> >>>>>
> >>>>> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?
> >>>>
> >>>> The assumption would be to split different types of blocks by flash firmware,
> >>>> so I think we can use WRITE_LIFE_NONE as a type as well.
> >>>>
> >>>
> >>> WRITE_LIFE_NONE means that no stream id is specified. It equals WRITE_LIFE_NOT_SET.
> >>
> >> Rgith, I just saw nvme implementation:
> >>
> >> nvme_assign_write_stream
> >>
> >> 	enum rw_hint streamid = req->write_hint;
> >>
> >> 	if (streamid == WRITE_LIFE_NOT_SET || streamid == WRITE_LIFE_NONE)
> >> 		streamid = 0;
> >> 	else {
> >> 		streamid--;
> >> ...
> >>
> >>> So I think that we can define WARM_DATA as WRITE_LIFE_NONE, and
> >>> COLD_NODE & COLD_DATA as WRITE_LIFE_EXTREME.
> > 
> > What's the point?
> > 
> > segment type                 hints                streamid
> > -------------                -----                -------
> > COLD_NODE & COLD_DATA        WRITE_LIFE_NONE      0
> > WARM_DATA                    WRITE_LIFE_EXTERME   4
> > HOT_NODE & WARM_NODE         WRITE_LIFE_LONG      3
> > HOT_DATA                     WRITE_LIFE_MEDIUM    2
> > META_DATA                    WRITE_LIFE_SHORT     1
> > 
> > So, I don't think something is wrong. Again, I don't care about its hotness
> > given to the naming, but do care how to split different types of blocks with
> > different stream ids. Exceptions would be giving _SHORT or _MEDIUM which are
> > likely to be latency-critical, since I guess firmware may be able to store them
> > into SLC buffer.
> > 
> > Am I missing that _NONE has another meaning?
> > 
> 
> What I am worried about is that datas with no hint have WRITE_LIFE_NOT_SET(id 0).
> If block devices have swap partitions and anothor file systems, cold datas could
> be mixed with datas from that. Does this seems way too much?

That seems like how to distinguish write_hints across multiple partitions?

> And I think that stream id 0 means disabling stream directives. 
> Becasue NVME_RW_DTYPE_STREAMS is clear.

Then, I guess SSD FW will just handle 5 stream IDs including disabled 0.

Thanks,

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 0/2] apply write hints to select the type of segments
  2017-11-17 18:53                               ` Jaegeuk Kim
@ 2017-11-20  2:12                                 ` Hyunchul Lee
  0 siblings, 0 replies; 30+ messages in thread
From: Hyunchul Lee @ 2017-11-20  2:12 UTC (permalink / raw)
  To: Jaegeuk Kim
  Cc: Chao Yu, linux-f2fs-devel, linux-kernel, kernel-team,
	Hyunchul Lee, Chao Yu, linux-block, axboe, hch

On 11/18/2017 03:53 AM, Jaegeuk Kim wrote:
> ...
>>>>>>>>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Using write hints[1], applications can inform the life time of the data
>>>>>>>>>>>>>>>>> written to devices. and this[2] reported that the write hints patch
>>>>>>>>>>>>>>>>> decreased writes in NAND by 25%.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This hints help F2FS to determine the followings.
>>>>>>>>>>>>>>>>>   1) the segment types where the data will be written.
>>>>>>>>>>>>>>>>>   2) the hints that will be passed down to devices with the data of segments.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This patch set implements the first mapping from write hints to segment types
>>>>>>>>>>>>>>>>> as shown below.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   hints                     segment type
>>>>>>>>>>>>>>>>>   -----                     ------------
>>>>>>>>>>>>>>>>>   WRITE_LIFE_SHORT          CURSEG_COLD_DATA
>>>>>>>>>>>>>>>>>   WRITE_LIFE_EXTREME        CURSEG_HOT_DATA
>>>>>>>>>>>>>>>>>   others                    CURSEG_WARM_DATA
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The F2FS poliy for hot/cold seperation has precedence over this hints, And
>>>>>>>>>>>>>>>>> hints are not applied in in-place update.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Could we change to disable IPU if file/inode write hint is existing?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am afraid that this makes side effects. for example, this could cause
>>>>>>>>>>>>>>> out-of-place updates even when there are not enough free segments. 
>>>>>>>>>>>>>>> I can write the patch that handles these situations. But I wonder 
>>>>>>>>>>>>>>> that this is required, and I am not sure which IPU polices can be disabled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Oh, As I replied in another thread, I think IPU just affects filesystem
>>>>>>>>>>>>>> hot/cold separating, rather than this feature. So I think it will be okay
>>>>>>>>>>>>>> to not consider it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Before the second mapping is implemented, write hints are not passed down
>>>>>>>>>>>>>>>>> to devices. Because it is better that the data of a segment have the same 
>>>>>>>>>>>>>>>>> hint.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [1]: c75b1d9421f80f4143e389d2d50ddfc8a28c8c35
>>>>>>>>>>>>>>>>> [2]: https://lwn.net/Articles/726477/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Could you write a patch to support passing write hint to block layer for
>>>>>>>>>>>>>>>> buffered writes as below commit:
>>>>>>>>>>>>>>>> 0127251c45ae ("ext4: add support for passing in write hints for buffered writes")
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sure I will. I wrote it already ;)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cool, ;)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think that datas from the same segment should be passed down with the same
>>>>>>>>>>>>>>> hint, and the following mapping is reasonable. I wonder what is your opinion
>>>>>>>>>>>>>>> about it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   segment type               hints
>>>>>>>>>>>>>>>   ------------               -----
>>>>>>>>>>>>>>>   CURSEG_COLD_DATA           WRITE_LIFE_EXTREME
>>>>>>>>>>>>>>>   CURSEG_HOT_DATA            WRITE_LIFE_SHORT
>>>>>>>>>>>>>>>   CURSEG_COLD_NODE           WRITE_LIFE_NORMAL
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We have WRITE_LIFE_LONG defined rather than WRITE_LIFE_NORMAL in fs.h?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   CURSEG_HOT_NODE            WRITE_LIFE_MEDIUM
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As I know, in scenario of cell phone, data of meta_inode is hottest, then hot
>>>>>>>>>>>>>> data, warm node, and cold node should be coldest. So I suggested we can define
>>>>>>>>>>>>>> as below:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> META_DATA			WRITE_LIFE_SHORT
>>>>>>>>>>>>>> HOT_DATA & WARM_NODE		WRITE_LIFE_MEDIUM
>>>>>>>>>>>>>> HOT_NODE & WARM_DATA		WRITE_LIFE_LONG
>>>>>>>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_EXTREME
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I agree, But I am not sure that assigning the same hint to a node and data
>>>>>>>>>>>>> segment is good. Because NVMe is likely to write them in the same erase 
>>>>>>>>>>>>> block if they have the same hint.
>>>>>>>>>>>>
>>>>>>>>>>>> If we do not give the hint, they can still be written to the same erase block,
>>>>>>>>>>
>>>>>>>>>> I mean it's possible to write them to the same erase block. :)
>>>>>>>>>>
>>>>>>>>>>>> right? it will not be worse?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> If the hint is not given, I think that they could be written to 
>>>>>>>>>>> the same erase block, or not. But if we give the same hint, they are written
>>>>>>>>>>> to the same block.
>>>>>>>>>>
>>>>>>>>>> IMO, Only if underlying device can support more hint type or opened channels,
>>>>>>>>>> and actual temperature of data segment and node segment is quite different, we
>>>>>>>>>> can separate them.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Okay, If Jaegeuk Kim agrees with this, I will submit the patch that 
>>>>>>>>> implements your proposed mapping.
>>>>>>>>
>>>>>>>> How about this? We'd better to split data and node blocks as much as possible.
>>>>>>>>
>>>>>>>> segment type                    hints
>>>>>>>> ------------                    -----
>>>>>>>> COLD_NODE & COLD_DATA		WRITE_LIFE_NONE
>>>>>>>
>>>>>>> WRITE_LIFE_NONE means there is no hints about write life time.
>>>>>>>
>>>>>>> Shouldn't we define COLD_NODE & COLD_DATA as WRITE_LIFE_EXTERME?
>>>>>>
>>>>>> The assumption would be to split different types of blocks by flash firmware,
>>>>>> so I think we can use WRITE_LIFE_NONE as a type as well.
>>>>>>
>>>>>
>>>>> WRITE_LIFE_NONE means that no stream id is specified. It equals WRITE_LIFE_NOT_SET.
>>>>
>>>> Rgith, I just saw nvme implementation:
>>>>
>>>> nvme_assign_write_stream
>>>>
>>>> 	enum rw_hint streamid = req->write_hint;
>>>>
>>>> 	if (streamid == WRITE_LIFE_NOT_SET || streamid == WRITE_LIFE_NONE)
>>>> 		streamid = 0;
>>>> 	else {
>>>> 		streamid--;
>>>> ...
>>>>
>>>>> So I think that we can define WARM_DATA as WRITE_LIFE_NONE, and
>>>>> COLD_NODE & COLD_DATA as WRITE_LIFE_EXTREME.
>>>
>>> What's the point?
>>>
>>> segment type                 hints                streamid
>>> -------------                -----                -------
>>> COLD_NODE & COLD_DATA        WRITE_LIFE_NONE      0
>>> WARM_DATA                    WRITE_LIFE_EXTERME   4
>>> HOT_NODE & WARM_NODE         WRITE_LIFE_LONG      3
>>> HOT_DATA                     WRITE_LIFE_MEDIUM    2
>>> META_DATA                    WRITE_LIFE_SHORT     1
>>>
>>> So, I don't think something is wrong. Again, I don't care about its hotness
>>> given to the naming, but do care how to split different types of blocks with
>>> different stream ids. Exceptions would be giving _SHORT or _MEDIUM which are
>>> likely to be latency-critical, since I guess firmware may be able to store them
>>> into SLC buffer.
>>>
>>> Am I missing that _NONE has another meaning?
>>>
>>
>> What I am worried about is that datas with no hint have WRITE_LIFE_NOT_SET(id 0).
>> If block devices have swap partitions and anothor file systems, cold datas could
>> be mixed with datas from that. Does this seems way too much?
> 
> That seems like how to distinguish write_hints across multiple partitions?
> 

What I intend is that because there could be another partitions and 
the default stream ID is 0, WRITE_LIFE_EXTREAM could be better than 
WRITE_LIFE_NONE for cold datas.

Thanks.

>> And I think that stream id 0 means disabling stream directives. 
>> Becasue NVME_RW_DTYPE_STREAMS is clear.
> 
> Then, I guess SSD FW will just handle 5 stream IDs including disabled 0.
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-11-20  2:12 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09  5:51 [RFC PATCH 0/2] apply write hints to select the type of segments Hyunchul Lee
2017-11-09  5:51 ` [RFC PATHC 1/2] f2fs: apply write hints to select the type of segments for buffered write Hyunchul Lee
2017-11-09  8:21   ` Chao Yu
2017-11-09 18:19     ` Jaegeuk Kim
2017-11-09  5:51 ` [RFC PATHC 2/2] f2fs: apply write hints to select the type of segment for direct write Hyunchul Lee
2017-11-11  0:38   ` [f2fs-dev] " Chao Yu
2017-11-13  0:07     ` Hyunchul Lee
2017-11-13  1:24       ` Chao Yu
2017-11-13  1:48         ` Hyunchul Lee
2017-11-09  9:12 ` [RFC PATCH 0/2] apply write hints to select the type of segments Chao Yu
2017-11-09 18:16   ` Jaegeuk Kim
2017-11-10  2:31     ` Chao Yu
2017-11-10  0:23   ` Hyunchul Lee
2017-11-10  6:42     ` Chao Yu
2017-11-13  0:24       ` Hyunchul Lee
2017-11-13  1:26         ` Chao Yu
2017-11-13  1:35           ` Hyunchul Lee
2017-11-13  1:59             ` Chao Yu
2017-11-13  2:25               ` Hyunchul Lee
2017-11-14  4:20                 ` Jaegeuk Kim
2017-11-14  6:22                   ` Chao Yu
2017-11-15 16:27                     ` Jaegeuk Kim
2017-11-16  0:56                       ` Hyunchul Lee
2017-11-16  2:52                         ` Chao Yu
2017-11-16  3:58                           ` Jaegeuk Kim
2017-11-16  4:35                             ` Hyunchul Lee
2017-11-17 18:53                               ` Jaegeuk Kim
2017-11-20  2:12                                 ` Hyunchul Lee
2017-11-17 17:23 ` Christoph Hellwig
2017-11-17 18:36   ` Jaegeuk Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).