linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
@ 2017-11-28  0:23 Hyunchul Lee
  2017-11-28  0:23 ` [PATCH 2/2] f2fs: pass down write hints to block layer for direct write Hyunchul Lee
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Hyunchul Lee @ 2017-11-28  0:23 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team,
	Jens Axboe, Hyunchul Lee

From: Hyunchul Lee <cheol.lee@lge.com>

This implements which hint is passed down to block layer
for datas from the specific segment type.

segment type                     hints
------------                     -----
COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
WARM_DATA                        WRITE_LIFE_NONE
HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
HOT_DATA                         WRITE_LIFE_MEDIUM
META_DATA                        WRITE_LIFE_SHORT

Many thanks to Chao Yu and Jaegeuk Kim for comments to
implement this patch.

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
---
 fs/f2fs/data.c    |  1 +
 fs/f2fs/f2fs.h    |  2 ++
 fs/f2fs/segment.c | 29 +++++++++++++++++++++++++++++
 fs/f2fs/super.c   |  2 ++
 4 files changed, 34 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index a7ae418..0919a43 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -437,6 +437,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
 		}
 		io->bio = __bio_alloc(sbi, fio->new_blkaddr,
 						BIO_MAX_PAGES, false);
+		io->bio->bi_write_hint = io->write_hint;
 		io->fio = *fio;
 	}
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 7bcd148..be3cb0c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -969,6 +969,7 @@ struct f2fs_bio_info {
 	struct rw_semaphore io_rwsem;	/* blocking op for bio */
 	spinlock_t io_lock;		/* serialize DATA/NODE IOs */
 	struct list_head io_list;	/* track fios */
+	enum rw_hint write_hint;
 };
 
 #define FDEV(i)				(sbi->devs[i])
@@ -2674,6 +2675,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
 int __init create_segment_manager_caches(void);
 void destroy_segment_manager_caches(void);
 int rw_hint_to_seg_type(enum rw_hint hint);
+enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp);
 
 /*
  * checkpoint.c
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..0570db7 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2446,6 +2446,35 @@ int rw_hint_to_seg_type(enum rw_hint hint)
 	}
 }
 
+enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp)
+{
+	if (page == DATA) {
+		switch (temp) {
+		case WARM:
+			return WRITE_LIFE_NONE;
+		case COLD:
+			return WRITE_LIFE_EXTREME;
+		case HOT:
+			return WRITE_LIFE_MEDIUM;
+		default:
+			return WRITE_LIFE_NOT_SET;
+		}
+	} else if (page == NODE) {
+		switch (temp) {
+		case WARM:
+		case HOT:
+			return WRITE_LIFE_LONG;
+		case COLD:
+			return WRITE_LIFE_EXTREME;
+		default:
+			return WRITE_LIFE_NOT_SET;
+		}
+
+	} else {
+		return WRITE_LIFE_SHORT;
+	}
+}
+
 static int __get_segment_type_2(struct f2fs_io_info *fio)
 {
 	if (fio->type == DATA)
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index a6c5dd4..51c19a0 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2508,6 +2508,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 			sbi->write_io[i][j].bio = NULL;
 			spin_lock_init(&sbi->write_io[i][j].io_lock);
 			INIT_LIST_HEAD(&sbi->write_io[i][j].io_list);
+			sbi->write_io[i][j].write_hint =
+						io_type_to_rw_hint(i, j);
 		}
 	}
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] f2fs: pass down write hints to block layer for direct write
  2017-11-28  0:23 [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Hyunchul Lee
@ 2017-11-28  0:23 ` Hyunchul Lee
  2017-11-30  6:32 ` [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Chao Yu
  2017-11-30  7:06 ` Chao Yu
  2 siblings, 0 replies; 18+ messages in thread
From: Hyunchul Lee @ 2017-11-28  0:23 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team,
	Jens Axboe, Hyunchul Lee

From: Hyunchul Lee <cheol.lee@lge.com>

This implements which hint is passed down to block layer
for direct write.

                     (allocated
file hint            segment type)   io hint
----------           ------------    --------
WRITE_LIFE_SHORT     HOT_DATA        WRITE_LIFE_MEDIUM
WRITE_LIFE_EXTREME   COLD_DATA       WRITE_LIFE_EXTREME
others               WARM_DATA       WRITE_LIFE_NONE

Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
---
 fs/f2fs/data.c    |  6 ++++++
 fs/f2fs/f2fs.h    |  1 +
 fs/f2fs/segment.c | 12 ++++++++++++
 3 files changed, 19 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 0919a43..eabd569 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2093,6 +2093,7 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 	loff_t offset = iocb->ki_pos;
 	int rw = iov_iter_rw(iter);
 	int err;
+	enum rw_hint orig_hint;
 
 	err = check_direct_IO(inode, iter, offset);
 	if (err)
@@ -2103,10 +2104,15 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 
 	trace_f2fs_direct_IO_enter(inode, offset, count, rw);
 
+	orig_hint = iocb->ki_hint;
+	iocb->ki_hint = file_rwhint_to_io_rwhint(iocb->ki_hint);
+
 	down_read(&F2FS_I(inode)->dio_rwsem[rw]);
 	err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
 	up_read(&F2FS_I(inode)->dio_rwsem[rw]);
 
+	iocb->ki_hint = orig_hint;
+
 	if (rw == WRITE) {
 		if (err > 0) {
 			f2fs_update_iostat(F2FS_I_SB(inode), APP_DIRECT_IO,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index be3cb0c..426625a 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2676,6 +2676,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
 void destroy_segment_manager_caches(void);
 int rw_hint_to_seg_type(enum rw_hint hint);
 enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp);
+enum rw_hint file_rwhint_to_io_rwhint(enum rw_hint hint);
 
 /*
  * checkpoint.c
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 0570db7..b4496a5 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2475,6 +2475,18 @@ enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp)
 	}
 }
 
+enum rw_hint file_rwhint_to_io_rwhint(enum rw_hint hint)
+{
+	switch(hint) {
+	case WRITE_LIFE_SHORT:
+		return WRITE_LIFE_MEDIUM;
+	case WRITE_LIFE_EXTREME:
+		return WRITE_LIFE_EXTREME;
+	default:
+		return WRITE_LIFE_NONE;
+	}
+}
+
 static int __get_segment_type_2(struct f2fs_io_info *fio)
 {
 	if (fio->type == DATA)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-11-28  0:23 [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Hyunchul Lee
  2017-11-28  0:23 ` [PATCH 2/2] f2fs: pass down write hints to block layer for direct write Hyunchul Lee
@ 2017-11-30  6:32 ` Chao Yu
  2017-12-01  7:28   ` Jaegeuk Kim
  2017-11-30  7:06 ` Chao Yu
  2 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2017-11-30  6:32 UTC (permalink / raw)
  To: Hyunchul Lee, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team,
	Jens Axboe, Hyunchul Lee

On 2017/11/28 8:23, Hyunchul Lee wrote:
> From: Hyunchul Lee <cheol.lee@lge.com>
> 
> This implements which hint is passed down to block layer
> for datas from the specific segment type.
> 
> segment type                     hints
> ------------                     -----
> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> WARM_DATA                        WRITE_LIFE_NONE
> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> HOT_DATA                         WRITE_LIFE_MEDIUM
> META_DATA                        WRITE_LIFE_SHORT

The correspondence looks good to me.

> 
> Many thanks to Chao Yu and Jaegeuk Kim for comments to
> implement this patch.
> 
> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
> ---
>  fs/f2fs/data.c    |  1 +
>  fs/f2fs/f2fs.h    |  2 ++
>  fs/f2fs/segment.c | 29 +++++++++++++++++++++++++++++
>  fs/f2fs/super.c   |  2 ++
>  4 files changed, 34 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index a7ae418..0919a43 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -437,6 +437,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
>  		}
>  		io->bio = __bio_alloc(sbi, fio->new_blkaddr,
>  						BIO_MAX_PAGES, false);
> +		io->bio->bi_write_hint = io->write_hint;

Need to assign bi_write_hint for IPU path?

- rewrite_data_page
 - f2fs_submit_page_bio

>  		io->fio = *fio;
>  	}
>  
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 7bcd148..be3cb0c 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -969,6 +969,7 @@ struct f2fs_bio_info {
>  	struct rw_semaphore io_rwsem;	/* blocking op for bio */
>  	spinlock_t io_lock;		/* serialize DATA/NODE IOs */
>  	struct list_head io_list;	/* track fios */
> +	enum rw_hint write_hint;

Add missing comment?

>  };
>  
>  #define FDEV(i)				(sbi->devs[i])
> @@ -2674,6 +2675,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
>  int __init create_segment_manager_caches(void);
>  void destroy_segment_manager_caches(void);
>  int rw_hint_to_seg_type(enum rw_hint hint);
> +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp);
>  
>  /*
>   * checkpoint.c
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index c117e09..0570db7 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -2446,6 +2446,35 @@ int rw_hint_to_seg_type(enum rw_hint hint)
>  	}
>  }
>  

It will be better to copy commit log here to declare correspondence
more clearly.

Thanks,

> +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp)
> +{
> +	if (page == DATA) {
> +		switch (temp) {
> +		case WARM:
> +			return WRITE_LIFE_NONE;
> +		case COLD:
> +			return WRITE_LIFE_EXTREME;
> +		case HOT:
> +			return WRITE_LIFE_MEDIUM;
> +		default:
> +			return WRITE_LIFE_NOT_SET;
> +		}
> +	} else if (page == NODE) {
> +		switch (temp) {
> +		case WARM:
> +		case HOT:
> +			return WRITE_LIFE_LONG;
> +		case COLD:
> +			return WRITE_LIFE_EXTREME;
> +		default:
> +			return WRITE_LIFE_NOT_SET;
> +		}
> +
> +	} else {
> +		return WRITE_LIFE_SHORT;
> +	}
> +}
> +
>  static int __get_segment_type_2(struct f2fs_io_info *fio)
>  {
>  	if (fio->type == DATA)
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index a6c5dd4..51c19a0 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -2508,6 +2508,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  			sbi->write_io[i][j].bio = NULL;
>  			spin_lock_init(&sbi->write_io[i][j].io_lock);
>  			INIT_LIST_HEAD(&sbi->write_io[i][j].io_list);
> +			sbi->write_io[i][j].write_hint =
> +						io_type_to_rw_hint(i, j);
>  		}
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-11-28  0:23 [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Hyunchul Lee
  2017-11-28  0:23 ` [PATCH 2/2] f2fs: pass down write hints to block layer for direct write Hyunchul Lee
  2017-11-30  6:32 ` [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Chao Yu
@ 2017-11-30  7:06 ` Chao Yu
  2017-12-01  8:28   ` Hyunchul Lee
  2 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2017-11-30  7:06 UTC (permalink / raw)
  To: Hyunchul Lee, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team,
	Jens Axboe, Hyunchul Lee

Hi Hyunchul,

On 2017/11/28 8:23, Hyunchul Lee wrote:
> From: Hyunchul Lee <cheol.lee@lge.com>
> 
> This implements which hint is passed down to block layer
> for datas from the specific segment type.
> 
> segment type                     hints
> ------------                     -----
> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> WARM_DATA                        WRITE_LIFE_NONE
> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> HOT_DATA                         WRITE_LIFE_MEDIUM
> META_DATA                        WRITE_LIFE_SHORT

Just noticed, if our user do not give the hint via ioctl, f2fs can
provider hint to lower layer according to hot/cold separation ability,
it will be okay. But once user give his hint which may be more accurate
than filesystem, hint converted by f2fs may be wrong.

So what do you think of adding an option to control whether filesystem
can convert hint user given?

Thanks,

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-11-30  6:32 ` [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Chao Yu
@ 2017-12-01  7:28   ` Jaegeuk Kim
  2017-12-01  8:50     ` Hyunchul Lee
  0 siblings, 1 reply; 18+ messages in thread
From: Jaegeuk Kim @ 2017-12-01  7:28 UTC (permalink / raw)
  To: Chao Yu
  Cc: Hyunchul Lee, linux-f2fs-devel, linux-kernel, linux-fsdevel,
	kernel-team, Jens Axboe, Hyunchul Lee

On 11/30, Chao Yu wrote:
> On 2017/11/28 8:23, Hyunchul Lee wrote:
> > From: Hyunchul Lee <cheol.lee@lge.com>
> > 
> > This implements which hint is passed down to block layer
> > for datas from the specific segment type.
> > 
> > segment type                     hints
> > ------------                     -----
> > COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> > WARM_DATA                        WRITE_LIFE_NONE
> > HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> > HOT_DATA                         WRITE_LIFE_MEDIUM
> > META_DATA                        WRITE_LIFE_SHORT
> 
> The correspondence looks good to me.

Still, I don't fully get the point. Why do we have to assign WRITE_LIFE_NONE
to WARM_DATA? Why not using WRITE_LIFE_NOT_SET?

> > 
> > Many thanks to Chao Yu and Jaegeuk Kim for comments to
> > implement this patch.
> > 
> > Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
> > ---
> >  fs/f2fs/data.c    |  1 +
> >  fs/f2fs/f2fs.h    |  2 ++
> >  fs/f2fs/segment.c | 29 +++++++++++++++++++++++++++++
> >  fs/f2fs/super.c   |  2 ++
> >  4 files changed, 34 insertions(+)
> > 
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index a7ae418..0919a43 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -437,6 +437,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
> >  		}
> >  		io->bio = __bio_alloc(sbi, fio->new_blkaddr,
> >  						BIO_MAX_PAGES, false);
> > +		io->bio->bi_write_hint = io->write_hint;
> 
> Need to assign bi_write_hint for IPU path?
> 
> - rewrite_data_page
>  - f2fs_submit_page_bio
> 
> >  		io->fio = *fio;
> >  	}
> >  
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 7bcd148..be3cb0c 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -969,6 +969,7 @@ struct f2fs_bio_info {
> >  	struct rw_semaphore io_rwsem;	/* blocking op for bio */
> >  	spinlock_t io_lock;		/* serialize DATA/NODE IOs */
> >  	struct list_head io_list;	/* track fios */
> > +	enum rw_hint write_hint;
> 
> Add missing comment?
> 
> >  };
> >  
> >  #define FDEV(i)				(sbi->devs[i])
> > @@ -2674,6 +2675,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
> >  int __init create_segment_manager_caches(void);
> >  void destroy_segment_manager_caches(void);
> >  int rw_hint_to_seg_type(enum rw_hint hint);
> > +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp);
> >  
> >  /*
> >   * checkpoint.c
> > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > index c117e09..0570db7 100644
> > --- a/fs/f2fs/segment.c
> > +++ b/fs/f2fs/segment.c
> > @@ -2446,6 +2446,35 @@ int rw_hint_to_seg_type(enum rw_hint hint)
> >  	}
> >  }
> >  
> 
> It will be better to copy commit log here to declare correspondence
> more clearly.
> 
> Thanks,
> 
> > +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp)
> > +{
> > +	if (page == DATA) {
> > +		switch (temp) {
> > +		case WARM:
> > +			return WRITE_LIFE_NONE;
> > +		case COLD:
> > +			return WRITE_LIFE_EXTREME;
> > +		case HOT:
> > +			return WRITE_LIFE_MEDIUM;
> > +		default:
> > +			return WRITE_LIFE_NOT_SET;
> > +		}
> > +	} else if (page == NODE) {
> > +		switch (temp) {
> > +		case WARM:
> > +		case HOT:
> > +			return WRITE_LIFE_LONG;
> > +		case COLD:
> > +			return WRITE_LIFE_EXTREME;
> > +		default:
> > +			return WRITE_LIFE_NOT_SET;
> > +		}
> > +
> > +	} else {
> > +		return WRITE_LIFE_SHORT;
> > +	}
> > +}
> > +
> >  static int __get_segment_type_2(struct f2fs_io_info *fio)
> >  {
> >  	if (fio->type == DATA)
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index a6c5dd4..51c19a0 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -2508,6 +2508,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >  			sbi->write_io[i][j].bio = NULL;
> >  			spin_lock_init(&sbi->write_io[i][j].io_lock);
> >  			INIT_LIST_HEAD(&sbi->write_io[i][j].io_list);
> > +			sbi->write_io[i][j].write_hint =
> > +						io_type_to_rw_hint(i, j);
> >  		}
> >  	}
> >  
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-11-30  7:06 ` Chao Yu
@ 2017-12-01  8:28   ` Hyunchul Lee
  2017-12-11 13:15     ` [f2fs-dev] " Chao Yu
  0 siblings, 1 reply; 18+ messages in thread
From: Hyunchul Lee @ 2017-12-01  8:28 UTC (permalink / raw)
  To: Chao Yu, Jaegeuk Kim
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team,
	Jens Axboe, Hyunchul Lee

Hi Chao,

On 11/30/2017 04:06 PM, Chao Yu wrote:
> Hi Hyunchul,
> 
> On 2017/11/28 8:23, Hyunchul Lee wrote:
>> From: Hyunchul Lee <cheol.lee@lge.com>
>>
>> This implements which hint is passed down to block layer
>> for datas from the specific segment type.
>>
>> segment type                     hints
>> ------------                     -----
>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>> WARM_DATA                        WRITE_LIFE_NONE
>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>> HOT_DATA                         WRITE_LIFE_MEDIUM
>> META_DATA                        WRITE_LIFE_SHORT
> 
> Just noticed, if our user do not give the hint via ioctl, f2fs can
> provider hint to lower layer according to hot/cold separation ability,
> it will be okay. But once user give his hint which may be more accurate
> than filesystem, hint converted by f2fs may be wrong.
> 
> So what do you think of adding an option to control whether filesystem
> can convert hint user given?
> 

I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
converted to different hints.

file hint      segment type        io hint
---------      ------------        -------
LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
LIFE_MEDIUM    WARM_DATA           LIFE_NONE
LIFE_LONG      WARM_DATA           LIFE_NONE
LIFE_EXTREME   COLD_DATA           LIFE_EXTREME

the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
the same hint, LIFE_NONE. I am not sure that the seperation between 
LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
difference between them is a little ambigous for users, and if WARM_DATA 
segment has two different hints, it can makes GC non-efficient.

I wonder your thought about this.

Thanks.

> Thanks,
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-01  7:28   ` Jaegeuk Kim
@ 2017-12-01  8:50     ` Hyunchul Lee
  2017-12-14 21:18       ` Jaegeuk Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Hyunchul Lee @ 2017-12-01  8:50 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: linux-f2fs-devel, linux-kernel, linux-fsdevel, kernel-team,
	Jens Axboe, Hyunchul Lee

Hi Jaegeuk,

On 12/01/2017 04:28 PM, Jaegeuk Kim wrote:
> On 11/30, Chao Yu wrote:
>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>
>>> This implements which hint is passed down to block layer
>>> for datas from the specific segment type.
>>>
>>> segment type                     hints
>>> ------------                     -----
>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>> WARM_DATA                        WRITE_LIFE_NONE
>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>> META_DATA                        WRITE_LIFE_SHORT
>>
>> The correspondence looks good to me.
> 
> Still, I don't fully get the point. Why do we have to assign WRITE_LIFE_NONE
> to WARM_DATA? Why not using WRITE_LIFE_NOT_SET?
> 

I think LIFE_NONE and LIFE_NOT_SET are the same. So I chose LIFE_NONE simply.

Thanks.

>>>
>>> Many thanks to Chao Yu and Jaegeuk Kim for comments to
>>> implement this patch.
>>>
>>> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
>>> ---
>>>  fs/f2fs/data.c    |  1 +
>>>  fs/f2fs/f2fs.h    |  2 ++
>>>  fs/f2fs/segment.c | 29 +++++++++++++++++++++++++++++
>>>  fs/f2fs/super.c   |  2 ++
>>>  4 files changed, 34 insertions(+)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index a7ae418..0919a43 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -437,6 +437,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
>>>  		}
>>>  		io->bio = __bio_alloc(sbi, fio->new_blkaddr,
>>>  						BIO_MAX_PAGES, false);
>>> +		io->bio->bi_write_hint = io->write_hint;
>>
>> Need to assign bi_write_hint for IPU path?
>>
>> - rewrite_data_page
>>  - f2fs_submit_page_bio
>>
>>>  		io->fio = *fio;
>>>  	}
>>>  
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>> index 7bcd148..be3cb0c 100644
>>> --- a/fs/f2fs/f2fs.h
>>> +++ b/fs/f2fs/f2fs.h
>>> @@ -969,6 +969,7 @@ struct f2fs_bio_info {
>>>  	struct rw_semaphore io_rwsem;	/* blocking op for bio */
>>>  	spinlock_t io_lock;		/* serialize DATA/NODE IOs */
>>>  	struct list_head io_list;	/* track fios */
>>> +	enum rw_hint write_hint;
>>
>> Add missing comment?
>>
>>>  };
>>>  
>>>  #define FDEV(i)				(sbi->devs[i])
>>> @@ -2674,6 +2675,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
>>>  int __init create_segment_manager_caches(void);
>>>  void destroy_segment_manager_caches(void);
>>>  int rw_hint_to_seg_type(enum rw_hint hint);
>>> +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp);
>>>  
>>>  /*
>>>   * checkpoint.c
>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>>> index c117e09..0570db7 100644
>>> --- a/fs/f2fs/segment.c
>>> +++ b/fs/f2fs/segment.c
>>> @@ -2446,6 +2446,35 @@ int rw_hint_to_seg_type(enum rw_hint hint)
>>>  	}
>>>  }
>>>  
>>
>> It will be better to copy commit log here to declare correspondence
>> more clearly.
>>
>> Thanks,
>>
>>> +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp)
>>> +{
>>> +	if (page == DATA) {
>>> +		switch (temp) {
>>> +		case WARM:
>>> +			return WRITE_LIFE_NONE;
>>> +		case COLD:
>>> +			return WRITE_LIFE_EXTREME;
>>> +		case HOT:
>>> +			return WRITE_LIFE_MEDIUM;
>>> +		default:
>>> +			return WRITE_LIFE_NOT_SET;
>>> +		}
>>> +	} else if (page == NODE) {
>>> +		switch (temp) {
>>> +		case WARM:
>>> +		case HOT:
>>> +			return WRITE_LIFE_LONG;
>>> +		case COLD:
>>> +			return WRITE_LIFE_EXTREME;
>>> +		default:
>>> +			return WRITE_LIFE_NOT_SET;
>>> +		}
>>> +
>>> +	} else {
>>> +		return WRITE_LIFE_SHORT;
>>> +	}
>>> +}
>>> +
>>>  static int __get_segment_type_2(struct f2fs_io_info *fio)
>>>  {
>>>  	if (fio->type == DATA)
>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>> index a6c5dd4..51c19a0 100644
>>> --- a/fs/f2fs/super.c
>>> +++ b/fs/f2fs/super.c
>>> @@ -2508,6 +2508,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>  			sbi->write_io[i][j].bio = NULL;
>>>  			spin_lock_init(&sbi->write_io[i][j].io_lock);
>>>  			INIT_LIST_HEAD(&sbi->write_io[i][j].io_list);
>>> +			sbi->write_io[i][j].write_hint =
>>> +						io_type_to_rw_hint(i, j);
>>>  		}
>>>  	}
>>>  
>>>
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-01  8:28   ` Hyunchul Lee
@ 2017-12-11 13:15     ` Chao Yu
  2017-12-12  2:15       ` Hyunchul Lee
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2017-12-11 13:15 UTC (permalink / raw)
  To: Hyunchul Lee, Chao Yu, Jaegeuk Kim
  Cc: Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

Hi Hyunchul,

On 2017/12/1 16:28, Hyunchul Lee wrote:
> Hi Chao,
> 
> On 11/30/2017 04:06 PM, Chao Yu wrote:
>> Hi Hyunchul,
>>
>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>
>>> This implements which hint is passed down to block layer
>>> for datas from the specific segment type.
>>>
>>> segment type                     hints
>>> ------------                     -----
>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>> WARM_DATA                        WRITE_LIFE_NONE
>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>> META_DATA                        WRITE_LIFE_SHORT
>>
>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>> provider hint to lower layer according to hot/cold separation ability,
>> it will be okay. But once user give his hint which may be more accurate
>> than filesystem, hint converted by f2fs may be wrong.
>>
>> So what do you think of adding an option to control whether filesystem
>> can convert hint user given?
>>
> 
> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
> converted to different hints.

What I mean is introducing a mount option, e.g. fs_iohint,
a) w/o fs_iohint, propagate file/inode io_hint to low layer.
b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
with filesystem's private rule.

Thanks,

> 
> file hint      segment type        io hint
> ---------      ------------        -------
> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
> LIFE_LONG      WARM_DATA           LIFE_NONE
> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
> 
> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
> the same hint, LIFE_NONE. I am not sure that the seperation between 
> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
> difference between them is a little ambigous for users, and if WARM_DATA 
> segment has two different hints, it can makes GC non-efficient.
> 
> I wonder your thought about this.
> 
> Thanks.
> 
>> Thanks,
>>
>>
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-11 13:15     ` [f2fs-dev] " Chao Yu
@ 2017-12-12  2:15       ` Hyunchul Lee
  2017-12-12  2:45         ` Chao Yu
  0 siblings, 1 reply; 18+ messages in thread
From: Hyunchul Lee @ 2017-12-12  2:15 UTC (permalink / raw)
  To: Chao Yu, Chao Yu, Jaegeuk Kim
  Cc: Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

Hi Chao,

On 12/11/2017 10:15 PM, Chao Yu wrote:
> Hi Hyunchul,
> 
> On 2017/12/1 16:28, Hyunchul Lee wrote:
>> Hi Chao,
>>
>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>> Hi Hyunchul,
>>>
>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>
>>>> This implements which hint is passed down to block layer
>>>> for datas from the specific segment type.
>>>>
>>>> segment type                     hints
>>>> ------------                     -----
>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>> META_DATA                        WRITE_LIFE_SHORT
>>>
>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>> provider hint to lower layer according to hot/cold separation ability,
>>> it will be okay. But once user give his hint which may be more accurate
>>> than filesystem, hint converted by f2fs may be wrong.
>>>
>>> So what do you think of adding an option to control whether filesystem
>>> can convert hint user given?
>>>
>>
>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>> converted to different hints.
> 
> What I mean is introducing a mount option, e.g. fs_iohint,
> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
> with filesystem's private rule.
> 

Okay, I will implement this option and send this patch again.

Without fs_iohint, Even if data blocks are moved due to GC, 
we should keep user hints. And if user hints are not given, 
any hints are not passed down to block layer, right?

Thank you for comments.

> Thanks,
> 
>>
>> file hint      segment type        io hint
>> ---------      ------------        -------
>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>> LIFE_LONG      WARM_DATA           LIFE_NONE
>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>
>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>> difference between them is a little ambigous for users, and if WARM_DATA 
>> segment has two different hints, it can makes GC non-efficient.
>>
>> I wonder your thought about this.
>>
>> Thanks.
>>
>>> Thanks,
>>>
>>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-12  2:15       ` Hyunchul Lee
@ 2017-12-12  2:45         ` Chao Yu
  2017-12-14  1:33           ` Hyunchul Lee
  0 siblings, 1 reply; 18+ messages in thread
From: Chao Yu @ 2017-12-12  2:45 UTC (permalink / raw)
  To: Hyunchul Lee, Chao Yu, Jaegeuk Kim
  Cc: Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

Hi Hyunchul,

On 2017/12/12 10:15, Hyunchul Lee wrote:
> Hi Chao,
> 
> On 12/11/2017 10:15 PM, Chao Yu wrote:
>> Hi Hyunchul,
>>
>> On 2017/12/1 16:28, Hyunchul Lee wrote:
>>> Hi Chao,
>>>
>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>>> Hi Hyunchul,
>>>>
>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>
>>>>> This implements which hint is passed down to block layer
>>>>> for datas from the specific segment type.
>>>>>
>>>>> segment type                     hints
>>>>> ------------                     -----
>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>>> META_DATA                        WRITE_LIFE_SHORT
>>>>
>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>>> provider hint to lower layer according to hot/cold separation ability,
>>>> it will be okay. But once user give his hint which may be more accurate
>>>> than filesystem, hint converted by f2fs may be wrong.
>>>>
>>>> So what do you think of adding an option to control whether filesystem
>>>> can convert hint user given?
>>>>
>>>
>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>>> converted to different hints.
>>
>> What I mean is introducing a mount option, e.g. fs_iohint,
>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
>> with filesystem's private rule.
>>
> 
> Okay, I will implement this option and send this patch again.

Let's wait for Jaegeuk's comments first?

> 
> Without fs_iohint, Even if data blocks are moved due to GC, 
> we should keep user hints. And if user hints are not given, 
> any hints are not passed down to block layer, right?

Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
layout, so later when we trigger GC, we can use the last io_hint in inode rather
than giving no hint or fs' hint.

I think it needs to discuss with original author of IO hint, what is the IO hint
policy when filesystem move block by itself after inode has been released in system.

Thanks,

> 
> Thank you for comments.
> 
>> Thanks,
>>
>>>
>>> file hint      segment type        io hint
>>> ---------      ------------        -------
>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>>> LIFE_LONG      WARM_DATA           LIFE_NONE
>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>>
>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>>> difference between them is a little ambigous for users, and if WARM_DATA 
>>> segment has two different hints, it can makes GC non-efficient.
>>>
>>> I wonder your thought about this.
>>>
>>> Thanks.
>>>
>>>> Thanks,
>>>>
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-12  2:45         ` Chao Yu
@ 2017-12-14  1:33           ` Hyunchul Lee
  2017-12-15  2:06             ` Jaegeuk Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Hyunchul Lee @ 2017-12-14  1:33 UTC (permalink / raw)
  To: Chao Yu, Chao Yu, Jaegeuk Kim
  Cc: Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

Hi Jaegeuk,

I need your comment about the fs_iohint mount option.

a) w/o fs_iohint, propagate user hints to low layer.
b) w/ fs_iohint, ignore user hints, and use hints which is generated
with F2FS.

Chao suggests this option. because user hints are more accurate than
file system.

This is resonable, But I have some concerns about this option. 
The first thing is that blocks of a segments have different hints. This
could make GC less effective. 
The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
really needed. I think that difference between them is a little ambigous 
for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
hints by F2FS.

Thanks.

On 12/12/2017 11:45 AM, Chao Yu wrote:
> Hi Hyunchul,
> 
> On 2017/12/12 10:15, Hyunchul Lee wrote:
>> Hi Chao,
>>
>> On 12/11/2017 10:15 PM, Chao Yu wrote:
>>> Hi Hyunchul,
>>>
>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
>>>> Hi Chao,
>>>>
>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>>>> Hi Hyunchul,
>>>>>
>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>
>>>>>> This implements which hint is passed down to block layer
>>>>>> for datas from the specific segment type.
>>>>>>
>>>>>> segment type                     hints
>>>>>> ------------                     -----
>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>>>> META_DATA                        WRITE_LIFE_SHORT
>>>>>
>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>>>> provider hint to lower layer according to hot/cold separation ability,
>>>>> it will be okay. But once user give his hint which may be more accurate
>>>>> than filesystem, hint converted by f2fs may be wrong.
>>>>>
>>>>> So what do you think of adding an option to control whether filesystem
>>>>> can convert hint user given?
>>>>>
>>>>
>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>>>> converted to different hints.
>>>
>>> What I mean is introducing a mount option, e.g. fs_iohint,
>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
>>> with filesystem's private rule.
>>>
>>
>> Okay, I will implement this option and send this patch again.
> 
> Let's wait for Jaegeuk's comments first?
> 
>>
>> Without fs_iohint, Even if data blocks are moved due to GC, 
>> we should keep user hints. And if user hints are not given, 
>> any hints are not passed down to block layer, right?
> 
> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
> layout, so later when we trigger GC, we can use the last io_hint in inode rather
> than giving no hint or fs' hint.
> 
> I think it needs to discuss with original author of IO hint, what is the IO hint
> policy when filesystem move block by itself after inode has been released in system.
> 
> Thanks,
> 
>>
>> Thank you for comments.
>>
>>> Thanks,
>>>
>>>>
>>>> file hint      segment type        io hint
>>>> ---------      ------------        -------
>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>>>
>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>>>> difference between them is a little ambigous for users, and if WARM_DATA 
>>>> segment has two different hints, it can makes GC non-efficient.
>>>>
>>>> I wonder your thought about this.
>>>>
>>>> Thanks.
>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Check out the vibrant tech community on one of the world's most
>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>> _______________________________________________
>>>> Linux-f2fs-devel mailing list
>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>
>>>
>>
>> .
>>
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-01  8:50     ` Hyunchul Lee
@ 2017-12-14 21:18       ` Jaegeuk Kim
  0 siblings, 0 replies; 18+ messages in thread
From: Jaegeuk Kim @ 2017-12-14 21:18 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Chao Yu, linux-f2fs-devel, linux-kernel, linux-fsdevel,
	kernel-team, Jens Axboe, Hyunchul Lee

Hello,

On 12/01, Hyunchul Lee wrote:
> Hi Jaegeuk,
> 
> On 12/01/2017 04:28 PM, Jaegeuk Kim wrote:
> > On 11/30, Chao Yu wrote:
> >> On 2017/11/28 8:23, Hyunchul Lee wrote:
> >>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>
> >>> This implements which hint is passed down to block layer
> >>> for datas from the specific segment type.
> >>>
> >>> segment type                     hints
> >>> ------------                     -----
> >>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> >>> WARM_DATA                        WRITE_LIFE_NONE
> >>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> >>> HOT_DATA                         WRITE_LIFE_MEDIUM
> >>> META_DATA                        WRITE_LIFE_SHORT
> >>
> >> The correspondence looks good to me.
> > 
> > Still, I don't fully get the point. Why do we have to assign WRITE_LIFE_NONE
> > to WARM_DATA? Why not using WRITE_LIFE_NOT_SET?
> > 
> 
> I think LIFE_NONE and LIFE_NOT_SET are the same. So I chose LIFE_NONE simply.

Same in what way?

WRITE_LIFE_NOT_SET = 0,
WRITE_LIFE_NONE = 1,

Still, you didn't answer why we have to assign LIFE_NONE to WARM_DATA?

Thanks,

> 
> Thanks.
> 
> >>>
> >>> Many thanks to Chao Yu and Jaegeuk Kim for comments to
> >>> implement this patch.
> >>>
> >>> Signed-off-by: Hyunchul Lee <cheol.lee@lge.com>
> >>> ---
> >>>  fs/f2fs/data.c    |  1 +
> >>>  fs/f2fs/f2fs.h    |  2 ++
> >>>  fs/f2fs/segment.c | 29 +++++++++++++++++++++++++++++
> >>>  fs/f2fs/super.c   |  2 ++
> >>>  4 files changed, 34 insertions(+)
> >>>
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>> index a7ae418..0919a43 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -437,6 +437,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio)
> >>>  		}
> >>>  		io->bio = __bio_alloc(sbi, fio->new_blkaddr,
> >>>  						BIO_MAX_PAGES, false);
> >>> +		io->bio->bi_write_hint = io->write_hint;
> >>
> >> Need to assign bi_write_hint for IPU path?
> >>
> >> - rewrite_data_page
> >>  - f2fs_submit_page_bio
> >>
> >>>  		io->fio = *fio;
> >>>  	}
> >>>  
> >>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>> index 7bcd148..be3cb0c 100644
> >>> --- a/fs/f2fs/f2fs.h
> >>> +++ b/fs/f2fs/f2fs.h
> >>> @@ -969,6 +969,7 @@ struct f2fs_bio_info {
> >>>  	struct rw_semaphore io_rwsem;	/* blocking op for bio */
> >>>  	spinlock_t io_lock;		/* serialize DATA/NODE IOs */
> >>>  	struct list_head io_list;	/* track fios */
> >>> +	enum rw_hint write_hint;
> >>
> >> Add missing comment?
> >>
> >>>  };
> >>>  
> >>>  #define FDEV(i)				(sbi->devs[i])
> >>> @@ -2674,6 +2675,7 @@ int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
> >>>  int __init create_segment_manager_caches(void);
> >>>  void destroy_segment_manager_caches(void);
> >>>  int rw_hint_to_seg_type(enum rw_hint hint);
> >>> +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp);
> >>>  
> >>>  /*
> >>>   * checkpoint.c
> >>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >>> index c117e09..0570db7 100644
> >>> --- a/fs/f2fs/segment.c
> >>> +++ b/fs/f2fs/segment.c
> >>> @@ -2446,6 +2446,35 @@ int rw_hint_to_seg_type(enum rw_hint hint)
> >>>  	}
> >>>  }
> >>>  
> >>
> >> It will be better to copy commit log here to declare correspondence
> >> more clearly.
> >>
> >> Thanks,
> >>
> >>> +enum rw_hint io_type_to_rw_hint(enum page_type page, enum temp_type temp)
> >>> +{
> >>> +	if (page == DATA) {
> >>> +		switch (temp) {
> >>> +		case WARM:
> >>> +			return WRITE_LIFE_NONE;
> >>> +		case COLD:
> >>> +			return WRITE_LIFE_EXTREME;
> >>> +		case HOT:
> >>> +			return WRITE_LIFE_MEDIUM;
> >>> +		default:
> >>> +			return WRITE_LIFE_NOT_SET;
> >>> +		}
> >>> +	} else if (page == NODE) {
> >>> +		switch (temp) {
> >>> +		case WARM:
> >>> +		case HOT:
> >>> +			return WRITE_LIFE_LONG;
> >>> +		case COLD:
> >>> +			return WRITE_LIFE_EXTREME;
> >>> +		default:
> >>> +			return WRITE_LIFE_NOT_SET;
> >>> +		}
> >>> +
> >>> +	} else {
> >>> +		return WRITE_LIFE_SHORT;
> >>> +	}
> >>> +}
> >>> +
> >>>  static int __get_segment_type_2(struct f2fs_io_info *fio)
> >>>  {
> >>>  	if (fio->type == DATA)
> >>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >>> index a6c5dd4..51c19a0 100644
> >>> --- a/fs/f2fs/super.c
> >>> +++ b/fs/f2fs/super.c
> >>> @@ -2508,6 +2508,8 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  			sbi->write_io[i][j].bio = NULL;
> >>>  			spin_lock_init(&sbi->write_io[i][j].io_lock);
> >>>  			INIT_LIST_HEAD(&sbi->write_io[i][j].io_list);
> >>> +			sbi->write_io[i][j].write_hint =
> >>> +						io_type_to_rw_hint(i, j);
> >>>  		}
> >>>  	}
> >>>  
> >>>
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-14  1:33           ` Hyunchul Lee
@ 2017-12-15  2:06             ` Jaegeuk Kim
  2017-12-18  7:28               ` Hyunchul Lee
  2017-12-23  9:44               ` Chao Yu
  0 siblings, 2 replies; 18+ messages in thread
From: Jaegeuk Kim @ 2017-12-15  2:06 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Chao Yu, Chao Yu, Jens Axboe, linux-kernel, linux-f2fs-devel,
	kernel-team, linux-fsdevel, Hyunchul Lee

On 12/14, Hyunchul Lee wrote:
> Hi Jaegeuk,
> 
> I need your comment about the fs_iohint mount option.
> 
> a) w/o fs_iohint, propagate user hints to low layer.
> b) w/ fs_iohint, ignore user hints, and use hints which is generated
> with F2FS.
> 
> Chao suggests this option. because user hints are more accurate than
> file system.
> 
> This is resonable, But I have some concerns about this option. 
> The first thing is that blocks of a segments have different hints. This
> could make GC less effective. 
> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
> really needed. I think that difference between them is a little ambigous 
> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
> hints by F2FS.

I think what we really can do would assign many user hints to our 3 DATA
logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
Then, we can decide how to keep that as much as possible, since we have
another filesystem metadata such as meta and nodes. In addition, I don't
think we have to keep the original user-hints which makes F2FS logs be
messed up.

With that mind, I can think of the below cases. Especially, if user wants
to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
In order to keep this policy, I think fs_iohints would be better to be a
feature set by mkfs.f2fs and detected by sysfs entries for users.

1) w/ fs_iohints

User                        F2FS               Block
-------------------------------------------------------------------
                            Meta               WRITE_LIFE_MEDIUM
                            HOT_NODE           WRITE_LIFE_NOTSET
                            WARM_NODE          -'
                            COLD_NODE          WRITE_LIFE_NONE
ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
extention list              -'                 -'
WRITE_LIFE_EXTREME          -'                 -'
WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT

-- buffered_io
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
WRITE_LIFE_NONE             -'                 -'
WRITE_LIFE_MEDIUM           -'                 -'
WRITE_LIFE_LONG             -'                 -'

-- direct_io (Not recommendable)
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG

2) w/o fs_iohints

User                        F2FS               Block
-------------------------------------------------------------------
                            Meta               -
                            HOT_NODE           -
                            WARM_NODE          -
                            COLD_NODE          -
ioctl(cold)                 COLD_DATA          -
extention list              -'                 -

-- buffered_io
WRITE_LIFE_EXTREME          COLD_DATA          -
WRITE_LIFE_SHORT            HOT_DATA           -
WRITE_LIFE_NOT_SET          WARM_DATA          -
WRITE_LIFE_NONE             -'                 -
WRITE_LIFE_MEDIUM           -'                 -
WRITE_LIFE_LONG             -'                 -

-- direct_io
WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG


Note that, I don't much care about how to manipulate streamid in nvme driver
in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
in different ways. Taking a look at the definition, at least, we don't need
to assume that those are same at all. For example, if we can expolit this in
UFS driver, we can pass all the stream ids to the device as context ids.

Thanks,

> 
> Thanks.
> 
> On 12/12/2017 11:45 AM, Chao Yu wrote:
> > Hi Hyunchul,
> > 
> > On 2017/12/12 10:15, Hyunchul Lee wrote:
> >> Hi Chao,
> >>
> >> On 12/11/2017 10:15 PM, Chao Yu wrote:
> >>> Hi Hyunchul,
> >>>
> >>> On 2017/12/1 16:28, Hyunchul Lee wrote:
> >>>> Hi Chao,
> >>>>
> >>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
> >>>>> Hi Hyunchul,
> >>>>>
> >>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
> >>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>
> >>>>>> This implements which hint is passed down to block layer
> >>>>>> for datas from the specific segment type.
> >>>>>>
> >>>>>> segment type                     hints
> >>>>>> ------------                     -----
> >>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> >>>>>> WARM_DATA                        WRITE_LIFE_NONE
> >>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> >>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
> >>>>>> META_DATA                        WRITE_LIFE_SHORT
> >>>>>
> >>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
> >>>>> provider hint to lower layer according to hot/cold separation ability,
> >>>>> it will be okay. But once user give his hint which may be more accurate
> >>>>> than filesystem, hint converted by f2fs may be wrong.
> >>>>>
> >>>>> So what do you think of adding an option to control whether filesystem
> >>>>> can convert hint user given?
> >>>>>
> >>>>
> >>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
> >>>> converted to different hints.
> >>>
> >>> What I mean is introducing a mount option, e.g. fs_iohint,
> >>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
> >>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
> >>> with filesystem's private rule.
> >>>
> >>
> >> Okay, I will implement this option and send this patch again.
> > 
> > Let's wait for Jaegeuk's comments first?
> > 
> >>
> >> Without fs_iohint, Even if data blocks are moved due to GC, 
> >> we should keep user hints. And if user hints are not given, 
> >> any hints are not passed down to block layer, right?
> > 
> > Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
> > layout, so later when we trigger GC, we can use the last io_hint in inode rather
> > than giving no hint or fs' hint.
> > 
> > I think it needs to discuss with original author of IO hint, what is the IO hint
> > policy when filesystem move block by itself after inode has been released in system.
> > 
> > Thanks,
> > 
> >>
> >> Thank you for comments.
> >>
> >>> Thanks,
> >>>
> >>>>
> >>>> file hint      segment type        io hint
> >>>> ---------      ------------        -------
> >>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
> >>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
> >>>> LIFE_LONG      WARM_DATA           LIFE_NONE
> >>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
> >>>>
> >>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
> >>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
> >>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
> >>>> difference between them is a little ambigous for users, and if WARM_DATA 
> >>>> segment has two different hints, it can makes GC non-efficient.
> >>>>
> >>>> I wonder your thought about this.
> >>>>
> >>>> Thanks.
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>>
> >>>>
> >>>> ------------------------------------------------------------------------------
> >>>> Check out the vibrant tech community on one of the world's most
> >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>> _______________________________________________
> >>>> Linux-f2fs-devel mailing list
> >>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>
> >>>
> >>
> >> .
> >>
> > 
> > 
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-15  2:06             ` Jaegeuk Kim
@ 2017-12-18  7:28               ` Hyunchul Lee
  2017-12-23  9:44               ` Chao Yu
  1 sibling, 0 replies; 18+ messages in thread
From: Hyunchul Lee @ 2017-12-18  7:28 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, Chao Yu
  Cc: Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

Hi Jaegeuk,

Agreed. If Chao agrees with this policy, I will implement it.

Thanks for the comment.

On 12/15/2017 11:06 AM, Jaegeuk Kim wrote:
> On 12/14, Hyunchul Lee wrote:
>> Hi Jaegeuk,
>>
>> I need your comment about the fs_iohint mount option.
>>
>> a) w/o fs_iohint, propagate user hints to low layer.
>> b) w/ fs_iohint, ignore user hints, and use hints which is generated
>> with F2FS.
>>
>> Chao suggests this option. because user hints are more accurate than
>> file system.
>>
>> This is resonable, But I have some concerns about this option. 
>> The first thing is that blocks of a segments have different hints. This
>> could make GC less effective. 
>> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
>> really needed. I think that difference between them is a little ambigous 
>> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
>> hints by F2FS.
> 
> I think what we really can do would assign many user hints to our 3 DATA
> logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
> Then, we can decide how to keep that as much as possible, since we have
> another filesystem metadata such as meta and nodes. In addition, I don't
> think we have to keep the original user-hints which makes F2FS logs be
> messed up.
> 
> With that mind, I can think of the below cases. Especially, if user wants
> to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
> In order to keep this policy, I think fs_iohints would be better to be a
> feature set by mkfs.f2fs and detected by sysfs entries for users.
> 
> 1) w/ fs_iohints
> 
> User                        F2FS               Block
> -------------------------------------------------------------------
>                             Meta               WRITE_LIFE_MEDIUM
>                             HOT_NODE           WRITE_LIFE_NOTSET
>                             WARM_NODE          -'
>                             COLD_NODE          WRITE_LIFE_NONE
> ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
> extention list              -'                 -'
> WRITE_LIFE_EXTREME          -'                 -'
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> 
> -- buffered_io
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
> WRITE_LIFE_NONE             -'                 -'
> WRITE_LIFE_MEDIUM           -'                 -'
> WRITE_LIFE_LONG             -'                 -'
> 
> -- direct_io (Not recommendable)
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> 
> 2) w/o fs_iohints
> 
> User                        F2FS               Block
> -------------------------------------------------------------------
>                             Meta               -
>                             HOT_NODE           -
>                             WARM_NODE          -
>                             COLD_NODE          -
> ioctl(cold)                 COLD_DATA          -
> extention list              -'                 -
> 
> -- buffered_io
> WRITE_LIFE_EXTREME          COLD_DATA          -
> WRITE_LIFE_SHORT            HOT_DATA           -
> WRITE_LIFE_NOT_SET          WARM_DATA          -
> WRITE_LIFE_NONE             -'                 -
> WRITE_LIFE_MEDIUM           -'                 -
> WRITE_LIFE_LONG             -'                 -
> 
> -- direct_io
> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> 
> 
> Note that, I don't much care about how to manipulate streamid in nvme driver
> in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
> in different ways. Taking a look at the definition, at least, we don't need
> to assume that those are same at all. For example, if we can expolit this in
> UFS driver, we can pass all the stream ids to the device as context ids.
> 
> Thanks,
> 
>>
>> Thanks.
>>
>> On 12/12/2017 11:45 AM, Chao Yu wrote:
>>> Hi Hyunchul,
>>>
>>> On 2017/12/12 10:15, Hyunchul Lee wrote:
>>>> Hi Chao,
>>>>
>>>> On 12/11/2017 10:15 PM, Chao Yu wrote:
>>>>> Hi Hyunchul,
>>>>>
>>>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
>>>>>> Hi Chao,
>>>>>>
>>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>>>>>> Hi Hyunchul,
>>>>>>>
>>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>
>>>>>>>> This implements which hint is passed down to block layer
>>>>>>>> for datas from the specific segment type.
>>>>>>>>
>>>>>>>> segment type                     hints
>>>>>>>> ------------                     -----
>>>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>>>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>>>>>> META_DATA                        WRITE_LIFE_SHORT
>>>>>>>
>>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>>>>>> provider hint to lower layer according to hot/cold separation ability,
>>>>>>> it will be okay. But once user give his hint which may be more accurate
>>>>>>> than filesystem, hint converted by f2fs may be wrong.
>>>>>>>
>>>>>>> So what do you think of adding an option to control whether filesystem
>>>>>>> can convert hint user given?
>>>>>>>
>>>>>>
>>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>>>>>> converted to different hints.
>>>>>
>>>>> What I mean is introducing a mount option, e.g. fs_iohint,
>>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
>>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
>>>>> with filesystem's private rule.
>>>>>
>>>>
>>>> Okay, I will implement this option and send this patch again.
>>>
>>> Let's wait for Jaegeuk's comments first?
>>>
>>>>
>>>> Without fs_iohint, Even if data blocks are moved due to GC, 
>>>> we should keep user hints. And if user hints are not given, 
>>>> any hints are not passed down to block layer, right?
>>>
>>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
>>> layout, so later when we trigger GC, we can use the last io_hint in inode rather
>>> than giving no hint or fs' hint.
>>>
>>> I think it needs to discuss with original author of IO hint, what is the IO hint
>>> policy when filesystem move block by itself after inode has been released in system.
>>>
>>> Thanks,
>>>
>>>>
>>>> Thank you for comments.
>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> file hint      segment type        io hint
>>>>>> ---------      ------------        -------
>>>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>>>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>>>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
>>>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>>>>>
>>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>>>>>> difference between them is a little ambigous for users, and if WARM_DATA 
>>>>>> segment has two different hints, it can makes GC non-efficient.
>>>>>>
>>>>>> I wonder your thought about this.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>> _______________________________________________
>>>>>> Linux-f2fs-devel mailing list
>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-15  2:06             ` Jaegeuk Kim
  2017-12-18  7:28               ` Hyunchul Lee
@ 2017-12-23  9:44               ` Chao Yu
  2017-12-28  3:26                 ` Jaegeuk Kim
  1 sibling, 1 reply; 18+ messages in thread
From: Chao Yu @ 2017-12-23  9:44 UTC (permalink / raw)
  To: Jaegeuk Kim, Hyunchul Lee
  Cc: Chao Yu, Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

On 2017/12/15 10:06, Jaegeuk Kim wrote:
> On 12/14, Hyunchul Lee wrote:
>> Hi Jaegeuk,
>>
>> I need your comment about the fs_iohint mount option.
>>
>> a) w/o fs_iohint, propagate user hints to low layer.
>> b) w/ fs_iohint, ignore user hints, and use hints which is generated
>> with F2FS.
>>
>> Chao suggests this option. because user hints are more accurate than
>> file system.
>>
>> This is resonable, But I have some concerns about this option. 
>> The first thing is that blocks of a segments have different hints. This
>> could make GC less effective. 
>> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
>> really needed. I think that difference between them is a little ambigous 
>> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
>> hints by F2FS.
> 
> I think what we really can do would assign many user hints to our 3 DATA
> logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
> Then, we can decide how to keep that as much as possible, since we have
> another filesystem metadata such as meta and nodes. In addition, I don't
> think we have to keep the original user-hints which makes F2FS logs be
> messed up.
> 
> With that mind, I can think of the below cases. Especially, if user wants
> to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.



> In order to keep this policy, I think fs_iohints would be better to be a
> feature set by mkfs.f2fs and detected by sysfs entries for users.
> 
> 1) w/ fs_iohints
> 
> User                        F2FS               Block
> -------------------------------------------------------------------
>                             Meta               WRITE_LIFE_MEDIUM
>                             HOT_NODE           WRITE_LIFE_NOTSET
>                             WARM_NODE          -'
>                             COLD_NODE          WRITE_LIFE_NONE
> ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
> extention list              -'                 -'
> WRITE_LIFE_EXTREME          -'                 -'
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> 
> -- buffered_io
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
> WRITE_LIFE_NONE             -'                 -'
> WRITE_LIFE_MEDIUM           -'                 -'
> WRITE_LIFE_LONG             -'                 -'
> 
> -- direct_io (Not recommendable)
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG

Agreed with above IO hint mapping rule.

> 
> 2) w/o fs_iohints
> 
> User                        F2FS               Block
> -------------------------------------------------------------------
>                             Meta               -
>                             HOT_NODE           -
>                             WARM_NODE          -
>                             COLD_NODE          -
> ioctl(cold)                 COLD_DATA          -
> extention list              -'                 -
> 
> -- buffered_io
> WRITE_LIFE_EXTREME          COLD_DATA          -
> WRITE_LIFE_SHORT            HOT_DATA           -
> WRITE_LIFE_NOT_SET          WARM_DATA          -
> WRITE_LIFE_NONE             -'                 -
> WRITE_LIFE_MEDIUM           -'                 -
> WRITE_LIFE_LONG             -'                 -

Now we recommend direct_io if user wants to give IO hint for storage, I suspect
that user would suffer performance regression issue w/o buffered IO.

Another problem is that, now, in Android, it will be very hard to prompt
application to migrate their IO pattern from buffered IO to direct IO, one
possible way is distinguishing user data lifetime from FWK, e.g. set
WRITE_LIFE_SHORT for cache file or tmp file, set WRITE_LIFE_EXTREME for media file.

In order to support buffered_io, would it be better to change mapping as below?

-- buffered_io
WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE             -'                 -'
WRITE_LIFE_MEDIUM           -'                 -'
WRITE_LIFE_LONG             -'                 -'

Thanks,

> 
> -- direct_io
> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> 
> 
> Note that, I don't much care about how to manipulate streamid in nvme driver
> in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
> in different ways. Taking a look at the definition, at least, we don't need
> to assume that those are same at all. For example, if we can expolit this in
> UFS driver, we can pass all the stream ids to the device as context ids.
> 
> Thanks,
> 
>>
>> Thanks.
>>
>> On 12/12/2017 11:45 AM, Chao Yu wrote:
>>> Hi Hyunchul,
>>>
>>> On 2017/12/12 10:15, Hyunchul Lee wrote:
>>>> Hi Chao,
>>>>
>>>> On 12/11/2017 10:15 PM, Chao Yu wrote:
>>>>> Hi Hyunchul,
>>>>>
>>>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
>>>>>> Hi Chao,
>>>>>>
>>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>>>>>> Hi Hyunchul,
>>>>>>>
>>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>
>>>>>>>> This implements which hint is passed down to block layer
>>>>>>>> for datas from the specific segment type.
>>>>>>>>
>>>>>>>> segment type                     hints
>>>>>>>> ------------                     -----
>>>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>>>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>>>>>> META_DATA                        WRITE_LIFE_SHORT
>>>>>>>
>>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>>>>>> provider hint to lower layer according to hot/cold separation ability,
>>>>>>> it will be okay. But once user give his hint which may be more accurate
>>>>>>> than filesystem, hint converted by f2fs may be wrong.
>>>>>>>
>>>>>>> So what do you think of adding an option to control whether filesystem
>>>>>>> can convert hint user given?
>>>>>>>
>>>>>>
>>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>>>>>> converted to different hints.
>>>>>
>>>>> What I mean is introducing a mount option, e.g. fs_iohint,
>>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
>>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
>>>>> with filesystem's private rule.
>>>>>
>>>>
>>>> Okay, I will implement this option and send this patch again.
>>>
>>> Let's wait for Jaegeuk's comments first?
>>>
>>>>
>>>> Without fs_iohint, Even if data blocks are moved due to GC, 
>>>> we should keep user hints. And if user hints are not given, 
>>>> any hints are not passed down to block layer, right?
>>>
>>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
>>> layout, so later when we trigger GC, we can use the last io_hint in inode rather
>>> than giving no hint or fs' hint.
>>>
>>> I think it needs to discuss with original author of IO hint, what is the IO hint
>>> policy when filesystem move block by itself after inode has been released in system.
>>>
>>> Thanks,
>>>
>>>>
>>>> Thank you for comments.
>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> file hint      segment type        io hint
>>>>>> ---------      ------------        -------
>>>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>>>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>>>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
>>>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>>>>>
>>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>>>>>> difference between them is a little ambigous for users, and if WARM_DATA 
>>>>>> segment has two different hints, it can makes GC non-efficient.
>>>>>>
>>>>>> I wonder your thought about this.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>> _______________________________________________
>>>>>> Linux-f2fs-devel mailing list
>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>
> 
> .
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-23  9:44               ` Chao Yu
@ 2017-12-28  3:26                 ` Jaegeuk Kim
  2017-12-28  5:05                   ` Hyunchul Lee
  0 siblings, 1 reply; 18+ messages in thread
From: Jaegeuk Kim @ 2017-12-28  3:26 UTC (permalink / raw)
  To: Chao Yu
  Cc: Hyunchul Lee, Chao Yu, Jens Axboe, linux-kernel,
	linux-f2fs-devel, kernel-team, linux-fsdevel, Hyunchul Lee

On 12/23, Chao Yu wrote:
> On 2017/12/15 10:06, Jaegeuk Kim wrote:
> > On 12/14, Hyunchul Lee wrote:
> >> Hi Jaegeuk,
> >>
> >> I need your comment about the fs_iohint mount option.
> >>
> >> a) w/o fs_iohint, propagate user hints to low layer.
> >> b) w/ fs_iohint, ignore user hints, and use hints which is generated
> >> with F2FS.
> >>
> >> Chao suggests this option. because user hints are more accurate than
> >> file system.
> >>
> >> This is resonable, But I have some concerns about this option. 
> >> The first thing is that blocks of a segments have different hints. This
> >> could make GC less effective. 
> >> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
> >> really needed. I think that difference between them is a little ambigous 
> >> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
> >> hints by F2FS.
> > 
> > I think what we really can do would assign many user hints to our 3 DATA
> > logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
> > Then, we can decide how to keep that as much as possible, since we have
> > another filesystem metadata such as meta and nodes. In addition, I don't
> > think we have to keep the original user-hints which makes F2FS logs be
> > messed up.
> > 
> > With that mind, I can think of the below cases. Especially, if user wants
> > to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
> 
> 
> 
> > In order to keep this policy, I think fs_iohints would be better to be a
> > feature set by mkfs.f2fs and detected by sysfs entries for users.
> > 
> > 1) w/ fs_iohints
> > 
> > User                        F2FS               Block
> > -------------------------------------------------------------------
> >                             Meta               WRITE_LIFE_MEDIUM
> >                             HOT_NODE           WRITE_LIFE_NOTSET
> >                             WARM_NODE          -'
> >                             COLD_NODE          WRITE_LIFE_NONE
> > ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
> > extention list              -'                 -'
> > WRITE_LIFE_EXTREME          -'                 -'
> > WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> > 
> > -- buffered_io
> > WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
> > WRITE_LIFE_NONE             -'                 -'
> > WRITE_LIFE_MEDIUM           -'                 -'
> > WRITE_LIFE_LONG             -'                 -'
> > 
> > -- direct_io (Not recommendable)
> > WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> > WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> > WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> > WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> 
> Agreed with above IO hint mapping rule.
> 
> > 
> > 2) w/o fs_iohints
> > 
> > User                        F2FS               Block
> > -------------------------------------------------------------------
> >                             Meta               -
> >                             HOT_NODE           -
> >                             WARM_NODE          -
> >                             COLD_NODE          -
> > ioctl(cold)                 COLD_DATA          -
> > extention list              -'                 -
> > 
> > -- buffered_io
> > WRITE_LIFE_EXTREME          COLD_DATA          -
> > WRITE_LIFE_SHORT            HOT_DATA           -
> > WRITE_LIFE_NOT_SET          WARM_DATA          -
> > WRITE_LIFE_NONE             -'                 -
> > WRITE_LIFE_MEDIUM           -'                 -
> > WRITE_LIFE_LONG             -'                 -
> 
> Now we recommend direct_io if user wants to give IO hint for storage, I suspect
> that user would suffer performance regression issue w/o buffered IO.
> 
> Another problem is that, now, in Android, it will be very hard to prompt
> application to migrate their IO pattern from buffered IO to direct IO, one
> possible way is distinguishing user data lifetime from FWK, e.g. set
> WRITE_LIFE_SHORT for cache file or tmp file, set WRITE_LIFE_EXTREME for media file.
> 
> In order to support buffered_io, would it be better to change mapping as below?
> 
> -- buffered_io
> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> WRITE_LIFE_NONE             -'                 -'
> WRITE_LIFE_MEDIUM           -'                 -'
> WRITE_LIFE_LONG             -'                 -'

Agreed, and it makes more sense that we'd better keep the write hints on
userdata given by applications.

BTW, since we couldn't get any performance numbers with these, how about
adding a mount option like "-o iohints=MODE" where MODE may be one of
"fs-based", "user-based", and "off"?

Thanks,

> 
> Thanks,
> 
> > 
> > -- direct_io
> > WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> > WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> > WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> > WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> > WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> > WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> > 
> > 
> > Note that, I don't much care about how to manipulate streamid in nvme driver
> > in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
> > in different ways. Taking a look at the definition, at least, we don't need
> > to assume that those are same at all. For example, if we can expolit this in
> > UFS driver, we can pass all the stream ids to the device as context ids.
> > 
> > Thanks,
> > 
> >>
> >> Thanks.
> >>
> >> On 12/12/2017 11:45 AM, Chao Yu wrote:
> >>> Hi Hyunchul,
> >>>
> >>> On 2017/12/12 10:15, Hyunchul Lee wrote:
> >>>> Hi Chao,
> >>>>
> >>>> On 12/11/2017 10:15 PM, Chao Yu wrote:
> >>>>> Hi Hyunchul,
> >>>>>
> >>>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
> >>>>>> Hi Chao,
> >>>>>>
> >>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
> >>>>>>> Hi Hyunchul,
> >>>>>>>
> >>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
> >>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>>>
> >>>>>>>> This implements which hint is passed down to block layer
> >>>>>>>> for datas from the specific segment type.
> >>>>>>>>
> >>>>>>>> segment type                     hints
> >>>>>>>> ------------                     -----
> >>>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> >>>>>>>> WARM_DATA                        WRITE_LIFE_NONE
> >>>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> >>>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
> >>>>>>>> META_DATA                        WRITE_LIFE_SHORT
> >>>>>>>
> >>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
> >>>>>>> provider hint to lower layer according to hot/cold separation ability,
> >>>>>>> it will be okay. But once user give his hint which may be more accurate
> >>>>>>> than filesystem, hint converted by f2fs may be wrong.
> >>>>>>>
> >>>>>>> So what do you think of adding an option to control whether filesystem
> >>>>>>> can convert hint user given?
> >>>>>>>
> >>>>>>
> >>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
> >>>>>> converted to different hints.
> >>>>>
> >>>>> What I mean is introducing a mount option, e.g. fs_iohint,
> >>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
> >>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
> >>>>> with filesystem's private rule.
> >>>>>
> >>>>
> >>>> Okay, I will implement this option and send this patch again.
> >>>
> >>> Let's wait for Jaegeuk's comments first?
> >>>
> >>>>
> >>>> Without fs_iohint, Even if data blocks are moved due to GC, 
> >>>> we should keep user hints. And if user hints are not given, 
> >>>> any hints are not passed down to block layer, right?
> >>>
> >>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
> >>> layout, so later when we trigger GC, we can use the last io_hint in inode rather
> >>> than giving no hint or fs' hint.
> >>>
> >>> I think it needs to discuss with original author of IO hint, what is the IO hint
> >>> policy when filesystem move block by itself after inode has been released in system.
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thank you for comments.
> >>>>
> >>>>> Thanks,
> >>>>>
> >>>>>>
> >>>>>> file hint      segment type        io hint
> >>>>>> ---------      ------------        -------
> >>>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
> >>>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
> >>>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
> >>>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
> >>>>>>
> >>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
> >>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
> >>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
> >>>>>> difference between them is a little ambigous for users, and if WARM_DATA 
> >>>>>> segment has two different hints, it can makes GC non-efficient.
> >>>>>>
> >>>>>> I wonder your thought about this.
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------------------------
> >>>>>> Check out the vibrant tech community on one of the world's most
> >>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>>>> _______________________________________________
> >>>>>> Linux-f2fs-devel mailing list
> >>>>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>>>
> >>>>>
> >>>>
> >>>> .
> >>>>
> >>>
> >>>
> >>> ------------------------------------------------------------------------------
> >>> Check out the vibrant tech community on one of the world's most
> >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>> _______________________________________________
> >>> Linux-f2fs-devel mailing list
> >>> Linux-f2fs-devel@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>
> > 
> > .
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-28  3:26                 ` Jaegeuk Kim
@ 2017-12-28  5:05                   ` Hyunchul Lee
  2017-12-28 16:32                     ` Jaegeuk Kim
  0 siblings, 1 reply; 18+ messages in thread
From: Hyunchul Lee @ 2017-12-28  5:05 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu
  Cc: Chao Yu, Jens Axboe, linux-kernel, linux-f2fs-devel, kernel-team,
	linux-fsdevel, Hyunchul Lee

Hi Jaegeuk,

On 12/28/2017 12:26 PM, Jaegeuk Kim wrote:
> On 12/23, Chao Yu wrote:
>> On 2017/12/15 10:06, Jaegeuk Kim wrote:
>>> On 12/14, Hyunchul Lee wrote:
>>>> Hi Jaegeuk,
>>>>
>>>> I need your comment about the fs_iohint mount option.
>>>>
>>>> a) w/o fs_iohint, propagate user hints to low layer.
>>>> b) w/ fs_iohint, ignore user hints, and use hints which is generated
>>>> with F2FS.
>>>>
>>>> Chao suggests this option. because user hints are more accurate than
>>>> file system.
>>>>
>>>> This is resonable, But I have some concerns about this option. 
>>>> The first thing is that blocks of a segments have different hints. This
>>>> could make GC less effective. 
>>>> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
>>>> really needed. I think that difference between them is a little ambigous 
>>>> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
>>>> hints by F2FS.
>>>
>>> I think what we really can do would assign many user hints to our 3 DATA
>>> logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
>>> Then, we can decide how to keep that as much as possible, since we have
>>> another filesystem metadata such as meta and nodes. In addition, I don't
>>> think we have to keep the original user-hints which makes F2FS logs be
>>> messed up.
>>>
>>> With that mind, I can think of the below cases. Especially, if user wants
>>> to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
>>
>>
>>
>>> In order to keep this policy, I think fs_iohints would be better to be a
>>> feature set by mkfs.f2fs and detected by sysfs entries for users.
>>>
>>> 1) w/ fs_iohints
>>>
>>> User                        F2FS               Block
>>> -------------------------------------------------------------------
>>>                             Meta               WRITE_LIFE_MEDIUM
>>>                             HOT_NODE           WRITE_LIFE_NOTSET
>>>                             WARM_NODE          -'
>>>                             COLD_NODE          WRITE_LIFE_NONE
>>> ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
>>> extention list              -'                 -'
>>> WRITE_LIFE_EXTREME          -'                 -'
>>> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
>>>
>>> -- buffered_io
>>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
>>> WRITE_LIFE_NONE             -'                 -'
>>> WRITE_LIFE_MEDIUM           -'                 -'
>>> WRITE_LIFE_LONG             -'                 -'
>>>
>>> -- direct_io (Not recommendable)
>>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
>>> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
>>> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
>>> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
>>
>> Agreed with above IO hint mapping rule.
>>
>>>
>>> 2) w/o fs_iohints
>>>
>>> User                        F2FS               Block
>>> -------------------------------------------------------------------
>>>                             Meta               -
>>>                             HOT_NODE           -
>>>                             WARM_NODE          -
>>>                             COLD_NODE          -
>>> ioctl(cold)                 COLD_DATA          -
>>> extention list              -'                 -
>>>
>>> -- buffered_io
>>> WRITE_LIFE_EXTREME          COLD_DATA          -
>>> WRITE_LIFE_SHORT            HOT_DATA           -
>>> WRITE_LIFE_NOT_SET          WARM_DATA          -
>>> WRITE_LIFE_NONE             -'                 -
>>> WRITE_LIFE_MEDIUM           -'                 -
>>> WRITE_LIFE_LONG             -'                 -
>>
>> Now we recommend direct_io if user wants to give IO hint for storage, I suspect
>> that user would suffer performance regression issue w/o buffered IO.
>>
>> Another problem is that, now, in Android, it will be very hard to prompt
>> application to migrate their IO pattern from buffered IO to direct IO, one
>> possible way is distinguishing user data lifetime from FWK, e.g. set
>> WRITE_LIFE_SHORT for cache file or tmp file, set WRITE_LIFE_EXTREME for media file.
>>
>> In order to support buffered_io, would it be better to change mapping as below?
>>
>> -- buffered_io
>> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
>> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
>> WRITE_LIFE_NONE             -'                 -'
>> WRITE_LIFE_MEDIUM           -'                 -'
>> WRITE_LIFE_LONG             -'                 -'
> 
> Agreed, and it makes more sense that we'd better keep the write hints on
> userdata given by applications.
> 
> BTW, since we couldn't get any performance numbers with these, how about
> adding a mount option like "-o iohints=MODE" where MODE may be one of
> "fs-based", "user-based", and "off"?
> 

"fs-based" equals "with fs_iohints", "user-based" equals "without fs_iohints"
+ Chao's suggest, and "off" means not passing down hints to block layer. right?

Thanks.

> Thanks,
> 
>>
>> Thanks,
>>
>>>
>>> -- direct_io
>>> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
>>> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
>>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
>>> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
>>> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
>>> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
>>>
>>>
>>> Note that, I don't much care about how to manipulate streamid in nvme driver
>>> in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
>>> in different ways. Taking a look at the definition, at least, we don't need
>>> to assume that those are same at all. For example, if we can expolit this in
>>> UFS driver, we can pass all the stream ids to the device as context ids.
>>>
>>> Thanks,
>>>
>>>>
>>>> Thanks.
>>>>
>>>> On 12/12/2017 11:45 AM, Chao Yu wrote:
>>>>> Hi Hyunchul,
>>>>>
>>>>> On 2017/12/12 10:15, Hyunchul Lee wrote:
>>>>>> Hi Chao,
>>>>>>
>>>>>> On 12/11/2017 10:15 PM, Chao Yu wrote:
>>>>>>> Hi Hyunchul,
>>>>>>>
>>>>>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
>>>>>>>> Hi Chao,
>>>>>>>>
>>>>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
>>>>>>>>> Hi Hyunchul,
>>>>>>>>>
>>>>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
>>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
>>>>>>>>>>
>>>>>>>>>> This implements which hint is passed down to block layer
>>>>>>>>>> for datas from the specific segment type.
>>>>>>>>>>
>>>>>>>>>> segment type                     hints
>>>>>>>>>> ------------                     -----
>>>>>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
>>>>>>>>>> WARM_DATA                        WRITE_LIFE_NONE
>>>>>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
>>>>>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
>>>>>>>>>> META_DATA                        WRITE_LIFE_SHORT
>>>>>>>>>
>>>>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
>>>>>>>>> provider hint to lower layer according to hot/cold separation ability,
>>>>>>>>> it will be okay. But once user give his hint which may be more accurate
>>>>>>>>> than filesystem, hint converted by f2fs may be wrong.
>>>>>>>>>
>>>>>>>>> So what do you think of adding an option to control whether filesystem
>>>>>>>>> can convert hint user given?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
>>>>>>>> converted to different hints.
>>>>>>>
>>>>>>> What I mean is introducing a mount option, e.g. fs_iohint,
>>>>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
>>>>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
>>>>>>> with filesystem's private rule.
>>>>>>>
>>>>>>
>>>>>> Okay, I will implement this option and send this patch again.
>>>>>
>>>>> Let's wait for Jaegeuk's comments first?
>>>>>
>>>>>>
>>>>>> Without fs_iohint, Even if data blocks are moved due to GC, 
>>>>>> we should keep user hints. And if user hints are not given, 
>>>>>> any hints are not passed down to block layer, right?
>>>>>
>>>>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
>>>>> layout, so later when we trigger GC, we can use the last io_hint in inode rather
>>>>> than giving no hint or fs' hint.
>>>>>
>>>>> I think it needs to discuss with original author of IO hint, what is the IO hint
>>>>> policy when filesystem move block by itself after inode has been released in system.
>>>>>
>>>>> Thanks,
>>>>>
>>>>>>
>>>>>> Thank you for comments.
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>>
>>>>>>>> file hint      segment type        io hint
>>>>>>>> ---------      ------------        -------
>>>>>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
>>>>>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
>>>>>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
>>>>>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
>>>>>>>>
>>>>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
>>>>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
>>>>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
>>>>>>>> difference between them is a little ambigous for users, and if WARM_DATA 
>>>>>>>> segment has two different hints, it can makes GC non-efficient.
>>>>>>>>
>>>>>>>> I wonder your thought about this.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Check out the vibrant tech community on one of the world's most
>>>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>>>>> _______________________________________________
>>>>>>>> Linux-f2fs-devel mailing list
>>>>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Check out the vibrant tech community on one of the world's most
>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>>>> _______________________________________________
>>>>> Linux-f2fs-devel mailing list
>>>>> Linux-f2fs-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>>>>>
>>>
>>> .
>>>
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [f2fs-dev] [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write
  2017-12-28  5:05                   ` Hyunchul Lee
@ 2017-12-28 16:32                     ` Jaegeuk Kim
  0 siblings, 0 replies; 18+ messages in thread
From: Jaegeuk Kim @ 2017-12-28 16:32 UTC (permalink / raw)
  To: Hyunchul Lee
  Cc: Chao Yu, Chao Yu, Jens Axboe, linux-kernel, linux-f2fs-devel,
	kernel-team, linux-fsdevel, Hyunchul Lee

On 12/28, Hyunchul Lee wrote:
> Hi Jaegeuk,
> 
> On 12/28/2017 12:26 PM, Jaegeuk Kim wrote:
> > On 12/23, Chao Yu wrote:
> >> On 2017/12/15 10:06, Jaegeuk Kim wrote:
> >>> On 12/14, Hyunchul Lee wrote:
> >>>> Hi Jaegeuk,
> >>>>
> >>>> I need your comment about the fs_iohint mount option.
> >>>>
> >>>> a) w/o fs_iohint, propagate user hints to low layer.
> >>>> b) w/ fs_iohint, ignore user hints, and use hints which is generated
> >>>> with F2FS.
> >>>>
> >>>> Chao suggests this option. because user hints are more accurate than
> >>>> file system.
> >>>>
> >>>> This is resonable, But I have some concerns about this option. 
> >>>> The first thing is that blocks of a segments have different hints. This
> >>>> could make GC less effective. 
> >>>> The second is that the separation between LIFE_MEDIUM and LIFE_LONG is 
> >>>> really needed. I think that difference between them is a little ambigous 
> >>>> for users, and LIFE_SHORT and LIFE_EXTREME is converted to different 
> >>>> hints by F2FS.
> >>>
> >>> I think what we really can do would assign many user hints to our 3 DATA
> >>> logs likewise rw_hint_to_seg_type(), since it's just hints for user data.
> >>> Then, we can decide how to keep that as much as possible, since we have
> >>> another filesystem metadata such as meta and nodes. In addition, I don't
> >>> think we have to keep the original user-hints which makes F2FS logs be
> >>> messed up.
> >>>
> >>> With that mind, I can think of the below cases. Especially, if user wants
> >>> to keep their io_hints, we'd better recommend to use direct_io w/o fs_iohints.
> >>
> >>
> >>
> >>> In order to keep this policy, I think fs_iohints would be better to be a
> >>> feature set by mkfs.f2fs and detected by sysfs entries for users.
> >>>
> >>> 1) w/ fs_iohints
> >>>
> >>> User                        F2FS               Block
> >>> -------------------------------------------------------------------
> >>>                             Meta               WRITE_LIFE_MEDIUM
> >>>                             HOT_NODE           WRITE_LIFE_NOTSET
> >>>                             WARM_NODE          -'
> >>>                             COLD_NODE          WRITE_LIFE_NONE
> >>> ioctl(cold)                 COLD_DATA          WRITE_LIFE_EXTREME
> >>> extention list              -'                 -'
> >>> WRITE_LIFE_EXTREME          -'                 -'
> >>> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> >>>
> >>> -- buffered_io
> >>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_LONG
> >>> WRITE_LIFE_NONE             -'                 -'
> >>> WRITE_LIFE_MEDIUM           -'                 -'
> >>> WRITE_LIFE_LONG             -'                 -'
> >>>
> >>> -- direct_io (Not recommendable)
> >>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> >>> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> >>> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> >>> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> >>
> >> Agreed with above IO hint mapping rule.
> >>
> >>>
> >>> 2) w/o fs_iohints
> >>>
> >>> User                        F2FS               Block
> >>> -------------------------------------------------------------------
> >>>                             Meta               -
> >>>                             HOT_NODE           -
> >>>                             WARM_NODE          -
> >>>                             COLD_NODE          -
> >>> ioctl(cold)                 COLD_DATA          -
> >>> extention list              -'                 -
> >>>
> >>> -- buffered_io
> >>> WRITE_LIFE_EXTREME          COLD_DATA          -
> >>> WRITE_LIFE_SHORT            HOT_DATA           -
> >>> WRITE_LIFE_NOT_SET          WARM_DATA          -
> >>> WRITE_LIFE_NONE             -'                 -
> >>> WRITE_LIFE_MEDIUM           -'                 -
> >>> WRITE_LIFE_LONG             -'                 -
> >>
> >> Now we recommend direct_io if user wants to give IO hint for storage, I suspect
> >> that user would suffer performance regression issue w/o buffered IO.
> >>
> >> Another problem is that, now, in Android, it will be very hard to prompt
> >> application to migrate their IO pattern from buffered IO to direct IO, one
> >> possible way is distinguishing user data lifetime from FWK, e.g. set
> >> WRITE_LIFE_SHORT for cache file or tmp file, set WRITE_LIFE_EXTREME for media file.
> >>
> >> In order to support buffered_io, would it be better to change mapping as below?
> >>
> >> -- buffered_io
> >> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> >> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> >> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> >> WRITE_LIFE_NONE             -'                 -'
> >> WRITE_LIFE_MEDIUM           -'                 -'
> >> WRITE_LIFE_LONG             -'                 -'
> > 
> > Agreed, and it makes more sense that we'd better keep the write hints on
> > userdata given by applications.
> > 
> > BTW, since we couldn't get any performance numbers with these, how about
> > adding a mount option like "-o iohints=MODE" where MODE may be one of
> > "fs-based", "user-based", and "off"?
> > 
> 
> "fs-based" equals "with fs_iohints", "user-based" equals "without fs_iohints"
> + Chao's suggest, and "off" means not passing down hints to block layer. right?

Yup, this'd allow us to add more options easily later.

Thanks,

> 
> Thanks.
> 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> -- direct_io
> >>> WRITE_LIFE_EXTREME          COLD_DATA          WRITE_LIFE_EXTREME
> >>> WRITE_LIFE_SHORT            HOT_DATA           WRITE_LIFE_SHORT
> >>> WRITE_LIFE_NOT_SET          WARM_DATA          WRITE_LIFE_NOT_SET
> >>> WRITE_LIFE_NONE             -'                 WRITE_LIFE_NONE
> >>> WRITE_LIFE_MEDIUM           -'                 WRITE_LIFE_MEDIUM
> >>> WRITE_LIFE_LONG             -'                 WRITE_LIFE_LONG
> >>>
> >>>
> >>> Note that, I don't much care about how to manipulate streamid in nvme driver
> >>> in terms of LIFE_NONE or LIFE_NOTSET, since other drivers can handle them
> >>> in different ways. Taking a look at the definition, at least, we don't need
> >>> to assume that those are same at all. For example, if we can expolit this in
> >>> UFS driver, we can pass all the stream ids to the device as context ids.
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>> Thanks.
> >>>>
> >>>> On 12/12/2017 11:45 AM, Chao Yu wrote:
> >>>>> Hi Hyunchul,
> >>>>>
> >>>>> On 2017/12/12 10:15, Hyunchul Lee wrote:
> >>>>>> Hi Chao,
> >>>>>>
> >>>>>> On 12/11/2017 10:15 PM, Chao Yu wrote:
> >>>>>>> Hi Hyunchul,
> >>>>>>>
> >>>>>>> On 2017/12/1 16:28, Hyunchul Lee wrote:
> >>>>>>>> Hi Chao,
> >>>>>>>>
> >>>>>>>> On 11/30/2017 04:06 PM, Chao Yu wrote:
> >>>>>>>>> Hi Hyunchul,
> >>>>>>>>>
> >>>>>>>>> On 2017/11/28 8:23, Hyunchul Lee wrote:
> >>>>>>>>>> From: Hyunchul Lee <cheol.lee@lge.com>
> >>>>>>>>>>
> >>>>>>>>>> This implements which hint is passed down to block layer
> >>>>>>>>>> for datas from the specific segment type.
> >>>>>>>>>>
> >>>>>>>>>> segment type                     hints
> >>>>>>>>>> ------------                     -----
> >>>>>>>>>> COLD_NODE & COLD_DATA            WRITE_LIFE_EXTREME
> >>>>>>>>>> WARM_DATA                        WRITE_LIFE_NONE
> >>>>>>>>>> HOT_NODE & WARM_NODE             WRITE_LIFE_LONG
> >>>>>>>>>> HOT_DATA                         WRITE_LIFE_MEDIUM
> >>>>>>>>>> META_DATA                        WRITE_LIFE_SHORT
> >>>>>>>>>
> >>>>>>>>> Just noticed, if our user do not give the hint via ioctl, f2fs can
> >>>>>>>>> provider hint to lower layer according to hot/cold separation ability,
> >>>>>>>>> it will be okay. But once user give his hint which may be more accurate
> >>>>>>>>> than filesystem, hint converted by f2fs may be wrong.
> >>>>>>>>>
> >>>>>>>>> So what do you think of adding an option to control whether filesystem
> >>>>>>>>> can convert hint user given?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I think it is okay for LIFE_SHORT and LIFE_EXTREME. because they are 
> >>>>>>>> converted to different hints.
> >>>>>>>
> >>>>>>> What I mean is introducing a mount option, e.g. fs_iohint,
> >>>>>>> a) w/o fs_iohint, propagate file/inode io_hint to low layer.
> >>>>>>> b) w/ fs_iohint, ignore file/inode io_hint, use io_hint which is generated
> >>>>>>> with filesystem's private rule.
> >>>>>>>
> >>>>>>
> >>>>>> Okay, I will implement this option and send this patch again.
> >>>>>
> >>>>> Let's wait for Jaegeuk's comments first?
> >>>>>
> >>>>>>
> >>>>>> Without fs_iohint, Even if data blocks are moved due to GC, 
> >>>>>> we should keep user hints. And if user hints are not given, 
> >>>>>> any hints are not passed down to block layer, right?
> >>>>>
> >>>>> Hmm.. that will be a problem, IMO, we can store last user's io_hint into inode
> >>>>> layout, so later when we trigger GC, we can use the last io_hint in inode rather
> >>>>> than giving no hint or fs' hint.
> >>>>>
> >>>>> I think it needs to discuss with original author of IO hint, what is the IO hint
> >>>>> policy when filesystem move block by itself after inode has been released in system.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>>
> >>>>>> Thank you for comments.
> >>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>>
> >>>>>>>> file hint      segment type        io hint
> >>>>>>>> ---------      ------------        -------
> >>>>>>>> LIFE_SHORT     HOT_DATA            LIFE_MEDIUM
> >>>>>>>> LIFE_MEDIUM    WARM_DATA           LIFE_NONE
> >>>>>>>> LIFE_LONG      WARM_DATA           LIFE_NONE
> >>>>>>>> LIFE_EXTREME   COLD_DATA           LIFE_EXTREME
> >>>>>>>>
> >>>>>>>> the problem is that LIFE_MEDIUM and LIFE_LONG are converted to 
> >>>>>>>> the same hint, LIFE_NONE. I am not sure that the seperation between 
> >>>>>>>> LIFE_MEDIUM and LIFE_LONG is really needed. Because I guess that the 
> >>>>>>>> difference between them is a little ambigous for users, and if WARM_DATA 
> >>>>>>>> segment has two different hints, it can makes GC non-efficient.
> >>>>>>>>
> >>>>>>>> I wonder your thought about this.
> >>>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> ------------------------------------------------------------------------------
> >>>>>>>> Check out the vibrant tech community on one of the world's most
> >>>>>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>>>>>> _______________________________________________
> >>>>>>>> Linux-f2fs-devel mailing list
> >>>>>>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> .
> >>>>>>
> >>>>>
> >>>>>
> >>>>> ------------------------------------------------------------------------------
> >>>>> Check out the vibrant tech community on one of the world's most
> >>>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>>>> _______________________________________________
> >>>>> Linux-f2fs-devel mailing list
> >>>>> Linux-f2fs-devel@lists.sourceforge.net
> >>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>>>>
> >>>
> >>> .
> >>>
> > 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-12-28 16:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28  0:23 [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Hyunchul Lee
2017-11-28  0:23 ` [PATCH 2/2] f2fs: pass down write hints to block layer for direct write Hyunchul Lee
2017-11-30  6:32 ` [PATCH 1/2] f2fs: pass down write hints to block layer for bufferd write Chao Yu
2017-12-01  7:28   ` Jaegeuk Kim
2017-12-01  8:50     ` Hyunchul Lee
2017-12-14 21:18       ` Jaegeuk Kim
2017-11-30  7:06 ` Chao Yu
2017-12-01  8:28   ` Hyunchul Lee
2017-12-11 13:15     ` [f2fs-dev] " Chao Yu
2017-12-12  2:15       ` Hyunchul Lee
2017-12-12  2:45         ` Chao Yu
2017-12-14  1:33           ` Hyunchul Lee
2017-12-15  2:06             ` Jaegeuk Kim
2017-12-18  7:28               ` Hyunchul Lee
2017-12-23  9:44               ` Chao Yu
2017-12-28  3:26                 ` Jaegeuk Kim
2017-12-28  5:05                   ` Hyunchul Lee
2017-12-28 16:32                     ` Jaegeuk Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).