All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-19 21:57 ` Jaegeuk Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-19 21:57 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
command to keep the write order.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/file.c | 4 +++-
 fs/f2fs/node.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index f08e6208e183..2aef0632f35b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
 	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
 	clear_inode_flag(inode, FI_APPEND_WRITE);
 flush_out:
-	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
+	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
+			(atomic && !test_opt(sbi, NOBARRIER) &&
+					f2fs_sb_has_blkzoned(sbi)))
 		ret = f2fs_issue_flush(sbi, inode->i_ino);
 	if (!ret) {
 		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c280f482c741..7224a980056f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
 		goto redirty_out;
 	}
 
-	if (atomic && !test_opt(sbi, NOBARRIER))
+	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
 		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
 
 	/* should add to global list before clearing PAGECACHE status */
-- 
2.36.0.rc0.470.gd361397f0d-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-19 21:57 ` Jaegeuk Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-19 21:57 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
command to keep the write order.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/file.c | 4 +++-
 fs/f2fs/node.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index f08e6208e183..2aef0632f35b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
 	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
 	clear_inode_flag(inode, FI_APPEND_WRITE);
 flush_out:
-	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
+	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
+			(atomic && !test_opt(sbi, NOBARRIER) &&
+					f2fs_sb_has_blkzoned(sbi)))
 		ret = f2fs_issue_flush(sbi, inode->i_ino);
 	if (!ret) {
 		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c280f482c741..7224a980056f 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
 		goto redirty_out;
 	}
 
-	if (atomic && !test_opt(sbi, NOBARRIER))
+	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
 		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
 
 	/* should add to global list before clearing PAGECACHE status */
-- 
2.36.0.rc0.470.gd361397f0d-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] f2fs: use flush command instead of FUA for zoned device
  2022-04-19 21:57 ` [f2fs-dev] " Jaegeuk Kim
@ 2022-04-20  5:16   ` Christoph Hellwig
  -1 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2022-04-20  5:16 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On Tue, Apr 19, 2022 at 02:57:03PM -0700, Jaegeuk Kim wrote:
> The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> command to keep the write order.

The block layer can reorder all commands.  Given that FUA only affects
the command itself that should not matter.  What is this trying to fix?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-20  5:16   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2022-04-20  5:16 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-kernel, linux-f2fs-devel

On Tue, Apr 19, 2022 at 02:57:03PM -0700, Jaegeuk Kim wrote:
> The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> command to keep the write order.

The block layer can reorder all commands.  Given that FUA only affects
the command itself that should not matter.  What is this trying to fix?


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] f2fs: use flush command instead of FUA for zoned device
  2022-04-20  5:16   ` [f2fs-dev] " Christoph Hellwig
@ 2022-04-20 18:12     ` Jaegeuk Kim
  -1 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-20 18:12 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-f2fs-devel

On 04/19, Christoph Hellwig wrote:
> On Tue, Apr 19, 2022 at 02:57:03PM -0700, Jaegeuk Kim wrote:
> > The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> > command to keep the write order.
> 
> The block layer can reorder all commands.  Given that FUA only affects
> the command itself that should not matter.  What is this trying to fix?

Not for zoned disk, as I mentioned. The mq-deadline serializes write commands
but I got an reordered command, if there's a FUA|PREFLUSH command, which emits
EIO by the disk.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-20 18:12     ` Jaegeuk Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-20 18:12 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-f2fs-devel

On 04/19, Christoph Hellwig wrote:
> On Tue, Apr 19, 2022 at 02:57:03PM -0700, Jaegeuk Kim wrote:
> > The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> > command to keep the write order.
> 
> The block layer can reorder all commands.  Given that FUA only affects
> the command itself that should not matter.  What is this trying to fix?

Not for zoned disk, as I mentioned. The mq-deadline serializes write commands
but I got an reordered command, if there's a FUA|PREFLUSH command, which emits
EIO by the disk.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] f2fs: use flush command instead of FUA for zoned device
  2022-04-19 21:57 ` [f2fs-dev] " Jaegeuk Kim
@ 2022-04-21  8:43   ` Damien Le Moal via Linux-f2fs-devel
  -1 siblings, 0 replies; 12+ messages in thread
From: Damien Le Moal @ 2022-04-21  8:43 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 4/20/22 06:57, Jaegeuk Kim wrote:
> The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> command to keep the write order.

Stricktly speaking, for a request that has data, the problem is triggered
by REQ_PREFLUSH since in this case the request does not go through the
scheduler and is processed through the blk-flush machinery. REQ_FUA on its
own should not matter if the device supports it. If the device does not
support FUA, then the same problem can happen due to POSTFLUSH (again no
scheduler).

Bypassing the scheduler leads to the write not write-locking the zone,
which leads to reordering... Completely overlooked that case when the zone
write locking was implemented.

Ideally, the FS should not have to care about this. blk-flush machinery
should be a little more intelligent and process the write phase of the
request using the scheduler. Need to look into that.

> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/file.c | 4 +++-
>  fs/f2fs/node.c | 2 +-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index f08e6208e183..2aef0632f35b 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>  	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
>  	clear_inode_flag(inode, FI_APPEND_WRITE);
>  flush_out:
> -	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
> +	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
> +			(atomic && !test_opt(sbi, NOBARRIER) &&
> +					f2fs_sb_has_blkzoned(sbi)))

Aligning the conditions and not breaking the second line would make this a
lot easier to read...

>  		ret = f2fs_issue_flush(sbi, inode->i_ino);
>  	if (!ret) {
>  		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index c280f482c741..7224a980056f 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
>  		goto redirty_out;
>  	}
>  
> -	if (atomic && !test_opt(sbi, NOBARRIER))
> +	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
>  		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;

Is this really OK to do ? flush + write as different operations may not
lead to the same result as a preflush+fua write.

Until the block layer is fixed to properly handle this, a simpler fix for
f2fs would be to force enable the NOBARRIER option for zoned drives ? That
would avoid these changes no ?

Also, with all the testing we do on SMR disks and f2fs (smaller, older SMR
disks due to the 16TB limit), we never have triggered this problem. How
did you trigger it ?

>  
>  	/* should add to global list before clearing PAGECACHE status */


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-21  8:43   ` Damien Le Moal via Linux-f2fs-devel
  0 siblings, 0 replies; 12+ messages in thread
From: Damien Le Moal via Linux-f2fs-devel @ 2022-04-21  8:43 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel

On 4/20/22 06:57, Jaegeuk Kim wrote:
> The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> command to keep the write order.

Stricktly speaking, for a request that has data, the problem is triggered
by REQ_PREFLUSH since in this case the request does not go through the
scheduler and is processed through the blk-flush machinery. REQ_FUA on its
own should not matter if the device supports it. If the device does not
support FUA, then the same problem can happen due to POSTFLUSH (again no
scheduler).

Bypassing the scheduler leads to the write not write-locking the zone,
which leads to reordering... Completely overlooked that case when the zone
write locking was implemented.

Ideally, the FS should not have to care about this. blk-flush machinery
should be a little more intelligent and process the write phase of the
request using the scheduler. Need to look into that.

> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  fs/f2fs/file.c | 4 +++-
>  fs/f2fs/node.c | 2 +-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index f08e6208e183..2aef0632f35b 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>  	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
>  	clear_inode_flag(inode, FI_APPEND_WRITE);
>  flush_out:
> -	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
> +	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
> +			(atomic && !test_opt(sbi, NOBARRIER) &&
> +					f2fs_sb_has_blkzoned(sbi)))

Aligning the conditions and not breaking the second line would make this a
lot easier to read...

>  		ret = f2fs_issue_flush(sbi, inode->i_ino);
>  	if (!ret) {
>  		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index c280f482c741..7224a980056f 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
>  		goto redirty_out;
>  	}
>  
> -	if (atomic && !test_opt(sbi, NOBARRIER))
> +	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
>  		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;

Is this really OK to do ? flush + write as different operations may not
lead to the same result as a preflush+fua write.

Until the block layer is fixed to properly handle this, a simpler fix for
f2fs would be to force enable the NOBARRIER option for zoned drives ? That
would avoid these changes no ?

Also, with all the testing we do on SMR disks and f2fs (smaller, older SMR
disks due to the 16TB limit), we never have triggered this problem. How
did you trigger it ?

>  
>  	/* should add to global list before clearing PAGECACHE status */


-- 
Damien Le Moal
Western Digital Research


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] f2fs: use flush command instead of FUA for zoned device
  2022-04-21  8:43   ` [f2fs-dev] " Damien Le Moal via Linux-f2fs-devel
@ 2022-04-21 15:20     ` Jaegeuk Kim
  -1 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-21 15:20 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-kernel, linux-f2fs-devel

On 04/21, Damien Le Moal wrote:
> On 4/20/22 06:57, Jaegeuk Kim wrote:
> > The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> > command to keep the write order.
> 
> Stricktly speaking, for a request that has data, the problem is triggered
> by REQ_PREFLUSH since in this case the request does not go through the
> scheduler and is processed through the blk-flush machinery. REQ_FUA on its
> own should not matter if the device supports it. If the device does not
> support FUA, then the same problem can happen due to POSTFLUSH (again no
> scheduler).

I think the problem is a piggy-backed data along with flush or fua whatever,
but this made me use a separate flush command.

> 
> Bypassing the scheduler leads to the write not write-locking the zone,
> which leads to reordering... Completely overlooked that case when the zone
> write locking was implemented.
> 
> Ideally, the FS should not have to care about this. blk-flush machinery
> should be a little more intelligent and process the write phase of the
> request using the scheduler. Need to look into that.

Please. I'm okay to revert this, once the block layer supports.

> 
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/file.c | 4 +++-
> >  fs/f2fs/node.c | 2 +-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index f08e6208e183..2aef0632f35b 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> >  	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
> >  	clear_inode_flag(inode, FI_APPEND_WRITE);
> >  flush_out:
> > -	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
> > +	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
> > +			(atomic && !test_opt(sbi, NOBARRIER) &&
> > +					f2fs_sb_has_blkzoned(sbi)))
> 
> Aligning the conditions and not breaking the second line would make this a
> lot easier to read...

Sure.

> 
> >  		ret = f2fs_issue_flush(sbi, inode->i_ino);
> >  	if (!ret) {
> >  		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index c280f482c741..7224a980056f 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
> >  		goto redirty_out;
> >  	}
> >  
> > -	if (atomic && !test_opt(sbi, NOBARRIER))
> > +	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
> >  		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
> 
> Is this really OK to do ? flush + write as different operations may not
> lead to the same result as a preflush+fua write.
> 
> Until the block layer is fixed to properly handle this, a simpler fix for
> f2fs would be to force enable the NOBARRIER option for zoned drives ? That
> would avoid these changes no ?

No, it will hurt the stability of FS metadata consistency.

> 
> Also, with all the testing we do on SMR disks and f2fs (smaller, older SMR
> disks due to the 16TB limit), we never have triggered this problem. How
> did you trigger it ?

This happens in Android only, since atomic_write for sqlite is taking this path.

> 
> >  
> >  	/* should add to global list before clearing PAGECACHE status */
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-21 15:20     ` Jaegeuk Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-21 15:20 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-kernel, linux-f2fs-devel

On 04/21, Damien Le Moal wrote:
> On 4/20/22 06:57, Jaegeuk Kim wrote:
> > The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
> > command to keep the write order.
> 
> Stricktly speaking, for a request that has data, the problem is triggered
> by REQ_PREFLUSH since in this case the request does not go through the
> scheduler and is processed through the blk-flush machinery. REQ_FUA on its
> own should not matter if the device supports it. If the device does not
> support FUA, then the same problem can happen due to POSTFLUSH (again no
> scheduler).

I think the problem is a piggy-backed data along with flush or fua whatever,
but this made me use a separate flush command.

> 
> Bypassing the scheduler leads to the write not write-locking the zone,
> which leads to reordering... Completely overlooked that case when the zone
> write locking was implemented.
> 
> Ideally, the FS should not have to care about this. blk-flush machinery
> should be a little more intelligent and process the write phase of the
> request using the scheduler. Need to look into that.

Please. I'm okay to revert this, once the block layer supports.

> 
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  fs/f2fs/file.c | 4 +++-
> >  fs/f2fs/node.c | 2 +-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index f08e6208e183..2aef0632f35b 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -372,7 +372,9 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> >  	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
> >  	clear_inode_flag(inode, FI_APPEND_WRITE);
> >  flush_out:
> > -	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
> > +	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
> > +			(atomic && !test_opt(sbi, NOBARRIER) &&
> > +					f2fs_sb_has_blkzoned(sbi)))
> 
> Aligning the conditions and not breaking the second line would make this a
> lot easier to read...

Sure.

> 
> >  		ret = f2fs_issue_flush(sbi, inode->i_ino);
> >  	if (!ret) {
> >  		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index c280f482c741..7224a980056f 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1633,7 +1633,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
> >  		goto redirty_out;
> >  	}
> >  
> > -	if (atomic && !test_opt(sbi, NOBARRIER))
> > +	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
> >  		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
> 
> Is this really OK to do ? flush + write as different operations may not
> lead to the same result as a preflush+fua write.
> 
> Until the block layer is fixed to properly handle this, a simpler fix for
> f2fs would be to force enable the NOBARRIER option for zoned drives ? That
> would avoid these changes no ?

No, it will hurt the stability of FS metadata consistency.

> 
> Also, with all the testing we do on SMR disks and f2fs (smaller, older SMR
> disks due to the 16TB limit), we never have triggered this problem. How
> did you trigger it ?

This happens in Android only, since atomic_write for sqlite is taking this path.

> 
> >  
> >  	/* should add to global list before clearing PAGECACHE status */
> 
> 
> -- 
> Damien Le Moal
> Western Digital Research


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] f2fs: use flush command instead of FUA for zoned device
  2022-04-19 21:57 ` [f2fs-dev] " Jaegeuk Kim
@ 2022-04-21 16:05   ` Jaegeuk Kim
  -1 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-21 16:05 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel

The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
command to keep the write order.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---

 change log from v1:
  - minor clean up

 fs/f2fs/file.c | 3 ++-
 fs/f2fs/node.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index f08e6208e183..eae2e7908072 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -372,7 +372,8 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
 	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
 	clear_inode_flag(inode, FI_APPEND_WRITE);
 flush_out:
-	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
+	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
+	    (atomic && !test_opt(sbi, NOBARRIER) && f2fs_sb_has_blkzoned(sbi)))
 		ret = f2fs_issue_flush(sbi, inode->i_ino);
 	if (!ret) {
 		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c45d341dcf6e..144f9f966690 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1631,7 +1631,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
 		goto redirty_out;
 	}
 
-	if (atomic && !test_opt(sbi, NOBARRIER))
+	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
 		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
 
 	/* should add to global list before clearing PAGECACHE status */
-- 
2.36.0.rc2.479.g8af0fa9b8e-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [f2fs-dev] [PATCH v2] f2fs: use flush command instead of FUA for zoned device
@ 2022-04-21 16:05   ` Jaegeuk Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Jaegeuk Kim @ 2022-04-21 16:05 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel

The block layer for zoned disk can reorder the FUA'ed IOs. Let's use flush
command to keep the write order.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---

 change log from v1:
  - minor clean up

 fs/f2fs/file.c | 3 ++-
 fs/f2fs/node.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index f08e6208e183..eae2e7908072 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -372,7 +372,8 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
 	f2fs_remove_ino_entry(sbi, ino, APPEND_INO);
 	clear_inode_flag(inode, FI_APPEND_WRITE);
 flush_out:
-	if (!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER)
+	if ((!atomic && F2FS_OPTION(sbi).fsync_mode != FSYNC_MODE_NOBARRIER) ||
+	    (atomic && !test_opt(sbi, NOBARRIER) && f2fs_sb_has_blkzoned(sbi)))
 		ret = f2fs_issue_flush(sbi, inode->i_ino);
 	if (!ret) {
 		f2fs_remove_ino_entry(sbi, ino, UPDATE_INO);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c45d341dcf6e..144f9f966690 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1631,7 +1631,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
 		goto redirty_out;
 	}
 
-	if (atomic && !test_opt(sbi, NOBARRIER))
+	if (atomic && !test_opt(sbi, NOBARRIER) && !f2fs_sb_has_blkzoned(sbi))
 		fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
 
 	/* should add to global list before clearing PAGECACHE status */
-- 
2.36.0.rc2.479.g8af0fa9b8e-goog



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-04-21 16:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-19 21:57 [PATCH] f2fs: use flush command instead of FUA for zoned device Jaegeuk Kim
2022-04-19 21:57 ` [f2fs-dev] " Jaegeuk Kim
2022-04-20  5:16 ` Christoph Hellwig
2022-04-20  5:16   ` [f2fs-dev] " Christoph Hellwig
2022-04-20 18:12   ` Jaegeuk Kim
2022-04-20 18:12     ` [f2fs-dev] " Jaegeuk Kim
2022-04-21  8:43 ` Damien Le Moal
2022-04-21  8:43   ` [f2fs-dev] " Damien Le Moal via Linux-f2fs-devel
2022-04-21 15:20   ` Jaegeuk Kim
2022-04-21 15:20     ` [f2fs-dev] " Jaegeuk Kim
2022-04-21 16:05 ` [PATCH v2] " Jaegeuk Kim
2022-04-21 16:05   ` [f2fs-dev] " Jaegeuk Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.