linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Major bug in BTRFS (syncs are ignored with libaio or io_uring)
@ 2022-10-27 21:21 Марк Коренберг
  2022-10-28  4:30 ` Andrei Borzenkov
  2022-10-28 10:23 ` Filipe Manana
  0 siblings, 2 replies; 5+ messages in thread
From: Марк Коренберг @ 2022-10-27 21:21 UTC (permalink / raw)
  To: linux-btrfs

How to reproduce (I tested in kernel 6.1):

2.  mkfs.btrfs over a partition.
3.  mount -o lazytime,noatime
4.  touch file.dat
5.  chattr +C file.dat # turns off compression, checksumming and COW
6.  fallocate -l1G file.dat
7.  # prefill the file with random data
    fio -ioengine=psync                      -name=test -bs=1M
-rw=write                 -filename=file.dat
8.  fio -ioengine=psync    -sync=1 -direct=1 -name=test -bs=4k
-rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 2K
IOPs
9.  fio -ioengine=io_uring -sync=1 -direct=1 -name=test -bs=4k
-rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
IOPs
10. fio -ioengine=libaio   -sync=1 -direct=1 -name=test -bs=4k
-rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
IOPs

Steps 9 and 10 show implausible IOPs.

This does not happen on, say, Ext4 (all the methods give roughly the same IOPs).

Removing -sync=1 on all engines on Ext4 gives immediate return (as
expected because everything gets merged and finally written very fast)

Adding/Removing -sync=1 with io_uring or libaio changes nothing on
BTRFS (it's definitely a bug)


I consider it's a bug in BTRFS. Very important bug because BTRFS
becomes default FS in Fedora server/desktop now. This bug may cause
data loss. That's why I set this bug as high priority.


*****************
https://bugzilla.redhat.com/show_bug.cgi?id=2117971
*****************


-- 
Segmentation fault

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Major bug in BTRFS (syncs are ignored with libaio or io_uring)
  2022-10-27 21:21 Major bug in BTRFS (syncs are ignored with libaio or io_uring) Марк Коренберг
@ 2022-10-28  4:30 ` Andrei Borzenkov
  2022-10-28  4:48   ` Марк Коренберг
  2022-10-28 10:23 ` Filipe Manana
  1 sibling, 1 reply; 5+ messages in thread
From: Andrei Borzenkov @ 2022-10-28  4:30 UTC (permalink / raw)
  To: Марк
	Коренберг,
	linux-btrfs

On 28.10.2022 00:21, Марк Коренберг wrote:
> How to reproduce (I tested in kernel 6.1):
> 
> 2.  mkfs.btrfs over a partition.
> 3.  mount -o lazytime,noatime
> 4.  touch file.dat
> 5.  chattr +C file.dat # turns off compression, checksumming and COW
> 6.  fallocate -l1G file.dat
> 7.  # prefill the file with random data
>      fio -ioengine=psync                      -name=test -bs=1M
> -rw=write                 -filename=file.dat
> 8.  fio -ioengine=psync    -sync=1 -direct=1 -name=test -bs=4k
> -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 2K
> IOPs
> 9.  fio -ioengine=io_uring -sync=1 -direct=1 -name=test -bs=4k
> -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> IOPs
> 10. fio -ioengine=libaio   -sync=1 -direct=1 -name=test -bs=4k
> -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> IOPs
> 
> Steps 9 and 10 show implausible IOPs.
> 
> This does not happen on, say, Ext4 (all the methods give roughly the same IOPs).
> 
> Removing -sync=1 on all engines on Ext4 gives immediate return (as
> expected because everything gets merged and finally written very fast)
> 
> Adding/Removing -sync=1 with io_uring or libaio changes nothing on
> BTRFS (it's definitely a bug)
> 
> 
> I consider it's a bug in BTRFS. Very important bug because BTRFS
> becomes default FS in Fedora server/desktop now. This bug may cause
> data loss. That's why I set this bug as high priority.
> 
> 

Could you explain how this can cause data loss?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Major bug in BTRFS (syncs are ignored with libaio or io_uring)
  2022-10-28  4:30 ` Andrei Borzenkov
@ 2022-10-28  4:48   ` Марк Коренберг
  0 siblings, 0 replies; 5+ messages in thread
From: Марк Коренберг @ 2022-10-28  4:48 UTC (permalink / raw)
  To: Andrei Borzenkov; +Cc: linux-btrfs

Databases and other software use fsync/sync feature to persist data.
To protect against data loss. Modern DBs may use new fast alternative
such as uring or use libaio to write data. Data loss may happen if
power outage happens after an application reported to upper layer that
data was persisted. Fsyncs/sync must never be faked in storage layers.

пт, 28 окт. 2022 г. в 07:30, Andrei Borzenkov <arvidjaar@gmail.com>:
>
> On 28.10.2022 00:21, Марк Коренберг wrote:
> > How to reproduce (I tested in kernel 6.1):
> >
> > 2.  mkfs.btrfs over a partition.
> > 3.  mount -o lazytime,noatime
> > 4.  touch file.dat
> > 5.  chattr +C file.dat # turns off compression, checksumming and COW
> > 6.  fallocate -l1G file.dat
> > 7.  # prefill the file with random data
> >      fio -ioengine=psync                      -name=test -bs=1M
> > -rw=write                 -filename=file.dat
> > 8.  fio -ioengine=psync    -sync=1 -direct=1 -name=test -bs=4k
> > -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 2K
> > IOPs
> > 9.  fio -ioengine=io_uring -sync=1 -direct=1 -name=test -bs=4k
> > -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> > IOPs
> > 10. fio -ioengine=libaio   -sync=1 -direct=1 -name=test -bs=4k
> > -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> > IOPs
> >
> > Steps 9 and 10 show implausible IOPs.
> >
> > This does not happen on, say, Ext4 (all the methods give roughly the same IOPs).
> >
> > Removing -sync=1 on all engines on Ext4 gives immediate return (as
> > expected because everything gets merged and finally written very fast)
> >
> > Adding/Removing -sync=1 with io_uring or libaio changes nothing on
> > BTRFS (it's definitely a bug)
> >
> >
> > I consider it's a bug in BTRFS. Very important bug because BTRFS
> > becomes default FS in Fedora server/desktop now. This bug may cause
> > data loss. That's why I set this bug as high priority.
> >
> >
>
> Could you explain how this can cause data loss?



-- 
Segmentation fault

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Major bug in BTRFS (syncs are ignored with libaio or io_uring)
  2022-10-27 21:21 Major bug in BTRFS (syncs are ignored with libaio or io_uring) Марк Коренберг
  2022-10-28  4:30 ` Andrei Borzenkov
@ 2022-10-28 10:23 ` Filipe Manana
  2022-11-11 12:01   ` Filipe Manana
  1 sibling, 1 reply; 5+ messages in thread
From: Filipe Manana @ 2022-10-28 10:23 UTC (permalink / raw)
  To: Марк
	Коренберг
  Cc: linux-btrfs

On Thu, Oct 27, 2022 at 11:08 PM Марк Коренберг <socketpair@gmail.com> wrote:
>
> How to reproduce (I tested in kernel 6.1):
>
> 2.  mkfs.btrfs over a partition.
> 3.  mount -o lazytime,noatime
> 4.  touch file.dat
> 5.  chattr +C file.dat # turns off compression, checksumming and COW
> 6.  fallocate -l1G file.dat
> 7.  # prefill the file with random data
>     fio -ioengine=psync                      -name=test -bs=1M
> -rw=write                 -filename=file.dat
> 8.  fio -ioengine=psync    -sync=1 -direct=1 -name=test -bs=4k
> -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 2K
> IOPs
> 9.  fio -ioengine=io_uring -sync=1 -direct=1 -name=test -bs=4k
> -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> IOPs
> 10. fio -ioengine=libaio   -sync=1 -direct=1 -name=test -bs=4k
> -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> IOPs
>
> Steps 9 and 10 show implausible IOPs.
>
> This does not happen on, say, Ext4 (all the methods give roughly the same IOPs).
>
> Removing -sync=1 on all engines on Ext4 gives immediate return (as
> expected because everything gets merged and finally written very fast)
>
> Adding/Removing -sync=1 with io_uring or libaio changes nothing on
> BTRFS (it's definitely a bug)

I confirm that the syncing is not happening often when using aio
(either old aio or io_uring).
I understand why it's happening, so I'll work on a fix for that.

Thanks for the report.

>
>
> I consider it's a bug in BTRFS. Very important bug because BTRFS
> becomes default FS in Fedora server/desktop now. This bug may cause
> data loss. That's why I set this bug as high priority.
>
>
> *****************
> https://bugzilla.redhat.com/show_bug.cgi?id=2117971
> *****************
>
>
> --
> Segmentation fault

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Major bug in BTRFS (syncs are ignored with libaio or io_uring)
  2022-10-28 10:23 ` Filipe Manana
@ 2022-11-11 12:01   ` Filipe Manana
  0 siblings, 0 replies; 5+ messages in thread
From: Filipe Manana @ 2022-11-11 12:01 UTC (permalink / raw)
  To: Марк
	Коренберг
  Cc: linux-btrfs

On Fri, Oct 28, 2022 at 11:23 AM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Thu, Oct 27, 2022 at 11:08 PM Марк Коренберг <socketpair@gmail.com> wrote:
> >
> > How to reproduce (I tested in kernel 6.1):
> >
> > 2.  mkfs.btrfs over a partition.
> > 3.  mount -o lazytime,noatime
> > 4.  touch file.dat
> > 5.  chattr +C file.dat # turns off compression, checksumming and COW
> > 6.  fallocate -l1G file.dat
> > 7.  # prefill the file with random data
> >     fio -ioengine=psync                      -name=test -bs=1M
> > -rw=write                 -filename=file.dat
> > 8.  fio -ioengine=psync    -sync=1 -direct=1 -name=test -bs=4k
> > -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 2K
> > IOPs
> > 9.  fio -ioengine=io_uring -sync=1 -direct=1 -name=test -bs=4k
> > -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> > IOPs
> > 10. fio -ioengine=libaio   -sync=1 -direct=1 -name=test -bs=4k
> > -rw=randwrite -runtime=60 -filename=file.dat  # Will show, say, 32K
> > IOPs
> >
> > Steps 9 and 10 show implausible IOPs.
> >
> > This does not happen on, say, Ext4 (all the methods give roughly the same IOPs).
> >
> > Removing -sync=1 on all engines on Ext4 gives immediate return (as
> > expected because everything gets merged and finally written very fast)
> >
> > Adding/Removing -sync=1 with io_uring or libaio changes nothing on
> > BTRFS (it's definitely a bug)
>
> I confirm that the syncing is not happening often when using aio
> (either old aio or io_uring).
> I understand why it's happening, so I'll work on a fix for that.

Btw, I forgot to follow up, but the fix is already in Linus' tree:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8184620ae21213d51eaf2e0bd4186baacb928172

It was also backported to the 6.0.8 and 5.15.78 stable releases.

Thanks.

>
> Thanks for the report.
>
> >
> >
> > I consider it's a bug in BTRFS. Very important bug because BTRFS
> > becomes default FS in Fedora server/desktop now. This bug may cause
> > data loss. That's why I set this bug as high priority.
> >
> >
> > *****************
> > https://bugzilla.redhat.com/show_bug.cgi?id=2117971
> > *****************
> >
> >
> > --
> > Segmentation fault

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-11 12:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-27 21:21 Major bug in BTRFS (syncs are ignored with libaio or io_uring) Марк Коренберг
2022-10-28  4:30 ` Andrei Borzenkov
2022-10-28  4:48   ` Марк Коренберг
2022-10-28 10:23 ` Filipe Manana
2022-11-11 12:01   ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).