On 2019/4/29 上午3:27, Hendrik Friedel wrote: > Hello, > thanks for your reply. > >>>3) Even more, it would be good, if btrfs would disable the write cache >>> in that case, so that one does not need to rely on the user >> >>Personally speaking, if user really believes it's write cache causing >>the problem or want to be extra safe, then they should disable cache. > How many percent of the users will be able to judge that? > >>As long as FLUSH is implemented without problem, the only faulty part is >>btrfs itself and I haven't found any proof of either yet. > But you have searched? > >>>2) I find the location of the (only?) warning -dmesg- well hidden. I > think it would be better to notify the user when creating the file-system. >>A notification on creating the volume and ones when adding devices > (either via `device add` or via a replace operation) >>would indeed be nice, but we should still keep the kernel log warning. > > Ok, so what would be the way to move forward on that? Would it help if I > create an issue in a https://bugzilla.kernel.org/ ? No need. See comment below. > >>>3) Even more, it would be good, if btrfs would disable the write cache > in that case, so that one does not need to rely on the user >> I would tend to disagree here. We should definitely _recommend_ this > to the user if we know there is no barrier support, but just >> doing it behind their back is not a good idea. > > Well, there is some room between 'automatic' and 'behind their back. E.g. > "Barriers are not supported by /dev/sda. Automatically disabling > write-cache on mount. You can suppress this with the > 'enable-cache-despite-no-barrier-support-I-know-what-I-am-doing' mount > option (maybe, we can shorten the option). There is no problem using write cache as long as the device supports flush. SATA/NVME protocol specified all devices should support flush. As long as flush is supported, fua can be emulated. Thus write cache is not a problem at all, as long as flush is implemented correctly. > >> There are also plenty of valid reasons to want to use the write cache > anyway. > > I cannot think of one. Who would sacrifice data integrity/potential > total loss of the filesystem for speed? No data integrity is lost, and performance is greatly improved with write cache. Thanks, Qu > >> As far as FUA/DPO, I know of exactly _zero_ devices that lie about > implementing it and don't. > ... >> but the fact that Linux used to not issue a FLUSH command to the disks > when you called fsync in userspace. > > Ok, thanks for that clarification. > > > Greetings, > Hendrik > > ------ Originalnachricht ------ > Von: "Austin S. Hemmelgarn" > An: "Hendrik Friedel" ; "Qu Wenruo" > ; linux-btrfs@vger.kernel.org > Gesendet: 03.04.2019 20:44:09 > Betreff: Re: btrfs and write barriers > >> On 2019-04-03 14:17, Hendrik Friedel wrote: >>> Hello, >>> >>> thanks for your reply. >>> >>>>> 3) Even more, it would be good, if btrfs would disable the write cache >>>>> in that case, so that one does not need to rely on the user >>>> Personally speaking, if user really believes it's write cache causing >>>> the problem or want to be extra safe, then they should disable cache. >>> How many percent of the users will be able to judge that? >>>> As long as FLUSH is implemented without problem, the only faulty >>>> part is >>>> btrfs itself and I haven't found any proof of either yet. >>> But you have searched? >>> >>>  >>2) I find the location of the (only?) warning -dmesg- well hidden. >>> I think it would be better to notify the user when creating the >>> file-system. >>>  >A notification on creating the volume and ones when adding devices >>> (either via `device add` or via a replace operation) >>>  >would indeed be nice, but we should still keep the kernel log warning. >>> >>> Ok, so what would be the way to move forward on that? Would it help >>> if I create an issue in a https://bugzilla.kernel.org/ ? >> The biggest issue is actually figuring out if the devices don't >> support write barriers (which means no FLUSH or broken FLUSH on Linux, >> not no FUA/DPO, because as long as the device properly implements >> FLUSH (and most do), Linux will provide a FUA emulation which works >> for write barriers).  Once you've got that, it should be pretty >> trivial to add to the messages. >>> >>>  >>3) Even more, it would be good, if btrfs would disable the write >>> cache in that case, so that one does not need to rely on the user >>>  > I would tend to disagree here. We should definitely _recommend_ >>> this to the user if we know there is no barrier support, but just >>>  > doing it behind their back is not a good idea. >>> >>> Well, there is some room between 'automatic' and 'behind their back. >>> E.g. >>> "Barriers are not supported by /dev/sda. Automatically disabling >>> write-cache on mount. You can suppress this with the >>> 'enable-cache-despite-no-barrier-support-I-know-what-I-am-doing' >>> mount option (maybe, we can shorten the option). >> And that's still 'behind the back' because it's a layering violation. >> Even LVM and MD don't do this, and they have even worse issues than we >> do because they aren't CoW. >>> >>>  > There are also plenty of valid reasons to want to use the write >>> cache anyway. >>> I cannot think of one. Who would sacrifice data integrity/potential >>> total loss of the filesystem for speed? >> There are quite a few cases where the risk of data loss _just doesn't >> matter_, and any data that could be invalid is also inherently stale. >> Some trivial examples: >> >> * /run on any modern Linux system. Primarily contains sockets used by >> running services, PID files for daemons, and other similar things that >> only matter for the duration of the current boot of the system. These >> days, it's usually in-memory, but some people with really tight memory >> constraints still use persistent storage for it to save memory. >> * /tmp on any sane UNIX system. Similar case to above, but usually for >> stuff that only matters on the scale of session lifetimes, or even >> just process lifetimes. >> * /var/tmp on most Linux systems. Usually the same case as /tmp. >> * /var/cache on any sane UNIX system. By definition, if the data here >> is lost, it doesn't matter, as it only exists for performance reasons >> anyway. Smart applications will even validate the files they put here, >> so corruption isn't an issue either. >> >> There are bunches of other examples I could list, but all of them are >> far more situational and application specific. >>> >>>  > As far as FUA/DPO, I know of exactly _zero_ devices that lie about >>> implementing it and don't. >>> ... >>>  > but the fact that Linux used to not issue a FLUSH command to the >>> disks when you called fsync in userspace. >>> Ok, thanks for that clarification. >