On Sun, Apr 19, 2015 at 05:10:30PM +0200, Martin Steigerwald wrote: > Am Sonntag, 19. April 2015, 22:31:02 schrieb Craig Ringer: > > On 19 April 2015 at 22:28, Martin Steigerwald > wrote: > > > Am Sonntag, 19. April 2015, 21:20:11 schrieb Craig Ringer: > > >> Hi all > > > > > > Hi Craig, > > > > > >> I'm looking into the advisability of running PostgreSQL on BTRFS, and > > >> after looking at the FAQ there's something I'm hoping you could > > >> clarify. > > >> > > >> The wiki FAQ says: > > >> > > >> "Btrfs does not force all dirty data to disk on every fsync or O_SYNC > > >> operation, fsync is designed to be fast." > > >> > > >> Is that wording intended narrowly, to contrast with ext3's nasty > > >> habit > > >> of flushing *all* dirty blocks for the entire file system whenever > > >> anyone calls fsync() ? Or is it intended broadly, to say that btrfs's > > >> fsync won't necessarily flush all data blocks (just metadata) ? > > >> > > >> Is that statement still true in recent BTRFS versions (3.18, etc)? > > > > > > I donīt know, thus leave that for others to answer. I always assumed a > > > strong fsync() guarentee as in "its on disk" with BTRFS. So I am > > > interested in that as well. > > > > > > But for databases, did you consider the copy on write fragmentation > > > BTRFS will give? Even with autodefrag, afaik it is not recommended to > > > use it for large databases on rotating media at least. > > > > I did, and any testing would need to look at the efficacy of the > > chattr +C option on the database directory tree. > > > > PostgreSQL is its self copy-on-write (because of multi-version > > concurrency control), so it doesn't make much sense to have the FS > > doing another layer of COW. > > > > I'm curious as to whether +C has any effect on BTRFS's durability, too. > > You will loose the ability to snapshot that directory tree then. No you won't. The +C attribute still allows snapshotting and reflink copies. However, after the snapshot, writes to either copy will result in that copy being CoWed. (Specifically, writes to an extent of a +C file with more than one reference to the extent will result in a CoW operation, until there is only one reference, and then the writes will not be CoWed again). The practical upshot of this is that every snapshot of, and subsequent writes to, a +C file will introduce fragmentation in the same way that writes to a non-+C file would. You also have a disadvantage with +C that you lose the checksumming features of the FS, and hence the self-healing properties if you're running with btrfs-native RAID. Hugo. -- Hugo Mills | Nothing right in my left brain. Nothing left in my hugo@... carfax.org.uk | right brain. http://carfax.org.uk/ | PGP: E2AB1DE4 |