From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from tartarus.angband.pl ([89.206.35.136]:58243 "EHLO tartarus.angband.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751619AbdDKJzz (ORCPT ); Tue, 11 Apr 2017 05:55:55 -0400 Received: from kilobyte by tartarus.angband.pl with local (Exim 4.88) (envelope-from ) id 1cxsWe-0001oc-QH for linux-btrfs@vger.kernel.org; Tue, 11 Apr 2017 11:55:52 +0200 Date: Tue, 11 Apr 2017 11:55:52 +0200 From: Adam Borowski To: linux-btrfs@vger.kernel.org Subject: Re: btrfs filesystem keeps allocating new chunks for no apparent reason Message-ID: <20170411095552.o5b4wysjqlbp57xa@angband.pl> References: <4532f6ee-2a6e-412a-7230-edb76735d55f@mendix.com> <07a7f59e-64e0-4d09-5d32-01bc933fe38d@gmail.com> <20170410144533.664fc304@jupiter.sol.kaishome.de> <5488ea5a-b41c-5987-e664-ec17cf2d5e01@gmail.com> <20170410184444.08ced097@jupiter.sol.local> <20170410185437.235b3b86@jupiter.sol.kaishome.de> <7ea65b63-d399-c049-d466-681c1df2d025@gmail.com> <20170410201842.216893be@jupiter.sol.kaishome.de> <20170411060119.65b34774@jupiter.sol.kaishome.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 In-Reply-To: <20170411060119.65b34774@jupiter.sol.kaishome.de> Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Tue, Apr 11, 2017 at 06:01:19AM +0200, Kai Krakow wrote: > Yes, I know all this. But I don't see why you still want noatime or > relatime if you use lazytime, except for super-optimizing. Lazytime > gives you POSIX conformity for a problem that the other options only > tried to solve. (Besides lazytime also working on mtime, and, technically, ctime.) First: atime, in any form, murders snapshots. On any filesystem that has them, not just btrfs -- I've tested zfs and LVM snapshots, there's also qcow2/vdi and so on. On all of them, every single read-everything operation costs you 5% disk space. For a _read_ operation! I've tested /usr-y mix of files, for consistency with the guy who mentioned this problem first. Your mileage will vary depending on whether you store 100GB disk images or a news spool. Read-everything is quite rare, but most systems have at least one stat-everything cronjob. That touches only diratime, but that's still 1-in-11 inodes (remarkably consistent: I've checked a few machines with drastically different purposes, and somehow the min was 10, max 12). And no, marking snapshots as ro doesn't help: reading the live version still breaks CoW. Second: atime murders media with limited write endurance. Modern SSD can cope well, but I for one work a lot with SD and eMMC. Every single SoC image I've seen uses noatime for this reason. Third: relatime/lazytime don't eliminate the performance cost. They fix only frequently read files -- if you have a big filesystem where you read a lot but individual files tend to be read rarely, relatime is as bad as strictatime, and lazytime actually worse. Both will do an unnecessary write of all inodes. Four: why? Beside being POSIXLY_CORRECT, what do you actually gain from atime? I can think only of: * new mail notification with mbox. Just patch the mail reader to manually futimens(..., {UTIME_NOW,UTIME_OMIT}), it has no extra cost on !noatime mounts. I've personally did so for mutt, the updated version will ship in Debian stretch; you can patch other mail readers although they tend to be rarely used in conjunction with shell access (and thus they have no need for atime at all). * Debian's popcon's "vote" field. Use "inst", and there's no gain from popcon for you personally. * some intrusion detection forensics (broken by open(..., O_NOATIME)) Conclusion: death to atime! -- ⢀⣴⠾⠻⢶⣦⠀ Meow! ⣾⠁⢠⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋⠀ Collisions shmolisions, let's see them find a collision or second ⠈⠳⣄⠀⠀⠀⠀ preimage for double rot13!