All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kai Krakow <hurikhan77@gmail.com>
To: linux-btrfs@vger.kernel.org
Cc: systemd-devel@lists.freedesktop.org
Subject: Re: Slow startup of systemd-journal on BTRFS
Date: Sat, 14 Jun 2014 12:59:31 +0200	[thread overview]
Message-ID: <jhmt6b-j4j.ln1@hurikhan77.spdns.de> (raw)
In-Reply-To: pan$625ac$a8aa7477$d0179ebe$c66ba817@cox.net

Duncan <1i5t5.duncan@cox.net> schrieb:

> As they say, "Whoosh!"
> 
> At least here, I interpreted that remark as primarily sarcastic
> commentary on the systemd devs' apparent attitude, which can be
> (controversially) summarized as: "Systemd doesn't have problems because
> it's perfect.  Therefore, any problems you have with systemd must instead
> be with other components which systemd depends on."

Come on, sorry, but this is fud. Really... ;-)

> IOW, it's a btrfs problem now in practice, not because it is so in a
> technical sense, but because systemd defines it as such and is unlikely
> to budge, so the only way to achieve progress is for btrfs to deal with
> it.

I think that systemd is even one of the early supporters of btrfs because it 
will defragment readahead files on boot from btrfs. I'd suggest the problem 
is to be found in the different semantics with COW filesystems. And if 
someone loudly complains to the systemd developers how bad they are at doing 
their stuff - hmm, well, I would be disapointed/offended, too, as a 
programmer because much very well done work has been put into systemd and 
I'd start ignoring such people. In Germany we have a saying for this: "Wie 
man in den Wald hineinruft, so schallt es heraus." [1] They are doing many 
things right that have not been adopted to modern systems in the last twenty 
years (or so) with the legacy init systems.

So let's start with my journals, on btrfs:

$ sudo filefrag *
system@0004fad12dae7676-98627a3d7df4e35e.journal~: 2 extents found
system@0004fae8ea4b84a4-3a2dc4a93c5f7dc9.journal~: 2 extents found
system@806cd49faa074a49b6cde5ff6fca8adc-000000000008e4cc-0004f82580cdcb45.journal: 
5 extents found
system@806cd49faa074a49b6cde5ff6fca8adc-0000000000097959-0004f89c2e8aff87.journal: 
5 extents found                                                                                                                  
system@806cd49faa074a49b6cde5ff6fca8adc-00000000000a166d-0004f98d7e04157c.journal: 
5 extents found                                                                                                                  
system@806cd49faa074a49b6cde5ff6fca8adc-00000000000aad59-0004fa379b9a1fdf.journal: 
5 extents found
system@ec16f60db38f43619f8337153a1cc024-0000000000000001-0004fae8e5057259.journal: 
5 extents found
system@ec16f60db38f43619f8337153a1cc024-00000000000092b1-0004fb59b1d034ad.journal: 
5 extents found
system.journal: 9 extents found
user-500@e4209c6628ed4a65954678b8011ad73f-0000000000085b7a-0004f77d25ebba04.journal: 
2 extents found
user-500@e4209c6628ed4a65954678b8011ad73f-000000000008e7fb-0004f83c7bf18294.journal: 
2 extents found
user-500@e4209c6628ed4a65954678b8011ad73f-0000000000097fe4-0004f8ae69c198ca.journal: 
2 extents found
user-500@e4209c6628ed4a65954678b8011ad73f-00000000000a1a7e-0004f9966e9c69d8.journal: 
2 extents found
user-500.journal: 2 extents found

I don't think these are too bad values, eh?

Well, how did I accomblish that?

First, I've set the journal directories nocow. Of course, systemd should do 
this by default. I'm not sure if this is a packaging or systemd code issue, 
tho. But I think the systemd devs are in common that for cow fs, the journal 
directories should be set nocow. After all, the journal is a transactional 
database - it does not need cow protection at all costs. And I think they 
have their own checksumming protection. So, why let systemd bother with 
that? A lot of other software has the same semantic problems with btrfs, too 
(ex. MySQL) where nobody shouts at the "inabilities" of the programmers. So 
why for systemd? Just because it's intrusive by its nature for being a 
radically and newly designed init system and thus requires some learning by 
its users/admins/packagers? Really? Come on... As admin and/or packager you 
have to stay with current technologies and developments anyways. It's only 
important to hide the details from the users.

Back to the extents counts: What I did next was implementing a defrag job 
that regularly defrags the journal (actually, the complete log directory as 
other log files suffer the same problem):

$ cat /usr/local/sbin/defrag-logs.rb 
#!/bin/sh
exec btrfs filesystem defragment -czlib -r /var/log

It can be easily converted into a timer job with systemd. This is left as an 
excercise to the reader.

BTW: Actually, that job isn't currently executed on my system which makes 
the numbers above pretty impressive... However, autodefrag is turned on 
which may play into the mix. I'm not sure. I stopped automatically running 
those defrag jobs a while ago (I have a few more).
 
> An arguably fairer and more impartial assessment of this particular
> situations suggests that neither btrfs, which as a COW-based filesystem,
> like all COW-based filesystems has the existing-file-rewrite as a major
> technical challenge that it must deal with /somehow/, nor systemd, which
> in choosing to use fallocate is specifically putting itself in that
> existing-file-rewrite class, are entirely at fault.

This challenge is not only affecting systemd but also a lot of other 
packages which do not play nice with btrfs semantics. But - as you correctly 
write - you cannot point your finger at just one party. FS and user space 
have to come together to evaluate and fix the problems on both sides. In 
> But that doesn't matter if one side refuses to budge, because then the
> other side must do so regardless of where the fault was, if there is to
> be any progress at all.

> Meanwhile, I've predicted before and do so here again, that as btrfs
> moves toward mainstream and starts supplanting ext* as the assumed Linux
> default filesystem, some of these problems will simply "go away", because
> at that point, various apps are no longer optimized for the assumed
> default filesystem, and they'll either be patched at some level (distro
> level if not upstream) to work better on the new default filesystem, or
> will be replaced by something that does.  And neither upstream nor distro
> level does that patching, then at some point, people are going to find
> that said distro performs worse than other distros that do that patching.

> Another alternative is that distros will start setting /var/log/journal
> NOCOW in their setup scripts by default when it's btrfs, thus avoiding
> the problem.  (Altho if they do automated snapshotting they'll also have
> to set it as its own subvolume, to avoid the first-write-after-snapshot-
> is-COW problem.)  Well, that, and/or set autodefrag in the default mount
> options.

> Meanwhile, there's some focus on making btrfs behave better with such
> rewrite-pattern files, but while I think the problem can be made /some/
> better, hopefully enough that the defaults bother far fewer people in far
> fewer cases, I expect it'll always be a bit of a sore spot because that's
> just how the technology works, and as such, setting NOCOW for such files
> and/or using autodefrag will continue to be recommended for an optimized
> setup.



  parent reply	other threads:[~2014-06-14 23:35 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-12 11:13 R: Re: Slow startup of systemd-journal on BTRFS Goffredo Baroncelli <kreijack@libero.it>
2014-06-12 12:37 ` Duncan
2014-06-12 23:24   ` Dave Chinner
2014-06-13 22:19     ` Goffredo Baroncelli
2014-06-14  2:53       ` Duncan
2014-06-14  7:52         ` Goffredo Baroncelli
2014-06-15  5:43           ` Duncan
2014-06-15 22:39             ` [systemd-devel] " Lennart Poettering
2014-06-15 22:13           ` Lennart Poettering
2014-06-16  0:17             ` Russell Coker
2014-06-16  1:06               ` John Williams
2014-06-16  2:19                 ` Russell Coker
2014-06-16 10:14               ` Lennart Poettering
2014-06-16 10:35                 ` Russell Coker
2014-06-16 11:16                   ` Austin S Hemmelgarn
2014-06-16 11:56                 ` Andrey Borzenkov
2014-06-16 16:05                 ` Josef Bacik
2014-06-16 19:52                   ` Martin
2014-06-16 20:20                     ` Josef Bacik
2014-06-17  0:15                     ` Austin S Hemmelgarn
2014-06-17  1:13                     ` cwillu
2014-06-17 12:24                       ` Martin
2014-06-17 17:56                       ` Chris Murphy
2014-06-17 18:46                       ` Filipe Brandenburger
2014-06-17 19:42                         ` Goffredo Baroncelli
2014-06-17 21:12                   ` Lennart Poettering
2014-06-16 16:32             ` Goffredo Baroncelli
2014-06-16 18:47               ` Goffredo Baroncelli
2014-06-19  1:13             ` Dave Chinner
2014-06-14 10:59         ` Kai Krakow [this message]
2014-06-15  5:02           ` Duncan
2014-06-15 11:18             ` Kai Krakow
2014-06-15 21:45           ` Martin Steigerwald
2014-06-15 21:51             ` Hugo Mills
2014-06-15 22:43           ` [systemd-devel] " Lennart Poettering
2014-06-15 21:31         ` Martin Steigerwald
2014-06-15 21:37           ` Hugo Mills
2014-06-17  8:22           ` Duncan
  -- strict thread matches above, loose matches on Subject: below --
2014-06-11 21:28 Goffredo Baroncelli
2014-06-12  0:40 ` Chris Murphy
2014-06-12  1:18 ` Russell Coker
2014-06-12  4:39   ` Duncan
2014-06-12  1:21 ` Dave Chinner
2014-06-12  1:37   ` Dave Chinner
2014-06-12  2:32     ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jhmt6b-j4j.ln1@hurikhan77.spdns.de \
    --to=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=systemd-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.