All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
To: Tommi Virtanen <tv@inktank.com>
Cc: Josh Durgin <josh.durgin@inktank.com>,
	ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Poor read performance in KVM
Date: Sat, 21 Jul 2012 02:23:25 +0930	[thread overview]
Message-ID: <50098D05.2080704@bashkirtsev.com> (raw)
In-Reply-To: <CADvuQRG1z0YQezCoJhCNFnC+03UDv6H3aJPUWBcbBw3-iUnA8Q@mail.gmail.com>

On 21/07/2012 2:12 AM, Tommi Virtanen wrote:
> On Fri, Jul 20, 2012 at 9:17 AM, Vladimir Bashkirtsev
> <vladimir@bashkirtsev.com> wrote:
>> not running. So I ended up rebooting hosts and that's where fun begin: btrfs
>> has failed to umount , on boot up it spit out "btrfs: free space inode
>> generation (0) did not match free space cache generation (177431)". I have
>> not started ceph and made an attempt to umount and umount just froze.
>> Another reboot: same stuff. I have rebooted second host and it came back
>> with the same error. So in effect I was unable to mount btrfs and read it:
>> no wonder that ceph was unable to run. Actually according to mons ceph was
> The btrfs developers tend to be good about bug reports that severe --
> I think you should email that mailing list and ask if that sounds like
> known bug, and ask what information you should capture if it happens
> again (assuming the workload is complex enough that you can't easily
> capture/reproduce all of that).
Well... Work load was fairly high - not something usually happening on 
MySQL. Our client keeps imagery in MySQL and his system was regenerating 
images (it takes hi-res image and produces five or six images which are 
of smaller size + watermark). Stuff runs imagemagick which keeps its 
temporary data on disk (and to ceph it is not really temporary data - it 
is data which must be committed to osds) and then innodb in MySQL stores 
results - which of course creates number of pages and so it appears as 
random writes to underlying file system. And from what I have seen write 
traffic created by this process was in TB range (my whole ceph cluster 
is just 3.3TB). So it was considerable amount of changes on filesystem.

I guess if we will start that process again we will end up with the 
similar result in few days - but by some reason I don't want to try it 
in production system :)

I can scavenge something from logs and post it to btrfs devs. Thanks for 
a tip.
>
>> But it leaves me with very final question: should we rely on btrfs at this
>> point given it is having such major faults? What if I will use well tested
>> by time ext4?
> You might want to try xfs. We hear/see problems with all three, but
> xfs currently seems to have the best long-term performance and
> reliability.
>
> I'm not sure if anyone's run detailed tests with ext4 after the
> xattrs-in-leveldb feature; before that, we ran into fs limitations.
That's what I was thinking: before xattrs-in-leveldb I even did not 
consider ext4 as viable alternative but now it may be reasonable to give 
it a go. Or even may be have a mix of osds backed by different file 
systems? What is devs opinion on this?

  parent reply	other threads:[~2012-07-20 16:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-15 13:13 Poor read performance in KVM Vladimir Bashkirtsev
2012-07-16  6:16 ` Josh Durgin
2012-07-18  5:46   ` Vladimir Bashkirtsev
2012-07-18 15:27     ` Josh Durgin
2012-07-19 10:46       ` Vladimir Bashkirtsev
2012-07-19 12:19       ` Vladimir Bashkirtsev
2012-07-19 15:52         ` Tommi Virtanen
2012-07-19 18:06           ` Calvin Morrow
2012-07-19 18:15             ` Mark Nelson
2012-07-20  5:24               ` Vladimir Bashkirtsev
2012-07-20  5:24             ` Vladimir Bashkirtsev
2012-07-20  5:20           ` Vladimir Bashkirtsev
     [not found]       ` <50080D9D.8010306@bashkirtsev.com>
2012-07-19 18:42         ` Josh Durgin
2012-07-20  5:31           ` Vladimir Bashkirtsev
2012-07-20 16:17           ` Vladimir Bashkirtsev
2012-07-20 16:42             ` Tommi Virtanen
2012-07-20 16:53               ` Mark Nelson
2012-07-20 16:53               ` Vladimir Bashkirtsev [this message]
2012-07-29 15:31               ` Vladimir Bashkirtsev
2012-07-29 16:13                 ` Mark Nelson
2012-07-18 15:34     ` Josh Durgin
2012-07-18  5:49   ` Vladimir Bashkirtsev
2012-07-18  5:51   ` Vladimir Bashkirtsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50098D05.2080704@bashkirtsev.com \
    --to=vladimir@bashkirtsev.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=josh.durgin@inktank.com \
    --cc=tv@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.