All of lore.kernel.org
 help / color / mirror / Atom feed
* ceph performance under xen?
@ 2012-06-28 14:27 Brian Edmonds
  2012-06-29 18:55 ` Gregory Farnum
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Edmonds @ 2012-06-28 14:27 UTC (permalink / raw)
  To: ceph-devel

I've installed a little, four node Ceph (0.47.2) cluster using Xen
virtual machines for testing, and when I run bonnie against a (kernel
driver) mount of it, it seems to be somewhat flaky (disturbing log
messages, occasional binary deaths), and very slow.  I expected some
slowness, with all the virtualization going on, but when bonnie is
running, simply doing a cd into the mount takes minutes to complete.
Is this to be expected, or is it worth spending time to investigate?
I intend to deploy this on bare metal eventually, but was hoping to
get some operational experience with Ceph before investing the money
in that.

For background, I'm investigating Ceph as a "redundant array of
inexpensive machines" replacement for RAID arrays at home.  I'm
planning eventually to deploy something (even if I have to write it
myself) that works essentially like Google's GFS, with a POSIX,
mountable interface, running on the cheapest machines I can assemble,
so there should be no failures that cannot be dealt with through
obvious and cheap hardware replacement.

Brian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ceph performance under xen?
  2012-06-28 14:27 ceph performance under xen? Brian Edmonds
@ 2012-06-29 18:55 ` Gregory Farnum
  2012-06-29 20:54   ` Brian Edmonds
  0 siblings, 1 reply; 6+ messages in thread
From: Gregory Farnum @ 2012-06-29 18:55 UTC (permalink / raw)
  To: Brian Edmonds; +Cc: ceph-devel

On Thu, Jun 28, 2012 at 7:27 AM, Brian Edmonds <mornir@gmail.com> wrote:
> I've installed a little, four node Ceph (0.47.2) cluster using Xen
> virtual machines for testing, and when I run bonnie against a (kernel
> driver) mount of it, it seems to be somewhat flaky (disturbing log
> messages, occasional binary deaths), and very slow.  I expected some
> slowness, with all the virtualization going on, but when bonnie is
> running, simply doing a cd into the mount takes minutes to complete.
> Is this to be expected, or is it worth spending time to investigate?
> I intend to deploy this on bare metal eventually, but was hoping to
> get some operational experience with Ceph before investing the money
> in that.

So right now you're using the Ceph filesystem, rather than RBD, right?
What processes do you have running on which machines/VMs? What's the
CPU usage on the ceph-mds process?

And a warning: the filesystem, while nifty, is not yet
production-ready — it works great for some use cases but there are
some serious known bugs that aren't very hard to trigger, as we've
been doing a lot of QA on RADOS and its associated systems (which the
filesystem depends on) at the expense of the filesystem itself.

-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ceph performance under xen?
  2012-06-29 18:55 ` Gregory Farnum
@ 2012-06-29 20:54   ` Brian Edmonds
  2012-06-29 21:06     ` Gregory Farnum
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Edmonds @ 2012-06-29 20:54 UTC (permalink / raw)
  To: ceph-devel

On Fri, Jun 29, 2012 at 11:55 AM, Gregory Farnum <greg@inktank.com> wrote:
> So right now you're using the Ceph filesystem, rather than RBD, right?

Right, CephFS.  I'm actually not even very clear on what RBD is, and
how one might use it, but I'm sure I'll understand that in the
fullness of time.  I came to Ceph from a background of wanting to
replace my primary RAID array with a RAIM (redundant array of
inexpensive machines) cluster, and a co-worker suggested Ceph as a
possibility.

> What processes do you have running on which machines/VMs? What's the
> CPU usage on the ceph-mds process?

I have four VMs running Debian testing, with a dom0 on a recent 6-core
AMD cpu (I forget which one).  Each VM has two virtual cores, 1GB of
RAM, and a 500GB virtual disk partition formatted with btrfs, used for
both data and journal.  These are somewhat smaller than recommended,
but in the right ballpark, and the filesystem has so far not been used
to store any significant amount of data.  (Mostly just bonnie tests.)

All four VMs are running OSD, the first three are running MON, and the
first two MDS.  I mostly watch top on the first machine (if there's a
better tool for watching a cluster, please let me know), and it shows
the majority of the CPU time in wait, with the Ceph jobs popping up
from time to time with a fraction of a percent, sometimes up into
single digits.  It's also not uncommon to see a lot of idle time.
When I get some time I'm going to wrap some sort of collector around
the log files and feed the data into OpenTSDB.

> And a warning: the filesystem, while nifty, is not yet
> production-ready — it works great for some use cases but there are
> some serious known bugs that aren't very hard to trigger, as we've
> been doing a lot of QA on RADOS and its associated systems (which the
> filesystem depends on) at the expense of the filesystem itself.

Good to know.  For now I'm just playing, but I eventually want to have
a distributed filesystem that I can use.  I'm curious to see how Ceph
does when deployed on real hardware, which I expect to have in the
next couple weeks.  Very simple stuff compared to what I see others on
the list discussing: a few dual core Atom systems with 1TB of drive
and 4GB of RAM each, all on a 1Gb switch.

Brian.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ceph performance under xen?
  2012-06-29 20:54   ` Brian Edmonds
@ 2012-06-29 21:06     ` Gregory Farnum
  2012-06-29 21:14       ` Brian Edmonds
  0 siblings, 1 reply; 6+ messages in thread
From: Gregory Farnum @ 2012-06-29 21:06 UTC (permalink / raw)
  To: Brian Edmonds; +Cc: ceph-devel

On Fri, Jun 29, 2012 at 1:54 PM, Brian Edmonds <mornir@gmail.com> wrote:
> On Fri, Jun 29, 2012 at 11:55 AM, Gregory Farnum <greg@inktank.com> wrote:
>> So right now you're using the Ceph filesystem, rather than RBD, right?
>
> Right, CephFS.  I'm actually not even very clear on what RBD is, and
> how one might use it, but I'm sure I'll understand that in the
> fullness of time.  I came to Ceph from a background of wanting to
> replace my primary RAID array with a RAIM (redundant array of
> inexpensive machines) cluster, and a co-worker suggested Ceph as a
> possibility.
>
>> What processes do you have running on which machines/VMs? What's the
>> CPU usage on the ceph-mds process?
>
> I have four VMs running Debian testing, with a dom0 on a recent 6-core
> AMD cpu (I forget which one).  Each VM has two virtual cores, 1GB of
> RAM, and a 500GB virtual disk partition formatted with btrfs, used for
> both data and journal.  These are somewhat smaller than recommended,
> but in the right ballpark, and the filesystem has so far not been used
> to store any significant amount of data.  (Mostly just bonnie tests.)
>
> All four VMs are running OSD, the first three are running MON, and the
> first two MDS.  I mostly watch top on the first machine (if there's a
> better tool for watching a cluster, please let me know), and it shows
> the majority of the CPU time in wait, with the Ceph jobs popping up
> from time to time with a fraction of a percent, sometimes up into
> single digits.  It's also not uncommon to see a lot of idle time.
> When I get some time I'm going to wrap some sort of collector around
> the log files and feed the data into OpenTSDB.

Okay, there's two things I'd do here. First, create a cluster that
only has one MDS — the multi-MDS system is significantly less stable.
Second, you've got 3 monitors doing frequent fsyncs, and 4 OSDs doing
frequent syncs, which are all funneling into a single disk. That's
going to go poorly no matter what you're doing. ;) Try doing a smaller
cluster with just one monitor, one or two OSDs, and one MDS.


>> And a warning: the filesystem, while nifty, is not yet
>> production-ready — it works great for some use cases but there are
>> some serious known bugs that aren't very hard to trigger, as we've
>> been doing a lot of QA on RADOS and its associated systems (which the
>> filesystem depends on) at the expense of the filesystem itself.
>
> Good to know.  For now I'm just playing, but I eventually want to have
> a distributed filesystem that I can use.  I'm curious to see how Ceph
> does when deployed on real hardware, which I expect to have in the
> next couple weeks.  Very simple stuff compared to what I see others on
> the list discussing: a few dual core Atom systems with 1TB of drive
> and 4GB of RAM each, all on a 1Gb switch.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ceph performance under xen?
  2012-06-29 21:06     ` Gregory Farnum
@ 2012-06-29 21:14       ` Brian Edmonds
  2012-06-30 17:02         ` Brian Edmonds
  0 siblings, 1 reply; 6+ messages in thread
From: Brian Edmonds @ 2012-06-29 21:14 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On Fri, Jun 29, 2012 at 2:06 PM, Gregory Farnum <greg@inktank.com> wrote:
> Okay, there's two things I'd do here. First, create a cluster that
> only has one MDS — the multi-MDS system is significantly less stable.

Ok, will do.

> Second, you've got 3 monitors doing frequent fsyncs, and 4 OSDs doing
> frequent syncs, which are all funneling into a single disk. That's
> going to go poorly no matter what you're doing. ;) Try doing a smaller
> cluster with just one monitor, one or two OSDs, and one MDS.

Oh, I missed that part.  The disks on the dom0 are actually logical
volumes on top of a (software) RAID5 with four physical disks.  That
may, of course, not be a better situation. =)

In any case, I'll rebuild the current four machines into one MON, one
MDS and two OSDs and see how that goes.  Thanks for the advice.

Brian.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ceph performance under xen?
  2012-06-29 21:14       ` Brian Edmonds
@ 2012-06-30 17:02         ` Brian Edmonds
  0 siblings, 0 replies; 6+ messages in thread
From: Brian Edmonds @ 2012-06-30 17:02 UTC (permalink / raw)
  To: ceph-devel

On Fri, Jun 29, 2012 at 2:14 PM, Brian Edmonds <mornir@gmail.com> wrote:
> In any case, I'll rebuild the current four machines into one MON, one
> MDS and two OSDs and see how that goes.  Thanks for the advice.

Just rebuilt as described, and it is significantly more stable.
Changing and listing directories is essentially snappy now, bonnie++
completed without issue, and I can run ceph commands with reasonable
latency.  (Previously they would block for sometimes minutes at a time
before responding.)  I'd obviously prefer more redundancy, but I
expect I can live with this sort of arrangement for the present while
I'm playing around with it.

Thanks for the help.  I should have all the parts here to build the
bare metal cluster the week after next.  In the mean time I'm going to
play with some operational issues so I'm all ready to go with the real
thing.

Brian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-06-30 17:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-28 14:27 ceph performance under xen? Brian Edmonds
2012-06-29 18:55 ` Gregory Farnum
2012-06-29 20:54   ` Brian Edmonds
2012-06-29 21:06     ` Gregory Farnum
2012-06-29 21:14       ` Brian Edmonds
2012-06-30 17:02         ` Brian Edmonds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.