All of lore.kernel.org
 help / color / mirror / Atom feed
* hardware heterogeneous in same pool
@ 2018-10-03 22:09 Bruno Carvalho
       [not found] ` <CAL040-j2xziD6WGd7yqJc08sQfO0g5Qru0f-7maCxbqAq4EayQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Bruno Carvalho @ 2018-10-03 22:09 UTC (permalink / raw)
  To: ceph-devel, ceph-users

Hi Cephers, I would like to know how you are growing the cluster.

Using dissimilar hardware in the same pool or creating a pool for each
different hardware group.

What problem would I have many problems using different hardware (CPU,
memory, disk) in the same pool?

Someone could share the experience with openstack.

Att,

Bruno Carvalho

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hardware heterogeneous in same pool
       [not found] ` <CAL040-j2xziD6WGd7yqJc08sQfO0g5Qru0f-7maCxbqAq4EayQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-10-03 22:34   ` Jonathan D. Proulx
  2018-10-04  7:50   ` Janne Johansson
  1 sibling, 0 replies; 4+ messages in thread
From: Jonathan D. Proulx @ 2018-10-03 22:34 UTC (permalink / raw)
  To: Bruno Carvalho; +Cc: ceph-devel, ceph-users

On Wed, Oct 03, 2018 at 07:09:30PM -0300, Bruno Carvalho wrote:
:Hi Cephers, I would like to know how you are growing the cluster.
:
:Using dissimilar hardware in the same pool or creating a pool for each
:different hardware group.
:
:What problem would I have many problems using different hardware (CPU,
:memory, disk) in the same pool?

I've been growing with new hardware in old pools.

Due to the way RDB gets smeared across the disks your performance is
almost always bottle necked by slowest storage location.

If you're just adding slightly newer slightly faster hardware this is
OK as most of the performance gain in that case is from spreading
wider not so much the individual drive performance.

But if you are adding a faster technology like going from
spinning disk to ssd you do want to think about how to transition.

I recently added SSD to a previously all HDD cluster (well HDD data
with SSD WAL/DB).  For this I did fiddle with crush rules. First I made
the existing rules require HDD class devices which shoudl have been a
noop in my mind but actually moved 90% of my data.  The folks at CERN
made a similar discovery before me and even (I think worked out a way
to avoid it) see
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-June/000113.html

After that I made new rules that took on SSD andtwo HDD for each
replica set (in addtion to spreading across racks or servers or what
ever) and after applying the new rule to the pools I use for Nova
ephemeral storage and Cinder Volumes I set the SSD OSDs to have high
"primary affinity" and the HDDs to have low "primary affinity".

In the end this means the SSDs server reads and writes while writes to
the HDD replicas are buffered by the SSD WAL so both reads and write
are relatively fast (we'd previouslyy been suffering on reads due to
IO load).

I left  Glance images on HDD only as those don't require much
performance in my world, same with RGW object storage though for soem
that may be performance sensitive.

The plan forward is more SSD to replace HDD, probbably by first
getting enough to transition ephemeral dirves, then a set to move
block storage, then the rest over next year or two.

The mixed SSD/HDD was a big win for us though so we're happy with that
for now.

scale matters with this so we have:
245 OSDs in 12 servers
627 TiB RAW storage (267 TiB used)
19.44 M objects


hope that helps,
-Jon

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hardware heterogeneous in same pool
       [not found] ` <CAL040-j2xziD6WGd7yqJc08sQfO0g5Qru0f-7maCxbqAq4EayQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2018-10-03 22:34   ` Jonathan D. Proulx
@ 2018-10-04  7:50   ` Janne Johansson
       [not found]     ` <CAA6-MF8qKOeJRzBm0rsieFK_aWbWS1PMWUhTDMGgVx1iSc88wQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 4+ messages in thread
From: Janne Johansson @ 2018-10-04  7:50 UTC (permalink / raw)
  To: brunowcs-Re5JQEeQqe8AvxtiuMwx3w
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA, Ceph Users


[-- Attachment #1.1: Type: text/plain, Size: 1564 bytes --]

Den tors 4 okt. 2018 kl 00:09 skrev Bruno Carvalho <brunowcs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> Hi Cephers, I would like to know how you are growing the cluster.
> Using dissimilar hardware in the same pool or creating a pool for each
> different hardware group.
> What problem would I have many problems using different hardware (CPU,
> memory, disk) in the same pool?


I don't think CPU and RAM (and other hw related things like HBA controller
card brand) matters
a lot, more is always nicer, but as long as you don't add worse machines
like Jonathan wrote you
should not see any degradation.

What you might want to look out for is if the new disks are very uneven
compared to the old
setup, so if you used to have servers with 10x2TB drives and suddenly add
one with 2x10TB,
things might become very unbalanced, since those differences will not be
handled seamlessly
by the crush map.

Apart from that, the only issues for us is "add drives, quickly set crush
reweight to 0.0 before
all existing OSD hosts shoot massive amounts of I/O on them, then script a
slower raise of
crush weight upto what they should end up at", to lessen the impact for our
24/7 operations.

If you have weekends where noone accesses the cluster or night-time low-IO
usage patterns,
just upping the weight at the right hour might suffice.

Lastly, for ssd/nvme setups with good networking, this is almost moot, they
converge so fast
its almost unfair. A real joy working with expanding flash-only
pools/clusters.

-- 
May the most significant bit of your life be positive.

[-- Attachment #1.2: Type: text/html, Size: 2147 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hardware heterogeneous in same pool
       [not found]     ` <CAA6-MF8qKOeJRzBm0rsieFK_aWbWS1PMWUhTDMGgVx1iSc88wQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-10-04 17:18       ` Brett Chancellor
  0 siblings, 0 replies; 4+ messages in thread
From: Brett Chancellor @ 2018-10-04 17:18 UTC (permalink / raw)
  To: icepic.dz-Re5JQEeQqe8AvxtiuMwx3w
  Cc: brunowcs-Re5JQEeQqe8AvxtiuMwx3w,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA,
	ceph-users-idqoXFIVOFJgJs9I8MT0rw


[-- Attachment #1.1: Type: text/plain, Size: 2082 bytes --]

You could also set *osd_crush_initial_weight = 0 . *New OSDs will
automatically come up with a 0 weight and you won't have to race the clock.

-Brett

On Thu, Oct 4, 2018 at 3:50 AM Janne Johansson <icepic.dz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

>
>
> Den tors 4 okt. 2018 kl 00:09 skrev Bruno Carvalho <brunowcs-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:
>
>> Hi Cephers, I would like to know how you are growing the cluster.
>> Using dissimilar hardware in the same pool or creating a pool for each
>> different hardware group.
>> What problem would I have many problems using different hardware (CPU,
>> memory, disk) in the same pool?
>
>
> I don't think CPU and RAM (and other hw related things like HBA controller
> card brand) matters
> a lot, more is always nicer, but as long as you don't add worse machines
> like Jonathan wrote you
> should not see any degradation.
>
> What you might want to look out for is if the new disks are very uneven
> compared to the old
> setup, so if you used to have servers with 10x2TB drives and suddenly add
> one with 2x10TB,
> things might become very unbalanced, since those differences will not be
> handled seamlessly
> by the crush map.
>
> Apart from that, the only issues for us is "add drives, quickly set crush
> reweight to 0.0 before
> all existing OSD hosts shoot massive amounts of I/O on them, then script a
> slower raise of
> crush weight upto what they should end up at", to lessen the impact for
> our 24/7 operations.
>
> If you have weekends where noone accesses the cluster or night-time low-IO
> usage patterns,
> just upping the weight at the right hour might suffice.
>
> Lastly, for ssd/nvme setups with good networking, this is almost moot,
> they converge so fast
> its almost unfair. A real joy working with expanding flash-only
> pools/clusters.
>
> --
> May the most significant bit of your life be positive.
> _______________________________________________
> ceph-users mailing list
> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

[-- Attachment #1.2: Type: text/html, Size: 3353 bytes --]

[-- Attachment #2: Type: text/plain, Size: 178 bytes --]

_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-04 17:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-03 22:09 hardware heterogeneous in same pool Bruno Carvalho
     [not found] ` <CAL040-j2xziD6WGd7yqJc08sQfO0g5Qru0f-7maCxbqAq4EayQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-10-03 22:34   ` Jonathan D. Proulx
2018-10-04  7:50   ` Janne Johansson
     [not found]     ` <CAA6-MF8qKOeJRzBm0rsieFK_aWbWS1PMWUhTDMGgVx1iSc88wQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-10-04 17:18       ` Brett Chancellor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.