All of lore.kernel.org
 help / color / mirror / Atom feed
* questions on networks and hardware
@ 2013-01-18 23:34 John Nielsen
  2013-01-19 16:00 ` Holcombe, Christopher
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: John Nielsen @ 2013-01-18 23:34 UTC (permalink / raw)
  To: ceph-devel

I'm planning a Ceph deployment which will include:
	10Gbit/s public/client network
	10Gbit/s cluster network
	dedicated mon hosts (3 to start)
	dedicated storage hosts (multiple disks, one XFS and OSD per disk, 3-5 to start)
	dedicated RADOS gateway host (1 to start)

I've done some initial testing and read through most of the docs but I still have a few questions. Please respond even if you just have a suggestion or response for one of them.

If I have "cluster network" and "public network" entries under [global] or [osd], do I still need to specify "public addr" and "cluster addr" for each OSD individually?

Which network(s) should the monitor hosts be on? If both, is it valid to have more than one "mon addr" entry per mon host or is there a different way to do it?

Is it worthwhile to have 10G NIC's on the monitor hosts? (The storage hosts will each have 2x 10Gbit/s NIC's.)

I'd like to have 2x 10Gbit/s NIC's on the gateway host and maximize throughput. Any suggestions on how to best do that? I'm assuming it will talk to the OSD's on the Ceph public/client network, so does that imply a third even-more-public network for the gateway's clients?

I think this has come up before, but has anyone written up something with more details on setting up gateways? Hardware recommendations, strategies to improve caching and performance, multiple gateway setups with and without a load balancer, etc.

Thanks!

JN


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: questions on networks and hardware
  2013-01-18 23:34 questions on networks and hardware John Nielsen
@ 2013-01-19 16:00 ` Holcombe, Christopher
  2013-01-20 17:16 ` Wido den Hollander
  2013-01-20 20:30 ` Gandalf Corvotempesta
  2 siblings, 0 replies; 16+ messages in thread
From: Holcombe, Christopher @ 2013-01-19 16:00 UTC (permalink / raw)
  To: John Nielsen, ceph-devel

Hi John,

I have a public/cluster network option setup in my config file.  You do not need to also specify an addr for each osd individually.  Here's an example of my working config:
[global]
        auth cluster required = none
        auth service required = none
        auth client required = none
        public network = 172.20.41.0/25
        cluster network = 172.20.41.128/25
        osd mkfs type = xfs
[osd]
        osd journal size = 1000
        filestore max sync interval = 30

[mon.a]
        host = plcephd01
        mon addr = 172.20.41.4:6789
[mon.b]
        host = plcephd03
        mon addr = 172.20.41.6:6789
[mon.c]
        host = plcephd05
        mon addr = 172.20.41.8:6789

[osd.0]
        host = plcephd01
        devs = /dev/sda3
[osd.X]
.... and so on...


-Chris
-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of John Nielsen
Sent: Friday, January 18, 2013 6:35 PM
To: ceph-devel@vger.kernel.org
Subject: questions on networks and hardware

I'm planning a Ceph deployment which will include:
        10Gbit/s public/client network
        10Gbit/s cluster network
        dedicated mon hosts (3 to start)
        dedicated storage hosts (multiple disks, one XFS and OSD per disk, 3-5 to start)
        dedicated RADOS gateway host (1 to start)

I've done some initial testing and read through most of the docs but I still have a few questions. Please respond even if you just have a suggestion or response for one of them.

If I have "cluster network" and "public network" entries under [global] or [osd], do I still need to specify "public addr" and "cluster addr" for each OSD individually?

Which network(s) should the monitor hosts be on? If both, is it valid to have more than one "mon addr" entry per mon host or is there a different way to do it?

Is it worthwhile to have 10G NIC's on the monitor hosts? (The storage hosts will each have 2x 10Gbit/s NIC's.)

I'd like to have 2x 10Gbit/s NIC's on the gateway host and maximize throughput. Any suggestions on how to best do that? I'm assuming it will talk to the OSD's on the Ceph public/client network, so does that imply a third even-more-public network for the gateway's clients?

I think this has come up before, but has anyone written up something with more details on setting up gateways? Hardware recommendations, strategies to improve caching and performance, multiple gateway setups with and without a load balancer, etc.

Thanks!

JN

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

________________________________

NOTICE: This e-mail and any attachments is intended only for use by the addressee(s) named herein and may contain legally privileged, proprietary or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this email, and any attachments thereto, is strictly prohibited. If you receive this email in error please immediately notify me via reply email or at (800) 927-9800 and permanently delete the original copy and any copy of any e-mail, and any printout.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-18 23:34 questions on networks and hardware John Nielsen
  2013-01-19 16:00 ` Holcombe, Christopher
@ 2013-01-20 17:16 ` Wido den Hollander
  2013-01-20 18:51   ` Jeff Mitchell
                     ` (2 more replies)
  2013-01-20 20:30 ` Gandalf Corvotempesta
  2 siblings, 3 replies; 16+ messages in thread
From: Wido den Hollander @ 2013-01-20 17:16 UTC (permalink / raw)
  To: John Nielsen; +Cc: ceph-devel

Hi,

On 01/19/2013 12:34 AM, John Nielsen wrote:
> I'm planning a Ceph deployment which will include:
> 	10Gbit/s public/client network
> 	10Gbit/s cluster network
> 	dedicated mon hosts (3 to start)
> 	dedicated storage hosts (multiple disks, one XFS and OSD per disk, 3-5 to start)
> 	dedicated RADOS gateway host (1 to start)
>
> I've done some initial testing and read through most of the docs but I still have a few questions. Please respond even if you just have a suggestion or response for one of them.
>
> If I have "cluster network" and "public network" entries under [global] or [osd], do I still need to specify "public addr" and "cluster addr" for each OSD individually?
>

Already answered, but no. You don't need to. The OSDs will bind to the 
available IP in that network.

> Which network(s) should the monitor hosts be on? If both, is it valid to have more than one "mon addr" entry per mon host or is there a different way to do it?
>

They should be on the "public" network since the clients also need to be 
able to reach the monitors.

> Is it worthwhile to have 10G NIC's on the monitor hosts? (The storage hosts will each have 2x 10Gbit/s NIC's.)
>

No, not really. 1Gbit should be more then enough for your monitors. 3 
monitors should also be good. No need to go for 5 or 7.

> I'd like to have 2x 10Gbit/s NIC's on the gateway host and maximize throughput. Any suggestions on how to best do that? I'm assuming it will talk to the OSD's on the Ceph public/client network, so does that imply a third even-more-public network for the gateway's clients?
>

No, there is no third network. You could trunk the 2 NICs with LACP or 
something. Since the client will open TCP connections to all the OSDs 
you will get a pretty good balancing over the available bandwith.

> I think this has come up before, but has anyone written up something with more details on setting up gateways? Hardware recommendations, strategies to improve caching and performance, multiple gateway setups with and without a load balancer, etc.

Not that I know. I'm still trying to play with RGW and Varnish in front 
of it, but haven't really taken the time yet. The goal is to offload a 
lot of the caching to Varnish and have Varnish 'ban' objects when they 
change.

You could also use Varnish as a loadbalancer. But in this case you could 
also use RR-DNS or LVS with Direct Routing, that way only incoming 
traffic goes through the loadbalancer and return traffic goes directly 
to your network's gateway.

Wido

>
> Thanks!
>
> JN
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 17:16 ` Wido den Hollander
@ 2013-01-20 18:51   ` Jeff Mitchell
  2013-01-20 19:05     ` Sage Weil
  2013-01-20 18:52   ` Stefan Priebe
  2013-01-21 23:15   ` John Nielsen
  2 siblings, 1 reply; 16+ messages in thread
From: Jeff Mitchell @ 2013-01-20 18:51 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: John Nielsen, ceph-devel

Wido den Hollander wrote:
> No, not really. 1Gbit should be more then enough for your monitors. 3
> monitors should also be good. No need to go for 5 or 7.

I have 5 monitors, across 16 different OSD-hosting machines...is that 
going to *harm* anything?

(I have had issues in my cluster when doing upgrades where my monitor 
count fell to 3, so it felt like having the extra was nice, at that point.)

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 17:16 ` Wido den Hollander
  2013-01-20 18:51   ` Jeff Mitchell
@ 2013-01-20 18:52   ` Stefan Priebe
  2013-01-20 19:03     ` Sage Weil
  2013-01-21 23:15   ` John Nielsen
  2 siblings, 1 reply; 16+ messages in thread
From: Stefan Priebe @ 2013-01-20 18:52 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: John Nielsen, ceph-devel

Hi,

Am 20.01.2013 18:16, schrieb Wido den Hollander:
>> If I have "cluster network" and "public network" entries under
>> [global] or [osd], do I still need to specify "public addr" and
>> "cluster addr" for each OSD individually?
>>
> Already answered, but no. You don't need to. The OSDs will bind to the
> available IP in that network.

What happens if there are more than one IP available in that network?

Stefan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 18:52   ` Stefan Priebe
@ 2013-01-20 19:03     ` Sage Weil
  0 siblings, 0 replies; 16+ messages in thread
From: Sage Weil @ 2013-01-20 19:03 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: Wido den Hollander, John Nielsen, ceph-devel

On Sun, 20 Jan 2013, Stefan Priebe wrote:
> Hi,
> 
> Am 20.01.2013 18:16, schrieb Wido den Hollander:
> > > If I have "cluster network" and "public network" entries under
> > > [global] or [osd], do I still need to specify "public addr" and
> > > "cluster addr" for each OSD individually?
> > > 
> > Already answered, but no. You don't need to. The OSDs will bind to the
> > available IP in that network.
> 
> What happens if there are more than one IP available in that network?

It'll take either the first or last one returned by the kernel.. you'd 
have to check the code to know which.  If you need finer control than 
that, 'public addr' and 'cluster addr' are there :)

sage


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 18:51   ` Jeff Mitchell
@ 2013-01-20 19:05     ` Sage Weil
  0 siblings, 0 replies; 16+ messages in thread
From: Sage Weil @ 2013-01-20 19:05 UTC (permalink / raw)
  To: Jeff Mitchell; +Cc: Wido den Hollander, John Nielsen, ceph-devel

On Sun, 20 Jan 2013, Jeff Mitchell wrote:
> Wido den Hollander wrote:
> > No, not really. 1Gbit should be more then enough for your monitors. 3
> > monitors should also be good. No need to go for 5 or 7.
> 
> I have 5 monitors, across 16 different OSD-hosting machines...is that going to
> *harm* anything?

The updates will be marginally more expensive, but I don't think it will 
cause any problems--not until you have 10 or more mons, I'd say.

The only thing to watch out for is the ceph-mon interfering with the 
ceph-osds.  If you have an old kernel without syncfs(2) (<2.6.39 or 
thereabouts), or if you are sharing the mon disk with an osd, there can be 
a negative interaction.

sage

> (I have had issues in my cluster when doing upgrades where my monitor count
> fell to 3, so it felt like having the extra was nice, at that point.)

5 will give you protection from 2 ceph-mon failues, which is nice.

sage

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-18 23:34 questions on networks and hardware John Nielsen
  2013-01-19 16:00 ` Holcombe, Christopher
  2013-01-20 17:16 ` Wido den Hollander
@ 2013-01-20 20:30 ` Gandalf Corvotempesta
  2013-01-20 22:23   ` Gregory Farnum
  2 siblings, 1 reply; 16+ messages in thread
From: Gandalf Corvotempesta @ 2013-01-20 20:30 UTC (permalink / raw)
  To: John Nielsen; +Cc: ceph-devel

2013/1/19 John Nielsen <lists@jnielsen.net>:
> I'm planning a Ceph deployment which will include:
>         10Gbit/s public/client network
>         10Gbit/s cluster network

I'm still trying to know if a redundant cluster networks is needed or not.
Is ceph able to manage a cluster network failure from on OSD?

What I would like to know if ceph will monitor OSD from the public
side (that should be redundant, because client will use it to connect
to the cluster) or from the cluster side.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 20:30 ` Gandalf Corvotempesta
@ 2013-01-20 22:23   ` Gregory Farnum
  2013-01-20 22:43     ` Gandalf Corvotempesta
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2013-01-20 22:23 UTC (permalink / raw)
  To: Gandalf Corvotempesta; +Cc: John Nielsen, ceph-devel

On Sunday, January 20, 2013 at 12:30 PM, Gandalf Corvotempesta wrote:
> 2013/1/19 John Nielsen <lists@jnielsen.net (mailto:lists@jnielsen.net)>:
> > I'm planning a Ceph deployment which will include:
> > 10Gbit/s public/client network
> > 10Gbit/s cluster network
> 
> 
> 
> I'm still trying to know if a redundant cluster networks is needed or not.
> Is ceph able to manage a cluster network failure from on OSD?
> 
> What I would like to know if ceph will monitor OSD from the public
> side (that should be redundant, because client will use it to connect
> to the cluster) or from the cluster side.
> 
This is a bit embarrassing, but if you're actually using two networks and the cluster network fails but the client network stays up, things behave pretty badly (the OSDs will keep insisting it's failed, while the monitor will insist it's still working). I believe there's a branch working on this problem, but I haven't been involved with it.
It's not necessary to have split networks though, no.

Does that answer your question?
-Greg


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 22:23   ` Gregory Farnum
@ 2013-01-20 22:43     ` Gandalf Corvotempesta
  2013-01-20 23:17       ` Gregory Farnum
  0 siblings, 1 reply; 16+ messages in thread
From: Gandalf Corvotempesta @ 2013-01-20 22:43 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Nielsen, ceph-devel

2013/1/20 Gregory Farnum <greg@inktank.com>:
> This is a bit embarrassing, but if you're actually using two networks and the cluster network fails but the client network stays up, things behave pretty badly (the OSDs will keep insisting it's failed, while the monitor will insist it's still working). I believe there's a branch working on this problem, but I haven't been involved with it.
> It's not necessary to have split networks though, no.

Ok.

> Does that answer your question?

Absolutely, but usually cluster network is faster than client network
and being forced to use two cluster network is very very expensive.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 22:43     ` Gandalf Corvotempesta
@ 2013-01-20 23:17       ` Gregory Farnum
  2013-01-21  8:19         ` Gandalf Corvotempesta
  0 siblings, 1 reply; 16+ messages in thread
From: Gregory Farnum @ 2013-01-20 23:17 UTC (permalink / raw)
  To: Gandalf Corvotempesta; +Cc: John Nielsen, ceph-devel

On Sunday, January 20, 2013 at 2:43 PM, Gandalf Corvotempesta wrote:
> 2013/1/20 Gregory Farnum <greg@inktank.com (mailto:greg@inktank.com)>:
> > This is a bit embarrassing, but if you're actually using two networks and the cluster network fails but the client network stays up, things behave pretty badly (the OSDs will keep insisting it's failed, while the monitor will insist it's still working). I believe there's a branch working on this problem, but I haven't been involved with it.
> > It's not necessary to have split networks though, no.
>  
>  
>  
> Ok.
>  
> > Does that answer your question?
>  
> Absolutely, but usually cluster network is faster than client network
> and being forced to use two cluster network is very very expensive.

I'm not quite sure what you mean…the use of the "cluster network" and "public network" are really just intended as conveniences for people with multiple NICs on their box. There's nothing preventing you from running everything on the same network…(and more specifically, from different speed grades to different boxes, but keeping them all on the same network).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 23:17       ` Gregory Farnum
@ 2013-01-21  8:19         ` Gandalf Corvotempesta
  2013-01-22 21:05           ` Dan Mick
  0 siblings, 1 reply; 16+ messages in thread
From: Gandalf Corvotempesta @ 2013-01-21  8:19 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: John Nielsen, ceph-devel

2013/1/21 Gregory Farnum <greg@inktank.com>:
> I'm not quite sure what you mean…the use of the "cluster network" and "public network" are really just intended as conveniences for people with multiple NICs on their box. There's nothing preventing you from running everything on the same network…(and more specifically, from different speed grades to different boxes, but keeping them all on the same network).

I mean using two cluster network and two pubic networks. 4 NICs in total.
Our cluster network will be 10GBe or Infiniband DDR/QDR, making a
fully redundany cluster network (two nics on each server, two
switches) will double our costs.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-20 17:16 ` Wido den Hollander
  2013-01-20 18:51   ` Jeff Mitchell
  2013-01-20 18:52   ` Stefan Priebe
@ 2013-01-21 23:15   ` John Nielsen
  2013-01-22 13:20     ` Wido den Hollander
  2 siblings, 1 reply; 16+ messages in thread
From: John Nielsen @ 2013-01-21 23:15 UTC (permalink / raw)
  To: ceph-devel

Thanks all for your responses! Some comments inline.

On Jan 20, 2013, at 10:16 AM, Wido den Hollander <wido@widodh.nl> wrote:

> On 01/19/2013 12:34 AM, John Nielsen wrote:
>> I'm planning a Ceph deployment which will include:
>> 	10Gbit/s public/client network
>> 	10Gbit/s cluster network
>> 	dedicated mon hosts (3 to start)
>> 	dedicated storage hosts (multiple disks, one XFS and OSD per disk, 3-5 to start)
>> 	dedicated RADOS gateway host (1 to start)
>> 
>> I've done some initial testing and read through most of the docs but I still have a few questions. Please respond even if you just have a suggestion or response for one of them.
>> 
>> If I have "cluster network" and "public network" entries under [global] or [osd], do I still need to specify "public addr" and "cluster addr" for each OSD individually?
> 
> Already answered, but no. You don't need to. The OSDs will bind to the available IP in that network.

Nice. That's how I hoped/assumed it would work but I have seen some configurations on the web that include both so I wanted to make sure.

>> Which network(s) should the monitor hosts be on? If both, is it valid to have more than one "mon addr" entry per mon host or is there a different way to do it?
> 
> They should be on the "public" network since the clients also need to be able to reach the monitors.

It sounds (from other followups) like there is work on adding some awareness of the cluster network to the monitor but it's not there yet. I'll stay tuned. It would be nice if the monitors and OSD's together could form a reachability map for the cluster and give the option of using the public network for the affected subset of OSD traffic in the event of a problem on the cluster network. Having the monitor(s) connected to the cluster network might be useful for that...

Then again, a human should be involved if there is a network failure anyway; manually changing the network settings in the Ceph config as a temporary workaround is already an option I suppose.

>> I'd like to have 2x 10Gbit/s NIC's on the gateway host and maximize throughput. Any suggestions on how to best do that? I'm assuming it will talk to the OSD's on the Ceph public/client network, so does that imply a third even-more-public network for the gateway's clients?
> 
> No, there is no third network. You could trunk the 2 NICs with LACP or something. Since the client will open TCP connections to all the OSDs you will get a pretty good balancing over the available bandwith.

Good suggestion, thanks. I actually just started using LACP to pair up 1Gb NIC's on a small test cluster and it's proven beneficial, even with less-than-ideal hashing from the switch.

To put my question here a different way: suppose I really really want to segregate the Ceph traffic from the HTTP traffic and I set up the IP's and routing necessary to do so (3rd network idea). Is there any reason NOT to do that?

>> Is it worthwhile to have 10G NIC's on the monitor hosts? (The storage hosts will each have 2x 10Gbit/s NIC's.)
> 
> No, not really. 1Gbit should be more then enough for your monitors. 3 monitors should also be good. No need to go for 5 or 7.

Maybe I'll have the monitors just be on the public network but use LACP with dual 1Gbit NIC's for fault tolerance (since I'll have the NIC's onboard anyway).

>> I think this has come up before, but has anyone written up something with more details on setting up gateways? Hardware recommendations, strategies to improve caching and performance, multiple gateway setups with and without a load balancer, etc.
> 
> Not that I know. I'm still trying to play with RGW and Varnish in front of it, but haven't really taken the time yet. The goal is to offload a lot of the caching to Varnish and have Varnish 'ban' objects when they change.
> 
> You could also use Varnish as a loadbalancer. But in this case you could also use RR-DNS or LVS with Direct Routing, that way only incoming traffic goes through the loadbalancer and return traffic goes directly to your network's gateway.

I definitely plan to look in to Varnish at some point, I'll be sure to follow up here if I learn anything interesting.

Thanks again,

JN


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-21 23:15   ` John Nielsen
@ 2013-01-22 13:20     ` Wido den Hollander
  2013-01-22 14:58       ` Jeff Mitchell
  0 siblings, 1 reply; 16+ messages in thread
From: Wido den Hollander @ 2013-01-22 13:20 UTC (permalink / raw)
  To: John Nielsen; +Cc: ceph-devel



On 01/22/2013 12:15 AM, John Nielsen wrote:
> Thanks all for your responses! Some comments inline.
>
> On Jan 20, 2013, at 10:16 AM, Wido den Hollander <wido@widodh.nl> wrote:
>
>> On 01/19/2013 12:34 AM, John Nielsen wrote:
>>> I'm planning a Ceph deployment which will include:
>>> 	10Gbit/s public/client network
>>> 	10Gbit/s cluster network
>>> 	dedicated mon hosts (3 to start)
>>> 	dedicated storage hosts (multiple disks, one XFS and OSD per disk, 3-5 to start)
>>> 	dedicated RADOS gateway host (1 to start)
>>>
>>> I've done some initial testing and read through most of the docs but I still have a few questions. Please respond even if you just have a suggestion or response for one of them.
>>>
>>> If I have "cluster network" and "public network" entries under [global] or [osd], do I still need to specify "public addr" and "cluster addr" for each OSD individually?
>>
>> Already answered, but no. You don't need to. The OSDs will bind to the available IP in that network.
>
> Nice. That's how I hoped/assumed it would work but I have seen some configurations on the web that include both so I wanted to make sure.
>
>>> Which network(s) should the monitor hosts be on? If both, is it valid to have more than one "mon addr" entry per mon host or is there a different way to do it?
>>
>> They should be on the "public" network since the clients also need to be able to reach the monitors.
>
> It sounds (from other followups) like there is work on adding some awareness of the cluster network to the monitor but it's not there yet. I'll stay tuned. It would be nice if the monitors and OSD's together could form a reachability map for the cluster and give the option of using the public network for the affected subset of OSD traffic in the event of a problem on the cluster network. Having the monitor(s) connected to the cluster network might be useful for that...
>
> Then again, a human should be involved if there is a network failure anyway; manually changing the network settings in the Ceph config as a temporary workaround is already an option I suppose.
>
>>> I'd like to have 2x 10Gbit/s NIC's on the gateway host and maximize throughput. Any suggestions on how to best do that? I'm assuming it will talk to the OSD's on the Ceph public/client network, so does that imply a third even-more-public network for the gateway's clients?
>>
>> No, there is no third network. You could trunk the 2 NICs with LACP or something. Since the client will open TCP connections to all the OSDs you will get a pretty good balancing over the available bandwith.
>
> Good suggestion, thanks. I actually just started using LACP to pair up 1Gb NIC's on a small test cluster and it's proven beneficial, even with less-than-ideal hashing from the switch.
>
> To put my question here a different way: suppose I really really want to segregate the Ceph traffic from the HTTP traffic and I set up the IP's and routing necessary to do so (3rd network idea). Is there any reason NOT to do that?
>

No, no problem! The RGW gateways simply have 10G to the internet and 10G 
to the Ceph cluster.

The OSDs on their part have a "cluster network" which they use for 
internal communication.

Ceph doesn't need to know about your 3rd network.

>>> Is it worthwhile to have 10G NIC's on the monitor hosts? (The storage hosts will each have 2x 10Gbit/s NIC's.)
>>
>> No, not really. 1Gbit should be more then enough for your monitors. 3 monitors should also be good. No need to go for 5 or 7.
>
> Maybe I'll have the monitors just be on the public network but use LACP with dual 1Gbit NIC's for fault tolerance (since I'll have the NIC's onboard anyway).
>
>>> I think this has come up before, but has anyone written up something with more details on setting up gateways? Hardware recommendations, strategies to improve caching and performance, multiple gateway setups with and without a load balancer, etc.
>>
>> Not that I know. I'm still trying to play with RGW and Varnish in front of it, but haven't really taken the time yet. The goal is to offload a lot of the caching to Varnish and have Varnish 'ban' objects when they change.
>>
>> You could also use Varnish as a loadbalancer. But in this case you could also use RR-DNS or LVS with Direct Routing, that way only incoming traffic goes through the loadbalancer and return traffic goes directly to your network's gateway.
>
> I definitely plan to look in to Varnish at some point, I'll be sure to follow up here if I learn anything interesting.
>

One thing is still having multiple Varnish caches and object banning. I 
proposed something for this some time ago, some "hook" in RGW you could 
use to inform a upstream cache to "purge" something from it's cache.

That isn't there yet though.

Wido

> Thanks again,
>
> JN
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-22 13:20     ` Wido den Hollander
@ 2013-01-22 14:58       ` Jeff Mitchell
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Mitchell @ 2013-01-22 14:58 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: John Nielsen, ceph-devel

Wido den Hollander wrote:
> One thing is still having multiple Varnish caches and object banning. I
> proposed something for this some time ago, some "hook" in RGW you could
> use to inform a upstream cache to "purge" something from it's cache.

Hopefully not Varnish-specific; something like the Last-Modified header 
would be good.

Also there are tricks you can do with queries; see for instance 
http://forum.nginx.org/read.php?2,1047,1052

--Jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: questions on networks and hardware
  2013-01-21  8:19         ` Gandalf Corvotempesta
@ 2013-01-22 21:05           ` Dan Mick
  0 siblings, 0 replies; 16+ messages in thread
From: Dan Mick @ 2013-01-22 21:05 UTC (permalink / raw)
  To: Gandalf Corvotempesta; +Cc: Gregory Farnum, John Nielsen, ceph-devel



On 01/21/2013 12:19 AM, Gandalf Corvotempesta wrote:
> 2013/1/21 Gregory Farnum <greg@inktank.com>:
>> I'm not quite sure what you mean…the use of the "cluster network" and "public network" are really just intended as conveniences for people with multiple NICs on their box. There's nothing preventing you from running everything on the same network…(and more specifically, from different speed grades to different boxes, but keeping them all on the same network).
>
> I mean using two cluster network and two pubic networks. 4 NICs in total.
> Our cluster network will be 10GBe or Infiniband DDR/QDR, making a
> fully redundany cluster network (two nics on each server, two
> switches) will double our costs.

Well, again, you needn't use two different networks for Ceph; that 
capability is there so that you *can* only use the faster gear for the 
cluster traffic if you want to limit costs but still segregate traffic. 
  If you're just after a fully-redundant network (by which I assume you 
also mean fully-redundant fabric, power, etc.), there's no reason it 
can't all be fast gear.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-01-22 21:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-18 23:34 questions on networks and hardware John Nielsen
2013-01-19 16:00 ` Holcombe, Christopher
2013-01-20 17:16 ` Wido den Hollander
2013-01-20 18:51   ` Jeff Mitchell
2013-01-20 19:05     ` Sage Weil
2013-01-20 18:52   ` Stefan Priebe
2013-01-20 19:03     ` Sage Weil
2013-01-21 23:15   ` John Nielsen
2013-01-22 13:20     ` Wido den Hollander
2013-01-22 14:58       ` Jeff Mitchell
2013-01-20 20:30 ` Gandalf Corvotempesta
2013-01-20 22:23   ` Gregory Farnum
2013-01-20 22:43     ` Gandalf Corvotempesta
2013-01-20 23:17       ` Gregory Farnum
2013-01-21  8:19         ` Gandalf Corvotempesta
2013-01-22 21:05           ` Dan Mick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.