All of lore.kernel.org
 help / color / mirror / Atom feed
* How to release Hammer osd RAM when compiled with jemalloc
@ 2016-12-13  2:22 Dong Wu
  2016-12-13 12:40 ` Sage Weil
  0 siblings, 1 reply; 5+ messages in thread
From: Dong Wu @ 2016-12-13  2:22 UTC (permalink / raw)
  To: The Sacred Order of the Squid Cybernetic, ceph-users

Hi, all
   I have a cluster with nearly 1000 osds, and each osd already
occupied 2.5G physical memory on average, which cause each host 90%
memory useage. when use tcmalloc, we can use "ceph tell osd.* release"
to release unused memory, but in my cluster, ceph is build with
jemalloc, so can't use "ceph tell osd.* release", is there any methods
to release some memory?

another question:
can I decrease the following config value which used for cached osdmap
 to lower osd's memory?

"mon_min_osdmap_epochs": "500"
"osd_map_max_advance": "200",
"osd_map_cache_size": "500",
"osd_map_message_max": "100",
"osd_map_share_max_epochs": "100"


here is one of my host memory usage:
Tasks: 547 total,   1 running, 546 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  0.4 sy,  0.0 ni, 98.8 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:  65474900 total, 65174136 used,   300764 free,    63472 buffers
KiB Swap:  4194300 total,  1273384 used,  2920916 free,  7100148 cached

PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
190259 root      20   0 12.1g 2.6g 5328 S     8  4.2   5445:38
ceph-osd
177520 root      20   0 11.9g 2.6g 5560 S    10  4.1   4725:24
ceph-osd
166517 root      20   0 12.2g 2.5g 5320 S     4  4.1   5399:44
ceph-osd
171744 root      20   0 11.9g 2.5g 5984 S     4  4.1   4911:13
ceph-osd
   958 root      20   0 11.9g 2.5g 4652 S     6  4.0   4821:20
ceph-osd
 16134 root      20   0 12.4g 2.5g 5252 S     4  4.0   5336:00
ceph-osd
183738 root      20   0 12.1g 2.5g 4500 S     6  4.0   4748:43
ceph-osd
  8482 root      20   0 11.6g 2.5g 5760 S     4  4.0   4937:24
ceph-osd
161514 root      20   0 12.1g 2.5g 5712 S     6  3.9   4937:30
ceph-osd
 37148 root      20   0 5919m 2.4g 4164 S     2  3.9   2709:53
ceph-osd
 48327 root      20   0 5956m 2.4g 3872 S     0  3.8   2782:25
ceph-osd
 31214 root      20   0 5990m 2.4g 4336 S     4  3.8   3020:38
ceph-osd
 24254 root      20   0 5762m 2.4g 4404 S     4  3.8   2852:50
ceph-osd
 19524 root      20   0 5782m 2.4g 4608 S     2  3.8   2752:12
ceph-osd
 40557 root      20   0 5875m 2.4g 4492 S     4  3.8   2808:41
ceph-osd
 22458 root      20   0 5769m 2.3g 4084 S     2  3.8   2820:34
ceph-osd
 28668 root      20   0 5796m 2.3g 4424 S     2  3.8   2728:06
ceph-osd
 20885 root      20   0 5867m 2.3g 4368 S     2  3.7   2802:10
ceph-osd
 26382 root      20   0 5857m 2.3g 4176 S     4  3.7   3012:35
ceph-osd
 44276 root      20   0 5828m 2.3g 4792 S     0  3.6   2891:12
ceph-osd
 34035 root      20   0 5887m 2.2g 3984 S     4  3.5   2836:21 ceph-osd


Thanks.
Regards.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to release Hammer osd RAM when compiled with jemalloc
  2016-12-13  2:22 How to release Hammer osd RAM when compiled with jemalloc Dong Wu
@ 2016-12-13 12:40 ` Sage Weil
  2016-12-14  1:39   ` Dong Wu
  0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-12-13 12:40 UTC (permalink / raw)
  To: Dong Wu; +Cc: The Sacred Order of the Squid Cybernetic, ceph-users

On Tue, 13 Dec 2016, Dong Wu wrote:
> Hi, all
>    I have a cluster with nearly 1000 osds, and each osd already
> occupied 2.5G physical memory on average, which cause each host 90%
> memory useage. when use tcmalloc, we can use "ceph tell osd.* release"
> to release unused memory, but in my cluster, ceph is build with
> jemalloc, so can't use "ceph tell osd.* release", is there any methods
> to release some memory?

We explicitly call into tcmalloc to release memory with that command, but 
unless you've patched something in yourself there is no integration with 
jemalloc's release API.

> another question:
> can I decrease the following config value which used for cached osdmap
>  to lower osd's memory?
> 
> "mon_min_osdmap_epochs": "500"
> "osd_map_max_advance": "200",
> "osd_map_cache_size": "500",
> "osd_map_message_max": "100",
> "osd_map_share_max_epochs": "100"

Yeah.  You should be fine with 500, 50, 100, 50, 50.

sage

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to release Hammer osd RAM when compiled with jemalloc
  2016-12-13 12:40 ` Sage Weil
@ 2016-12-14  1:39   ` Dong Wu
       [not found]     ` <CAAL-TMdqQiLExSB+tTrGvnOiMxNgSZmM6KYEPyYCHJDouWfM4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Dong Wu @ 2016-12-14  1:39 UTC (permalink / raw)
  To: Sage Weil; +Cc: The Sacred Order of the Squid Cybernetic, ceph-users

Thanks for your response.

2016-12-13 20:40 GMT+08:00 Sage Weil <sage@newdream.net>:
> On Tue, 13 Dec 2016, Dong Wu wrote:
>> Hi, all
>>    I have a cluster with nearly 1000 osds, and each osd already
>> occupied 2.5G physical memory on average, which cause each host 90%
>> memory useage. when use tcmalloc, we can use "ceph tell osd.* release"
>> to release unused memory, but in my cluster, ceph is build with
>> jemalloc, so can't use "ceph tell osd.* release", is there any methods
>> to release some memory?
>
> We explicitly call into tcmalloc to release memory with that command, but
> unless you've patched something in yourself there is no integration with
> jemalloc's release API.

Are there any methods to know detail memory usage of OSD, if we have a
memory allocator recording detail memory usage, will this be helpful?
Is it on the schedule?

>
>> another question:
>> can I decrease the following config value which used for cached osdmap
>>  to lower osd's memory?
>>
>> "mon_min_osdmap_epochs": "500"
>> "osd_map_max_advance": "200",
>> "osd_map_cache_size": "500",
>> "osd_map_message_max": "100",
>> "osd_map_share_max_epochs": "100"
>
> Yeah.  You should be fine with 500, 50, 100, 50, 50.

>
> sage

Thanks.
Regards.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to release Hammer osd RAM when compiled with jemalloc
       [not found]     ` <CAAL-TMdqQiLExSB+tTrGvnOiMxNgSZmM6KYEPyYCHJDouWfM4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-12-14  1:46       ` Sage Weil
  2016-12-14  3:32         ` Dong Wu
  0 siblings, 1 reply; 5+ messages in thread
From: Sage Weil @ 2016-12-14  1:46 UTC (permalink / raw)
  To: Dong Wu; +Cc: The Sacred Order of the Squid Cybernetic, ceph-users

On Wed, 14 Dec 2016, Dong Wu wrote:
> Thanks for your response.
> 
> 2016-12-13 20:40 GMT+08:00 Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org>:
> > On Tue, 13 Dec 2016, Dong Wu wrote:
> >> Hi, all
> >>    I have a cluster with nearly 1000 osds, and each osd already
> >> occupied 2.5G physical memory on average, which cause each host 90%
> >> memory useage. when use tcmalloc, we can use "ceph tell osd.* release"
> >> to release unused memory, but in my cluster, ceph is build with
> >> jemalloc, so can't use "ceph tell osd.* release", is there any methods
> >> to release some memory?
> >
> > We explicitly call into tcmalloc to release memory with that command, but
> > unless you've patched something in yourself there is no integration with
> > jemalloc's release API.
> 
> Are there any methods to know detail memory usage of OSD, if we have a
> memory allocator recording detail memory usage, will this be helpful?
> Is it on the schedule?

Kraken has a new mempool infrastructure and some of the OSD pieces have 
been moved into it, but only some.  There's quite a bit of opportunity to 
further categorize allocations to get better visibility here.

Barring that, your best bet is to use either tcmalloc's heap profiling or 
valgrind massif.  Both slow down execution by a lot (5-10x).  Massif has 
better detail, but is somewhat slower.

sage


> 
> >
> >> another question:
> >> can I decrease the following config value which used for cached osdmap
> >>  to lower osd's memory?
> >>
> >> "mon_min_osdmap_epochs": "500"
> >> "osd_map_max_advance": "200",
> >> "osd_map_cache_size": "500",
> >> "osd_map_message_max": "100",
> >> "osd_map_share_max_epochs": "100"
> >
> > Yeah.  You should be fine with 500, 50, 100, 50, 50.
> 
> >
> > sage
> 
> Thanks.
> Regards.
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to release Hammer osd RAM when compiled with jemalloc
  2016-12-14  1:46       ` Sage Weil
@ 2016-12-14  3:32         ` Dong Wu
  0 siblings, 0 replies; 5+ messages in thread
From: Dong Wu @ 2016-12-14  3:32 UTC (permalink / raw)
  To: Sage Weil; +Cc: The Sacred Order of the Squid Cybernetic, ceph-users

2016-12-14 9:46 GMT+08:00 Sage Weil <sage@newdream.net>:
> On Wed, 14 Dec 2016, Dong Wu wrote:
>> Thanks for your response.
>>
>> 2016-12-13 20:40 GMT+08:00 Sage Weil <sage@newdream.net>:
>> > On Tue, 13 Dec 2016, Dong Wu wrote:
>> >> Hi, all
>> >>    I have a cluster with nearly 1000 osds, and each osd already
>> >> occupied 2.5G physical memory on average, which cause each host 90%
>> >> memory useage. when use tcmalloc, we can use "ceph tell osd.* release"
>> >> to release unused memory, but in my cluster, ceph is build with
>> >> jemalloc, so can't use "ceph tell osd.* release", is there any methods
>> >> to release some memory?
>> >
>> > We explicitly call into tcmalloc to release memory with that command, but
>> > unless you've patched something in yourself there is no integration with
>> > jemalloc's release API.
>>
>> Are there any methods to know detail memory usage of OSD, if we have a
>> memory allocator recording detail memory usage, will this be helpful?
>> Is it on the schedule?
>
> Kraken has a new mempool infrastructure and some of the OSD pieces have
> been moved into it, but only some.  There's quite a bit of opportunity to
> further categorize allocations to get better visibility here.

Looking forward.

>
> Barring that, your best bet is to use either tcmalloc's heap profiling or
> valgrind massif.  Both slow down execution by a lot (5-10x).  Massif has
> better detail, but is somewhat slower.

I'll first use these tools in my test cluster to see memory usage.
But in our product cluster, can I just use ceph tell osd.* injectargs
'--osd_map_max_advance 50 --osd_map_cache_size 100
--osd_map_message_max 50 --osd_map_share_max_epochs 50' to lower osd's
memory?
Or should I change ceph.conf and then restart OSDs?

>
> sage
>
>
>>
>> >
>> >> another question:
>> >> can I decrease the following config value which used for cached osdmap
>> >>  to lower osd's memory?
>> >>
>> >> "mon_min_osdmap_epochs": "500"
>> >> "osd_map_max_advance": "200",
>> >> "osd_map_cache_size": "500",
>> >> "osd_map_message_max": "100",
>> >> "osd_map_share_max_epochs": "100"
>> >
>> > Yeah.  You should be fine with 500, 50, 100, 50, 50.
>>
>> >
>> > sage
>>
>> Thanks.
>> Regards.
>>
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-12-14  3:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-13  2:22 How to release Hammer osd RAM when compiled with jemalloc Dong Wu
2016-12-13 12:40 ` Sage Weil
2016-12-14  1:39   ` Dong Wu
     [not found]     ` <CAAL-TMdqQiLExSB+tTrGvnOiMxNgSZmM6KYEPyYCHJDouWfM4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-12-14  1:46       ` Sage Weil
2016-12-14  3:32         ` Dong Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.