All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan van der Ster <dan-EOCVfBHj35C+XT7JhA+gdA@public.gmane.org>
To: Stefan Priebe - Profihost AG
	<s.priebe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
Cc: "ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org"
	<ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org>,
	"ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	spandankumarsahu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Subject: Re: ceph mgr balancer bad distribution
Date: Thu, 1 Mar 2018 11:30:40 +0100	[thread overview]
Message-ID: <CABZ+qqm-gMs9COEg2TVfNwEhVja8mGox00=0y5wQB7Z2QoSjSQ@mail.gmail.com> (raw)
In-Reply-To: <CABZ+qqmUPsiUDQV4kkHAWN0v=fCosaaHs0dGvXDaKUGKC7v9=g@mail.gmail.com>

On Thu, Mar 1, 2018 at 10:40 AM, Dan van der Ster <dan-EOCVfBHj35C+XT7JhA+gdA@public.gmane.org> wrote:
> On Thu, Mar 1, 2018 at 10:38 AM, Dan van der Ster <dan-EOCVfBHj35C+XT7JhA+gdA@public.gmane.org> wrote:
>> On Thu, Mar 1, 2018 at 10:24 AM, Stefan Priebe - Profihost AG
>> <s.priebe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org> wrote:
>>>
>>> Am 01.03.2018 um 09:58 schrieb Dan van der Ster:
>>>> On Thu, Mar 1, 2018 at 9:52 AM, Stefan Priebe - Profihost AG
>>>> <s.priebe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org> wrote:
>>>>> Hi,
>>>>>
>>>>> Am 01.03.2018 um 09:42 schrieb Dan van der Ster:
>>>>>> On Thu, Mar 1, 2018 at 9:31 AM, Stefan Priebe - Profihost AG
>>>>>> <s.priebe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org> wrote:
>>>>>>> Hi,
>>>>>>> Am 01.03.2018 um 09:03 schrieb Dan van der Ster:
>>>>>>>> Is the score improving?
>>>>>>>>
>>>>>>>>     ceph balancer eval
>>>>>>>>
>>>>>>>> It should be decreasing over time as the variances drop toward zero.
>>>>>>>>
>>>>>>>> You mentioned a crush optimize code at the beginning... how did that
>>>>>>>> leave your cluster? The mgr balancer assumes that the crush weight of
>>>>>>>> each OSD is equal to its size in TB.
>>>>>>>> Do you have any osd reweights? crush-compat will gradually adjust
>>>>>>>> those back to 1.0.
>>>>>>>
>>>>>>> I reweighted them all back to their correct weight.
>>>>>>>
>>>>>>> Now the mgr balancer module says:
>>>>>>> mgr[balancer] Failed to find further optimization, score 0.010646
>>>>>>>
>>>>>>> But as you can see it's heavily imbalanced:
>>>>>>>
>>>>>>>
>>>>>>> Example:
>>>>>>> 49   ssd 0.84000  1.00000   864G   546G   317G 63.26 1.13  49
>>>>>>>
>>>>>>> vs:
>>>>>>>
>>>>>>> 48   ssd 0.84000  1.00000   864G   397G   467G 45.96 0.82  49
>>>>>>>
>>>>>>> 45% usage vs. 63%
>>>>>>
>>>>>> Ahh... but look, the num PGs are perfectly balanced, which implies
>>>>>> that you have a relatively large number of empty PGs.
>>>>>>
>>>>>> But regardless, this is annoying and I expect lots of operators to get
>>>>>> this result. (I've also observed that the num PGs is gets balanced
>>>>>> perfectly at the expense of the other score metrics.)
>>>>>>
>>>>>> I was thinking of a patch around here [1] that lets operators add a
>>>>>> score weight on pgs, objects, bytes so we can balance how we like.
>>>>>>
>>>>>> Spandan: you were the last to look at this function. Do you think it
>>>>>> can be improved as I suggested?
>>>>>
>>>>> Yes the PGs are perfectly distributed - but i think most of the people
>>>>> would like to have a dsitribution by bytes and not pgs.
>>>>>
>>>>> Is this possible? I mean in the code there is already a dict for pgs,
>>>>> objects and bytes - but i don't know how to change the logic. Just
>>>>> remove the pgs and objects from the dict?
>>>>
>>>> It's worth a try to remove the pgs and objects from this dict:
>>>> https://github.com/ceph/ceph/blob/luminous/src/pybind/mgr/balancer/module.py#L552
>>>
>>> Do i have to change this 3 to 1 cause we have only one item in the dict?
>>> I'm not sure where the 3 comes from.
>>>         pe.score /= 3 * len(roots)
>>>
>>
>> I'm pretty sure that 3 is just for our 3 metrics. Indeed you can
>> change that to 1.
>>
>> I'm trying this on our test cluster here too. The last few lines of
>> output from `ceph balancer eval-verbose` will confirm that the score
>> is based only on bytes.
>>
>> But I'm not sure this is going to work -- indeed the score here went
>> from ~0.02 to 0.08, but the do_crush_compat doesn't manage to find a
>> better score.
>
> Maybe this:
>
> https://github.com/ceph/ceph/blob/luminous/src/pybind/mgr/balancer/module.py#L682
>
> I'm trying with that = 'bytes'

That seems to be working. I sent this PR as a start
https://github.com/ceph/ceph/pull/20665

I'm not sure we need to mess with the score function, on second thought.

-- dan

  reply	other threads:[~2018-03-01 10:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-28 12:47 ceph mgr balancer bad distribution Stefan Priebe - Profihost AG
     [not found] ` <1ac5678e-ec95-3ab6-38bf-bdb889e1cd23-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-02-28 12:58   ` Dan van der Ster
     [not found]     ` <CABZ+qqmgOb459reQ2=MkhQLBho_O5AM8OA=0PuUQ1Zz=uGrMpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-28 13:59       ` Stefan Priebe - Profihost AG
2018-03-01  7:27   ` Stefan Priebe - Profihost AG
     [not found]     ` <b5d774be-a2e2-b57c-d201-b5df71868d49-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-01  8:03       ` Dan van der Ster
     [not found]         ` <CABZ+qqnQ+GrhRR7+9GmuzBA3STfwmtSzfMpSU2tPZWocMGHB8A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01  8:31           ` Stefan Priebe - Profihost AG
     [not found]             ` <da7136f6-cc57-0b28-428c-ccaaef34dfa7-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-01  8:42               ` Dan van der Ster
     [not found]                 ` <CABZ+qqmONpy74yXqr7e_zt_24aaxcFomPrwz0Mu2ncf0gYW3Ng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01  8:52                   ` Stefan Priebe - Profihost AG
     [not found]                     ` <3b2c1d04-c7bd-1906-6239-b783e4fd585a-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-01  8:58                       ` Dan van der Ster
     [not found]                         ` <CABZ+qqkKVsdr+Tch=ZOrpzbbSdmWo-eOdCspWxCRSTnK=buEFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01  9:24                           ` Stefan Priebe - Profihost AG
     [not found]                             ` <bea62c27-0faf-1b47-ca1e-9577e98ec6b1-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-01  9:38                               ` Dan van der Ster
     [not found]                                 ` <CABZ+qqnRwQa8Jrg9=DPc5VnzqG4cjq0RvdhfFG74NgLMs_4EwQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01  9:40                                   ` Dan van der Ster
2018-03-01 10:30                                     ` Dan van der Ster [this message]
     [not found]                                       ` <CABZ+qqm-gMs9COEg2TVfNwEhVja8mGox00=0y5wQB7Z2QoSjSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-01 12:08                                         ` Stefan Priebe - Profihost AG
     [not found]                                           ` <3d244da6-25c2-b6d8-d4c2-a6a28b897509-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-01 12:12                                             ` Dan van der Ster
     [not found]                                               ` <CABZ+qq=xs5CYAXn55JEGbA4OSZayGdvbFnpwDz7AZDa0A0T2aQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-02  9:12                                                 ` Stefan Priebe - Profihost AG
     [not found]                                                   ` <88BB07AB-D6C8-4106-953F-2131E56081BD-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-02 13:29                                                     ` Dan van der Ster
     [not found]                                                       ` <CABZ+qqkdAHvkLv0q8ysDhjx+dHC_TCYrcQT9Nv_ddLt0krGzgg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-03-02 20:21                                                         ` Stefan Priebe - Profihost AG
     [not found]                                                           ` <173aba9e-16ae-c9d6-3afa-2c25683b0dbe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org>
2018-03-03 20:04                                                             ` Stefan Priebe - Profihost AG
2018-03-02 10:13                                                 ` Stefan Priebe - Profihost AG

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABZ+qqm-gMs9COEg2TVfNwEhVja8mGox00=0y5wQB7Z2QoSjSQ@mail.gmail.com' \
    --to=dan-eocvfbhj35c+xt7jha+gda@public.gmane.org \
    --cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org \
    --cc=s.priebe-2Lf/h1ldwEHR5kwTpVNS9A@public.gmane.org \
    --cc=spandankumarsahu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.