All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrei Mikhailovsky <andrei-930XJYlnu5nQT0dZR+AlfA@public.gmane.org>
To: Daniel Swarbrick
	<daniel.swarbrick-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
Cc: ceph-users <ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org>,
	ceph-devel <ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: cluster down during backfilling, Jewel tunables and client IO optimisations
Date: Wed, 22 Jun 2016 16:54:51 +0100 (BST)	[thread overview]
Message-ID: <829216253.136015.1466610890998.JavaMail.zimbra@arhont.com> (raw)
In-Reply-To: <nke15p$u9f$1@ger.gmane.org>

Hi Daniel,

Many thanks for your useful tests and your results.

How much IO wait do you have on your client vms? Has it significantly increased or not?

Many thanks

Andrei

----- Original Message -----
> From: "Daniel Swarbrick" <daniel.swarbrick-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
> To: "ceph-users" <ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org>
> Cc: "ceph-devel" <ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> Sent: Wednesday, 22 June, 2016 13:43:37
> Subject: Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

> On 20/06/16 19:51, Gregory Farnum wrote:
>> On Mon, Jun 20, 2016 at 8:33 AM, Daniel Swarbrick
>>>
>>> At this stage, I have a strong suspicion that it is the introduction of
>>> "require_feature_tunables5 = 1" in the tunables. This seems to require
>>> all RADOS connections to be re-established.
>> 
>> Do you have any evidence of that besides the one restart?
>> 
>> I guess it's possible that we aren't kicking requests if the crush map
>> but not the rest of the osdmap changes, but I'd be surprised.
>> -Greg
> 
> I think the key fact to take note of is that we had long-running Qemu
> processes that had been started a few months ago, using Infernalis
> librbd shared libs.
> 
> If Infernalis had no concept of require_feature_tunables5, then it seems
> logical that these clients would block if the cluster were upgraded to
> Jewel and this tunable became mandatory.
> 
> I have just upgraded our fourth and final cluster to Jewel. Prior to
> applying optimal tunables, we upgraded our hypervisor nodes' librbd
> also, and migrated all VMs at least once, to start a fresh Qemu process
> for each (using the updated librbd).
> 
> We're seeing ~65% data movement due to chooseleaf_stable 0 => 1, but
> other than that, so far so good. No clients are blocking indefinitely.
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  reply	other threads:[~2016-06-22 15:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <848514758.3747.1466265852627.JavaMail.zimbra@arhont.com>
     [not found] ` <31cbf96d-c79e-1e7d-19fd-df9e2d2a748f@ip-interactive.de>
     [not found]   ` <1456968003.98467.1466423640703.JavaMail.zimbra@arhont.com>
     [not found]     ` <nk92c8$knq$1@ger.gmane.org>
2016-06-20 17:51       ` cluster down during backfilling, Jewel tunables and client IO optimisations Gregory Farnum
     [not found]         ` <CAJ4mKGb28W-jPK7Z7wMzm7fC8Q5YPmqr+PeGS=Bz7jky4GfxuA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-06-20 19:22           ` Josef Johansson
     [not found]             ` <CAOnYue9FR5amxPkZ-5v6bntq9WN=YkDUGu0U4MSdj2k_eNiuWA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-06-20 20:16               ` Andrei Mikhailovsky
2016-06-22 12:43           ` Daniel Swarbrick
2016-06-22 15:54             ` Andrei Mikhailovsky [this message]
     [not found]               ` <829216253.136015.1466610890998.JavaMail.zimbra-930XJYlnu5nQT0dZR+AlfA@public.gmane.org>
2016-06-22 16:09                 ` Daniel Swarbrick
2016-06-22 16:49                   ` Andrei Mikhailovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=829216253.136015.1466610890998.JavaMail.zimbra@arhont.com \
    --to=andrei-930xjylnu5nqt0dzr+alfa@public.gmane.org \
    --cc=ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org \
    --cc=daniel.swarbrick-EIkl63zCoXaH+58JC4qpiA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.