All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maloney <peter.maloney@brockmann-consult.de>
To: Sage Weil <sage@newdream.net>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: another scrub bug? blocked for > 10240.948831 secs
Date: Tue, 23 May 2017 11:29:20 +0200	[thread overview]
Message-ID: <cae31cf5-344a-d6f6-c810-4f0aec031ae4@brockmann-consult.de> (raw)
In-Reply-To: <27bfebcd-6a91-bd00-82c5-e56af6775d52@brockmann-consult.de>

On 04/28/17 09:24, Peter Maloney wrote:
> On 04/20/17 20:58, Peter Maloney wrote:
>> On 04/20/17 18:19, Sage Weil wrote:
>>>  but I guess the 
>>> underlying first question is whether any snap deletion happened anywhere 
>>> around this period (2560+ sec before the warning, or around the time the 
>>> op was sent in epoch 83264).  
>> A snapshot based backup thing ran at 12:00 CEST and took until 18:25
>> CEST to finish, which overlaps that, and creates and removes 120
>> snapshots spread throughout the process.
>>> (And yeah, removed_snaps is the field that 
>>> matters here!)
>>>
>>> Thanks!
>>> sage
> This still happens in 10.2.7
>
>> 2017-04-28 04:41:59.343443 osd.9 10.3.0.132:6808/2704 18 : cluster
>> [WRN] slow request 10.040822 seconds old, received at 2017-04-28
>> 04:41:49.302552: replica scrub(pg:
>> 4.145,from:0'0,to:93267'6832180,epoch:93267,start:4:a2d2c99e:::rbd_data.4bf687238e1f29.000000000000f7a3:0,end:4:a2d2dcd6:::rbd_data.46820b238e1f29.000000000000bfbc:f25e,chunky:1,deep:0,seed:4294967295,version:6)
>> currently reached_pg
>> ...
>> 2017-04-28 06:07:09.975902 osd.9 10.3.0.132:6808/2704 36 : cluster
>> [WRN] slow request 5120.673291 seconds old, received at 2017-04-28
>> 04:41:49.302552: replica scrub(pg: 4.145,from:0'0,to:93
>> 267'6832180,epoch:93267,start:4:a2d2c99e:::rbd_data.4bf687238e1f29.000000000000f7a3:0,end:4:a2d2dcd6:::rbd_data.46820b238e1f29.000000000000bfbc:f25e,chunky:1,deep:0,seed:4294967295,version:
> and there are snaps created and removed around that time.

So I changed some settings a long time ago for unrelated reasons, and
now it's far more rare (only happened once since, but had many more than
1 request blocked).

Here are the old settings:

> osd deep scrub stride = 524288  # 512 KiB
> osd scrub chunk min = 1
> osd scrub chunk max = 1
> osd scrub sleep = 0.5
And the new:
> osd deep scrub stride = 4194304  # 4 MiB
> osd scrub chunk min = 20
> osd scrub chunk max = 25
> osd scrub sleep = 4



  reply	other threads:[~2017-05-23  9:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-15 21:34 another scrub bug? blocked for > 10240.948831 secs Peter Maloney
2017-04-17 13:18 ` Sage Weil
2017-04-20  5:58   ` Peter Maloney
2017-04-20 14:23     ` Sage Weil
2017-04-20 16:05       ` Peter Maloney
2017-04-20 16:19         ` Sage Weil
2017-04-20 18:58           ` Peter Maloney
2017-04-28  7:24             ` Peter Maloney
2017-05-23  9:29               ` Peter Maloney [this message]
2017-06-15 12:49                 ` Peter Maloney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cae31cf5-344a-d6f6-c810-4f0aec031ae4@brockmann-consult.de \
    --to=peter.maloney@brockmann-consult.de \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.