From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sage Weil Subject: Re: another scrub bug? blocked for > 10240.948831 secs Date: Thu, 20 Apr 2017 14:23:13 +0000 (UTC) Message-ID: References: <94127293-3ba3-8bb8-db43-604b6a54eb55@brockmann-consult.de> <456cafe7-ecaa-5c62-ca82-6c09a2b06ae7@brockmann-consult.de> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Return-path: Received: from cobra.newdream.net ([66.33.216.30]:33675 "EHLO cobra.newdream.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S946228AbdDTOXQ (ORCPT ); Thu, 20 Apr 2017 10:23:16 -0400 In-Reply-To: <456cafe7-ecaa-5c62-ca82-6c09a2b06ae7@brockmann-consult.de> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Peter Maloney Cc: ceph-devel On Thu, 20 Apr 2017, Peter Maloney wrote: > On 04/17/17 15:18, Sage Weil wrote: > > On Sat, 15 Apr 2017, Peter Maloney wrote: > >> Is this another scrub bug? Something just like this (1 or 2 requests > >> blocked forever until osd restart) happened about 5 times so far, each > >> time during recovery or some other thing I did myself to trigger it, > >> probably involving snapshots. This time I noticed that it says scrub in > >> the log. One other time it made a client block, but didn't seem to this > >> time. I didn't have the same issue in 10.2.3, but I don't know if I > >> generated the same load or whatever causes it back then. > >> > >> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) > >> > >> If you want me to try 10.2.6 or 7 instead, I can do that, but no > >> guarantee I can reproduce it any time soon. > > There have been lots of scrub-related patches since 10.2.5, but I don't > > see one that would explain this. I'm guessing there is a scrub waitlist > > bug that we aren't turning up in qa because our thrashing tests are > > triggering lots of other actions in sequence (peering from up/down osds > > and balancing) and those probably have the effect of > > clearing the issue. > > > > Next time you see it, can you capture the output of 'ceph daemon osd.NNN > > ops' so we can see what steps the request went through? Also, any > > additional or more specific clues as to what might have triggered it would > > help. > > > > Thanks! > > sage > > > > > Here, we go... > > > 2017-04-20 05:03:58.522624 osd.20 10.3.0.132:6824/1088346 2078 : > > cluster [WRN] slow request 20480.953923 seconds old, received at > > 2017-04-19 23:22:37.568657: replica scrub(pg: > > 4.2d,from:0'0,to:79873'5433886,epoch:79873,start:4:b46a9ab4:::rbd_data.43e4a9238e1f29.00000000000019fe:e438,end:4:b46a9ab4:::rbd_data.43e4a9238e1f29.00000000000019fe:e43a,chunky:1,deep:1,seed:4294967295,version:6) > > currently reached_pg > > > > > > root@ceph3:/var/log/ceph # ceph pg dump | grep -E "^pg_stat|^4.2d" > > dumped all in format plain > > pg_stat objects mip degr misp unf bytes log > > disklog state state_stamp v reported up > > up_primary acting acting_primary last_scrub scrub_stamp > > last_deep_scrub deep_scrub_stamp > > 4.2d 3494 0 0 0 0 13061467136 > > 3012 3012 active+clean+scrubbing+deep 2017-04-19 > > 23:11:22.763008 80142'5447712 80142:3027638 [14,7,20] > > 14 [14,7,20] 14 76449'5396453 2017-04-18 > > 17:30:01.419498 73551'5298960 2017-04-15 03:23:48.821848 > > > > ceph daemon osd.14 ops > longscrubbug.14.log # ceph1 > > ceph daemon osd.7 ops > longscrubbug.7.log # ceph3 > > ceph daemon osd.20 ops > longscrubbug.20.log # ceph2 > > > > > > > > peter@peter:~/tmp/longscrubbug $ cat longscrubbug.14.log > > { > > "ops": [], > > "num_ops": 0 > > } > > > > peter@peter:~/tmp/longscrubbug $ cat longscrubbug.7.log > > { > > "ops": [], > > "num_ops": 0 > > } > > > > peter@peter:~/tmp/longscrubbug $ cat longscrubbug.20.log > > { > > "ops": [ > > { > > "description": "replica scrub(pg: > > 4.2d,from:0'0,to:79873'5433886,epoch:79873,start:4:b46a9ab4:::rbd_data.43e4a9238e1f29.00000000000019fe:e438,end:4:b46a9ab4:::rbd_data.43e4a9238e1f29.00000000000019fe:e43a,chunky:1,deep:1,seed:4294967295,version:6)", > > "initiated_at": "2017-04-19 23:22:37.568657", > > "age": 30563.826540, > > "duration": 30563.826803, > > "type_data": [ > > "reached pg", > > [ > > { > > "time": "2017-04-19 23:22:37.568657", > > "event": "initiated" > > }, > > { > > "time": "2017-04-19 23:22:37.568769", > > "event": "queued_for_pg" > > }, > > { > > "time": "2017-04-19 23:22:37.568814", > > "event": "reached_pg" > > } > > ] > > ] > > } > > ], > > "num_ops": 1 > > } > > > > to me that looks kinda non-useful... anything else you need? (I think > I'll leave this one alone for a while in case you reply soon and want > something) Hmm. Yeah, not terribly helpful. Can you attach a 'ceph osd dump -f json-pretty' for epoch 79870 through 79880, as well as the latest? My best guess is this is an interaction with snap trimming; the osdmap should indicate if a snap was deleted right around then. Thanks! sage