From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sage Weil Subject: Re: clearing unfound objects Date: Tue, 12 Sep 2017 22:48:52 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Return-path: Received: from mx1.redhat.com ([209.132.183.28]:41852 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751526AbdILWsy (ORCPT ); Tue, 12 Sep 2017 18:48:54 -0400 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Two Spirit Cc: John Spray , ceph-devel On Tue, 12 Sep 2017, Two Spirit wrote: > >On Tue, 12 Sep 2017, Two Spirit wrote: > >> I don't have any OSDs that are down, so the 1 unfound object I think > >> needs to be manually cleared. I ran across a webpage a while ago that > >> talked about how to clear it, but if you have a reference, would save > >> me a little time. > > > >http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#failures-osd-unfound > > Thanks. That was the page I had read earlier. > > I've attached the full outputs to this mail and show just clips below. > > # ceph health detail > OBJECT_UNFOUND 1/731529 unfound (0.000%) > pg 6.2 has 1 unfound objects > > There looks like one number that shouldn't be there... > # ceph pg 6.2 list_missing > { > "offset": { > ... > "pool": -9223372036854775808, > "namespace": "" > }, > ... I think you've snipped out the bit that has the name of the unfound object? sage > > # ceph -s > osd: 6 osds: 6 up, 6 in; 10 remapped pgs > > This shows under the pg query that something believes that osd "2" is > down, but all OSDs are up, as seen in the previous ceph -s command. > # ceph pg 6.2 query > "recovery_state": [ > { > "name": "Started/Primary/Active", > "enter_time": "2017-09-12 10:33:11.193486", > "might_have_unfound": [ > { > "osd": "0", > "status": "already probed" > }, > { > "osd": "1", > "status": "already probed" > }, > { > "osd": "2", > "status": "osd is down" > }, > { > "osd": "4", > "status": "already probed" > }, > { > "osd": "5", > "status": "already probed" > } > > > If i go to a couple other OSDs, and run the same command, > the osd "2" is listed as "already probed". They are not in sync. I > double checked that all the OSDs were up on all 3 times I ran the > command. > > Now. my question to debug this to figure out if I want to > "revert|delete", is what in the heck are these file(s)/object(s) > associated with the pg? I assume this might be in the MDS, but I'd > like to see a file name associated with this to make a further > determination of what I should do. I don't have enough information at > this point to figure out how I should recover. >