All of lore.kernel.org
 help / color / mirror / Atom feed
* librbd: error finding header
@ 2012-07-09  5:42 Vladimir Bashkirtsev
  2012-07-09  9:03 ` Dan Mick
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-09  5:42 UTC (permalink / raw)
  To: ceph-devel

Hello,

I just hit this error:

error opening image sip.logics.net.au: (2) No such file or directory
2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding header: 
(2) No such file or directory

Googled around and found that Oliver Francke had similar issue back in 
March. Read your responses but still unclear where to start to dig. Just 
upgraded from 0.47.3 to 0.48, done rolling upgrade and all RBD images 
are OK with exception of this one. Notably as upgrade was done while VM 
was up it continued to run unaffected (ie all data inside of VM was 
written and read as nothing happened). It makes me think that there no 
data loss but mere broken header.

Of course I have a backup of VM and can pull it back in no time but I am 
really interested (and I am sure other ceph users) how to recover from 
such failures?

Regards,
Vladimir

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-09  5:42 librbd: error finding header Vladimir Bashkirtsev
@ 2012-07-09  9:03 ` Dan Mick
  2012-07-09 10:29   ` Vladimir Bashkirtsev
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Mick @ 2012-07-09  9:03 UTC (permalink / raw)
  To: Vladimir Bashkirtsev; +Cc: ceph-devel

Vladimir: you can do some investigation with the rados command.  What does
rados -p rbd ls show you?

On 07/08/2012 10:42 PM, Vladimir Bashkirtsev wrote:
> Hello,
>
> I just hit this error:
>
> error opening image sip.logics.net.au: (2) No such file or directory
> 2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding 
> header: (2) No such file or directory
>
> Googled around and found that Oliver Francke had similar issue back in 
> March. Read your responses but still unclear where to start to dig. 
> Just upgraded from 0.47.3 to 0.48, done rolling upgrade and all RBD 
> images are OK with exception of this one. Notably as upgrade was done 
> while VM was up it continued to run unaffected (ie all data inside of 
> VM was written and read as nothing happened). It makes me think that 
> there no data loss but mere broken header.
>
> Of course I have a backup of VM and can pull it back in no time but I 
> am really interested (and I am sure other ceph users) how to recover 
> from such failures?
>
> Regards,
> Vladimir
> -- 
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-09  9:03 ` Dan Mick
@ 2012-07-09 10:29   ` Vladimir Bashkirtsev
  2012-07-09 16:30     ` Florian Haas
  2012-07-09 17:47     ` Dan Mick
  0 siblings, 2 replies; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-09 10:29 UTC (permalink / raw)
  To: Dan Mick; +Cc: ceph-devel

On 09/07/12 18:33, Dan Mick wrote:
> Vladimir: you can do some investigation with the rados command.  What 
> does
> rados -p rbd ls show you?
Rather long list of:
rb.0.11.000000002786
rb.0.d.0000000054a2
rb.0.6.000000002eb5
rb.0.d.000000008294
rb.0.13.000000000377
rb.0.e.000000000629
rb.0.6.000000002756
rb.0.d.000000006156
rb.0.d.000000009b82
rb.0.5.000000000c9e
rb.0.d.0000000080ba
rb.0.f.000000000e75
rb.0.6.00000000ab4f
rb.0.d.0000000048e4
rb.0.d.000000005f67
rb.0.13.0000000014ad
rb.0.d.00000000e074
rb.0.f.000000001a4b
rb.0.13.0000000004a3
...

How to find out to which image these objects belong?

>
> On 07/08/2012 10:42 PM, Vladimir Bashkirtsev wrote:
>> Hello,
>>
>> I just hit this error:
>>
>> error opening image sip.logics.net.au: (2) No such file or directory
>> 2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding 
>> header: (2) No such file or directory
>>
>> Googled around and found that Oliver Francke had similar issue back 
>> in March. Read your responses but still unclear where to start to 
>> dig. Just upgraded from 0.47.3 to 0.48, done rolling upgrade and all 
>> RBD images are OK with exception of this one. Notably as upgrade was 
>> done while VM was up it continued to run unaffected (ie all data 
>> inside of VM was written and read as nothing happened). It makes me 
>> think that there no data loss but mere broken header.
>>
>> Of course I have a backup of VM and can pull it back in no time but I 
>> am really interested (and I am sure other ceph users) how to recover 
>> from such failures?
>>
>> Regards,
>> Vladimir
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-09 10:29   ` Vladimir Bashkirtsev
@ 2012-07-09 16:30     ` Florian Haas
  2012-07-10  3:28       ` Vladimir Bashkirtsev
  2012-07-09 17:47     ` Dan Mick
  1 sibling, 1 reply; 14+ messages in thread
From: Florian Haas @ 2012-07-09 16:30 UTC (permalink / raw)
  To: Vladimir Bashkirtsev; +Cc: Dan Mick, ceph-devel

On 07/09/12 12:29, Vladimir Bashkirtsev wrote:
> On 09/07/12 18:33, Dan Mick wrote:
>> Vladimir: you can do some investigation with the rados command.  What
>> does
>> rados -p rbd ls show you?
> Rather long list of:
> rb.0.11.000000002786
> rb.0.d.0000000054a2
> rb.0.6.000000002eb5
> rb.0.d.000000008294
> rb.0.13.000000000377
> rb.0.e.000000000629
> rb.0.6.000000002756
> rb.0.d.000000006156
> rb.0.d.000000009b82
> rb.0.5.000000000c9e
> rb.0.d.0000000080ba
> rb.0.f.000000000e75
> rb.0.6.00000000ab4f
> rb.0.d.0000000048e4
> rb.0.d.000000005f67
> rb.0.13.0000000014ad
> rb.0.d.00000000e074
> rb.0.f.000000001a4b
> rb.0.13.0000000004a3
> ...
> 
> How to find out to which image these objects belong?

"rbd info" would tell you the block prefix for the image you're looking
at. Or does that command give you an "error opening image" message as well?

Cheers,
Florian

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-09 10:29   ` Vladimir Bashkirtsev
  2012-07-09 16:30     ` Florian Haas
@ 2012-07-09 17:47     ` Dan Mick
  2012-07-10  3:29       ` Vladimir Bashkirtsev
       [not found]       ` <4FFBA108.3010009@bashkirtsev.com>
  1 sibling, 2 replies; 14+ messages in thread
From: Dan Mick @ 2012-07-09 17:47 UTC (permalink / raw)
  To: Vladimir Bashkirtsev; +Cc: ceph-devel

Well, it's not so much those; those are the objects that hold data 
blocks.  You're more interested in the objects whose names end in 
'.rbd'.  These are the header objects, one per image, and are 
interpreted by rbd info, but I'm concerned that one of them may not exist.

On 07/09/2012 03:29 AM, Vladimir Bashkirtsev wrote:
> On 09/07/12 18:33, Dan Mick wrote:
>> Vladimir: you can do some investigation with the rados command.  What
>> does
>> rados -p rbd ls show you?
> Rather long list of:
> rb.0.11.000000002786
> rb.0.d.0000000054a2
> rb.0.6.000000002eb5
> rb.0.d.000000008294
> rb.0.13.000000000377
> rb.0.e.000000000629
> rb.0.6.000000002756
> rb.0.d.000000006156
> rb.0.d.000000009b82
> rb.0.5.000000000c9e
> rb.0.d.0000000080ba
> rb.0.f.000000000e75
> rb.0.6.00000000ab4f
> rb.0.d.0000000048e4
> rb.0.d.000000005f67
> rb.0.13.0000000014ad
> rb.0.d.00000000e074
> rb.0.f.000000001a4b
> rb.0.13.0000000004a3
> ...
>
> How to find out to which image these objects belong?
>
>>
>> On 07/08/2012 10:42 PM, Vladimir Bashkirtsev wrote:
>>> Hello,
>>>
>>> I just hit this error:
>>>
>>> error opening image sip.logics.net.au: (2) No such file or directory
>>> 2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding
>>> header: (2) No such file or directory
>>>
>>> Googled around and found that Oliver Francke had similar issue back
>>> in March. Read your responses but still unclear where to start to
>>> dig. Just upgraded from 0.47.3 to 0.48, done rolling upgrade and all
>>> RBD images are OK with exception of this one. Notably as upgrade was
>>> done while VM was up it continued to run unaffected (ie all data
>>> inside of VM was written and read as nothing happened). It makes me
>>> think that there no data loss but mere broken header.
>>>
>>> Of course I have a backup of VM and can pull it back in no time but I
>>> am really interested (and I am sure other ceph users) how to recover
>>> from such failures?
>>>
>>> Regards,
>>> Vladimir
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-09 16:30     ` Florian Haas
@ 2012-07-10  3:28       ` Vladimir Bashkirtsev
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-10  3:28 UTC (permalink / raw)
  To: Florian Haas; +Cc: Dan Mick, ceph-devel

On 10/07/12 02:00, Florian Haas wrote:
> On 07/09/12 12:29, Vladimir Bashkirtsev wrote:
>> On 09/07/12 18:33, Dan Mick wrote:
>>> Vladimir: you can do some investigation with the rados command.  What
>>> does
>>> rados -p rbd ls show you?
>> Rather long list of:
>> rb.0.11.000000002786
>> rb.0.d.0000000054a2
>> rb.0.6.000000002eb5
>> rb.0.d.000000008294
>> rb.0.13.000000000377
>> rb.0.e.000000000629
>> rb.0.6.000000002756
>> rb.0.d.000000006156
>> rb.0.d.000000009b82
>> rb.0.5.000000000c9e
>> rb.0.d.0000000080ba
>> rb.0.f.000000000e75
>> rb.0.6.00000000ab4f
>> rb.0.d.0000000048e4
>> rb.0.d.000000005f67
>> rb.0.13.0000000014ad
>> rb.0.d.00000000e074
>> rb.0.f.000000001a4b
>> rb.0.13.0000000004a3
>> ...
>>
>> How to find out to which image these objects belong?
> "rbd info" would tell you the block prefix for the image you're looking
> at. Or does that command give you an "error opening image" message as well?
Yes, the same error. But Dan guessed it right.
>
> Cheers,
> Florian



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-09 17:47     ` Dan Mick
@ 2012-07-10  3:29       ` Vladimir Bashkirtsev
       [not found]       ` <4FFBA108.3010009@bashkirtsev.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-10  3:29 UTC (permalink / raw)
  To: Dan Mick; +Cc: ceph-devel

On 10/07/12 03:17, Dan Mick wrote:
> Well, it's not so much those; those are the objects that hold data 
> blocks.  You're more interested in the objects whose names end in 
> '.rbd'.  These are the header objects, one per image, and are 
> interpreted by rbd info, but I'm concerned that one of them may not 
> exist.
Right on the ball: .rbd for image concerned just does not exist. So how 
can we recover from this? And why it has disappeared in first place? (I 
guess latter may be related to some sort of bug)
>
> On 07/09/2012 03:29 AM, Vladimir Bashkirtsev wrote:
>> On 09/07/12 18:33, Dan Mick wrote:
>>> Vladimir: you can do some investigation with the rados command.  What
>>> does
>>> rados -p rbd ls show you?
>> Rather long list of:
>> rb.0.11.000000002786
>> rb.0.d.0000000054a2
>> rb.0.6.000000002eb5
>> rb.0.d.000000008294
>> rb.0.13.000000000377
>> rb.0.e.000000000629
>> rb.0.6.000000002756
>> rb.0.d.000000006156
>> rb.0.d.000000009b82
>> rb.0.5.000000000c9e
>> rb.0.d.0000000080ba
>> rb.0.f.000000000e75
>> rb.0.6.00000000ab4f
>> rb.0.d.0000000048e4
>> rb.0.d.000000005f67
>> rb.0.13.0000000014ad
>> rb.0.d.00000000e074
>> rb.0.f.000000001a4b
>> rb.0.13.0000000004a3
>> ...
>>
>> How to find out to which image these objects belong?
>>
>>>
>>> On 07/08/2012 10:42 PM, Vladimir Bashkirtsev wrote:
>>>> Hello,
>>>>
>>>> I just hit this error:
>>>>
>>>> error opening image sip.logics.net.au: (2) No such file or directory
>>>> 2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding
>>>> header: (2) No such file or directory
>>>>
>>>> Googled around and found that Oliver Francke had similar issue back
>>>> in March. Read your responses but still unclear where to start to
>>>> dig. Just upgraded from 0.47.3 to 0.48, done rolling upgrade and all
>>>> RBD images are OK with exception of this one. Notably as upgrade was
>>>> done while VM was up it continued to run unaffected (ie all data
>>>> inside of VM was written and read as nothing happened). It makes me
>>>> think that there no data loss but mere broken header.
>>>>
>>>> Of course I have a backup of VM and can pull it back in no time but I
>>>> am really interested (and I am sure other ceph users) how to recover
>>>> from such failures?
>>>>
>>>> Regards,
>>>> Vladimir
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>>
>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
       [not found]         ` <4FFBB74F.2050702@inktank.com>
@ 2012-07-10  9:25           ` Vladimir Bashkirtsev
  2012-07-10 20:08             ` Josh Durgin
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-10  9:25 UTC (permalink / raw)
  To: Dan Mick; +Cc: ceph-devel

On 10/07/12 14:32, Dan Mick wrote:
>
>
> On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote:
>> On 10/07/12 03:17, Dan Mick wrote:
>>> Well, it's not so much those; those are the objects that hold data
>>> blocks.  You're more interested in the objects whose names end in
>>> '.rbd'.  These are the header objects, one per image, and are
>>> interpreted by rbd info, but I'm concerned that one of them may not
>>> exist.
>> Right on the ball: .rbd for image concerned just does not exist. So how
>> can we recover from this? And why it has disappeared in first place? (I
>> guess latter may be related to some sort of bug)
>
> Don't know why it might have disappeared.  Recovery: no easy way. It's 
> possible that image header could be reconstructed, but there aren't 
> any tools written to do it (the header format is pretty uncomplicated).
Well... Then somehow either I need to rebuild it manually or clean up 
image remains to free up space. Given that rbd tool refuses to do 
anything without .rbd object then clean up appears to be manual as well.

I have run rbd info on the rest of images and excluded rb.* objects 
belonging to good images. Now I know broken image has prefix of rb.0.1 
and technically I can clean out objects belonging to this image. But rbd 
ls seems to pull the list of rbd images from somewhere: broken image 
must be removed from there as well. Not sure where it is stored.

Alternatively how hard it would be to throw together a quick tool which 
picks up these objects and reconstructs .rbd header? Something tells me 
that it should be relatively straight forward.

I have no pressing need to recover this image - I have pulled the backup 
and now it is on its merry way. But just for future sake we need to get 
this one resolved: another day someone else will hit it.

----------------------------------

30 minutes later:

I have looked at structure of rbd_obj_header_ondisk and really it is 
quite simple. Image has no snapshots and so it makes everything straight 
forward. Order is default 22, size - well, unknown but finding object 
with highest index provides some guidance. get rbd header from another 
image, using hexedit changed name and size, put it back and viola - 
image is back and running. Not quite sure about integrity but at least 
now it will allow to remove image cleanly.

>
> It certainly shouldn't have just happened.  Any idea what operations 
> might have been in progress when it did?
Obviously not. I am running ceph over last few months trying to get it 
off track and till now had no major issues. VM concerned was running 
while I did upgrade from 0.47.3 to 0.48. After that point I have asked 
the list if it is safe to live migrate VM with rbd cache on. Josh 
confirmed that it is safe to do so. So I have live migrated VM to 
another host. No dramas. Still everything runs. Then I have updated 
hosts (rolling update again - migrating VMs away while rebooting hosts). 
I have around 10 VMs (including heavily loaded) and all of them migrated 
around without any issues. Then suddenly this VM refused to migrate. 
While I was typing it I remembered that there was one issue between 
upgrade of ceph and failure to migrate: one of pgs turned inconsistent. 
pg repair fixed it and I immediately forgot about it. Could it be the 
reason why this .rbd disappeared? (Went to check logs but logrotate 
already removed it).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-10  9:25           ` Vladimir Bashkirtsev
@ 2012-07-10 20:08             ` Josh Durgin
  2012-07-12  2:40               ` Vladimir Bashkirtsev
  0 siblings, 1 reply; 14+ messages in thread
From: Josh Durgin @ 2012-07-10 20:08 UTC (permalink / raw)
  To: Vladimir Bashkirtsev; +Cc: Dan Mick, ceph-devel

On 07/10/2012 02:25 AM, Vladimir Bashkirtsev wrote:
> On 10/07/12 14:32, Dan Mick wrote:
>>
>>
>> On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote:
>>> On 10/07/12 03:17, Dan Mick wrote:
>>>> Well, it's not so much those; those are the objects that hold data
>>>> blocks.  You're more interested in the objects whose names end in
>>>> '.rbd'.  These are the header objects, one per image, and are
>>>> interpreted by rbd info, but I'm concerned that one of them may not
>>>> exist.
>>> Right on the ball: .rbd for image concerned just does not exist. So how
>>> can we recover from this? And why it has disappeared in first place? (I
>>> guess latter may be related to some sort of bug)
>>
>> Don't know why it might have disappeared.  Recovery: no easy way. It's
>> possible that image header could be reconstructed, but there aren't
>> any tools written to do it (the header format is pretty uncomplicated).
> Well... Then somehow either I need to rebuild it manually or clean up
> image remains to free up space. Given that rbd tool refuses to do
> anything without .rbd object then clean up appears to be manual as well.
>
> I have run rbd info on the rest of images and excluded rb.* objects
> belonging to good images. Now I know broken image has prefix of rb.0.1
> and technically I can clean out objects belonging to this image. But rbd
> ls seems to pull the list of rbd images from somewhere: broken image
> must be removed from there as well. Not sure where it is stored.

This is stored in the rbd_directory object. 'rbd rm' tries to do
as much as it can when missing the header, including removing the image
from the directory. If you do 'rbd rm image --debug-rbd 2' you should
see this happen. You'll still get the message about the header being
missing, but it should continue and remove it from the rbd_directory
object as well. It can't remove the data objects since it doesn't
know the correct prefix without the header.

> Alternatively how hard it would be to throw together a quick tool which
> picks up these objects and reconstructs .rbd header? Something tells me
> that it should be relatively straight forward.

It's pretty simple if you don't have any snapshots. If you do have
snapshots, you would need to figure out which snapshot ids they have,
and without the header the only way you could do that would be to
examine the rbd data objects on the osds (there's no way to examine
which selfmanaged snapshots exist via librados right now).

Alternatively, you could brute force the snapshot ids by attempting to 
read from each snapshot id for each data object (since not all of them
will have all snapshots). If they all return -ENOENT for a given
snapshot id, that snapshot id doesn't exist in the image.

If any snapshots had different sizes, or in future versions of rbd had 
other metadata change, you might need to recreate that metadata to be 
able to use the snapshot.

If you created a new header ignoring any snapshots that existed,
you would end up with space still being used by the snapshots after
you removed the image.

> I have no pressing need to recover this image - I have pulled the backup
> and now it is on its merry way. But just for future sake we need to get
> this one resolved: another day someone else will hit it.
>
> ----------------------------------
>
> 30 minutes later:
>
> I have looked at structure of rbd_obj_header_ondisk and really it is
> quite simple. Image has no snapshots and so it makes everything straight
> forward. Order is default 22, size - well, unknown but finding object
> with highest index provides some guidance. get rbd header from another
> image, using hexedit changed name and size, put it back and viola -
> image is back and running. Not quite sure about integrity but at least
> now it will allow to remove image cleanly.
>
>>
>> It certainly shouldn't have just happened.  Any idea what operations
>> might have been in progress when it did?
> Obviously not. I am running ceph over last few months trying to get it
> off track and till now had no major issues. VM concerned was running
> while I did upgrade from 0.47.3 to 0.48. After that point I have asked
> the list if it is safe to live migrate VM with rbd cache on. Josh
> confirmed that it is safe to do so. So I have live migrated VM to
> another host. No dramas. Still everything runs. Then I have updated
> hosts (rolling update again - migrating VMs away while rebooting hosts).
> I have around 10 VMs (including heavily loaded) and all of them migrated
> around without any issues. Then suddenly this VM refused to migrate.
> While I was typing it I remembered that there was one issue between
> upgrade of ceph and failure to migrate: one of pgs turned inconsistent.
> pg repair fixed it and I immediately forgot about it. Could it be the
> reason why this .rbd disappeared? (Went to check logs but logrotate
> already removed it).

Yeah, the the inconsistent pg was very likely the problem. If you see
that happen again it would be good to save the osd logs so we can try to
figure out how it happened. PG repair won't be able to replace missing 
objects if they're missing on all replicas.

Josh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-10 20:08             ` Josh Durgin
@ 2012-07-12  2:40               ` Vladimir Bashkirtsev
  2012-07-12  4:41                 ` Josh Durgin
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-12  2:40 UTC (permalink / raw)
  To: joshd; +Cc: Dan Mick, ceph-devel

On 11/07/12 05:38, Josh Durgin wrote:
> On 07/10/2012 02:25 AM, Vladimir Bashkirtsev wrote:
>> On 10/07/12 14:32, Dan Mick wrote:
>>>
>>>
>>> On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote:
>>>> On 10/07/12 03:17, Dan Mick wrote:
>>>>> Well, it's not so much those; those are the objects that hold data
>>>>> blocks.  You're more interested in the objects whose names end in
>>>>> '.rbd'.  These are the header objects, one per image, and are
>>>>> interpreted by rbd info, but I'm concerned that one of them may not
>>>>> exist.
>>>> Right on the ball: .rbd for image concerned just does not exist. So 
>>>> how
>>>> can we recover from this? And why it has disappeared in first 
>>>> place? (I
>>>> guess latter may be related to some sort of bug)
>>>
>>> Don't know why it might have disappeared.  Recovery: no easy way. It's
>>> possible that image header could be reconstructed, but there aren't
>>> any tools written to do it (the header format is pretty uncomplicated).
>> Well... Then somehow either I need to rebuild it manually or clean up
>> image remains to free up space. Given that rbd tool refuses to do
>> anything without .rbd object then clean up appears to be manual as well.
>>
>> I have run rbd info on the rest of images and excluded rb.* objects
>> belonging to good images. Now I know broken image has prefix of rb.0.1
>> and technically I can clean out objects belonging to this image. But rbd
>> ls seems to pull the list of rbd images from somewhere: broken image
>> must be removed from there as well. Not sure where it is stored.
>
> This is stored in the rbd_directory object. 'rbd rm' tries to do
> as much as it can when missing the header, including removing the image
> from the directory. If you do 'rbd rm image --debug-rbd 2' you should
> see this happen. You'll still get the message about the header being
> missing, but it should continue and remove it from the rbd_directory
> object as well. It can't remove the data objects since it doesn't
> know the correct prefix without the header.
Managed to recreate header and run rbd rm as per normal
>
>> Alternatively how hard it would be to throw together a quick tool which
>> picks up these objects and reconstructs .rbd header? Something tells me
>> that it should be relatively straight forward.
>
> It's pretty simple if you don't have any snapshots. If you do have
> snapshots, you would need to figure out which snapshot ids they have,
> and without the header the only way you could do that would be to
> examine the rbd data objects on the osds (there's no way to examine
> which selfmanaged snapshots exist via librados right now).
>
> Alternatively, you could brute force the snapshot ids by attempting to 
> read from each snapshot id for each data object (since not all of them
> will have all snapshots). If they all return -ENOENT for a given
> snapshot id, that snapshot id doesn't exist in the image.
>
> If any snapshots had different sizes, or in future versions of rbd had 
> other metadata change, you might need to recreate that metadata to be 
> able to use the snapshot.
>
> If you created a new header ignoring any snapshots that existed,
> you would end up with space still being used by the snapshots after
> you removed the image.
Thank you for your explanation. I believe other people will definitely 
will find it useful. My case was not severe but having clear idea of how 
underlying rados storage keeps rbd images is definitely a bonus should 
anything like it happen again.

Just quick question: index on the end of object name rb.*.*.<index> is 
sequential number of object in rbd image? Ie to find size of an image we 
need to find highest index, multiply it by block size (based on order) 
and we should get size of an image? I guess size of an image is not 
recorded anywhere except header?
>
>> I have no pressing need to recover this image - I have pulled the backup
>> and now it is on its merry way. But just for future sake we need to get
>> this one resolved: another day someone else will hit it.
>>
>> ----------------------------------
>>
>> 30 minutes later:
>>
>> I have looked at structure of rbd_obj_header_ondisk and really it is
>> quite simple. Image has no snapshots and so it makes everything straight
>> forward. Order is default 22, size - well, unknown but finding object
>> with highest index provides some guidance. get rbd header from another
>> image, using hexedit changed name and size, put it back and viola -
>> image is back and running. Not quite sure about integrity but at least
>> now it will allow to remove image cleanly.
>>
>>>
>>> It certainly shouldn't have just happened.  Any idea what operations
>>> might have been in progress when it did?
>> Obviously not. I am running ceph over last few months trying to get it
>> off track and till now had no major issues. VM concerned was running
>> while I did upgrade from 0.47.3 to 0.48. After that point I have asked
>> the list if it is safe to live migrate VM with rbd cache on. Josh
>> confirmed that it is safe to do so. So I have live migrated VM to
>> another host. No dramas. Still everything runs. Then I have updated
>> hosts (rolling update again - migrating VMs away while rebooting hosts).
>> I have around 10 VMs (including heavily loaded) and all of them migrated
>> around without any issues. Then suddenly this VM refused to migrate.
>> While I was typing it I remembered that there was one issue between
>> upgrade of ceph and failure to migrate: one of pgs turned inconsistent.
>> pg repair fixed it and I immediately forgot about it. Could it be the
>> reason why this .rbd disappeared? (Went to check logs but logrotate
>> already removed it).
>
> Yeah, the the inconsistent pg was very likely the problem. If you see
> that happen again it would be good to save the osd logs so we can try to
> figure out how it happened. PG repair won't be able to replace missing 
> objects if they're missing on all replicas.
I really did stupid thing allowing these logs to rotate out. I was 
preoccupied with something else and when I hit inconsistency which was 
fixed by pg repair. It did not occur to me to at least copy logs away. 
Now it is too late but I certainly will remember it next time.
>
> Josh



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-12  2:40               ` Vladimir Bashkirtsev
@ 2012-07-12  4:41                 ` Josh Durgin
  2012-07-12 16:00                   ` Tommi Virtanen
  0 siblings, 1 reply; 14+ messages in thread
From: Josh Durgin @ 2012-07-12  4:41 UTC (permalink / raw)
  To: Vladimir Bashkirtsev; +Cc: Dan Mick, ceph-devel

On 07/11/2012 07:40 PM, Vladimir Bashkirtsev wrote:
> On 11/07/12 05:38, Josh Durgin wrote:
>> On 07/10/2012 02:25 AM, Vladimir Bashkirtsev wrote:
>>> On 10/07/12 14:32, Dan Mick wrote:
>>>>
>>>>
>>>> On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote:
>>>>> On 10/07/12 03:17, Dan Mick wrote:
>>>>>> Well, it's not so much those; those are the objects that hold data
>>>>>> blocks.  You're more interested in the objects whose names end in
>>>>>> '.rbd'.  These are the header objects, one per image, and are
>>>>>> interpreted by rbd info, but I'm concerned that one of them may not
>>>>>> exist.
>>>>> Right on the ball: .rbd for image concerned just does not exist. So
>>>>> how
>>>>> can we recover from this? And why it has disappeared in first
>>>>> place? (I
>>>>> guess latter may be related to some sort of bug)
>>>>
>>>> Don't know why it might have disappeared.  Recovery: no easy way. It's
>>>> possible that image header could be reconstructed, but there aren't
>>>> any tools written to do it (the header format is pretty uncomplicated).
>>> Well... Then somehow either I need to rebuild it manually or clean up
>>> image remains to free up space. Given that rbd tool refuses to do
>>> anything without .rbd object then clean up appears to be manual as well.
>>>
>>> I have run rbd info on the rest of images and excluded rb.* objects
>>> belonging to good images. Now I know broken image has prefix of rb.0.1
>>> and technically I can clean out objects belonging to this image. But rbd
>>> ls seems to pull the list of rbd images from somewhere: broken image
>>> must be removed from there as well. Not sure where it is stored.
>>
>> This is stored in the rbd_directory object. 'rbd rm' tries to do
>> as much as it can when missing the header, including removing the image
>> from the directory. If you do 'rbd rm image --debug-rbd 2' you should
>> see this happen. You'll still get the message about the header being
>> missing, but it should continue and remove it from the rbd_directory
>> object as well. It can't remove the data objects since it doesn't
>> know the correct prefix without the header.
> Managed to recreate header and run rbd rm as per normal
>>
>>> Alternatively how hard it would be to throw together a quick tool which
>>> picks up these objects and reconstructs .rbd header? Something tells me
>>> that it should be relatively straight forward.
>>
>> It's pretty simple if you don't have any snapshots. If you do have
>> snapshots, you would need to figure out which snapshot ids they have,
>> and without the header the only way you could do that would be to
>> examine the rbd data objects on the osds (there's no way to examine
>> which selfmanaged snapshots exist via librados right now).
>>
>> Alternatively, you could brute force the snapshot ids by attempting to
>> read from each snapshot id for each data object (since not all of them
>> will have all snapshots). If they all return -ENOENT for a given
>> snapshot id, that snapshot id doesn't exist in the image.
>>
>> If any snapshots had different sizes, or in future versions of rbd had
>> other metadata change, you might need to recreate that metadata to be
>> able to use the snapshot.
>>
>> If you created a new header ignoring any snapshots that existed,
>> you would end up with space still being used by the snapshots after
>> you removed the image.
> Thank you for your explanation. I believe other people will definitely
> will find it useful. My case was not severe but having clear idea of how
> underlying rados storage keeps rbd images is definitely a bonus should
> anything like it happen again.

Yeah, we're trying to document newer things better. The design for
layering is at http://ceph.com/docs/master/dev/rbd-layering/, and
the new format that goes with it is partially described in this thread:

http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/6595

A summary of that should be added to the online docs too.

> Just quick question: index on the end of object name rb.*.*.<index> is
> sequential number of object in rbd image? Ie to find size of an image we
> need to find highest index, multiply it by block size (based on order)
> and we should get size of an image? I guess size of an image is not
> recorded anywhere except header?

Size is only recorded in the header. One of the reasons is that
there's only one place to update when the image is resized
(transactions are only possible on a single object).

You're right about the object name - you can get its offset in the
image that way. Since rbd is thin-provisioned, however, the highest
index object might not be the highest possible object. When you first
create an image, only the header object is created.

Josh

>>
>>> I have no pressing need to recover this image - I have pulled the backup
>>> and now it is on its merry way. But just for future sake we need to get
>>> this one resolved: another day someone else will hit it.
>>>
>>> ----------------------------------
>>>
>>> 30 minutes later:
>>>
>>> I have looked at structure of rbd_obj_header_ondisk and really it is
>>> quite simple. Image has no snapshots and so it makes everything straight
>>> forward. Order is default 22, size - well, unknown but finding object
>>> with highest index provides some guidance. get rbd header from another
>>> image, using hexedit changed name and size, put it back and viola -
>>> image is back and running. Not quite sure about integrity but at least
>>> now it will allow to remove image cleanly.
>>>
>>>>
>>>> It certainly shouldn't have just happened.  Any idea what operations
>>>> might have been in progress when it did?
>>> Obviously not. I am running ceph over last few months trying to get it
>>> off track and till now had no major issues. VM concerned was running
>>> while I did upgrade from 0.47.3 to 0.48. After that point I have asked
>>> the list if it is safe to live migrate VM with rbd cache on. Josh
>>> confirmed that it is safe to do so. So I have live migrated VM to
>>> another host. No dramas. Still everything runs. Then I have updated
>>> hosts (rolling update again - migrating VMs away while rebooting hosts).
>>> I have around 10 VMs (including heavily loaded) and all of them migrated
>>> around without any issues. Then suddenly this VM refused to migrate.
>>> While I was typing it I remembered that there was one issue between
>>> upgrade of ceph and failure to migrate: one of pgs turned inconsistent.
>>> pg repair fixed it and I immediately forgot about it. Could it be the
>>> reason why this .rbd disappeared? (Went to check logs but logrotate
>>> already removed it).
>>
>> Yeah, the the inconsistent pg was very likely the problem. If you see
>> that happen again it would be good to save the osd logs so we can try to
>> figure out how it happened. PG repair won't be able to replace missing
>> objects if they're missing on all replicas.
> I really did stupid thing allowing these logs to rotate out. I was
> preoccupied with something else and when I hit inconsistency which was
> fixed by pg repair. It did not occur to me to at least copy logs away.
> Now it is too late but I certainly will remember it next time.
>>
>> Josh
>
>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-12  4:41                 ` Josh Durgin
@ 2012-07-12 16:00                   ` Tommi Virtanen
  2012-07-13 13:06                     ` Vladimir Bashkirtsev
  0 siblings, 1 reply; 14+ messages in thread
From: Tommi Virtanen @ 2012-07-12 16:00 UTC (permalink / raw)
  To: Josh Durgin; +Cc: Vladimir Bashkirtsev, Dan Mick, ceph-devel

On Wed, Jul 11, 2012 at 9:41 PM, Josh Durgin <josh.durgin@inktank.com> wrote:
> You're right about the object name - you can get its offset in the
> image that way. Since rbd is thin-provisioned, however, the highest
> index object might not be the highest possible object. When you first
> create an image, only the header object is created.

You can re-create it with a size that's known to be greater than the
old size (put in a terabyte extra, or something), and then use a
partitioning tool to see what the disk layout really is, and resize
based on that.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-12 16:00                   ` Tommi Virtanen
@ 2012-07-13 13:06                     ` Vladimir Bashkirtsev
  2012-07-14  0:34                       ` Josh Durgin
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-13 13:06 UTC (permalink / raw)
  To: Tommi Virtanen; +Cc: Josh Durgin, Dan Mick, ceph-devel

On 13/07/12 01:30, Tommi Virtanen wrote:
> On Wed, Jul 11, 2012 at 9:41 PM, Josh Durgin <josh.durgin@inktank.com> wrote:
>> You're right about the object name - you can get its offset in the
>> image that way. Since rbd is thin-provisioned, however, the highest
>> index object might not be the highest possible object. When you first
>> create an image, only the header object is created.
> You can re-create it with a size that's known to be greater than the
> old size (put in a terabyte extra, or something), and then use a
> partitioning tool to see what the disk layout really is, and resize
> based on that.
Good point. However ceph should not be aware of image internal 
structure. In most installations image would contain partition table 
which obviously may be used to calculate image size but in some cases 
(when whole image is used for something) it may not be. Perhaps good 
point for RBD would be to create first and last object for image when 
RBD header is created. Will waste a bit of space but generally these 
objects will hold partitioning information and just their existence 
would allow to establish boundaries of the image. Does not help with 
snapshots though. But definitely will be helpful for a recovery tool.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: librbd: error finding header
  2012-07-13 13:06                     ` Vladimir Bashkirtsev
@ 2012-07-14  0:34                       ` Josh Durgin
  0 siblings, 0 replies; 14+ messages in thread
From: Josh Durgin @ 2012-07-14  0:34 UTC (permalink / raw)
  To: Vladimir Bashkirtsev; +Cc: Tommi Virtanen, Dan Mick, ceph-devel

On 07/13/2012 06:06 AM, Vladimir Bashkirtsev wrote:
> On 13/07/12 01:30, Tommi Virtanen wrote:
>> On Wed, Jul 11, 2012 at 9:41 PM, Josh Durgin <josh.durgin@inktank.com>
>> wrote:
>>> You're right about the object name - you can get its offset in the
>>> image that way. Since rbd is thin-provisioned, however, the highest
>>> index object might not be the highest possible object. When you first
>>> create an image, only the header object is created.
>> You can re-create it with a size that's known to be greater than the
>> old size (put in a terabyte extra, or something), and then use a
>> partitioning tool to see what the disk layout really is, and resize
>> based on that.
> Good point. However ceph should not be aware of image internal
> structure. In most installations image would contain partition table
> which obviously may be used to calculate image size but in some cases
> (when whole image is used for something) it may not be. Perhaps good
> point for RBD would be to create first and last object for image when
> RBD header is created. Will waste a bit of space but generally these
> objects will hold partitioning information and just their existence
> would allow to establish boundaries of the image. Does not help with
> snapshots though. But definitely will be helpful for a recovery tool.

Ceph definitely needs to store the image size. Things like qemu need
to know the size of the block device to report to the guest bios. It's
also useful to know how much space your rbd images have allocated. We
can't assume there's a partition table, or that it's accurate. Ceph
shouldn't need to interpret the contents of an image, since it's
defined by the end user.

Since rbd images can have sizes that are not multiples of their object
size, it also wouldn't give you the exact size. Also, using discard/TRIM
support, objects may be deleted.

If you always make your images a multiple of object size, you never use
discard/TRIM, and you write to the end of the image after you create
it, you could tell the size from highest numbered object that exists.

I don't think this buys you much over doing what Tommi suggested.

Josh

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-07-14  0:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-09  5:42 librbd: error finding header Vladimir Bashkirtsev
2012-07-09  9:03 ` Dan Mick
2012-07-09 10:29   ` Vladimir Bashkirtsev
2012-07-09 16:30     ` Florian Haas
2012-07-10  3:28       ` Vladimir Bashkirtsev
2012-07-09 17:47     ` Dan Mick
2012-07-10  3:29       ` Vladimir Bashkirtsev
     [not found]       ` <4FFBA108.3010009@bashkirtsev.com>
     [not found]         ` <4FFBB74F.2050702@inktank.com>
2012-07-10  9:25           ` Vladimir Bashkirtsev
2012-07-10 20:08             ` Josh Durgin
2012-07-12  2:40               ` Vladimir Bashkirtsev
2012-07-12  4:41                 ` Josh Durgin
2012-07-12 16:00                   ` Tommi Virtanen
2012-07-13 13:06                     ` Vladimir Bashkirtsev
2012-07-14  0:34                       ` Josh Durgin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.