All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-lvm] Missing PV
@ 2012-04-24 13:24 Brian McCullough
  2012-04-26 14:58 ` Brian McCullough
  0 siblings, 1 reply; 11+ messages in thread
From: Brian McCullough @ 2012-04-24 13:24 UTC (permalink / raw)
  To: linux-lvm

I have encountered a situation where vgscan and vgchange are complaining
about a missing UUID.

As far as I know, all, or almost all, of the LV is on the PV that is
known ( how do I know for sure? ), so I think that I am trying to just
"remove" the PV and recover what I can of the LV.

I have read Milan Brosz' slides from 2009, and the only piece that I
seem to be missing is the recovery of the LV.


I have made a copy of the PV to work with and the procedure that I need
to follow, as I understand it, is as follows:


vgscan
vgchange -a y ( fails )
vgchange -a y --partial
vgreduce --removemissing vgname

then what?

I have done:

pvs -o +uuid
lvs -o +devices

But am not sure how to interpret the results.  lvs shows two entries for
the LV that I am interested in.  The second entry shows "unknown device(0)".  



Thank you,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-24 13:24 [linux-lvm] Missing PV Brian McCullough
@ 2012-04-26 14:58 ` Brian McCullough
  2012-04-26 16:47   ` Milan Broz
  0 siblings, 1 reply; 11+ messages in thread
From: Brian McCullough @ 2012-04-26 14:58 UTC (permalink / raw)
  To: linux-lvm

On Tue, Apr 24, 2012 at 09:24:19AM -0400, Brian McCullough wrote:
> I have encountered a situation where vgscan and vgchange are complaining
> about a missing UUID.
> 
> As far as I know, all, or almost all, of the LV is on the PV that is
> known ( how do I know for sure? ), so I think that I am trying to just
> "remove" the PV and recover what I can of the LV.

Sorry to be dense, but I don't feel confident about proceeding before I
know what the next step should be.

I am pretty sure that I can remove the "lost" PV, using the
instructions that I have found in multiple places, including the
referenced slide deck, but I have not been able to find anything about
recovering the LV that spans from the existing PV into the lost one.

Since most, if not all of the existing data from that LV is on the
"good" PV, I would hope that I can recover that filesystem.  The
question is "How?"



> I have read Milan Brosz' slides from 2009, and the only piece that I
> seem to be missing is the recovery of the LV.
> 
> 
> I have made a copy of the PV to work with and the procedure that I need
> to follow, as I understand it, is as follows:
> 
> 
> vgscan
> vgchange -a y ( fails )
> vgchange -a y --partial
> vgreduce --removemissing vgname
> 
> then what?
> 
> I have done:
> 
> pvs -o +uuid
> lvs -o +devices
> 
> But am not sure how to interpret the results.  lvs shows two entries for
> the LV that I am interested in.  The second entry shows "unknown device(0)".  


Thanks,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 14:58 ` Brian McCullough
@ 2012-04-26 16:47   ` Milan Broz
  2012-04-26 17:23     ` Brian McCullough
  0 siblings, 1 reply; 11+ messages in thread
From: Milan Broz @ 2012-04-26 16:47 UTC (permalink / raw)
  To: LVM general discussion and development

On 04/26/2012 04:58 PM, Brian McCullough wrote:
> On Tue, Apr 24, 2012 at 09:24:19AM -0400, Brian McCullough wrote:
>> I have encountered a situation where vgscan and vgchange are complaining
>> about a missing UUID.
>>
>> As far as I know, all, or almost all, of the LV is on the PV that is
>> known ( how do I know for sure? ), so I think that I am trying to just
>> "remove" the PV and recover what I can of the LV.
> 
> Sorry to be dense, but I don't feel confident about proceeding before I
> know what the next step should be.

It is not clear what exactly you are trying to do and what how your configuration
looks like.

Btw there are several examples as well
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/mdatarecover.html

You said you have missing PV, right?

- why the PV is missing? What exactly happened?
(overwritten, removed from system, hw failed?)

- what was on that missing PV? e.g. which part of LV?

("lvs -o +devices" should tell, paste it somewhere, if
it is the first segment missing, you will perhaps not recover fs on it)

All recovery now depends on info above and what you really want:

1) either you have old disk and you want to recover metadata on it
and attach it back to VG

2) or you want just recover data from existing PVs
(replace missing PV segments with zeroes for example)

3) or you want completely remove all LVs which were even partially on this
lost PV (no data recovery, just make VG consistent again)

What is the option you want to do? I guess 2) ?

(btw all situations are described on my slides you mentioned,
http://mbroz.fedorapeople.org/talks/LinuxAlt2009_2/ - but it is possible
some info is not up to date, there were some small changes.
And I borrowed some info from Bryn lvm recovery talk as well)

> I am pretty sure that I can remove the "lost" PV, using the
> instructions that I have found in multiple places, including the
> referenced slide deck, but I have not been able to find anything about
> recovering the LV that spans from the existing PV into the lost one.

See the section for missing_stripe_filler and --partial activation
(default stripe filler is "error" - all IO on missing segment fails
with io error)

vgchange/lvchange should then replace these missing with this filler.

(See how you can use "zero" replacement on my slides above. This
is better for data recovery, similar to dd_rescue job)

Milan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 16:47   ` Milan Broz
@ 2012-04-26 17:23     ` Brian McCullough
  2012-04-26 17:47       ` Milan Broz
  0 siblings, 1 reply; 11+ messages in thread
From: Brian McCullough @ 2012-04-26 17:23 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development

On Thu, Apr 26, 2012 at 06:47:17PM +0200, Milan Broz wrote:
> On 04/26/2012 04:58 PM, Brian McCullough wrote:
> > On Tue, Apr 24, 2012 at 09:24:19AM -0400, Brian McCullough wrote:
> >> I have encountered a situation where vgscan and vgchange are complaining
> >> about a missing UUID.
> >>
> >> As far as I know, all, or almost all, of the LV is on the PV that is
> >> known ( how do I know for sure? ), so I think that I am trying to just
> >> "remove" the PV and recover what I can of the LV.
> > 
> > Sorry to be dense, but I don't feel confident about proceeding before I
> > know what the next step should be.
> 
> It is not clear what exactly you are trying to do and what how your configuration
> looks like.

I'm sorry to be unclear.  I will try and explain.




To begin with, this is a KVM Virtual Machine living in an Ubuntu 11.10 (
recently upgraded ) environment.

For each VM, I have created an LV in the host, which contains the .qcow
files, usually one per VM, which are the "virtual disks."



> You said you have missing PV, right?

In the case of the machine in question, the original "disk" was found to
be too small by the user, and another qcow file was created to handle
the excess.

Last Tuesday ( a week+ ago ), the primary Sysadmin for the host machine
rebooted the machine for various reasons, I understand, and afterward
this VM would not restart, with a missing PV.



> - why the PV is missing? What exactly happened?
> (overwritten, removed from system, hw failed?)

The qcow file is still there, but LVM claims that it is missing, from
what I understand from the messages.


pvs -o +uuid looks like:

  PV             VG       Fmt  Attr PSize  PFree  PV UUID                               
/dev/vda5      eyeball4 lvm2 a-    9.76g     0 cL9Emm-3uB0-KRxV-561E-bdaC-r3c3-YLlRzA
/dev/vdb2      eyeball  lvm2 a-   78.75g     0 FqSnyb-GUGF-Iz2u-FTSN-SYHS-hIR2-b7vy73
unknown device eyeball  lvm2 a-   40.00g 20.00g kXXFhn-oalZ-9D12-0CvG-6b4w-RjcE-Jb171k


> - what was on that missing PV? e.g. which part of LV?
> 
> ("lvs -o +devices" should tell, paste it somewhere, if
> it is the first segment missing, you will perhaps not recover fs on it)

I understand.  What I gather from lvs is that it is the last segments.

  LV     VG      Attr   LSize   Origin Snap%  Move Log Copy%  Convert Devices          
home   eyeball -wi---  89.89g /dev/vdb2(2268)  
home   eyeball -wi---  89.89g unknown device(0)
root   eyeball -wi-a- 332.00m /dev/vdb2(0)     
swap_1 eyeball -wi-a- 732.00m /dev/vdb2(1990)  
tmp    eyeball -wi-a- 380.00m /dev/vdb2(2173)  
usr    eyeball -wi-a-   4.66g /dev/vdb2(83)    
var    eyeball -wi-a-   2.79g /dev/vdb2(1275)


I can see, and if I do a vgchange --partial, I can mount and read
everything except home, which is where the "critical" data is.



> All recovery now depends on info above and what you really want:
> 
> 1) either you have old disk and you want to recover metadata on it
> and attach it back to VG
> 
> 2) or you want just recover data from existing PVs
> (replace missing PV segments with zeroes for example)
> 
> 3) or you want completely remove all LVs which were even partially on this
> lost PV (no data recovery, just make VG consistent again)
> 
> What is the option you want to do? I guess 2) ?

You are correct.  Number 2 is my goal.



> (btw all situations are described on my slides you mentioned,
> http://mbroz.fedorapeople.org/talks/LinuxAlt2009_2/ - but it is possible
> some info is not up to date, there were some small changes.
> And I borrowed some info from Bryn lvm recovery talk as well)

Perhaps I wasn't reading clearly, but if your number 2 was in those
slides, I didn't understand how to apply it to my situation.


> > I am pretty sure that I can remove the "lost" PV, using the
> > instructions that I have found in multiple places, including the
> > referenced slide deck, but I have not been able to find anything about
> > recovering the LV that spans from the existing PV into the lost one.
> 
> See the section for missing_stripe_filler and --partial activation
> (default stripe filler is "error" - all IO on missing segment fails
> with io error)

I think that there were missing steps ( for me, they might actually 
have been there ) in this process.  I got lost in this area.  Yes, I
think I saw you create the "empty" section, but didn't understand how
you involved it in the recovery process.


> vgchange/lvchange should then replace these missing with this filler.
> 
> (See how you can use "zero" replacement on my slides above. This
> is better for data recovery, similar to dd_rescue job)

I have had friends recommend dd_rescue for physical drive recovery, but
didn't see how to apply it here, and didn't think to do so.


I was about to ask more questions, but I think that I will let you guide
me in the direction I need to go, whether with more information that I
can provide, or in the solution steps.


> Milan

Thank you,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 17:23     ` Brian McCullough
@ 2012-04-26 17:47       ` Milan Broz
  2012-04-26 18:24         ` Brian McCullough
  2012-04-26 19:43         ` Brian McCullough
  0 siblings, 2 replies; 11+ messages in thread
From: Milan Broz @ 2012-04-26 17:47 UTC (permalink / raw)
  To: Brian McCullough; +Cc: LVM general discussion and development

On 04/26/2012 07:23 PM, Brian McCullough wrote:

>> You said you have missing PV, right?
> 
> In the case of the machine in question, the original "disk" was found to
> be too small by the user, and another qcow file was created to handle
> the excess.
> 
> Last Tuesday ( a week+ ago ), the primary Sysadmin for the host machine
> rebooted the machine for various reasons, I understand, and afterward
> this VM would not restart, with a missing PV.
> 
> 
> 
>> - why the PV is missing? What exactly happened?
>> (overwritten, removed from system, hw failed?)
> 
> The qcow file is still there, but LVM claims that it is missing, from
> what I understand from the messages.

Are you sure that VM see that second qcow file content? I don't think so.

It seems lvm is not your problem at all. I guess once you fix your
VM configuration lvm will activate that without any data lost.

Or is that qcow file corrupted?

Check with lsblk (if available) and /dev/ that you _really_ physical
device which was added later. If not, the question is what
changed in your VM that after reboot it is not visible?

Check log when starting VM - it must log that second qcow is used.
Check log inside guest - which device were there (/dev/vdc?) and
now missing... etc

> unknown device eyeball  lvm2 a-   40.00g 20.00g kXXFhn-oalZ-9D12-0CvG-6b4w-RjcE-Jb171k

>   LV     VG      Attr   LSize   Origin Snap%  Move Log Copy%  Convert Devices          
> home   eyeball -wi---  89.89g /dev/vdb2(2268)  
> home   eyeball -wi---  89.89g unknown device(0)

From this I guess 20GB of data missing on the end of home. quite a lot...

>> All recovery now depends on info above and what you really want:
>>
>> 1) either you have old disk and you want to recover metadata on it
>> and attach it back to VG
>>
>> 2) or you want just recover data from existing PVs
>> (replace missing PV segments with zeroes for example)
>>
>> 3) or you want completely remove all LVs which were even partially on this
>> lost PV (no data recovery, just make VG consistent again)
>>
>> What is the option you want to do? I guess 2) ?
> 
> You are correct.  Number 2 is my goal.

With the info above I think you should try 1) first :)

Milan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 17:47       ` Milan Broz
@ 2012-04-26 18:24         ` Brian McCullough
  2012-04-26 19:43         ` Brian McCullough
  1 sibling, 0 replies; 11+ messages in thread
From: Brian McCullough @ 2012-04-26 18:24 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development


Thank you for the thoughts.

I will investigate and see if I can "mount" the missing disk after all.
I created a new VM to work with, and copied the qcow files from the
original VM to try and not damage anything fatally.


Onward with Option 1.

I will let you know how it goes.


Thanks,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 17:47       ` Milan Broz
  2012-04-26 18:24         ` Brian McCullough
@ 2012-04-26 19:43         ` Brian McCullough
  2012-04-26 20:45           ` Milan Broz
  1 sibling, 1 reply; 11+ messages in thread
From: Brian McCullough @ 2012-04-26 19:43 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development

On Thu, Apr 26, 2012 at 07:47:24PM +0200, Milan Broz wrote:
> 
> Are you sure that VM see that second qcow file content? I don't think so.

You are right.

dmesg shows that the "drive" is found and accessable to the system.

HOWEVER, no partitions.  Or more properly, no partition table!   ????

Confirmed by fdisk.


I then did a dd of the first 512K of that disk, and then ran strings on
it.  When I did that, I saw strings like "GRUB," "Hard Disk" and such
fairly early in the file.

I also saw "LVM2," which I know is the beginning of LVM2 meta-data.  I
can see a nice block of at least two different revisions of LVM2
meta-data on this drive. ( I stopped looking there. )


I tried a tool know I know of called "testdisk," which attempts to
recover partition tables, and it wasn't able to help.  Odd.


Still looking.


B

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 19:43         ` Brian McCullough
@ 2012-04-26 20:45           ` Milan Broz
  2012-04-26 21:13             ` Brian McCullough
  0 siblings, 1 reply; 11+ messages in thread
From: Milan Broz @ 2012-04-26 20:45 UTC (permalink / raw)
  To: Brian McCullough; +Cc: LVM general discussion and development

On 04/26/2012 09:43 PM, Brian McCullough wrote:
> On Thu, Apr 26, 2012 at 07:47:24PM +0200, Milan Broz wrote:
>>
>> Are you sure that VM see that second qcow file content? I don't think so.
> 
> You are right.
> 
> dmesg shows that the "drive" is found and accessable to the system.
> 
> HOWEVER, no partitions.  Or more properly, no partition table!   ????

If it was second disk, it can be without partition table, lvm does not
require it.

Never try to recover something when you are not sure if it was there ;-)

You can easily check - in guest, go to the /etc/lvm directory,
there should be metadata backups. Chekck them, there should be comment
which disk was there.

Can you paste some old metadata (_correct_ metadata, before disk disappeared)
to pastebin and send a link to it? Or directly to mail.

run "blkid -p <disk>" on it - what it reports?

> I tried a tool know I know of called "testdisk," which attempts to
> recover partition tables, and it wasn't able to help.  Odd.

Why? These tries can do more damages. Maybe someone just installed
e.g. grub there - this is fully recoverable still.

Paste that metadata above, we will see.

Milan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 20:45           ` Milan Broz
@ 2012-04-26 21:13             ` Brian McCullough
  2012-04-27  3:08               ` Brian McCullough
  0 siblings, 1 reply; 11+ messages in thread
From: Brian McCullough @ 2012-04-26 21:13 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development

On Thu, Apr 26, 2012 at 10:45:30PM +0200, Milan Broz wrote:
> On 04/26/2012 09:43 PM, Brian McCullough wrote:
> > On Thu, Apr 26, 2012 at 07:47:24PM +0200, Milan Broz wrote:
> 
> Never try to recover something when you are not sure if it was there ;-)

I completely understand and agree.  That's why I keep asking questions,
both of you and it!


I have to run to an appointment right now, but I will return to this
afterward.


I feel a LOT closer to success.


Thanks,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-26 21:13             ` Brian McCullough
@ 2012-04-27  3:08               ` Brian McCullough
  2012-04-28 17:01                 ` Brian McCullough
  0 siblings, 1 reply; 11+ messages in thread
From: Brian McCullough @ 2012-04-27  3:08 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development

On Thu, Apr 26, 2012 at 05:13:20PM -0400, Brian McCullough wrote:
> On Thu, Apr 26, 2012 at 10:45:30PM +0200, Milan Broz wrote:
> > On 04/26/2012 09:43 PM, Brian McCullough wrote:
> > > On Thu, Apr 26, 2012 at 07:47:24PM +0200, Milan Broz wrote:
> > 
> > Never try to recover something when you are not sure if it was there ;-)
> 
> I completely understand and agree.  That's why I keep asking questions,
> both of you and it!
> 
> 
> I have to run to an appointment right now, but I will return to this
> afterward.

It appears that I have got that LV back!

I am waiting for the user to check things, and I will let you know
tomorrow.


Thank you,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [linux-lvm] Missing PV
  2012-04-27  3:08               ` Brian McCullough
@ 2012-04-28 17:01                 ` Brian McCullough
  0 siblings, 0 replies; 11+ messages in thread
From: Brian McCullough @ 2012-04-28 17:01 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development


Milan,

I promised you an explanation, and here it is.


It was, at least partially, my fault.


I don't really understand why the original VM did not, and would not,
restart correctly.


However, I discovered, after I had made copies of everything and created
a new VM to work with, that there were two different "formats" of
"virtual disk drives," even though they were all called "qcow2."  There
were a combination of "raw" and "qcow2" disks.


Once I got the VM configuration file, and therefore the VM
configuration, matching the formats of the disk drives, I was able to
completely recover the "missing" PV, and therefore the VG and LV.

During my investigation, I had issued a "removemissing," so the LVM
Meta-data was "corrupted," but I edited a copy of that and 
vgcfgrestore corrected everything.


I did an e2fsck of the LV, to make sure, but the user tells me that
everything looks the way that he expected, so far.


So, even though the problem really wasn't what I thought it was, I want
to thank you very much for the patience and guidance toward helping me
find the ( apparent ) real problem.  

I am still not happy about how rebooting the host machine caused so much
damage, with one LV completely losing its contents ( the qcow2 file has
completely disappeared ) and therefore one VM being destroyed, and more
than this one not being able to restart correctly, but at least, most of
the VMs are back in action.


Thanks again,
Brian

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-04-28 17:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-24 13:24 [linux-lvm] Missing PV Brian McCullough
2012-04-26 14:58 ` Brian McCullough
2012-04-26 16:47   ` Milan Broz
2012-04-26 17:23     ` Brian McCullough
2012-04-26 17:47       ` Milan Broz
2012-04-26 18:24         ` Brian McCullough
2012-04-26 19:43         ` Brian McCullough
2012-04-26 20:45           ` Milan Broz
2012-04-26 21:13             ` Brian McCullough
2012-04-27  3:08               ` Brian McCullough
2012-04-28 17:01                 ` Brian McCullough

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.