linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] exposing snapshot block device
@ 2019-10-22 10:47 Dalebjörk, Tomas
  2019-10-22 13:57 ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Dalebjörk, Tomas @ 2019-10-22 10:47 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]

Hi

When you create a snapshot of a logical volume.

A new virtual dm- device will be created with the content of the changes 
from the origin.

This cow device can than be used to read changed contents etc.


In case of an incident, this cow device can be used to read back the 
changed content to its origin using the "lvmerge" command.


The question I have is if there is a way to couple an external cow 
device to an empty equaly sized logical volume,

so that the empty logical volume is aware of that all changed content 
are placed on this attached cow device?

If that is possible, than it will help making instant recovery of LV 
volumes from an external source using native lvmerge command, from for 
example a backup server.


[EMPTY LOGICAL VOLUME]
 ������� ^
 ����� � |
 �� � lvmerge
 ������� |

[ATTACHED COW DEVICE]

Regards Tomas


[-- Attachment #2: Type: text/html, Size: 1715 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 10:47 [linux-lvm] exposing snapshot block device Dalebjörk, Tomas
@ 2019-10-22 13:57 ` Zdenek Kabelac
  2019-10-22 15:29   ` Dalebjörk, Tomas
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-22 13:57 UTC (permalink / raw)
  To: LVM general discussion and development, Dalebjörk, Tomas

Dne 22. 10. 19 v 12:47 Dalebj�rk, Tomas napsal(a):
> Hi
> 
> When you create a snapshot of a logical volume.
> 
> A new virtual dm- device will be created with the content of the changes from 
> the origin.
> 
> This cow device can than be used to read changed contents etc.
> 
> 
> In case of an incident, this cow device can be used to read back the changed 
> content to its origin using the "lvmerge" command.
> 
> 
> The question I have is if there is a way to couple an external cow device to 
> an empty equaly sized logical volume,
> 
> so that the empty logical volume is aware of that all changed content are 
> placed on this attached cow device?
> 
> If that is possible, than it will help making instant recovery of LV volumes 
> from an external source using native lvmerge command, from for example a 
> backup server.

For most info how old snapshot for so called 'thick' LVs works - check
these papers: http://people.redhat.com/agk/talks/


lvconvert --merge

is in fact 'instant' operation - when it happens - you can immediately access
'already merged' content  while the merge is happening in the background
(you can look for copies percentage in lvs command)


However 'thick' LVs with old snapshots are rather 'dated' technology
you should probably checkout the usage of  thinly provisioned LVs.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 13:57 ` Zdenek Kabelac
@ 2019-10-22 15:29   ` Dalebjörk, Tomas
  2019-10-22 15:36     ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Dalebjörk, Tomas @ 2019-10-22 15:29 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development

Thanks for feedback,


I know that thick LV snapshots are out dated, and that one should use 
thin LV snapshots.

But my understanding is that the dm- cow and dm - origin are still 
present and available in thin too?


Example of a scenario:

1. Create a snapshot of LV testlv with the name snaplv
2. Perform a full copy of the snaplv using for example dd to a block device
3. Delete the snapshot

Now I would like to re-attach this external block device as a snapshot 
again.

After all, it is just a dm and LVM config, right? So for example:

1. create a snapshot of testlv with the name snaplv
2. re create the -cow meta data device : 
<offset><chunk_size><data>...<offset><chunk_size><data><EOF>
 ��� Recreate this -cow meta data device by telling the origin that all 
data has been changed and are in the cow device (the raw device)
3. If the above were possible to perform, than it could be possible to 
instantly get at copy of the LV data using the lvconvert --merge command

I have already invented a way to perform "block level incremental 
forever"; using the -cow device.

And a possibility to reverse the blocks, to copy back only changed 
content from external devices.

But, it would be better if the cow device could be recreated in a faster 
way, mentioning that all blocks are present on an external device, so 
that the LV volume can be restored much quicker using "lvconvert 
--merge" command.

That would be super cool!

Imagine backing up multi terrabyte sized volumes in minutes to external 
destinations, and restoring the data in seconds using instant recovery 
by re-creating or emulating the cow device, and associating all blocks 
to an external device?

Regards Tomas


Den 2019-10-22 kl. 15:57, skrev Zdenek Kabelac:

> Dne 22. 10. 19 v 12:47 Dalebj�rk, Tomas napsal(a):
>> Hi
>>
>> When you create a snapshot of a logical volume.
>>
>> A new virtual dm- device will be created with the content of the 
>> changes from the origin.
>>
>> This cow device can than be used to read changed contents etc.
>>
>>
>> In case of an incident, this cow device can be used to read back the 
>> changed content to its origin using the "lvmerge" command.
>>
>>
>> The question I have is if there is a way to couple an external cow 
>> device to an empty equaly sized logical volume,
>>
>> so that the empty logical volume is aware of that all changed content 
>> are placed on this attached cow device?
>>
>> If that is possible, than it will help making instant recovery of LV 
>> volumes from an external source using native lvmerge command, from 
>> for example a backup server.
>
> For most info how old snapshot for so called 'thick' LVs works - check
> these papers: http://people.redhat.com/agk/talks/
>
>
> lvconvert --merge
>
> is in fact 'instant' operation - when it happens - you can immediately 
> access
> 'already merged' content� while the merge is happening in the background
> (you can look for copies percentage in lvs command)
>
>
> However 'thick' LVs with old snapshots are rather 'dated' technology
> you should probably checkout the usage of� thinly provisioned LVs.
>
> Regards
>
> Zdenek
>
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 15:29   ` Dalebjörk, Tomas
@ 2019-10-22 15:36     ` Zdenek Kabelac
  2019-10-22 16:13       ` Dalebjörk, Tomas
  2019-10-22 16:15       ` Stuart D. Gathman
  0 siblings, 2 replies; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-22 15:36 UTC (permalink / raw)
  To: Dalebjörk, Tomas, LVM general discussion and development

Dne 22. 10. 19 v 17:29 Dalebj�rk, Tomas napsal(a):
> Thanks for feedback,
> 
> But, it would be better if the cow device could be recreated in a faster way, 
> mentioning that all blocks are present on an external device, so that the LV 
> volume can be restored much quicker using "lvconvert --merge" command.
> 
> That would be super cool!
> 
> Imagine backing up multi terrabyte sized volumes in minutes to external 
> destinations, and restoring the data in seconds using instant recovery by 
> re-creating or emulating the cow device, and associating all blocks to an 
> external device?
> 

Hi

I do not want to break your imagination here, but that is exactly the thing 
you can do with thin provisioning and thin_delta tool.

You just work with LV, take snapshot1, take snapshot2,
send delta between  s1 -> s2  to remove machine,
remove s1, take s3, send delta  s2 -> s3...

It's just not automated by lvm2 ATM...

Using this with old snapshot would be insanely inefficient...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 15:36     ` Zdenek Kabelac
@ 2019-10-22 16:13       ` Dalebjörk, Tomas
  2019-10-23 10:26         ` Zdenek Kabelac
  2019-10-22 16:15       ` Stuart D. Gathman
  1 sibling, 1 reply; 53+ messages in thread
From: Dalebjörk, Tomas @ 2019-10-22 16:13 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development

That is cool,

But, are there any practical example how this could be working in reality.

Eg:

lvcreate -s mysnap vg/testlv

thin_dump vg/mysnap > deltafile # I assume that this should be the name 
of the snapshot?

But... How to recreate only the metadata only?, so that the meta data 
changes are associated to an external device?

thin_restore -i metadata < deltafile # that will restore the metadata, 
but I also want the restored meta data to point out the location of the 
data from for example a file or a raw deice


I have created a way to perform block level incremental forever by 
reading the -cow device, and thin_dump would be nice replacement for that.

This can also be reversed, so that the thin_restore can be used to 
restore the meta data and the data@same time (If I now the format of it)

But it would be much more better if one can do the restoration in 
background using "lvconvert --merge" tool, by first restoring the 
metadata (I can understand that this part is needed), and assoicate all 
the data to an external raw disk or much more better a file, so that all 
changes associated to this restored snapshot can be found on the file.


Not so good to explain this, but I hope you understand how I am thinking.

A destroyed thin pool, can than be restored instantly using a backup 
server as the cow similar device.

Regards Tomas

Den 2019-10-22 kl. 17:36, skrev Zdenek Kabelac:
> Dne 22. 10. 19 v 17:29 Dalebj�rk, Tomas napsal(a):
>> Thanks for feedback,
>>
>> But, it would be better if the cow device could be recreated in a 
>> faster way, mentioning that all blocks are present on an external 
>> device, so that the LV volume can be restored much quicker using 
>> "lvconvert --merge" command.
>>
>> That would be super cool!
>>
>> Imagine backing up multi terrabyte sized volumes in minutes to 
>> external destinations, and restoring the data in seconds using 
>> instant recovery by re-creating or emulating the cow device, and 
>> associating all blocks to an external device?
>>
>
> Hi
>
> I do not want to break your imagination here, but that is exactly the 
> thing you can do with thin provisioning and thin_delta tool.
>
> You just work with LV, take snapshot1, take snapshot2,
> send delta between� s1 -> s2� to remove machine,
> remove s1, take s3, send delta� s2 -> s3...
>
> It's just not automated by lvm2 ATM...
>
> Using this with old snapshot would be insanely inefficient...
>
> Regards
>
> Zdenek
>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 15:36     ` Zdenek Kabelac
  2019-10-22 16:13       ` Dalebjörk, Tomas
@ 2019-10-22 16:15       ` Stuart D. Gathman
  2019-10-22 17:02         ` Tomas Dalebjörk
                           ` (2 more replies)
  1 sibling, 3 replies; 53+ messages in thread
From: Stuart D. Gathman @ 2019-10-22 16:15 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Dalebjörk, Tomas

[-- Attachment #1: Type: text/plain, Size: 2678 bytes --]

On Tue, 22 Oct 2019, Zdenek Kabelac wrote:

> Dne 22. 10. 19 v 17:29 Dalebj�rk, Tomas napsal(a):
>> But, it would be better if the cow device could be recreated in a faster 
>> way, mentioning that all blocks are present on an external device, so that 
>> the LV volume can be restored much quicker using "lvconvert --merge" 
>> command.

> I do not want to break your imagination here, but that is exactly the thing 
> you can do with thin provisioning and thin_delta tool.

lvconvert --merge does a "rollback" to the point at which the snapshot
was taken.  The master LV already has current data.  What Tomas wants to
be able to do a "rollforward" from the point at which the snapshot was
taken.  He also wants to be able to put the cow volume on an
extern/remote medium, and add a snapshot using an already existing cow.

This way, restoring means copying the full volume from backup, creating
a snapshot using existing external cow, then lvconvert --merge 
instantly logically applies the cow changes while updating the master
LV.

Pros:

"Old" snapshots are exactly as efficient as thin when there is exactly
one.  They only get inefficient with multiple snapshots.  On the other
hand, thin volumes are as inefficient as an old LV with one snapshot.
An old LV is as efficient, and as anti-fragile, as a partition.  Thin
volumes are much more flexible, but depend on much more fragile database
like meta-data.

For this reason, I always prefer "old" LVs when the functionality of
thin LVs are not actually needed.  I can even manually recover from
trashed meta data by editing it, as it is human readable text.

Updates to the external cow can be pipelined (but then properly
handling reads becomes non trivial - there are mature remote block
device implementations for linux that will do the job).

Cons:

For the external cow to be useful, updates to it must be *strictly*
serialized.  This is doable, but not as obvious or trivial as it might
seem at first glance.  (Remote block device software will take care
of this as well.)

The "rollforward" must be applied to the backup image of the snapshot.
If the admin gets it paired with the wrong backup, massive corruption
ensues.  This could be automated.  E.g. the full image backup and
external cow would have unique matching names.  Or the full image backup
could compute an md5 in parallel, which would be store with the cow.
But none of those tools currently exist.

-- 
 	      Stuart D. Gathman <stuart@gathman.org>
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 16:15       ` Stuart D. Gathman
@ 2019-10-22 17:02         ` Tomas Dalebjörk
  2019-10-22 21:38         ` Gionatan Danti
  2019-10-23 10:46         ` Zdenek Kabelac
  2 siblings, 0 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-10-22 17:02 UTC (permalink / raw)
  To: Stuart D. Gathman; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 3472 bytes --]

Thanks for feedbak.

Think of lvmsync as a tool, which reads the block changes from the cow
device.
<offset><chunk_size><data>...
Lets assume that I am able to recreate this cow format instantly back to
the server,
and present this as a file with the name "cowfile" on the file system for
simplicity.

Is it possible than in some way, to use this cowfile in someway to inform
LVM about the location of the snapshot area, so that lvconvert --merge can
be used to restore the data quicker, using this cowfile.

The cowfile will include all blocks for the logical volume.

Regards Tomas

Den tis 22 okt. 2019 kl 18:15 skrev Stuart D. Gathman <stuart@gathman.org>:

> On Tue, 22 Oct 2019, Zdenek Kabelac wrote:
>
> > Dne 22. 10. 19 v 17:29 Dalebjörk, Tomas napsal(a):
> >> But, it would be better if the cow device could be recreated in a
> faster
> >> way, mentioning that all blocks are present on an external device, so
> that
> >> the LV volume can be restored much quicker using "lvconvert --merge"
> >> command.
>
> > I do not want to break your imagination here, but that is exactly the
> thing
> > you can do with thin provisioning and thin_delta tool.
>
> lvconvert --merge does a "rollback" to the point at which the snapshot
> was taken.  The master LV already has current data.  What Tomas wants to
> be able to do a "rollforward" from the point at which the snapshot was
> taken.  He also wants to be able to put the cow volume on an
> extern/remote medium, and add a snapshot using an already existing cow.
>
> This way, restoring means copying the full volume from backup, creating
> a snapshot using existing external cow, then lvconvert --merge
> instantly logically applies the cow changes while updating the master
> LV.
>
> Pros:
>
> "Old" snapshots are exactly as efficient as thin when there is exactly
> one.  They only get inefficient with multiple snapshots.  On the other
> hand, thin volumes are as inefficient as an old LV with one snapshot.
> An old LV is as efficient, and as anti-fragile, as a partition.  Thin
> volumes are much more flexible, but depend on much more fragile database
> like meta-data.
>
> For this reason, I always prefer "old" LVs when the functionality of
> thin LVs are not actually needed.  I can even manually recover from
> trashed meta data by editing it, as it is human readable text.
>
> Updates to the external cow can be pipelined (but then properly
> handling reads becomes non trivial - there are mature remote block
> device implementations for linux that will do the job).
>
> Cons:
>
> For the external cow to be useful, updates to it must be *strictly*
> serialized.  This is doable, but not as obvious or trivial as it might
> seem at first glance.  (Remote block device software will take care
> of this as well.)
>
> The "rollforward" must be applied to the backup image of the snapshot.
> If the admin gets it paired with the wrong backup, massive corruption
> ensues.  This could be automated.  E.g. the full image backup and
> external cow would have unique matching names.  Or the full image backup
> could compute an md5 in parallel, which would be store with the cow.
> But none of those tools currently exist.
>
> --
>               Stuart D. Gathman <stuart@gathman.org>
> "Confutatis maledictis, flamis acribus addictis" - background song for
> a Microsoft sponsored "Where do you want to go from here?" commercial.

[-- Attachment #2: Type: text/html, Size: 4300 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 16:15       ` Stuart D. Gathman
  2019-10-22 17:02         ` Tomas Dalebjörk
@ 2019-10-22 21:38         ` Gionatan Danti
  2019-10-22 22:53           ` Stuart D. Gathman
  2019-10-23 10:46         ` Zdenek Kabelac
  2 siblings, 1 reply; 53+ messages in thread
From: Gionatan Danti @ 2019-10-22 21:38 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Dalebjörk, Tomas

Hi,

Il 22-10-2019 18:15 Stuart D. Gathman ha scritto:
> "Old" snapshots are exactly as efficient as thin when there is exactly
> one.  They only get inefficient with multiple snapshots.  On the other
> hand, thin volumes are as inefficient as an old LV with one snapshot.
> An old LV is as efficient, and as anti-fragile, as a partition.  Thin
> volumes are much more flexible, but depend on much more fragile 
> database
> like meta-data.

this is both true and false: while in the single-snapshot case 
performance remains acceptable even from fat snapshots, the btree 
representation (and more modern code) of the "new" (7+ years old now) 
thin snapshots gurantees significantly higher performance, at least on 
my tests.

Note #1: I know that the old snapshot code uses 4K chunks by default, 
versus the 64K chunks of thinsnap. That said, I recorded higher thinsnap 
performance even when using a 64K chunk size for old fat snapshots.
Note #2: I generally disable thinpool zeroing (as I use a filesystem 
layer on top of thin volumes).

I 100% agree that old LVM code, with its plain text metadata and 
continuous plain-text backups, is extremely reliable and easy to 
fix/correct.

> For this reason, I always prefer "old" LVs when the functionality of
> thin LVs are not actually needed.  I can even manually recover from
> trashed meta data by editing it, as it is human readable text.

My main use of fat logical volumes is for boot and root filesystems, 
while thin vols (and zfs datasets, but this is another story...) are 
used for data partitions.

The main thing that somewhat scares me is that (if things had not 
changed) thinvol uses a single root btree node: losing it means losing 
*all* thin volumes of a specific thin pool. Coupled with the fact that 
metadata dump are not as handy as with the old LVM code (no 
vgcfgrestore), it worries me.

> The "rollforward" must be applied to the backup image of the snapshot.
> If the admin gets it paired with the wrong backup, massive corruption
> ensues.  This could be automated.  E.g. the full image backup and
> external cow would have unique matching names.  Or the full image 
> backup
> could compute an md5 in parallel, which would be store with the cow.
> But none of those tools currently exist.

This is the reason why I have not used thin_delta in production: an 
error from my part in recovering the volume (ie: applying the wrong 
delta) would cause massive data corruption. My current setup for instant 
recovery *and* added resiliance is somewhat similar to that: RAID -> 
DRBD -> THINPOOL -> THINVOL w/periodic snapshots (with the DRBD layer 
replicating to a sibling machine).

Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 21:38         ` Gionatan Danti
@ 2019-10-22 22:53           ` Stuart D. Gathman
  2019-10-23  6:58             ` Gionatan Danti
  2019-10-23 10:12             ` Zdenek Kabelac
  0 siblings, 2 replies; 53+ messages in thread
From: Stuart D. Gathman @ 2019-10-22 22:53 UTC (permalink / raw)
  To: Gionatan Danti
  Cc: Dalebjörk, Tomas, LVM general discussion and development

On Tue, 22 Oct 2019, Gionatan Danti wrote:

> The main thing that somewhat scares me is that (if things had not changed) 
> thinvol uses a single root btree node: losing it means losing *all* thin 
> volumes of a specific thin pool. Coupled with the fact that metadata dump are 
> not as handy as with the old LVM code (no vgcfgrestore), it worries me.

If you can find all the leaf nodes belonging to the root (in my btree
database they are marked with the root id and can be found by sequential
scan of the volume), then reconstructing the btree data is
straightforward - even in place.

I remember realizing this was the only way to recover a major customer's
data - and had the utility written, tested, and applied in a 36 hour
programming marathon (which I hope to never repeat).  If this hasn't
occured to thin pool programmers, I am happy to flesh out the procedure.
Having such a utility available as a last resort would ratchet up the
reliability of thin pools.

-- 
 	      Stuart D. Gathman <stuart@gathman.org>
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 22:53           ` Stuart D. Gathman
@ 2019-10-23  6:58             ` Gionatan Danti
  2019-10-23 10:06               ` Tomas Dalebjörk
  2019-10-23 10:12             ` Zdenek Kabelac
  1 sibling, 1 reply; 53+ messages in thread
From: Gionatan Danti @ 2019-10-23  6:58 UTC (permalink / raw)
  To: Stuart D. Gathman
  Cc: Dalebjörk, Tomas, LVM general discussion and development

Il 23-10-2019 00:53 Stuart D. Gathman ha scritto:
> If you can find all the leaf nodes belonging to the root (in my btree
> database they are marked with the root id and can be found by 
> sequential
> scan of the volume), then reconstructing the btree data is
> straightforward - even in place.
> 
> I remember realizing this was the only way to recover a major 
> customer's
> data - and had the utility written, tested, and applied in a 36 hour
> programming marathon (which I hope to never repeat).  If this hasn't
> occured to thin pool programmers, I am happy to flesh out the 
> procedure.
> Having such a utility available as a last resort would ratchet up the
> reliability of thin pools.

Very interesting. Can I ask you what product/database you recovered?

Anyway, giving similar ability to thin Vols would be awesome.

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23  6:58             ` Gionatan Danti
@ 2019-10-23 10:06               ` Tomas Dalebjörk
  0 siblings, 0 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-10-23 10:06 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 1642 bytes --]

Many thanks for all the feedback.

The idea works for those applications that supports snapshots.
Like Sybase / SAP Adaptive Server Enterprise, Sybase / SAP IQ Server, DB2,
MongoDB, MariaDB/MySQL, PostgreSQL etc..

Anyhow, back to the origin question:
Is there a way how to re-create the cow- format.
so that lvconvert --merge can be used.
Or by having lvconvert --merge to accept to read from a "cow file"

If that would be possible, than instant recovery would be possible from an
external source, like a backup server.

Regards Tomas

Den ons 23 okt. 2019 kl 08:58 skrev Gionatan Danti <g.danti@assyoma.it>:

> Il 23-10-2019 00:53 Stuart D. Gathman ha scritto:
> > If you can find all the leaf nodes belonging to the root (in my btree
> > database they are marked with the root id and can be found by
> > sequential
> > scan of the volume), then reconstructing the btree data is
> > straightforward - even in place.
> >
> > I remember realizing this was the only way to recover a major
> > customer's
> > data - and had the utility written, tested, and applied in a 36 hour
> > programming marathon (which I hope to never repeat).  If this hasn't
> > occured to thin pool programmers, I am happy to flesh out the
> > procedure.
> > Having such a utility available as a last resort would ratchet up the
> > reliability of thin pools.
>
> Very interesting. Can I ask you what product/database you recovered?
>
> Anyway, giving similar ability to thin Vols would be awesome.
>
> Thanks.
>
> --
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.danti@assyoma.it - info@assyoma.it
> GPG public key ID: FF5F32A8
>

[-- Attachment #2: Type: text/html, Size: 2450 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 22:53           ` Stuart D. Gathman
  2019-10-23  6:58             ` Gionatan Danti
@ 2019-10-23 10:12             ` Zdenek Kabelac
  1 sibling, 0 replies; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-23 10:12 UTC (permalink / raw)
  To: LVM general discussion and development, Stuart D. Gathman,
	Gionatan Danti
  Cc: Dalebjörk, Tomas

Dne 23. 10. 19 v 0:53 Stuart D. Gathman napsal(a):
> On Tue, 22 Oct 2019, Gionatan Danti wrote:
> 
>> The main thing that somewhat scares me is that (if things had not changed) 
>> thinvol uses a single root btree node: losing it means losing *all* thin 
>> volumes of a specific thin pool. Coupled with the fact that metadata dump 
>> are not as handy as with the old LVM code (no vgcfgrestore), it worries me.
> 
> If you can find all the leaf nodes belonging to the root (in my btree
> database they are marked with the root id and can be found by sequential
> scan of the volume), then reconstructing the btree data is
> straightforward - even in place.
> 
> I remember realizing this was the only way to recover a major customer's
> data - and had the utility written, tested, and applied in a 36 hour
> programming marathon (which I hope to never repeat).� If this hasn't
> occured to thin pool programmers, I am happy to flesh out the procedure.
> Having such a utility available as a last resort would ratchet up the
> reliability of thin pools.


There have been made great enhancements in thin_repair tool (>=0.8.5)
But of course further fixes and extensions are always welcomed by Joe.

There are unfortunately some 'limitations' what can be fixed with current
metadata format but lots of troubles we have witnessed in past are now
mostly 'covered' by the recent kernel driver. But if there is known case 
causing troubles - please open BZ so we can look over it.



Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 16:13       ` Dalebjörk, Tomas
@ 2019-10-23 10:26         ` Zdenek Kabelac
  2019-10-23 10:56           ` Tomas Dalebjörk
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-23 10:26 UTC (permalink / raw)
  To: LVM general discussion and development, Dalebjörk, Tomas,
	Zdenek Kabelac

Dne 22. 10. 19 v 18:13 Dalebj�rk, Tomas napsal(a):
> That is cool,
> 
> But, are there any practical example how this could be working in reality.
> 

There is not yet a practical example available from our lvm2 team yet.

So we are only describing the 'model' & 'plan' we have ATM...


> 
> I have created a way to perform block level incremental forever by reading the 
> -cow device, and thin_dump would be nice replacement for that.


COW is dead technology from our perspective - it can't cope with recent
performance of modern drives like NVMe...

So our plan is to focus on thinp technology here.


Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-22 16:15       ` Stuart D. Gathman
  2019-10-22 17:02         ` Tomas Dalebjörk
  2019-10-22 21:38         ` Gionatan Danti
@ 2019-10-23 10:46         ` Zdenek Kabelac
  2019-10-23 11:08           ` Gionatan Danti
  2 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-23 10:46 UTC (permalink / raw)
  To: LVM general discussion and development, Stuart D. Gathman
  Cc: Dalebjörk, Tomas

Dne 22. 10. 19 v 18:15 Stuart D. Gathman napsal(a):
> On Tue, 22 Oct 2019, Zdenek Kabelac wrote:
> 
>> Dne 22. 10. 19 v 17:29 Dalebj�rk, Tomas napsal(a):
>>> But, it would be better if the cow device could be recreated in a faster 
>>> way, mentioning that all blocks are present on an external device, so that 
>>> the LV volume can be restored much quicker using "lvconvert --merge" command.
> 
>> I do not want to break your imagination here, but that is exactly the thing 
>> you can do with thin provisioning and thin_delta tool.
> 
> lvconvert --merge does a "rollback" to the point at which the snapshot
> was taken.� The master LV already has current data.� What Tomas wants to
> be able to do a "rollforward" from the point at which the snapshot was
> taken.� He also wants to be able to put the cow volume on an
> extern/remote medium, and add a snapshot using an already existing cow.
> 
> This way, restoring means copying the full volume from backup, creating
> a snapshot using existing external cow, then lvconvert --merge instantly 
> logically applies the cow changes while updating the master
> LV.
> 
> Pros:
> 
> "Old" snapshots are exactly as efficient as thin when there is exactly
> one.� They only get inefficient with multiple snapshots.� On the other
> hand, thin volumes are as inefficient as an old LV with one snapshot.
> An old LV is as efficient, and as anti-fragile, as a partition.� Thin
> volumes are much more flexible, but depend on much more fragile database
> like meta-data.


Just few 'comments' - it's not really comparable - the efficiency of thin-pool 
metadata outperforms old snapshot in BIG way (there is no point to talk about 
snapshots that takes just couple of MiB)

There is also BIG difference about the usage of old snapshot origin and snapshot.

COW of old snapshot effectively cuts performance 1/2 if you write to origin.

> For this reason, I always prefer "old" LVs when the functionality of
> thin LVs are not actually needed.� I can even manually recover from
> trashed meta data by editing it, as it is human readable text.

On the other hand you can loose  COW snapshot at any moment in time
if your 'COW' storage is no big enough - this is very different
from thin-poo.....

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 10:26         ` Zdenek Kabelac
@ 2019-10-23 10:56           ` Tomas Dalebjörk
  0 siblings, 0 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-10-23 10:56 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

Thanks,

Ok, looking at thin than.
Is there a way to adopt similarities using thin instead?

Regards Tomas

Den ons 23 okt. 2019 kl 12:26 skrev Zdenek Kabelac <zkabelac@redhat.com>:

> Dne 22. 10. 19 v 18:13 Dalebjörk, Tomas napsal(a):
> > That is cool,
> >
> > But, are there any practical example how this could be working in
> reality.
> >
>
> There is not yet a practical example available from our lvm2 team yet.
>
> So we are only describing the 'model' & 'plan' we have ATM...
>
>
> >
> > I have created a way to perform block level incremental forever by
> reading the
> > -cow device, and thin_dump would be nice replacement for that.
>
>
> COW is dead technology from our perspective - it can't cope with recent
> performance of modern drives like NVMe...
>
> So our plan is to focus on thinp technology here.
>
>
> Zdenek
>
>

[-- Attachment #2: Type: text/html, Size: 1353 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 10:46         ` Zdenek Kabelac
@ 2019-10-23 11:08           ` Gionatan Danti
  2019-10-23 11:24             ` Tomas Dalebjörk
                               ` (3 more replies)
  0 siblings, 4 replies; 53+ messages in thread
From: Gionatan Danti @ 2019-10-23 11:08 UTC (permalink / raw)
  To: LVM general discussion and development, Zdenek Kabelac,
	Stuart D. Gathman
  Cc: Dalebjörk, Tomas

On 23/10/19 12:46, Zdenek Kabelac wrote:
> Just few 'comments' - it's not really comparable - the efficiency of 
> thin-pool metadata outperforms old snapshot in BIG way (there is no 
> point to talk about snapshots that takes just couple of MiB)

Yes, this matches my experience.

> There is also BIG difference about the usage of old snapshot origin and 
> snapshot.
> 
> COW of old snapshot effectively cuts performance 1/2 if you write to 
> origin.

If used without non-volatile RAID controller, 1/2 is generous - I 
measured performance as low as 1/5 (with fat snapshot).

Talking about thin snapshot, an obvious performance optimization which 
seems to not be implemented is to skip reading source data when 
overwriting in larger-than-chunksize blocks.

For example, consider a completely filled 64k chunk thin volume (with 
thinpool having ample free space). Snapshotting it and writing a 4k 
block on origin will obviously cause a read of the original 64k chunk, 
an in-memory change of the 4k block and a write of the entire modified 
64k block to a new location. But writing, say, a 1 MB block should *not* 
cause the same read on source: after all, the read data will be 
immediately discarded, overwritten by the changed 1 MB block.

However, my testing shows that source chunks are always read, even when 
completely overwritten.

Am I missing something?

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 11:08           ` Gionatan Danti
@ 2019-10-23 11:24             ` Tomas Dalebjörk
  2019-10-23 11:26               ` Tomas Dalebjörk
  2019-10-24 16:01               ` Zdenek Kabelac
  2019-10-23 12:12             ` Ilia Zykov
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-10-23 11:24 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: Zdenek Kabelac, LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2485 bytes --]

I have tested FusionIO together with old thick snapshots.
I created the thick snapshot on a separate old traditional SATA drive, just
to check if that could be used as a snapshot target for high performance
disks; like a Fusion IO card.
For those who doesn't know about FusionIO; they can deal with 150-250,000
IOPS.

And to be honest, I couldn't bottle neck the SATA disk I used as a thick
snapshot target.
The reason for why is simple:
- thick snapshots uses sequential write techniques

If I would have been using thin snapshots, than the writes would most
likely be more randomized on disk, which would have required more spindles
to coop with this.

Anyhow;
I am still eager to hear how to use an external device to import snapshots.
And when I say "import"; I am not talking about copyback, more to use to
read data from.

Regards Tomas

Den ons 23 okt. 2019 kl 13:08 skrev Gionatan Danti <g.danti@assyoma.it>:

> On 23/10/19 12:46, Zdenek Kabelac wrote:
> > Just few 'comments' - it's not really comparable - the efficiency of
> > thin-pool metadata outperforms old snapshot in BIG way (there is no
> > point to talk about snapshots that takes just couple of MiB)
>
> Yes, this matches my experience.
>
> > There is also BIG difference about the usage of old snapshot origin and
> > snapshot.
> >
> > COW of old snapshot effectively cuts performance 1/2 if you write to
> > origin.
>
> If used without non-volatile RAID controller, 1/2 is generous - I
> measured performance as low as 1/5 (with fat snapshot).
>
> Talking about thin snapshot, an obvious performance optimization which
> seems to not be implemented is to skip reading source data when
> overwriting in larger-than-chunksize blocks.
>
> For example, consider a completely filled 64k chunk thin volume (with
> thinpool having ample free space). Snapshotting it and writing a 4k
> block on origin will obviously cause a read of the original 64k chunk,
> an in-memory change of the 4k block and a write of the entire modified
> 64k block to a new location. But writing, say, a 1 MB block should *not*
> cause the same read on source: after all, the read data will be
> immediately discarded, overwritten by the changed 1 MB block.
>
> However, my testing shows that source chunks are always read, even when
> completely overwritten.
>
> Am I missing something?
>
> --
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.danti@assyoma.it - info@assyoma.it
> GPG public key ID: FF5F32A8
>

[-- Attachment #2: Type: text/html, Size: 3347 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 11:24             ` Tomas Dalebjörk
@ 2019-10-23 11:26               ` Tomas Dalebjörk
  2019-10-24 16:01               ` Zdenek Kabelac
  1 sibling, 0 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-10-23 11:26 UTC (permalink / raw)
  To: Gionatan Danti; +Cc: Zdenek Kabelac, LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2972 bytes --]

And the block size for thick snapshots can be set when using the lvcreate
command.
And the automatic growing of a snapshot can be configured too in the lvm
configuration.

Same issues with both thin and thick, if you run out of space.
//T

Den ons 23 okt. 2019 kl 13:24 skrev Tomas Dalebjörk <
tomas.dalebjork@gmail.com>:

> I have tested FusionIO together with old thick snapshots.
> I created the thick snapshot on a separate old traditional SATA drive,
> just to check if that could be used as a snapshot target for high
> performance disks; like a Fusion IO card.
> For those who doesn't know about FusionIO; they can deal with 150-250,000
> IOPS.
>
> And to be honest, I couldn't bottle neck the SATA disk I used as a thick
> snapshot target.
> The reason for why is simple:
> - thick snapshots uses sequential write techniques
>
> If I would have been using thin snapshots, than the writes would most
> likely be more randomized on disk, which would have required more spindles
> to coop with this.
>
> Anyhow;
> I am still eager to hear how to use an external device to import snapshots.
> And when I say "import"; I am not talking about copyback, more to use to
> read data from.
>
> Regards Tomas
>
> Den ons 23 okt. 2019 kl 13:08 skrev Gionatan Danti <g.danti@assyoma.it>:
>
>> On 23/10/19 12:46, Zdenek Kabelac wrote:
>> > Just few 'comments' - it's not really comparable - the efficiency of
>> > thin-pool metadata outperforms old snapshot in BIG way (there is no
>> > point to talk about snapshots that takes just couple of MiB)
>>
>> Yes, this matches my experience.
>>
>> > There is also BIG difference about the usage of old snapshot origin and
>> > snapshot.
>> >
>> > COW of old snapshot effectively cuts performance 1/2 if you write to
>> > origin.
>>
>> If used without non-volatile RAID controller, 1/2 is generous - I
>> measured performance as low as 1/5 (with fat snapshot).
>>
>> Talking about thin snapshot, an obvious performance optimization which
>> seems to not be implemented is to skip reading source data when
>> overwriting in larger-than-chunksize blocks.
>>
>> For example, consider a completely filled 64k chunk thin volume (with
>> thinpool having ample free space). Snapshotting it and writing a 4k
>> block on origin will obviously cause a read of the original 64k chunk,
>> an in-memory change of the 4k block and a write of the entire modified
>> 64k block to a new location. But writing, say, a 1 MB block should *not*
>> cause the same read on source: after all, the read data will be
>> immediately discarded, overwritten by the changed 1 MB block.
>>
>> However, my testing shows that source chunks are always read, even when
>> completely overwritten.
>>
>> Am I missing something?
>>
>> --
>> Danti Gionatan
>> Supporto Tecnico
>> Assyoma S.r.l. - www.assyoma.it
>> email: g.danti@assyoma.it - info@assyoma.it
>> GPG public key ID: FF5F32A8
>>
>

[-- Attachment #2: Type: text/html, Size: 4068 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 11:08           ` Gionatan Danti
  2019-10-23 11:24             ` Tomas Dalebjörk
@ 2019-10-23 12:12             ` Ilia Zykov
  2019-10-23 12:20             ` Ilia Zykov
  2019-10-23 12:59             ` Zdenek Kabelac
  3 siblings, 0 replies; 53+ messages in thread
From: Ilia Zykov @ 2019-10-23 12:12 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 993 bytes --]

On 23.10.2019 14:08, Gionatan Danti wrote:
> 
> For example, consider a completely filled 64k chunk thin volume (with
> thinpool having ample free space). Snapshotting it and writing a 4k
> block on origin will obviously cause a read of the original 64k chunk,
> an in-memory change of the 4k block and a write of the entire modified
> 64k block to a new location. But writing, say, a 1 MB block should *not*
> cause the same read on source: after all, the read data will be
> immediately discarded, overwritten by the changed 1 MB block.
> 
> However, my testing shows that source chunks are always read, even when
> completely overwritten.

Not only read but sometimes write.
I watched it without snapshot. Only zeroing was enabled. Before wrote
new chunks "dd bs=1048576 ..." chunks were zeroed. But for security it's
good. IMHO: In this case best choice firstly write chunks to the disk
and then give this chunks to the volume.

> 
> Am I missing something?
> 




[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3703 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 11:08           ` Gionatan Danti
  2019-10-23 11:24             ` Tomas Dalebjörk
  2019-10-23 12:12             ` Ilia Zykov
@ 2019-10-23 12:20             ` Ilia Zykov
  2019-10-23 13:05               ` Zdenek Kabelac
  2019-10-23 12:59             ` Zdenek Kabelac
  3 siblings, 1 reply; 53+ messages in thread
From: Ilia Zykov @ 2019-10-23 12:20 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

On 23.10.2019 14:08, Gionatan Danti wrote:
> 
> For example, consider a completely filled 64k chunk thin volume (with
> thinpool having ample free space). Snapshotting it and writing a 4k
> block on origin will obviously cause a read of the original 64k chunk,
> an in-memory change of the 4k block and a write of the entire modified
> 64k block to a new location. But writing, say, a 1 MB block should *not*
> cause the same read on source: after all, the read data will be
> immediately discarded, overwritten by the changed 1 MB block.
> 
> However, my testing shows that source chunks are always read, even when
> completely overwritten.

Not only read but sometimes write.
I watched it without snapshot. Only zeroing was enabled. Before wrote
new chunks "dd bs=1048576 ..." chunks were zeroed. But for security it's
good. IMHO: In this case good choice firstly write chunks to the disk
and then give this chunks to the volume.
> 
> Am I missing something?
> 



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3695 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 11:08           ` Gionatan Danti
                               ` (2 preceding siblings ...)
  2019-10-23 12:20             ` Ilia Zykov
@ 2019-10-23 12:59             ` Zdenek Kabelac
  2019-10-23 14:37               ` Gionatan Danti
  3 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-23 12:59 UTC (permalink / raw)
  To: LVM general discussion and development, Gionatan Danti,
	Stuart D. Gathman
  Cc: Dalebjörk, Tomas

Dne 23. 10. 19 v 13:08 Gionatan Danti napsal(a):
> On 23/10/19 12:46, Zdenek Kabelac wrote:
>> Just few 'comments' - it's not really comparable - the efficiency of 
>> thin-pool metadata outperforms old snapshot in BIG way (there is no point to 
>> talk about snapshots that takes just couple of MiB)
> 
> Yes, this matches my experience.
> 
>> There is also BIG difference about the usage of old snapshot origin and 
>> snapshot.
>>
>> COW of old snapshot effectively cuts performance 1/2 if you write to origin.
> 
> If used without non-volatile RAID controller, 1/2 is generous - I measured 
> performance as low as 1/5 (with fat snapshot).
> 
> Talking about thin snapshot, an obvious performance optimization which seems 
> to not be implemented is to skip reading source data when overwriting in 
> larger-than-chunksize blocks.

Hi

There is no such optimization possible for old snapshots.
You would need to write ONLY to snapshots.

As soon as you start to write to origin - you have to 'read' original data 
from origin, copy them to COW storage, once this is finished, you can
overwrite origin data area with your writing I/O.

This is simply never going to work fast ;) - the fast way is thin-pool...

Old snapshots were designed for 'short' lived snapshots (so you can take
a backup of volume which is not being modified underneath).

Any idea of improving this old snapshots target are sooner or later going to 
end-up with thin-pool anyway :)  (we've been in this river many many years 
back in time...)


> For example, consider a completely filled 64k chunk thin volume (with thinpool 
> having ample free space). Snapshotting it and writing a 4k block on origin 

There is no support of  snapshot of snapshot  with old snaps...
It would be extremely slow to use...

> However, my testing shows that source chunks are always read, even when 
> completely overwritten.
> 
> Am I missing something?

Yep - you would need to always jump to your 'snapshot' - so instead of
keeping 'origin' on  major:minor  - it would need to become a 'snapshot'...
Seriously complex concept to work with - especially when there is thin-pool...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 12:20             ` Ilia Zykov
@ 2019-10-23 13:05               ` Zdenek Kabelac
  2019-10-23 14:40                 ` Gionatan Danti
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-23 13:05 UTC (permalink / raw)
  To: LVM general discussion and development, Ilia Zykov

Dne 23. 10. 19 v 14:20 Ilia Zykov napsal(a):
> On 23.10.2019 14:08, Gionatan Danti wrote:
>>
>> For example, consider a completely filled 64k chunk thin volume (with
>> thinpool having ample free space). Snapshotting it and writing a 4k
>> block on origin will obviously cause a read of the original 64k chunk,
>> an in-memory change of the 4k block and a write of the entire modified
>> 64k block to a new location. But writing, say, a 1 MB block should *not*
>> cause the same read on source: after all, the read data will be
>> immediately discarded, overwritten by the changed 1 MB block.
>>
>> However, my testing shows that source chunks are always read, even when
>> completely overwritten.
> 
> Not only read but sometimes write.
> I watched it without snapshot. Only zeroing was enabled. Before wrote
> new chunks "dd bs=1048576 ..." chunks were zeroed. But for security it's
> good. IMHO: In this case good choice firstly write chunks to the disk
> and then give this chunks to the volume.


Yep - we are recommending to disable zeroing as soon as chunksize >512K.

But for 'security' reason the option it's up to users to select what fits the 
needs in the best way - there is no  'one solution fits them all' in this case.

Clearly when you put a modern filesystem (ext4, xfs...) on top of thinLV - you 
can't read junk data - filesystem knows very well about written portions.
But if you will access thinLV device on 'block-level' with 'dd' command you 
might see some old data trash if zeroing is disabled...

For smaller chunksizes  zeroing is usually not a big deal - with bigger chunks 
it slows down initial provisioning in major way - but once the block is 
provisioned there are no further costs....

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 12:59             ` Zdenek Kabelac
@ 2019-10-23 14:37               ` Gionatan Danti
  2019-10-23 15:37                 ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Gionatan Danti @ 2019-10-23 14:37 UTC (permalink / raw)
  To: Zdenek Kabelac, LVM general discussion and development,
	Stuart D. Gathman
  Cc: Dalebjörk, Tomas

On 23/10/19 14:59, Zdenek Kabelac wrote:
> Dne 23. 10. 19 v 13:08 Gionatan Danti napsal(a):
>> Talking about thin snapshot, an obvious performance optimization which 
>> seems to not be implemented is to skip reading source data when 
>> overwriting in larger-than-chunksize blocks.
> 
> Hi
> 
> There is no such optimization possible for old snapshots.
> You would need to write ONLY to snapshots.
> 
> As soon as you start to write to origin - you have to 'read' original 
> data from origin, copy them to COW storage, once this is finished, you can
> overwrite origin data area with your writing I/O.
> 
> This is simply never going to work fast ;) - the fast way is thin-pool...
> 
> Old snapshots were designed for 'short' lived snapshots (so you can take
> a backup of volume which is not being modified underneath).
> 
> Any idea of improving this old snapshots target are sooner or later 
> going to end-up with thin-pool anyway :)� (we've been in this river many 
> many years back in time...)
> 
> 
>> For example, consider a completely filled 64k chunk thin volume (with 
>> thinpool having ample free space). Snapshotting it and writing a 4k 
>> block on origin 
> 
> There is no support of� snapshot of snapshot� with old snaps...
> It would be extremely slow to use...
> 
>> However, my testing shows that source chunks are always read, even 
>> when completely overwritten.
>>
>> Am I missing something?
> 
> Yep - you would need to always jump to your 'snapshot' - so instead of
> keeping 'origin' on� major:minor� - it would need to become a 'snapshot'...
> Seriously complex concept to work with - especially when there is 
> thin-pool...

Hi, I was speaking about *thin* snapshots here. Rewriting the example 
given above (for clarity):

"For example, consider a completely filled 64k chunk thin volume (with 
thinpool having ample free space). Snapshotting it and writing a 4k 
block on origin will obviously cause a read of the original 64k chunk, 
an in-memory change of the 4k block and a write of the entire modified 
64k block to a new location. But writing, say, a 1 MB block should *not* 
cause the same read on source: after all, the read data will be 
immediately discarded, overwritten by the changed 1 MB block."

I would expect that such large-block *thin* snapshot rewrite behavior 
would not cause a read/modify/write, but it really does.

Is this a low-hanging fruit or there are more fundamental problem 
avoiding read/modify/write in this case?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 13:05               ` Zdenek Kabelac
@ 2019-10-23 14:40                 ` Gionatan Danti
  2019-10-23 15:46                   ` Ilia Zykov
  0 siblings, 1 reply; 53+ messages in thread
From: Gionatan Danti @ 2019-10-23 14:40 UTC (permalink / raw)
  To: LVM general discussion and development, Zdenek Kabelac, Ilia Zykov

On 23/10/19 15:05, Zdenek Kabelac wrote:
> Yep - we are recommending to disable zeroing as soon as chunksize >512K.
> 
> But for 'security' reason the option it's up to users to select what 
> fits the needs in the best way - there is no� 'one solution fits them 
> all' in this case.

Sure, but again: if writing a block larger than the underlying chunk, 
zeroing can (and should) skipped. Yet I seem to remember that the new 
block is zeroed in any case, even if it is going to be rewritten entirely.

Do I remember wrongly?

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 14:37               ` Gionatan Danti
@ 2019-10-23 15:37                 ` Zdenek Kabelac
  2019-10-23 17:16                   ` Gionatan Danti
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-23 15:37 UTC (permalink / raw)
  To: LVM general discussion and development, Gionatan Danti,
	Stuart D. Gathman
  Cc: Dalebjörk, Tomas

Dne 23. 10. 19 v 16:37 Gionatan Danti napsal(a):
> On 23/10/19 14:59, Zdenek Kabelac wrote:
>> Dne 23. 10. 19 v 13:08 Gionatan Danti napsal(a):
>>> Talking about thin snapshot, an obvious performance optimization which 
>>> seems to not be implemented is to skip reading source data when overwriting 
>>> in larger-than-chunksize blocks.
> 
> "For example, consider a completely filled 64k chunk thin volume (with 
> thinpool having ample free space). Snapshotting it and writing a 4k block on 
> origin will obviously cause a read of the original 64k chunk, an in-memory 
> change of the 4k block and a write of the entire modified 64k block to a new 
> location. But writing, say, a 1 MB block should *not* cause the same read on 
> source: after all, the read data will be immediately discarded, overwritten by 
> the changed 1 MB block."
> 
> I would expect that such large-block *thin* snapshot rewrite behavior would 
> not cause a read/modify/write, but it really does.
> 
> Is this a low-hanging fruit or there are more fundamental problem avoiding 
> read/modify/write in this case?

Hi

If you use 1MiB chunksize for thin-pool and you use  'dd' with proper bs size
and you write 'aligned' on 1MiB boundary (be sure you user  directIO, so you 
are not a victim of some page cache flushing...) - there should not be any 
useless read.

If you still do see such read - and you can easily reproduce this with latest 
kernel - report a bug please with your reproducer and results.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 14:40                 ` Gionatan Danti
@ 2019-10-23 15:46                   ` Ilia Zykov
  0 siblings, 0 replies; 53+ messages in thread
From: Ilia Zykov @ 2019-10-23 15:46 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]

On 23.10.2019 17:40, Gionatan Danti wrote:
> On 23/10/19 15:05, Zdenek Kabelac wrote:
>> Yep - we are recommending to disable zeroing as soon as chunksize >512K.
>>
>> But for 'security' reason the option it's up to users to select what
>> fits the needs in the best way - there is no  'one solution fits them
>> all' in this case.
> 
> Sure, but again: if writing a block larger than the underlying chunk,
> zeroing can (and should) skipped. Yet I seem to remember that the new

At this case if we get reset before a full chunk written, the tail of
the chunk will be a foreign old data (if meta data already written) -
little security problem.
We need firstly write a data to the disk and then give the fully written
chunk to the volume. But I think it's 'little' complicate matters.

> block is zeroed in any case, even if it is going to be rewritten entirely.
> 
> Do I remember wrongly?
> 



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3695 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 15:37                 ` Zdenek Kabelac
@ 2019-10-23 17:16                   ` Gionatan Danti
  0 siblings, 0 replies; 53+ messages in thread
From: Gionatan Danti @ 2019-10-23 17:16 UTC (permalink / raw)
  To: Zdenek Kabelac
  Cc: Dalebjörk, Tomas, LVM general discussion and development

Il 23-10-2019 17:37 Zdenek Kabelac ha scritto:
> Hi
> 
> If you use 1MiB chunksize for thin-pool and you use  'dd' with proper 
> bs size
> and you write 'aligned' on 1MiB boundary (be sure you user  directIO,
> so you are not a victim of some page cache flushing...) - there should
> not be any useless read.
> 
> If you still do see such read - and you can easily reproduce this with
> latest kernel - report a bug please with your reproducer and results.
> 
> Regards
> 
> Zdenek

OK, I triple-checked my numbers and you are right: on a fully updated 
CentOS 7.7 x86-64 box with kernel-3.10.0-1062.4.1 and lvm2-2.02.185-2, 
it seems that the behavior I observed on older (>2 years ago) is not 
present anymore.

Take this original lvm setup:
[root@localhost ~]# lvs -o +chunk_size
   LV       VG     Attr       LSize   Pool     Origin Data%  Meta%  Move 
Log Cpy%Sync Convert Chunk
   root     centos -wi-ao----  <6.20g                                     
                         0
   swap     centos -wi-ao---- 512.00m                                     
                         0
   thinpool centos twi-aot---   1.00g                 25.00  14.16        
                     64.00k
   thinvol  centos Vwi-a-t--- 256.00m thinpool        100.00              
                         0

Taking a snapshot (lvcreate -s /dev/centos/thinvol -n thinsnap) and 
overwriting 1 MB of data on origin via "dd if=/dev/urandom 
of=/dev/centos/thinvol bs=1M count=32 oflag=direct" results in the 
following I/O to/from disk:

[root@localhost ~]# dstat -d -D sdc
---dsk/sdc---
  read  writ
1036k   32M

As you can see, while 1 MB was indeed read (due to metadata read?), no 
other read amplification occoured.

Now I got curious to see if zeroing behave in the same manner. So, I 
deleted thinsnap & thinvol, toggled zeroing on (lvchange -Zy 
centos/thinpool), and recreated thinvol:

[root@localhost ~]# lvs -o +chunk_size
   LV       VG     Attr       LSize   Pool     Origin Data%  Meta%  Move 
Log Cpy%Sync Convert Chunk
   root     centos -wi-ao----  <6.20g                                     
                         0
   swap     centos -wi-ao---- 512.00m                                     
                         0
   thinpool centos twi-aotz--   1.00g                 0.00   11.04        
                     64.00k
   thinvol  centos Vwi-a-tz-- 256.00m thinpool        0.00                
                         0

[root@localhost ~]# dstat -d -D sdc
--dsk/sdc--
  read  writ
    0    13M
  520k   19M

Again, no write amplificaton occoured.

Kudos to all the team for optimizing lvmthin in this manner, it really 
is a flexible and great performing tool.
Regards.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-23 11:24             ` Tomas Dalebjörk
  2019-10-23 11:26               ` Tomas Dalebjörk
@ 2019-10-24 16:01               ` Zdenek Kabelac
  2019-10-25 16:31                 ` Tomas Dalebjörk
  1 sibling, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-10-24 16:01 UTC (permalink / raw)
  To: LVM general discussion and development, Tomas Dalebjörk,
	Gionatan Danti

Dne 23. 10. 19 v 13:24 Tomas Dalebjörk napsal(a):
> I have tested FusionIO together with old thick snapshots.
> I created the thick snapshot on a separate old traditional SATA drive, just to 
> check if that could be used as a snapshot target for high performance disks; 
> like a Fusion IO card.
> For those who doesn't know about FusionIO; they can deal with 150-250,000 IOPS.
> 
> And to be honest, I couldn't bottle neck the SATA disk I used as a thick 
> snapshot target.
> The reason for why is simple:
> - thick snapshots uses sequential write techniques
> 
> If I would have been using thin snapshots, than the writes would most likely 
> be more randomized on disk, which would have required more spindles to coop 
> with this.
> 
> Anyhow;
> I am still eager to hear how to use an external device to import snapshots.
> And when I say "import"; I am not talking about copyback, more to use to read 
> data from.

Format of 'on-disk' snapshot metadata for old snapshot is trivial - being some
header + pairs of dataoffset-TO-FROM -  I think googling will reveal couple
python tools playing with it.

You can add pre-created COW image to LV  with  lvconvert --snapshot
and to avoid 'zeroing' metadata use option -Zn
(BTW in the same way you can detach snapshot from LV with --splitsnapshot so 
you can look how the metadata looks like...)

Although it's pretty unusual why would anyone create first the COW image with 
all the special layout and then merge it to LV - instead of directly 
merging...   There is only the 'little' advantage of minimizing 'offline' time 
of such device   (and it's the reason why --split exists).

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-24 16:01               ` Zdenek Kabelac
@ 2019-10-25 16:31                 ` Tomas Dalebjörk
  2019-11-04  5:54                   ` Tomas Dalebjörk
  0 siblings, 1 reply; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-10-25 16:31 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Gionatan Danti, LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2760 bytes --]

Wow!

Impressing.
This will make history!

If this is possible, than we are able to implement a solution, which can do:
- progressive block level incremental forever (always incremental on block
level : this already exist)
- instant recovery to point in time (using the mentioned methods you just
described)

For example, lets say that a client wants to restore a file system, or a
logical volume to how it looked a like yesterday.
Eventhough there are no snapshot, nor any data.
Than the client (with some coding); can start from an empty volume, and
re-attach a cow device, and convert that using lvconvert --merge, so that
the copying can be done in background using the backup server.

If you forget about "how we will re-create the cow device"; and just
focusing on the LVM ideas of re-attaching a cow device.
Do you think that I have understood it correctly?


Den tors 24 okt. 2019 kl 18:01 skrev Zdenek Kabelac <zkabelac@redhat.com>:

> Dne 23. 10. 19 v 13:24 Tomas Dalebjörk napsal(a):
> > I have tested FusionIO together with old thick snapshots.
> > I created the thick snapshot on a separate old traditional SATA drive,
> just to
> > check if that could be used as a snapshot target for high performance
> disks;
> > like a Fusion IO card.
> > For those who doesn't know about FusionIO; they can deal with
> 150-250,000 IOPS.
> >
> > And to be honest, I couldn't bottle neck the SATA disk I used as a thick
> > snapshot target.
> > The reason for why is simple:
> > - thick snapshots uses sequential write techniques
> >
> > If I would have been using thin snapshots, than the writes would most
> likely
> > be more randomized on disk, which would have required more spindles to
> coop
> > with this.
> >
> > Anyhow;
> > I am still eager to hear how to use an external device to import
> snapshots.
> > And when I say "import"; I am not talking about copyback, more to use to
> read
> > data from.
>
> Format of 'on-disk' snapshot metadata for old snapshot is trivial - being
> some
> header + pairs of dataoffset-TO-FROM -  I think googling will reveal couple
> python tools playing with it.
>
> You can add pre-created COW image to LV  with  lvconvert --snapshot
> and to avoid 'zeroing' metadata use option -Zn
> (BTW in the same way you can detach snapshot from LV with --splitsnapshot
> so
> you can look how the metadata looks like...)
>
> Although it's pretty unusual why would anyone create first the COW image
> with
> all the special layout and then merge it to LV - instead of directly
> merging...   There is only the 'little' advantage of minimizing 'offline'
> time
> of such device   (and it's the reason why --split exists).
>
> Regards
>
> Zdenek
>
>
>

[-- Attachment #2: Type: text/html, Size: 3391 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-10-25 16:31                 ` Tomas Dalebjörk
@ 2019-11-04  5:54                   ` Tomas Dalebjörk
  2019-11-04 10:07                     ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-11-04  5:54 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: Gionatan Danti, LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 3897 bytes --]

Hi

I have some additional questions related to this.
regarding this statement:
“ While the merge is in progress, reads or writes to the origin appear as they were directed to the snapshot being merged. ”

What exactly does that mean?

Will that mean that before changes are being placed on the origin device, it has to first:
read the data from the snapshot back to origin, copy the data back from origin to the snapshot, and than after that allow changes to happen?
if that is the case, does it keep track of that this block should not be copied again?

and will the ongoing merge priorities this block before the other background copying?

how about read operations ?
will the requested read operations on the origin volume be prioritized before the copying of snapshot data?

I didn’t find much information about this, hence why I ask here

assuming that someone has executed: lvconvert - - merge -b snapshot

thanks for the feedback 

Skickat från min iPhone

> 25 okt. 2019 kl. 18:31 skrev Tomas Dalebjörk <tomas.dalebjork@gmail.com>:
> 
> 
> Wow!
> 
> Impressing.
> This will make history!
> 
> If this is possible, than we are able to implement a solution, which can do:
> - progressive block level incremental forever (always incremental on block level : this already exist)
> - instant recovery to point in time (using the mentioned methods you just described)
> 
> For example, lets say that a client wants to restore a file system, or a logical volume to how it looked a like yesterday.
> Eventhough there are no snapshot, nor any data.
> Than the client (with some coding); can start from an empty volume, and re-attach a cow device, and convert that using lvconvert --merge, so that the copying can be done in background using the backup server.
> 
> If you forget about "how we will re-create the cow device"; and just focusing on the LVM ideas of re-attaching a cow device.
> Do you think that I have understood it correctly?
> 
> 
> Den tors 24 okt. 2019 kl 18:01 skrev Zdenek Kabelac <zkabelac@redhat.com>:
>> Dne 23. 10. 19 v 13:24 Tomas Dalebjörk napsal(a):
>> > I have tested FusionIO together with old thick snapshots.
>> > I created the thick snapshot on a separate old traditional SATA drive, just to 
>> > check if that could be used as a snapshot target for high performance disks; 
>> > like a Fusion IO card.
>> > For those who doesn't know about FusionIO; they can deal with 150-250,000 IOPS.
>> > 
>> > And to be honest, I couldn't bottle neck the SATA disk I used as a thick
>> > snapshot target.
>> > The reason for why is simple:
>> > - thick snapshots uses sequential write techniques
>> > 
>> > If I would have been using thin snapshots, than the writes would most likely 
>> > be more randomized on disk, which would have required more spindles to coop 
>> > with this.
>> > 
>> > Anyhow;
>> > I am still eager to hear how to use an external device to import snapshots.
>> > And when I say "import"; I am not talking about copyback, more to use to read 
>> > data from.
>> 
>> Format of 'on-disk' snapshot metadata for old snapshot is trivial - being some
>> header + pairs of dataoffset-TO-FROM -  I think googling will reveal couple
>> python tools playing with it.
>> 
>> You can add pre-created COW image to LV  with  lvconvert --snapshot
>> and to avoid 'zeroing' metadata use option -Zn
>> (BTW in the same way you can detach snapshot from LV with --splitsnapshot so 
>> you can look how the metadata looks like...)
>> 
>> Although it's pretty unusual why would anyone create first the COW image with 
>> all the special layout and then merge it to LV - instead of directly 
>> merging...   There is only the 'little' advantage of minimizing 'offline' time 
>> of such device   (and it's the reason why --split exists).
>> 
>> Regards
>> 
>> Zdenek
>> 
>> 

[-- Attachment #2: Type: text/html, Size: 5353 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-04  5:54                   ` Tomas Dalebjörk
@ 2019-11-04 10:07                     ` Zdenek Kabelac
  2019-11-04 14:40                       ` Tomas Dalebjörk
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-11-04 10:07 UTC (permalink / raw)
  To: Tomas Dalebjörk; +Cc: LVM general discussion and development

Dne 04. 11. 19 v 6:54 Tomas Dalebjörk napsal(a):
> Hi
> 
> I have some additional questions related to this.
> regarding this statement:
> “ While the merge is in progress, reads or writes to the origin appear as they 
> were directed to the snapshot being merged. ”
> 
> What exactly does that mean?
> 
> Will that mean that before changes are being placed on the origin device, it 
> has to first:
> read the data from the snapshot back to origin, copy the data back from origin 
> to the snapshot, and than after that allow changes to happen?
> if that is the case, does it keep track of that this block should not be 
> copied again?

Hi

When the 'merge' is in progress -  your 'origin' is no longer accessible
for your normal usage. It's hiddenly active and only usable by snapshot-merge 
target)

So during 'merging' - you can already use you snapshot like if it would be and
origin - and in the background there is a process that reads data from 
'snapshot' COW device and copies them back to hidden origin.
(this is what you can observe with 'lvs' and copy%)

So any 'new' writes to such device lends at right place -  reads are either 
from COW (if the block has not yet been merged) or from origin.

Once all blocks from 'COW' are merged into origing - tables are remapped again
so all 'supportive' devices are removed and only your 'now fully merged' 
origin becomes present for usage (while still being fully online)

Hopefully it gets more clear.


For more explanation how DM works - probably visit:
http://people.redhat.com/agk/talks/

> and will the ongoing merge priorities this block before the other background 
> copying?
> 
> how about read operations ?
> will the requested read operations on the origin volume be prioritized before 
> the copying of snapshot data?

The priority is that you always get proper block.
Don't seek there the 'top most' performance - the correctness was always the 
priority there and for long time there is no much devel effort on this ancient 
target - since  thin-pool usage is simply way more superior....

1st. note - major difficulty comes from ONLINE usage. If you do NOT need 
device to be online (aka you keep 'reserve' copy of device) - you can merge 
things directly into a device - and I simply don't see why you would want to 
complicate this whole with extra step of transforming data into COW format 
first and the do online merge.

2nd. note - clearly one cannot start 'merge' of snapshot into origin while 
such origin device is in-use (i.e. mounted) - as that would lead to 
'modification' of such filesystem under its hands.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-04 10:07                     ` Zdenek Kabelac
@ 2019-11-04 14:40                       ` Tomas Dalebjörk
  2019-11-04 15:04                         ` Zdenek Kabelac
  2019-11-05 16:40                         ` Mikulas Patocka
  0 siblings, 2 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-11-04 14:40 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 5103 bytes --]

Thanks for feedback.

Let me try to type different scenarios:

We have an origin volume, lets call it: /dev/vg00/lv00
We convert a snapshot volume to origin volume, lets call it:
/dev/vg00/lv00-snap
- all blocks has been changed, and are represented in the
/dev/vg00/lv00-snap, when we start the lvconvert process

I assume that something reads the data from /dev/vg00/lv00-snap and copy
that to /dev/vg00/lv00
It will most likely start from the first block, to the last block to copy.
The block size is 1MB on /dev/vg00/lv00-snap, and we have for simplicity
the same block size on the origin /dev/vg00/lv00

Scenario 1: A read comes want to read block LP 100, but lvconvert has not
yet copied that LP block.
Will the read comes from /dev/vg00/lv00-snap directly and delivered to
requestor?
Or will lvconvert prioritize to copy data from /dev/vg00/lv00-snap to
/dev/vg00/lv00 for that block, and let the requestor wait until the copying
has been completed, so that a read operation can happen from origin?
Or will the requestor have to wait until the copy data from
/dev/vg00/lv00-snap to /dev/vg00/lv00 for that block has been completed,
without any prioritization?

Scenario 2: A write comes want to write block LP 100, but lvconvert has not
yet copied that LP block (yes, I do understand that origin is hidden now)
Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to
/dev/vg00/lv00 for that block, and let the requestor write the changes
directly on the origin after the copying has been performed?
Or will the write be blocked until lvconvert has finished the copying of
the requested block, and than a write can be accepted to the origin?
Or where will the changes be written?

It is important for me to understand, as the backup device that I want to
map as a COW device is a read only target, and is not allowed to be written
to.
If read happends from the backup COW device, and writes happends to the
origin, than it is possible to create an instant recovery.
If writes happends to the backup COW device, than it not that easy to
implement a instance reovery solution, as the backup device is write
protected.

Thanks in advance.

Den mån 4 nov. 2019 kl 11:07 skrev Zdenek Kabelac <zkabelac@redhat.com>:

> Dne 04. 11. 19 v 6:54 Tomas Dalebjörk napsal(a):
> > Hi
> >
> > I have some additional questions related to this.
> > regarding this statement:
> > “ While the merge is in progress, reads or writes to the origin appear
> as they
> > were directed to the snapshot being merged. ”
> >
> > What exactly does that mean?
> >
> > Will that mean that before changes are being placed on the origin
> device, it
> > has to first:
> > read the data from the snapshot back to origin, copy the data back from
> origin
> > to the snapshot, and than after that allow changes to happen?
> > if that is the case, does it keep track of that this block should not be
> > copied again?
>
> Hi
>
> When the 'merge' is in progress -  your 'origin' is no longer accessible
> for your normal usage. It's hiddenly active and only usable by
> snapshot-merge
> target)
>
> So during 'merging' - you can already use you snapshot like if it would be
> and
> origin - and in the background there is a process that reads data from
> 'snapshot' COW device and copies them back to hidden origin.
> (this is what you can observe with 'lvs' and copy%)
>
> So any 'new' writes to such device lends at right place -  reads are
> either
> from COW (if the block has not yet been merged) or from origin.
>
> Once all blocks from 'COW' are merged into origing - tables are remapped
> again
> so all 'supportive' devices are removed and only your 'now fully merged'
> origin becomes present for usage (while still being fully online)
>
> Hopefully it gets more clear.
>
>
> For more explanation how DM works - probably visit:
> http://people.redhat.com/agk/talks/
>
> > and will the ongoing merge priorities this block before the other
> background
> > copying?
> >
> > how about read operations ?
> > will the requested read operations on the origin volume be prioritized
> before
> > the copying of snapshot data?
>
> The priority is that you always get proper block.
> Don't seek there the 'top most' performance - the correctness was always
> the
> priority there and for long time there is no much devel effort on this
> ancient
> target - since  thin-pool usage is simply way more superior....
>
> 1st. note - major difficulty comes from ONLINE usage. If you do NOT need
> device to be online (aka you keep 'reserve' copy of device) - you can
> merge
> things directly into a device - and I simply don't see why you would want
> to
> complicate this whole with extra step of transforming data into COW format
> first and the do online merge.
>
> 2nd. note - clearly one cannot start 'merge' of snapshot into origin while
> such origin device is in-use (i.e. mounted) - as that would lead to
> 'modification' of such filesystem under its hands.
>
> Regards
>
> Zdenek
>
>

[-- Attachment #2: Type: text/html, Size: 6031 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-04 14:40                       ` Tomas Dalebjörk
@ 2019-11-04 15:04                         ` Zdenek Kabelac
  2019-11-04 17:28                           ` Tomas Dalebjörk
  2019-11-05 16:40                         ` Mikulas Patocka
  1 sibling, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2019-11-04 15:04 UTC (permalink / raw)
  To: Tomas Dalebjörk; +Cc: LVM general discussion and development

Dne 04. 11. 19 v 15:40 Tomas Dalebjörk napsal(a):
> Thanks for feedback.
> 
>

> Scenario 2: A write comes want to write block LP 100, but lvconvert has not 
> yet copied that LP block (yes, I do understand that origin is hidden now)
> Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to 
> /dev/vg00/lv00 for that block, and let the requestor write the changes 
> directly on the origin after the copying has been performed?
> Or will the write be blocked until lvconvert has finished the copying of the 
> requested block, and than a write can be accepted to the origin?
> Or where will the changes be written?

Since the COW device contains not only 'data' but also 'metadata'  blocks
and during the 'merge' it's being updated so it 'knows' which data has
been already merged back to origin (in other words during the merge the usage 
of COW is being reduced towards 0)  - I assume your 'plan' stops right here
and there is not much point to explore how much sub-optimal the rest of 
merging process is  (and as said - primary aspect was robustness - so if there 
is crash in any moment in time - data remain correct)

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-04 15:04                         ` Zdenek Kabelac
@ 2019-11-04 17:28                           ` Tomas Dalebjörk
  2019-11-05 16:24                             ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-11-04 17:28 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

thanks, I understand that meta data blocks needs to be update, that I can understand.
how about the other questions?
like : data write will happen towards which device? cow device or after the copying has been completed to the origin disk?


Skickat från min iPhone

> 4 nov. 2019 kl. 16:04 skrev Zdenek Kabelac <zkabelac@redhat.com>:
> 
> Dne 04. 11. 19 v 15:40 Tomas Dalebjörk napsal(a):
>> Thanks for feedback.
>> 
> 
>> Scenario 2: A write comes want to write block LP 100, but lvconvert has not yet copied that LP block (yes, I do understand that origin is hidden now)
>> Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block, and let the requestor write the changes directly on the origin after the copying has been performed?
>> Or will the write be blocked until lvconvert has finished the copying of the requested block, and than a write can be accepted to the origin?
>> Or where will the changes be written?
> 
> Since the COW device contains not only 'data' but also 'metadata'  blocks
> and during the 'merge' it's being updated so it 'knows' which data has
> been already merged back to origin (in other words during the merge the usage of COW is being reduced towards 0)  - I assume your 'plan' stops right here
> and there is not much point to explore how much sub-optimal the rest of merging process is  (and as said - primary aspect was robustness - so if there is crash in any moment in time - data remain correct)
> 
> Regards
> 
> Zdenek
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-04 17:28                           ` Tomas Dalebjörk
@ 2019-11-05 16:24                             ` Zdenek Kabelac
  0 siblings, 0 replies; 53+ messages in thread
From: Zdenek Kabelac @ 2019-11-05 16:24 UTC (permalink / raw)
  To: LVM general discussion and development, Tomas Dalebjörk

Dne 04. 11. 19 v 18:28 Tomas Dalebjörk napsal(a):
> thanks, I understand that meta data blocks needs to be update, that I can understand.
> how about the other questions?
> like : data write will happen towards which device? cow device or after the copying has been completed to the origin disk?

Hi

I'd assume - if the block is still mapped in COW and the block is not yet 
merged into origin - the 'write' needs to lend COW - as there is no 'extra' 
information about which 'portion' of the chunk has been already 'merged'.
If you happen to 'write' your I/O to currently merged 'chunk' - you will
wait till check gets merged and metadata are updated and then your I/O land in 
origin.

But I don't think there are any optimization made - as it doesn't really 
matter too much in terms of the actual merging speed -  if couple I/O are 
repeated - who cares - on the overall time of whole merging process it will 
have negligible impact - and as said - the preference was made towards 
simplicity and correctness.

For the most details - just feel free to take a look at:

linux/drviers/md/dm-snap.c

i.e. function snapshot_merge_next_chunks()

The snapshot was designed to be small and map a very low percentage of origin 
device - it's never been assumed to be used with 200GiB and similar snapshot 
COW size....

Regads

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-04 14:40                       ` Tomas Dalebjörk
  2019-11-04 15:04                         ` Zdenek Kabelac
@ 2019-11-05 16:40                         ` Mikulas Patocka
  2019-11-05 20:56                           ` Tomas Dalebjörk
  1 sibling, 1 reply; 53+ messages in thread
From: Mikulas Patocka @ 2019-11-05 16:40 UTC (permalink / raw)
  To: Tomas Dalebjörk
  Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3088 bytes --]



On Mon, 4 Nov 2019, Tomas Dalebjörk wrote:

> Thanks for feedback.
> 
> Let me try to type different scenarios:
> 
> We have an origin volume, lets call it: /dev/vg00/lv00
> We convert a snapshot volume to origin volume, lets call it: /dev/vg00/lv00-snap
> - all blocks has been changed, and are represented in the /dev/vg00/lv00-snap, when we start the lvconvert process
> 
> I assume that something reads the data from /dev/vg00/lv00-snap and copy that to /dev/vg00/lv00
> It will most likely start from the first block, to the last block to copy.

Merging starts from the last block on the lv00-snap device and it proceeds 
backward to the beginning.

> The block size is 1MB on /dev/vg00/lv00-snap, and we have for simplicity the same block size on the origin /dev/vg00/lv00
> 
> Scenario 1: A read comes want to read block LP 100, but lvconvert has not yet copied that LP block.
> Will the read comes from /dev/vg00/lv00-snap directly and delivered to requestor?

Yes.

> Or will lvconvert prioritize to copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block, and let the requestor wait until the copying has been completed, so
> that a read operation can happen from origin?
> Or will the requestor have to wait until the copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block has been completed, without any prioritization?

It only waits if you attempt to read or write the block that is currently 
being copied.

If you read data that hasn't been merged yet, it reads from the snapshot, 
if you read data that has been merged, it reads from the origin, if you 
read data that is currently being copied, it waits.

> Scenario 2: A write comes want to write block LP 100, but lvconvert has not yet copied that LP block (yes, I do understand that origin is hidden now)
> Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block, and let the requestor write the changes directly on the origin after the
> copying has been performed?

No.

> Or will the write be blocked until lvconvert has finished the copying of the requested block, and than a write can be accepted to the origin?
> Or where will the changes be written?

The changes will be written to the lv00-snap device.

If you write data that hasn't been merged yet, the write is redirected to 
the lv00-snap device. If you write data that has already been merged, the 
write is directed to the origin device. If you write data that is 
currently being merged, it waits.

> It is important for me to understand, as the backup device that I want to map as a COW device is a read only target, and is not allowed to be written to.

You can't have read-only COW device. Both metadata and data on the COW 
device are updated during the merge.

> If read happends from the backup COW device, and writes happends to the origin, than it is possible to create an instant recovery.
> If writes happends to the backup COW device, than it not that easy to implement a instance reovery solution, as the backup device is write protected.
> 
> Thanks in advance.

Mikulas

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-05 16:40                         ` Mikulas Patocka
@ 2019-11-05 20:56                           ` Tomas Dalebjörk
  2019-11-06  9:22                             ` Zdenek Kabelac
  2019-11-07 16:54                             ` Mikulas Patocka
  0 siblings, 2 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-11-05 20:56 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: text/plain, Size: 3788 bytes --]

Thanks,

That really helped me to understand how the snapshot works.
Last question:
- lets say that block 100 which is 1MB in size is in the cow device, and a
write happen that wants to something or all data on that region of block
100.
Than I assume; based on what have been previously said here, that the block
in the cow device will be overwritten with the new changes.

Regards Tomas

Den tis 5 nov. 2019 kl 17:40 skrev Mikulas Patocka <mpatocka@redhat.com>:

>
>
> On Mon, 4 Nov 2019, Tomas Dalebjörk wrote:
>
> > Thanks for feedback.
> >
> > Let me try to type different scenarios:
> >
> > We have an origin volume, lets call it: /dev/vg00/lv00
> > We convert a snapshot volume to origin volume, lets call it:
> /dev/vg00/lv00-snap
> > - all blocks has been changed, and are represented in the
> /dev/vg00/lv00-snap, when we start the lvconvert process
> >
> > I assume that something reads the data from /dev/vg00/lv00-snap and copy
> that to /dev/vg00/lv00
> > It will most likely start from the first block, to the last block to
> copy.
>
> Merging starts from the last block on the lv00-snap device and it proceeds
> backward to the beginning.
>
> > The block size is 1MB on /dev/vg00/lv00-snap, and we have for simplicity
> the same block size on the origin /dev/vg00/lv00
> >
> > Scenario 1: A read comes want to read block LP 100, but lvconvert has
> not yet copied that LP block.
> > Will the read comes from /dev/vg00/lv00-snap directly and delivered to
> requestor?
>
> Yes.
>
> > Or will lvconvert prioritize to copy data from /dev/vg00/lv00-snap to
> /dev/vg00/lv00 for that block, and let the requestor wait until the copying
> has been completed, so
> > that a read operation can happen from origin?
> > Or will the requestor have to wait until the copy data from
> /dev/vg00/lv00-snap to /dev/vg00/lv00 for that block has been completed,
> without any prioritization?
>
> It only waits if you attempt to read or write the block that is currently
> being copied.
>
> If you read data that hasn't been merged yet, it reads from the snapshot,
> if you read data that has been merged, it reads from the origin, if you
> read data that is currently being copied, it waits.
>
> > Scenario 2: A write comes want to write block LP 100, but lvconvert has
> not yet copied that LP block (yes, I do understand that origin is hidden
> now)
> > Will lvconvery prioritize to copy data from /dev/vg00/lv00-snap to
> /dev/vg00/lv00 for that block, and let the requestor write the changes
> directly on the origin after the
> > copying has been performed?
>
> No.
>
> > Or will the write be blocked until lvconvert has finished the copying of
> the requested block, and than a write can be accepted to the origin?
> > Or where will the changes be written?
>
> The changes will be written to the lv00-snap device.
>
> If you write data that hasn't been merged yet, the write is redirected to
> the lv00-snap device. If you write data that has already been merged, the
> write is directed to the origin device. If you write data that is
> currently being merged, it waits.
>
> > It is important for me to understand, as the backup device that I want
> to map as a COW device is a read only target, and is not allowed to be
> written to.
>
> You can't have read-only COW device. Both metadata and data on the COW
> device are updated during the merge.
>
> > If read happends from the backup COW device, and writes happends to the
> origin, than it is possible to create an instant recovery.
> > If writes happends to the backup COW device, than it not that easy to
> implement a instance reovery solution, as the backup device is write
> protected.
> >
> > Thanks in advance.
>
> Mikulas

[-- Attachment #2: Type: text/html, Size: 4327 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-05 20:56                           ` Tomas Dalebjörk
@ 2019-11-06  9:22                             ` Zdenek Kabelac
  2019-11-07 16:54                             ` Mikulas Patocka
  1 sibling, 0 replies; 53+ messages in thread
From: Zdenek Kabelac @ 2019-11-06  9:22 UTC (permalink / raw)
  To: Tomas Dalebjörk; +Cc: LVM general discussion and development

Dne 05. 11. 19 v 21:56 Tomas Dalebjörk napsal(a):
> Thanks,
> 
> That really helped me to understand how the snapshot works.
> Last question:
> - lets say that block 100 which is 1MB in size is in the cow device, and a 
> write happen that wants to something or all data on that region of block 100.
> Than I assume; based on what have been previously said here, that the block in 
> the cow device will be overwritten with the new changes.

Yes - it needs to be written to 'COW' device - since when the block will be 
merged - it would overwrite whatever would have been written in 'origin'
(as said - there is nothing else in snapshot metadata then  'from->to' block 
mapping table - so there is no way to store information about a portion of 
'chunk' being already written into origin) - and 'merge' needs to work 
reliable in cases like 'power-off' in the middle of merge operation...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-05 20:56                           ` Tomas Dalebjörk
  2019-11-06  9:22                             ` Zdenek Kabelac
@ 2019-11-07 16:54                             ` Mikulas Patocka
  2019-11-07 17:29                               ` Tomas Dalebjörk
  1 sibling, 1 reply; 53+ messages in thread
From: Mikulas Patocka @ 2019-11-07 16:54 UTC (permalink / raw)
  To: Tomas Dalebjörk
  Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: TEXT/PLAIN, Size: 512 bytes --]



On Tue, 5 Nov 2019, Tomas Dalebjörk wrote:

> Thanks,
> 
> That really helped me to understand how the snapshot works.
> Last question:
> - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block 100.
> Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes.

Yes, the block in the cow device will be overwritten.

Mikulas

> Regards Tomas

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-07 16:54                             ` Mikulas Patocka
@ 2019-11-07 17:29                               ` Tomas Dalebjörk
  2020-09-04 12:09                                 ` Tomas Dalebjörk
  0 siblings, 1 reply; 53+ messages in thread
From: Tomas Dalebjörk @ 2019-11-07 17:29 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

Great, thanks!

Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka <mpatocka@redhat.com>:

>
>
> On Tue, 5 Nov 2019, Tomas Dalebjörk wrote:
>
> > Thanks,
> >
> > That really helped me to understand how the snapshot works.
> > Last question:
> > - lets say that block 100 which is 1MB in size is in the cow device, and
> a write happen that wants to something or all data on that region of block
> 100.
> > Than I assume; based on what have been previously said here, that the
> block in the cow device will be overwritten with the new changes.
>
> Yes, the block in the cow device will be overwritten.
>
> Mikulas
>
> > Regards Tomas

[-- Attachment #2: Type: text/html, Size: 984 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2019-11-07 17:29                               ` Tomas Dalebjörk
@ 2020-09-04 12:09                                 ` Tomas Dalebjörk
  2020-09-04 12:37                                   ` Zdenek Kabelac
  2020-09-07 13:09                                   ` Mikulas Patocka
  0 siblings, 2 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2020-09-04 12:09 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: text/plain, Size: 1738 bytes --]

hi

I tried to perform as suggested
# lvconvert —splitsnapshot vg/lv-snap
works fine
# lvconvert -s vg/lv vg/lv-snap
works fine too

but...
if I try to converting cow data directly from the meta device, than it doesn’t work
eg
# lvconvert -s vg/lv /dev/mycowdev
the tool doesn’t like the path
I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev
and retried the operations 
# lvconveet -s vg/lv /dev/vg/mycowdev
but this doesn’t work either

conclusion 
even though the cow device is an exact copy of the cow device that I have saved on /dev/mycowdev before the split, it wouldn’t work to use to convert back as a lvm snapshot 

not sure if I understand the tool correctly, or if there are other things needed to perform, such as creating virtual information about the lvm VGDA data on the first of this virtual volume named /dev/mycowdev 

let me know what more steps are needed

beat regards Tomas

Sent from my iPhone

> On 7 Nov 2019, at 18:29, Tomas Dalebjörk <tomas.dalebjork@gmail.com> wrote:
> 
> 
> Great, thanks! 
> 
> Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka <mpatocka@redhat.com>:
>> 
>> 
>> On Tue, 5 Nov 2019, Tomas Dalebjörk wrote:
>> 
>> > Thanks,
>> > 
>> > That really helped me to understand how the snapshot works.
>> > Last question:
>> > - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block 100.
>> > Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes.
>> 
>> Yes, the block in the cow device will be overwritten.
>> 
>> Mikulas
>> 
>> > Regards Tomas

[-- Attachment #2: Type: text/html, Size: 2583 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-04 12:09                                 ` Tomas Dalebjörk
@ 2020-09-04 12:37                                   ` Zdenek Kabelac
  2020-09-07 13:09                                   ` Mikulas Patocka
  1 sibling, 0 replies; 53+ messages in thread
From: Zdenek Kabelac @ 2020-09-04 12:37 UTC (permalink / raw)
  To: Tomas Dalebjörk; +Cc: LVM general discussion and development

Dne 04. 09. 20 v 14:09 Tomas Dalebjörk napsal(a):
> hi
> 
> I tried to perform as suggested
> # lvconvert —splitsnapshot vg/lv-snap
> works fine
> # lvconvert -s vg/lv vg/lv-snap
> works fine too
> 
> but...
> if I try to converting cow data directly from the meta device, than it doesn’t 
> work
> eg
> # lvconvert -s vg/lv /dev/mycowdev
> the tool doesn’t like the path
> I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev

Hi

lvm2 does only support 'objects' within VG without any plan to support 
'external' devices.

So user may not take any 'random' device in a system and use it for
commands like lvconvert.

There is always very strict requirement to place block devices as VG member 
first (pvcreate, vgextend...) and then user can allocate space of this device 
for various LVs.

> conclusion
> even though the cow device is an exact copy of the cow device that I have 
> saved on /dev/mycowdev before the split, it wouldn’t work to use to convert 
> back as a lvm snapshot

COW data needs to be simply stored on an LV for use with lvm2.

You may of course use the 'dmsetup' command directly and arrange your
snapshot setup in the way to combine various kinds of devices - but this
is going completely without any lvm2 command involved - in this case
you have to fully manipulate all devices in your device stack with this
dmsetup command.

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-04 12:09                                 ` Tomas Dalebjörk
  2020-09-04 12:37                                   ` Zdenek Kabelac
@ 2020-09-07 13:09                                   ` Mikulas Patocka
  2020-09-07 14:14                                     ` Dalebjörk, Tomas
  1 sibling, 1 reply; 53+ messages in thread
From: Mikulas Patocka @ 2020-09-07 13:09 UTC (permalink / raw)
  To: Tomas Dalebjörk
  Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2030 bytes --]



On Fri, 4 Sep 2020, Tomas Dalebjörk wrote:

> hi
> I tried to perform as suggested
> # lvconvert —splitsnapshot vg/lv-snap
> works fine
> # lvconvert -s vg/lv vg/lv-snap
> works fine too
> 
> but...
> if I try to converting cow data directly from the meta device, than it doesn’t work
> eg
> # lvconvert -s vg/lv /dev/mycowdev
> the tool doesn’t like the path
> I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev
> and retried the operations 
> # lvconveet -s vg/lv /dev/vg/mycowdev
> but this doesn’t work either
> 
> conclusion  even though the cow device is an exact copy of the cow 
> device that I have saved on /dev/mycowdev before the split, it wouldn’t 
> work to use to convert back as a lvm snapshot 
> 
> not sure if I understand the tool correctly, or if there are other 
> things needed to perform, such as creating virtual information about the 
> lvm VGDA data on the first of this virtual volume named /dev/mycowdev 

AFAIK LVM doesn't support taking existing cow device and attaching it to 
an existing volume. When you create a snapshot, you start with am empty 
cow.

Mikulas

> let me know what more steps are needed
> 
> beat regards Tomas
> 
> Sent from my iPhone
> 
>       On 7 Nov 2019, at 18:29, Tomas Dalebjörk <tomas.dalebjork@gmail.com> wrote:
> 
>       Great, thanks!
> 
> Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka <mpatocka@redhat.com>:
> 
> 
>       On Tue, 5 Nov 2019, Tomas Dalebjörk wrote:
> 
>       > Thanks,
>       >
>       > That really helped me to understand how the snapshot works.
>       > Last question:
>       > - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block
>       100.
>       > Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes.
> 
>       Yes, the block in the cow device will be overwritten.
> 
>       Mikulas
> 
>       > Regards Tomas
> 
> 
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 13:09                                   ` Mikulas Patocka
@ 2020-09-07 14:14                                     ` Dalebjörk, Tomas
  2020-09-07 14:17                                       ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Dalebjörk, Tomas @ 2020-09-07 14:14 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: LVM general discussion and development, Zdenek Kabelac

[-- Attachment #1: Type: text/plain, Size: 2677 bytes --]

Hi Mikulas,

Thanks for the replies

I am confused now with the last message?

LVM doesn't support taking existing cow device and attaching it to an 
existing volume?

Isn't that what "lvconvert --splitsnapshot" & "lvconvert -s" is ment to 
be doing?

lets say that I create the snapshot on a different device using these steps:

root@src# lvcreate -s -L 10GB -n lvsnap vg/lv /dev/sdh
root@src# lvconvert ---splitsnapshot vg/lvsnap
root@src# echo "I now move /dev/sdb to another server"
root@tgt# lvconvert -s newvg/newlv vg/lvsnap


Regards Tomas

Den 2020-09-07 kl. 15:09, skrev Mikulas Patocka:
>
> On Fri, 4 Sep 2020, Tomas Dalebjörk wrote:
>
>> hi
>> I tried to perform as suggested
>> # lvconvert —splitsnapshot vg/lv-snap
>> works fine
>> # lvconvert -s vg/lv vg/lv-snap
>> works fine too
>>
>> but...
>> if I try to converting cow data directly from the meta device, than it doesn’t work
>> eg
>> # lvconvert -s vg/lv /dev/mycowdev
>> the tool doesn’t like the path
>> I tried to place a link in /dev/vg/mycowdev -> /dev/mycowdev
>> and retried the operations
>> # lvconveet -s vg/lv /dev/vg/mycowdev
>> but this doesn’t work either
>>
>> conclusion  even though the cow device is an exact copy of the cow
>> device that I have saved on /dev/mycowdev before the split, it wouldn’t
>> work to use to convert back as a lvm snapshot
>>
>> not sure if I understand the tool correctly, or if there are other
>> things needed to perform, such as creating virtual information about the
>> lvm VGDA data on the first of this virtual volume named /dev/mycowdev
> AFAIK LVM doesn't support taking existing cow device and attaching it to
> an existing volume. When you create a snapshot, you start with am empty
> cow.
>
> Mikulas
>
>> let me know what more steps are needed
>>
>> beat regards Tomas
>>
>> Sent from my iPhone
>>
>>        On 7 Nov 2019, at 18:29, Tomas Dalebjörk <tomas.dalebjork@gmail.com> wrote:
>>
>>        Great, thanks!
>>
>> Den tors 7 nov. 2019 kl 17:54 skrev Mikulas Patocka <mpatocka@redhat.com>:
>>
>>
>>        On Tue, 5 Nov 2019, Tomas Dalebjörk wrote:
>>
>>        > Thanks,
>>        >
>>        > That really helped me to understand how the snapshot works.
>>        > Last question:
>>        > - lets say that block 100 which is 1MB in size is in the cow device, and a write happen that wants to something or all data on that region of block
>>        100.
>>        > Than I assume; based on what have been previously said here, that the block in the cow device will be overwritten with the new changes.
>>
>>        Yes, the block in the cow device will be overwritten.
>>
>>        Mikulas
>>
>>        > Regards Tomas
>>
>>

[-- Attachment #2: Type: text/html, Size: 3532 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 14:14                                     ` Dalebjörk, Tomas
@ 2020-09-07 14:17                                       ` Zdenek Kabelac
  2020-09-07 16:34                                         ` Tomas Dalebjörk
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2020-09-07 14:17 UTC (permalink / raw)
  To: Dalebjörk, Tomas, Mikulas Patocka
  Cc: LVM general discussion and development

Dne 07. 09. 20 v 16:14 Dalebjörk, Tomas napsal(a):
> Hi Mikulas,
> 
> Thanks for the replies
> 
> I am confused now with the last message?
> 
> LVM doesn't support taking existing cow device and attaching it to an existing 
> volume?
> 
> Isn't that what "lvconvert --splitsnapshot" & "lvconvert -s" is ment to be doing?
> 
> lets say that I create the snapshot on a different device using these steps:
> 
> root@src# lvcreate -s -L 10GB -n lvsnap vg/lv /dev/sdh
> root@src# lvconvert ---splitsnapshot vg/lvsnap
> root@src# echo "I now move /dev/sdb to another server"
> root@tgt# lvconvert -s newvg/newlv vg/lvsnap
> 

Hi

This is only supported as long as you stay within VG.
So newlv & lvsnap must be in a single VG.

Note - you can 'vgreduce' PV from VG1 and vgextend to VG2.
But it always work on whole PV base - you can't mix
LV between VGs.

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 14:17                                       ` Zdenek Kabelac
@ 2020-09-07 16:34                                         ` Tomas Dalebjörk
  2020-09-07 16:42                                           ` Zdenek Kabelac
  0 siblings, 1 reply; 53+ messages in thread
From: Tomas Dalebjörk @ 2020-09-07 16:34 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

thanks for feedback 

so if I understand this correctly
# fallocate -l 100M /tmp/pv1
# fallocate -l 100M /tmp/pv2
# fallocate -l 100M /tmp/pv3

# losetup —find —show /tmp/pv1
# losetup —find —show /tmp/pv2
# losetup —find —show /tmp/pv3

# vgcreate vg0 /dev/loop0
# lvcreate -n lv0 -l 1 vg0
# vgextend vg0 /dev/loop1
# lvcreate -s -l 1 -n lvsnap /dev/loop1
# vgchange -a n vg0

# lvconvert —splitsnapshot vg0/lvsnap

# vgreduce vg0 /dev/loop1

# vgcreate vg1 /dev/loop2
# lvcreate -n lv0 -l 1 vg1
# vgextend vg1 /dev/loop1 
# lvconvert -s vg1/lvsnap vg1/lv0

not sure if the steps are correct?

regards Tomas

Sent from my iPhone

> On 7 Sep 2020, at 16:17, Zdenek Kabelac <zkabelac@redhat.com> wrote:
> 
> Dne 07. 09. 20 v 16:14 Dalebjörk, Tomas napsal(a):
>> Hi Mikulas,
>> Thanks for the replies
>> I am confused now with the last message?
>> LVM doesn't support taking existing cow device and attaching it to an existing volume?
>> Isn't that what "lvconvert --splitsnapshot" & "lvconvert -s" is ment to be doing?
>> lets say that I create the snapshot on a different device using these steps:
>> root@src# lvcreate -s -L 10GB -n lvsnap vg/lv /dev/sdh
>> root@src# lvconvert ---splitsnapshot vg/lvsnap
>> root@src# echo "I now move /dev/sdb to another server"
>> root@tgt# lvconvert -s newvg/newlv vg/lvsnap
> 
> Hi
> 
> This is only supported as long as you stay within VG.
> So newlv & lvsnap must be in a single VG.
> 
> Note - you can 'vgreduce' PV from VG1 and vgextend to VG2.
> But it always work on whole PV base - you can't mix
> LV between VGs.
> 
> Zdenek
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 16:34                                         ` Tomas Dalebjörk
@ 2020-09-07 16:42                                           ` Zdenek Kabelac
  2020-09-07 17:37                                             ` Tomas Dalebjörk
  2020-09-07 19:56                                             ` Tomas Dalebjörk
  0 siblings, 2 replies; 53+ messages in thread
From: Zdenek Kabelac @ 2020-09-07 16:42 UTC (permalink / raw)
  To: Tomas Dalebjörk; +Cc: LVM general discussion and development

Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a):
> thanks for feedback
> 
> so if I understand this correctly
> # fallocate -l 100M /tmp/pv1
> # fallocate -l 100M /tmp/pv2
> # fallocate -l 100M /tmp/pv3
> 
> # losetup —find —show /tmp/pv1
> # losetup —find —show /tmp/pv2
> # losetup —find —show /tmp/pv3
> 
> # vgcreate vg0 /dev/loop0
> # lvcreate -n lv0 -l 1 vg0
> # vgextend vg0 /dev/loop1
> # lvcreate -s -l 1 -n lvsnap /dev/loop1
> # vgchange -a n vg0
> 
> # lvconvert —splitsnapshot vg0/lvsnap
> 
> # vgreduce vg0 /dev/loop1


Hi

Here you would need to use 'vgsplit' rather - otherwise you
loose the mapping for whatever was living on /dev/loop1

> 
> # vgcreate vg1 /dev/loop2
> # lvcreate -n lv0 -l 1 vg1
> # vgextend vg1 /dev/loop1

And  'vgmerge'


> # lvconvert -s vg1/lvsnap vg1/lv0
> 
> not sure if the steps are correct?
> 


I hope you realize the content of vg1/lv0 must be exactly same
as vg0/lv0.

As snapshot COW volume contains only 'diff chunks' - so if you
would attach snapshot to 'different' lv - you would get only mess.


Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 16:42                                           ` Zdenek Kabelac
@ 2020-09-07 17:37                                             ` Tomas Dalebjörk
  2020-09-07 17:50                                               ` Zdenek Kabelac
  2020-09-07 19:56                                             ` Tomas Dalebjörk
  1 sibling, 1 reply; 53+ messages in thread
From: Tomas Dalebjörk @ 2020-09-07 17:37 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

thanks 

ok
 vgsplit/merge instead
and after that lvconvert-s

yes, I am aware of the issues with corruption
but if the cow device has all data, than no corruption will happen, right?

if COW has a copy of all blocks
than a lvconvert —merge, or mount of the snapshot volume will be without issues

right?

regards Tomas

Sent from my iPhone

> On 7 Sep 2020, at 18:42, Zdenek Kabelac <zkabelac@redhat.com> wrote:
> 
> Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a):
>> thanks for feedback
>> so if I understand this correctly
>> # fallocate -l 100M /tmp/pv1
>> # fallocate -l 100M /tmp/pv2
>> # fallocate -l 100M /tmp/pv3
>> # losetup —find —show /tmp/pv1
>> # losetup —find —show /tmp/pv2
>> # losetup —find —show /tmp/pv3
>> # vgcreate vg0 /dev/loop0
>> # lvcreate -n lv0 -l 1 vg0
>> # vgextend vg0 /dev/loop1
>> # lvcreate -s -l 1 -n lvsnap /dev/loop1
>> # vgchange -a n vg0
>> # lvconvert —splitsnapshot vg0/lvsnap
>> # vgreduce vg0 /dev/loop1
> 
> 
> Hi
> 
> Here you would need to use 'vgsplit' rather - otherwise you
> loose the mapping for whatever was living on /dev/loop1
> 
>> # vgcreate vg1 /dev/loop2
>> # lvcreate -n lv0 -l 1 vg1
>> # vgextend vg1 /dev/loop1
> 
> And  'vgmerge'
> 
> 
>> # lvconvert -s vg1/lvsnap vg1/lv0
>> not sure if the steps are correct?
> 
> 
> I hope you realize the content of vg1/lv0 must be exactly same
> as vg0/lv0.
> 
> As snapshot COW volume contains only 'diff chunks' - so if you
> would attach snapshot to 'different' lv - you would get only mess.
> 
> 
> Zdenek
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 17:37                                             ` Tomas Dalebjörk
@ 2020-09-07 17:50                                               ` Zdenek Kabelac
  2020-09-08 12:32                                                 ` Dalebjörk, Tomas
  0 siblings, 1 reply; 53+ messages in thread
From: Zdenek Kabelac @ 2020-09-07 17:50 UTC (permalink / raw)
  To: Tomas Dalebjörk; +Cc: LVM general discussion and development

Dne 07. 09. 20 v 19:37 Tomas Dalebjörk napsal(a):
> thanks
> 
> ok
>   vgsplit/merge instead
> and after that lvconvert-s
> 
> yes, I am aware of the issues with corruption
> but if the cow device has all data, than no corruption will happen, right?
> 
> if COW has a copy of all blocks
> than a lvconvert —merge, or mount of the snapshot volume will be without issues

If the 'COW' has all the data - why do you need then snapshot ?
Why not travel whole LV instead of snapshot ?

Also - nowdays this old (so called 'thick') snapshot is really slow compared 
with thin-provisioning - might be good if you check what kind of features
you can gain/loose if you would have switched to thin-pool
(clearly whole thin-pool (both data & metadata) would need to travel between 
your VGs.)

Regards

Zdenek

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 16:42                                           ` Zdenek Kabelac
  2020-09-07 17:37                                             ` Tomas Dalebjörk
@ 2020-09-07 19:56                                             ` Tomas Dalebjörk
  2020-09-07 20:22                                               ` Tomas Dalebjörk
  2020-09-07 21:02                                               ` Tomas Dalebjörk
  1 sibling, 2 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2020-09-07 19:56 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

hi 
I tried all these steps
but when I associated the snapshot cow device back to an empty origin, and typed the lvs command
the data% output shows 0% instead of 37% ?
so it looks like that the lvconvert -s vg1/lvsnap vg1/lv0 looses the cow data?

perhaps ypu can guide me how this can be done?

btw, just to emulate s full copy, I executed the 
dd if=/dev/vg0/lv0 of=/dev/vg1/lv0 
before the lvconvert -s, to make sure the last data is there

and than I tried to mount the vg1/lv0 which worked fine
but the data was not at snapshot view
even mounting vg1/lvsnap works fine
but with wrong data

confused over how and why vgmerge should be used as vgsplit does the work?

regards Tomas

Sent from my iPhone

> On 7 Sep 2020, at 18:42, Zdenek Kabelac <zkabelac@redhat.com> wrote:
> 
> Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a):
>> thanks for feedback
>> so if I understand this correctly
>> # fallocate -l 100M /tmp/pv1
>> # fallocate -l 100M /tmp/pv2
>> # fallocate -l 100M /tmp/pv3
>> # losetup —find —show /tmp/pv1
>> # losetup —find —show /tmp/pv2
>> # losetup —find —show /tmp/pv3
>> # vgcreate vg0 /dev/loop0
>> # lvcreate -n lv0 -l 1 vg0
>> # vgextend vg0 /dev/loop1
>> # lvcreate -s -l 1 -n lvsnap /dev/loop1
>> # vgchange -a n vg0
>> # lvconvert —splitsnapshot vg0/lvsnap
>> # vgreduce vg0 /dev/loop1
> 
> 
> Hi
> 
> Here you would need to use 'vgsplit' rather - otherwise you
> loose the mapping for whatever was living on /dev/loop1
> 
>> # vgcreate vg1 /dev/loop2
>> # lvcreate -n lv0 -l 1 vg1
>> # vgextend vg1 /dev/loop1
> 
> And  'vgmerge'
> 
> 
>> # lvconvert -s vg1/lvsnap vg1/lv0
>> not sure if the steps are correct?
> 
> 
> I hope you realize the content of vg1/lv0 must be exactly same
> as vg0/lv0.
> 
> As snapshot COW volume contains only 'diff chunks' - so if you
> would attach snapshot to 'different' lv - you would get only mess.
> 
> 
> Zdenek
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 19:56                                             ` Tomas Dalebjörk
@ 2020-09-07 20:22                                               ` Tomas Dalebjörk
  2020-09-07 21:02                                               ` Tomas Dalebjörk
  1 sibling, 0 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2020-09-07 20:22 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

yes, 
we need the snapshot data, as it is provisioned from the backup target and can’t be changed

we will definitely look into thin snapshots later, but want to first making sure that we can reanimate the cow device as a device and associate this with an empty origin

we want if possible be able to associate this cow to a new empty vg/lv using new vgname/lvname if possible 

after all, it is just an virutal volume

Sent from my iPhone

> On 7 Sep 2020, at 21:56, Tomas Dalebjörk <tomas.dalebjork@gmail.com> wrote:
> 
> hi 
> I tried all these steps
> but when I associated the snapshot cow device back to an empty origin, and typed the lvs command
> the data% output shows 0% instead of 37% ?
> so it looks like that the lvconvert -s vg1/lvsnap vg1/lv0 looses the cow data?
> 
> perhaps ypu can guide me how this can be done?
> 
> btw, just to emulate s full copy, I executed the 
> dd if=/dev/vg0/lv0 of=/dev/vg1/lv0 
> before the lvconvert -s, to make sure the last data is there
> 
> and than I tried to mount the vg1/lv0 which worked fine
> but the data was not at snapshot view
> even mounting vg1/lvsnap works fine
> but with wrong data
> 
> confused over how and why vgmerge should be used as vgsplit does the work?
> 
> regards Tomas
> 
> Sent from my iPhone
> 
>> On 7 Sep 2020, at 18:42, Zdenek Kabelac <zkabelac@redhat.com> wrote:
>> 
>> Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a):
>>> thanks for feedback
>>> so if I understand this correctly
>>> # fallocate -l 100M /tmp/pv1
>>> # fallocate -l 100M /tmp/pv2
>>> # fallocate -l 100M /tmp/pv3
>>> # losetup —find —show /tmp/pv1
>>> # losetup —find —show /tmp/pv2
>>> # losetup —find —show /tmp/pv3
>>> # vgcreate vg0 /dev/loop0
>>> # lvcreate -n lv0 -l 1 vg0
>>> # vgextend vg0 /dev/loop1
>>> # lvcreate -s -l 1 -n lvsnap /dev/loop1
>>> # vgchange -a n vg0
>>> # lvconvert —splitsnapshot vg0/lvsnap
>>> # vgreduce vg0 /dev/loop1
>> 
>> 
>> Hi
>> 
>> Here you would need to use 'vgsplit' rather - otherwise you
>> loose the mapping for whatever was living on /dev/loop1
>> 
>>> # vgcreate vg1 /dev/loop2
>>> # lvcreate -n lv0 -l 1 vg1
>>> # vgextend vg1 /dev/loop1
>> 
>> And  'vgmerge'
>> 
>> 
>>> # lvconvert -s vg1/lvsnap vg1/lv0
>>> not sure if the steps are correct?
>> 
>> 
>> I hope you realize the content of vg1/lv0 must be exactly same
>> as vg0/lv0.
>> 
>> As snapshot COW volume contains only 'diff chunks' - so if you
>> would attach snapshot to 'different' lv - you would get only mess.
>> 
>> 
>> Zdenek
>> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 19:56                                             ` Tomas Dalebjörk
  2020-09-07 20:22                                               ` Tomas Dalebjörk
@ 2020-09-07 21:02                                               ` Tomas Dalebjörk
  1 sibling, 0 replies; 53+ messages in thread
From: Tomas Dalebjörk @ 2020-09-07 21:02 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

it worked

I missed the -Zn flag


Sent from my iPhone

> On 7 Sep 2020, at 21:56, Tomas Dalebjörk <tomas.dalebjork@gmail.com> wrote:
> 
> hi 
> I tried all these steps
> but when I associated the snapshot cow device back to an empty origin, and typed the lvs command
> the data% output shows 0% instead of 37% ?
> so it looks like that the lvconvert -s vg1/lvsnap vg1/lv0 looses the cow data?
> 
> perhaps ypu can guide me how this can be done?
> 
> btw, just to emulate s full copy, I executed the 
> dd if=/dev/vg0/lv0 of=/dev/vg1/lv0 
> before the lvconvert -s, to make sure the last data is there
> 
> and than I tried to mount the vg1/lv0 which worked fine
> but the data was not at snapshot view
> even mounting vg1/lvsnap works fine
> but with wrong data
> 
> confused over how and why vgmerge should be used as vgsplit does the work?
> 
> regards Tomas
> 
> Sent from my iPhone
> 
>> On 7 Sep 2020, at 18:42, Zdenek Kabelac <zkabelac@redhat.com> wrote:
>> 
>> Dne 07. 09. 20 v 18:34 Tomas Dalebjörk napsal(a):
>>> thanks for feedback
>>> so if I understand this correctly
>>> # fallocate -l 100M /tmp/pv1
>>> # fallocate -l 100M /tmp/pv2
>>> # fallocate -l 100M /tmp/pv3
>>> # losetup —find —show /tmp/pv1
>>> # losetup —find —show /tmp/pv2
>>> # losetup —find —show /tmp/pv3
>>> # vgcreate vg0 /dev/loop0
>>> # lvcreate -n lv0 -l 1 vg0
>>> # vgextend vg0 /dev/loop1
>>> # lvcreate -s -l 1 -n lvsnap /dev/loop1
>>> # vgchange -a n vg0
>>> # lvconvert —splitsnapshot vg0/lvsnap
>>> # vgreduce vg0 /dev/loop1
>> 
>> 
>> Hi
>> 
>> Here you would need to use 'vgsplit' rather - otherwise you
>> loose the mapping for whatever was living on /dev/loop1
>> 
>>> # vgcreate vg1 /dev/loop2
>>> # lvcreate -n lv0 -l 1 vg1
>>> # vgextend vg1 /dev/loop1
>> 
>> And  'vgmerge'
>> 
>> 
>>> # lvconvert -s vg1/lvsnap vg1/lv0
>>> not sure if the steps are correct?
>> 
>> 
>> I hope you realize the content of vg1/lv0 must be exactly same
>> as vg0/lv0.
>> 
>> As snapshot COW volume contains only 'diff chunks' - so if you
>> would attach snapshot to 'different' lv - you would get only mess.
>> 
>> 
>> Zdenek
>> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [linux-lvm] exposing snapshot block device
  2020-09-07 17:50                                               ` Zdenek Kabelac
@ 2020-09-08 12:32                                                 ` Dalebjörk, Tomas
  0 siblings, 0 replies; 53+ messages in thread
From: Dalebjörk, Tomas @ 2020-09-08 12:32 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2719 bytes --]

Hi,


This is the steps that I did.

- the COW data exists on /dev/loop1, including space for the PV header + 
metadata

I created a fakevg template file from vgcfgbackup /tmp/fakevg.bkp

( the content of this file I created fake uuid etc... )


I craete a fake uuid for the PV

# pvcreate -ff -u fake-uidx-nrxx-xxxx --restorefile /tmp/fakevg.bkp


And created rhe metadata from the backup

# vgcfgrestore -f /tmp/fakevg.bkp fakevg


I can now see the lvsnap in fakevg

Perhaps the restore can be done directly to the destination vg? not sure...

Anyhow, I than used the vgsplit to move the fakevg data to the 
destination vg

# vgsplit fakevg destvg /dev/loop1


I know have the lvsnap volume in the correct volume group

 From here, I connected the lvsnap to a lv destination using

# lvconvert -Zn -s destvg/lvsnap destvg/destlv


I know have a snapshot connected to the origin destlv

 From here, I can either mount the snapshot and start using it, or 
revert to the snapshot

# lvchange -a n destvg/destlv
# lvconvert --merge -b destvg/lvsnap
# lvchange -a y destvg/destlv


Now to my questions...

is there any DBUS api that can perform the vgcfgrestore operations that 
I can use through C?

or another ways to recreate the metadata?

I have to now use two steps: pvcreate + vgcfgrestore, where I just need 
to actually restore just the metadata (only vgcfgrestore)?

If I run vgcfgrestore without pvcreate, than vgcfgrestore will not find 
the pvid, and cant be executed with a parameter like:

# vgcfgrestore -f vgXX.bkp /dev/nbd

Instead it has to be used with the parameter vgXX pointing out the 
volume group...


I can live with vgcfgrestore + pvcreate, but would prefer to use the 
libblockdev (DBUS) or another api from C directly.

What options do I have?


Thanks for an excellent help

God Bless

Tomas



Den 2020-09-07 kl. 19:50, skrev Zdenek Kabelac:
> Dne 07. 09. 20 v 19:37 Tomas Dalebjörk napsal(a):
>> thanks
>>
>> ok
>>   vgsplit/merge instead
>> and after that lvconvert-s
>>
>> yes, I am aware of the issues with corruption
>> but if the cow device has all data, than no corruption will happen, 
>> right?
>>
>> if COW has a copy of all blocks
>> than a lvconvert —merge, or mount of the snapshot volume will be 
>> without issues
>
> If the 'COW' has all the data - why do you need then snapshot ?
> Why not travel whole LV instead of snapshot ?
>
> Also - nowdays this old (so called 'thick') snapshot is really slow 
> compared with thin-provisioning - might be good if you check what kind 
> of features
> you can gain/loose if you would have switched to thin-pool
> (clearly whole thin-pool (both data & metadata) would need to travel 
> between your VGs.)
>
> Regards
>
> Zdenek
>

[-- Attachment #2: Type: text/html, Size: 4365 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2020-09-08 12:32 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 10:47 [linux-lvm] exposing snapshot block device Dalebjörk, Tomas
2019-10-22 13:57 ` Zdenek Kabelac
2019-10-22 15:29   ` Dalebjörk, Tomas
2019-10-22 15:36     ` Zdenek Kabelac
2019-10-22 16:13       ` Dalebjörk, Tomas
2019-10-23 10:26         ` Zdenek Kabelac
2019-10-23 10:56           ` Tomas Dalebjörk
2019-10-22 16:15       ` Stuart D. Gathman
2019-10-22 17:02         ` Tomas Dalebjörk
2019-10-22 21:38         ` Gionatan Danti
2019-10-22 22:53           ` Stuart D. Gathman
2019-10-23  6:58             ` Gionatan Danti
2019-10-23 10:06               ` Tomas Dalebjörk
2019-10-23 10:12             ` Zdenek Kabelac
2019-10-23 10:46         ` Zdenek Kabelac
2019-10-23 11:08           ` Gionatan Danti
2019-10-23 11:24             ` Tomas Dalebjörk
2019-10-23 11:26               ` Tomas Dalebjörk
2019-10-24 16:01               ` Zdenek Kabelac
2019-10-25 16:31                 ` Tomas Dalebjörk
2019-11-04  5:54                   ` Tomas Dalebjörk
2019-11-04 10:07                     ` Zdenek Kabelac
2019-11-04 14:40                       ` Tomas Dalebjörk
2019-11-04 15:04                         ` Zdenek Kabelac
2019-11-04 17:28                           ` Tomas Dalebjörk
2019-11-05 16:24                             ` Zdenek Kabelac
2019-11-05 16:40                         ` Mikulas Patocka
2019-11-05 20:56                           ` Tomas Dalebjörk
2019-11-06  9:22                             ` Zdenek Kabelac
2019-11-07 16:54                             ` Mikulas Patocka
2019-11-07 17:29                               ` Tomas Dalebjörk
2020-09-04 12:09                                 ` Tomas Dalebjörk
2020-09-04 12:37                                   ` Zdenek Kabelac
2020-09-07 13:09                                   ` Mikulas Patocka
2020-09-07 14:14                                     ` Dalebjörk, Tomas
2020-09-07 14:17                                       ` Zdenek Kabelac
2020-09-07 16:34                                         ` Tomas Dalebjörk
2020-09-07 16:42                                           ` Zdenek Kabelac
2020-09-07 17:37                                             ` Tomas Dalebjörk
2020-09-07 17:50                                               ` Zdenek Kabelac
2020-09-08 12:32                                                 ` Dalebjörk, Tomas
2020-09-07 19:56                                             ` Tomas Dalebjörk
2020-09-07 20:22                                               ` Tomas Dalebjörk
2020-09-07 21:02                                               ` Tomas Dalebjörk
2019-10-23 12:12             ` Ilia Zykov
2019-10-23 12:20             ` Ilia Zykov
2019-10-23 13:05               ` Zdenek Kabelac
2019-10-23 14:40                 ` Gionatan Danti
2019-10-23 15:46                   ` Ilia Zykov
2019-10-23 12:59             ` Zdenek Kabelac
2019-10-23 14:37               ` Gionatan Danti
2019-10-23 15:37                 ` Zdenek Kabelac
2019-10-23 17:16                   ` Gionatan Danti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).